Skip to main content
Erschienen in: Empirical Software Engineering 7/2022

01.12.2022

SPVF: security property assisted vulnerability fixing via attention-based models

verfasst von: Zhou Zhou, Lili Bo, Xiaoxue Wu, Xiaobing Sun, Tao Zhang, Bin Li, Jiale Zhang, Sicong Cao

Erschienen in: Empirical Software Engineering | Ausgabe 7/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The past few years have witnessed the wide application of machine learning models to fix vulnerabilities automatically. However, existing approaches cannot capture the characteristics of vulnerabilities that are helpful to improve the effectiveness of automated vulnerability fixing. In this paper, we propose a novel approach for automatically fixing vulnerabilities, called SPVF. SPVF captures the security property from the descriptive information about the vulnerability. SPVF is based on the attention mechanism and uses the abstract syntax tree as well as the security properties, integrating them using the pointer generator. The experimental results on two public datasets show that SPVF outperforms the state-of-the-art approaches by 13% for C/C++ and 47% for Python. And SPVF is capable of successfully fixing 153 C/C++ vulnerabilities and 276 Python vulnerabilities.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
3
Note that a small number of vulnerabilities cannot be classified into the classifications listed due to the absence of CWE id information in the dataset.
 
Literatur
Zurück zum Zitat Abhinav K, Sharvani V, Dubey A, D’Souza M, Bhardwaj N, Jain S, Arora V (2021) Repairnet: contextual sequence-to-sequence network for automated program repair. In: Roll I, McNamara DS, Sosnovsky SA, Luckin R, Dimitrova V (eds) Artificial intelligence in education - 22nd international conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part I, Lecture notes in computer science, vol 12748. Springer, pp 3–15. https://doi.org/10.1007/978-3-030-78292-4_1 Abhinav K, Sharvani V, Dubey A, D’Souza M, Bhardwaj N, Jain S, Arora V (2021) Repairnet: contextual sequence-to-sequence network for automated program repair. In: Roll I, McNamara DS, Sosnovsky SA, Luckin R, Dimitrova V (eds) Artificial intelligence in education - 22nd international conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part I, Lecture notes in computer science, vol 12748. Springer, pp 3–15. https://​doi.​org/​10.​1007/​978-3-030-78292-4_​1
Zurück zum Zitat Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
Zurück zum Zitat Bird S, Loper E (2004) NLTK: the natural language toolkit. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26, 2004 - Poster and Demonstration. ACL Bird S, Loper E (2004) NLTK: the natural language toolkit. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26, 2004 - Poster and Demonstration. ACL
Zurück zum Zitat Cao S, Sun X, Bo L, Wu R, Li B, Tao C (2022) MVD: memory-related vulnerability detection based on flow-sensitive graph neural networks. In: 44th IEEE/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, PA, USA, May 25–27, 2022. IEEE, pp 1456–1468. https://doi.org/10.1145/3510003.3510219 Cao S, Sun X, Bo L, Wu R, Li B, Tao C (2022) MVD: memory-related vulnerability detection based on flow-sensitive graph neural networks. In: 44th IEEE/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, PA, USA, May 25–27, 2022. IEEE, pp 1456–1468. https://​doi.​org/​10.​1145/​3510003.​3510219
Zurück zum Zitat Chakraborty S, Krishna R, Ding Y, Ray B (2020) Deep learning based vulnerability detection: are we there yet? arxiv:2009.07235 Chakraborty S, Krishna R, Ding Y, Ray B (2020) Deep learning based vulnerability detection: are we there yet? arxiv:2009.​07235
Zurück zum Zitat Chen Z, Kommrusch S, Monperrus M (2019) Using sequence-to-sequence learning for repairing C vulnerabilities. arxiv:1912.02015 Chen Z, Kommrusch S, Monperrus M (2019) Using sequence-to-sequence learning for repairing C vulnerabilities. arxiv:1912.​02015
Zurück zum Zitat Cheng X, Wang H, Hua J, Xu G, Sui Y (2021) Deepwukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans Softw Eng Methodol 30(3):38:1–38:33CrossRef Cheng X, Wang H, Hua J, Xu G, Sui Y (2021) Deepwukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans Softw Eng Methodol 30(3):38:1–38:33CrossRef
Zurück zum Zitat Chi J, Qu Y, Liu T, Zheng Q, Yin H (2020) Seqtrans: automatic vulnerability fix via sequence to sequence learning. arxiv:2010.10805. Accessed May 2021 Chi J, Qu Y, Liu T, Zheng Q, Yin H (2020) Seqtrans: automatic vulnerability fix via sequence to sequence learning. arxiv:2010.​10805. Accessed May 2021
Zurück zum Zitat Cooper N, Bernal-Cárdenas C, Chaparro O, Moran K, Poshyvanyk D (2021) It takes two to TANGO: combining visual and textual information for detecting duplicate video-based bug reports. In: 43rd IEEE/ACM international conference on software engineering, ICSE 2021, Madrid, Spain, 22–30 May 2021. IEEE, pp 957–969. https://doi.org/10.1109/ICSE43902.2021.00091 Cooper N, Bernal-Cárdenas C, Chaparro O, Moran K, Poshyvanyk D (2021) It takes two to TANGO: combining visual and textual information for detecting duplicate video-based bug reports. In: 43rd IEEE/ACM international conference on software engineering, ICSE 2021, Madrid, Spain, 22–30 May 2021. IEEE, pp 957–969. https://​doi.​org/​10.​1109/​ICSE43902.​2021.​00091
Zurück zum Zitat Durumeric Z, Kasten J, Adrian D, Halderman JA, Bailey M, Li F, Weaver N, Amann J, Beekman J, Payer M, Paxson V (2014) The matter of heartbleed. In: Proceedings of the 2014 internet measurement conference, IMC 2014, Vancouver, BC, Canada, November 5–7, 2014. ACM, pp 475–488. https://doi.org/10.1145/2663716.2663755 Durumeric Z, Kasten J, Adrian D, Halderman JA, Bailey M, Li F, Weaver N, Amann J, Beekman J, Payer M, Paxson V (2014) The matter of heartbleed. In: Proceedings of the 2014 internet measurement conference, IMC 2014, Vancouver, BC, Canada, November 5–7, 2014. ACM, pp 475–488. https://​doi.​org/​10.​1145/​2663716.​2663755
Zurück zum Zitat Fan J, Li Y, Wang S, Nguyen TN (2020) A C/C++ code vulnerability dataset with code changes and CVE summaries. In: MSR ’20: 17th international conference on mining software repositories, Seoul, Republic of Korea, 29–30 June, 2020. ACM, pp 508–512 Fan J, Li Y, Wang S, Nguyen TN (2020) A C/C++ code vulnerability dataset with code changes and CVE summaries. In: MSR ’20: 17th international conference on mining software repositories, Seoul, Republic of Korea, 29–30 June, 2020. ACM, pp 508–512
Zurück zum Zitat Freitag M, Al-Onaizan Y (2017) Beam search strategies for neural machine translation. In: Luong T, Birch A, Neubig G, Finch AM (eds) Proceedings of the first workshop on neural machine translation, NMT@ACL 2017, Vancouver, Canada, August 4, 2017. Association for Computational Linguistics, pp 56–60. https://doi.org/10.18653/v1/w17-3207 Freitag M, Al-Onaizan Y (2017) Beam search strategies for neural machine translation. In: Luong T, Birch A, Neubig G, Finch AM (eds) Proceedings of the first workshop on neural machine translation, NMT@ACL 2017, Vancouver, Canada, August 4, 2017. Association for Computational Linguistics, pp 56–60. https://​doi.​org/​10.​18653/​v1/​w17-3207
Zurück zum Zitat Harer J, Ozdemir O, Lazovich T, Reale CP, Russell RL, Kim LY, Chin SP (2018) Learning to repair software vulnerabilities with generative adversarial networks. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, canada, pp 7944–7954 Harer J, Ozdemir O, Lazovich T, Reale CP, Russell RL, Kim LY, Chin SP (2018) Learning to repair software vulnerabilities with generative adversarial networks. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, canada, pp 7944–7954
Zurück zum Zitat Huang Z, Lie D, Tan G, Jaeger T (2019) Using safety properties to generate vulnerability patches. In: 2019 IEEE symposium on security and privacy, SP 2019, San Francisco, CA, USA, May 19–23, 2019. IEEE, pp 539–554 Huang Z, Lie D, Tan G, Jaeger T (2019) Using safety properties to generate vulnerability patches. In: 2019 IEEE symposium on security and privacy, SP 2019, San Francisco, CA, USA, May 19–23, 2019. IEEE, pp 539–554
Zurück zum Zitat Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. arXiv:1412.6980 Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. arXiv:1412.​6980
Zurück zum Zitat Li F, Paxson V (2017) A large-scale empirical study of security patches. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, CCS 2017, Dallas, TX, USA, October 30–November 03, 2017. ACM, pp 2201–2215 Li F, Paxson V (2017) A large-scale empirical study of security patches. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, CCS 2017, Dallas, TX, USA, October 30–November 03, 2017. ACM, pp 2201–2215
Zurück zum Zitat Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. arxiv:1801.01681 Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. arxiv:1801.​01681
Zurück zum Zitat Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 101–114 Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 101–114
Zurück zum Zitat Mesbah A, Rice A, Johnston E, Glorioso N, Aftandilian E (2019) Deepdelta: learning to repair compilation errors. In: Dumas M, Pfahl D, Apel S, Russo A (eds) Proceedings of the ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26–30, 2019. ACM, pp 925–936. https://doi.org/10.1145/3338906.3340455 Mesbah A, Rice A, Johnston E, Glorioso N, Aftandilian E (2019) Deepdelta: learning to repair compilation errors. In: Dumas M, Pfahl D, Apel S, Russo A (eds) Proceedings of the ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26–30, 2019. ACM, pp 925–936. https://​doi.​org/​10.​1145/​3338906.​3340455
Zurück zum Zitat Monperrus M (2018) Automatic software repair: a bibliography. ACM Comput Surv 51(1):17:1–17:24 Monperrus M (2018) Automatic software repair: a bibliography. ACM Comput Surv 51(1):17:1–17:24
Zurück zum Zitat Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) Fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Demonstrations. Association for Computational Linguistics, pp 48–53 Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) Fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Demonstrations. Association for Computational Linguistics, pp 48–53
Zurück zum Zitat Pradel M, Murali V, Qian R, Machalica M, Meijer E, Chandra S (2020) Scaffle: bug localization on millions of files. In: Khurshid S, Pasareanu CS (eds) ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 225–236. https://doi.org/10.1145/3395363.3397356 Pradel M, Murali V, Qian R, Machalica M, Meijer E, Chandra S (2020) Scaffle: bug localization on millions of files. In: Khurshid S, Pasareanu CS (eds) ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 225–236. https://​doi.​org/​10.​1145/​3395363.​3397356
Zurück zum Zitat See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp 1073–1083 See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp 1073–1083
Zurück zum Zitat Tarlow D, Moitra S, Rice A, Chen Z, Manzagol P, Sutton C, Aftandilian E (2020) Learning to fix build errors with graph2diff neural networks. In: ICSE ’20: 42nd international conference on software engineering, workshops, Seoul, Republic of Korea, 27 June–19 July, 2020. ACM, pp 19–20. https://doi.org/10.1145/3387940.3392181 Tarlow D, Moitra S, Rice A, Chen Z, Manzagol P, Sutton C, Aftandilian E (2020) Learning to fix build errors with graph2diff neural networks. In: ICSE ’20: 42nd international conference on software engineering, workshops, Seoul, Republic of Korea, 27 June–19 July, 2020. ACM, pp 19–20. https://​doi.​org/​10.​1145/​3387940.​3392181
Zurück zum Zitat Tufano M, Watson C, Bavota G, Penta MD, White M, Poshyvanyk D (2018) An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018, Montpellier, France, September 3–7, 2018. ACM, pp 832–837. https://doi.org/10.1145/3238147.3240732 Tufano M, Watson C, Bavota G, Penta MD, White M, Poshyvanyk D (2018) An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018, Montpellier, France, September 3–7, 2018. ACM, pp 832–837. https://​doi.​org/​10.​1145/​3238147.​3240732
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008
Zurück zum Zitat Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017, Urbana, IL, USA, October 30–November 03, 2017. IEEE Computer Society, pp 660–670. https://doi.org/10.1109/ASE.2017.8115676 Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017, Urbana, IL, USA, October 30–November 03, 2017. IEEE Computer Society, pp 660–670. https://​doi.​org/​10.​1109/​ASE.​2017.​8115676
Zurück zum Zitat Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE symposium on security and privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014. IEEE Computer Society, pp 590–604. https://doi.org/10.1109/SP.2014.44 Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE symposium on security and privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014. IEEE Computer Society, pp 590–604. https://​doi.​org/​10.​1109/​SP.​2014.​44
Zurück zum Zitat Yang S, Wang Y, Chu X (2020) A survey of deep learning techniques for neural machine translation. arxiv:2002.07526 Yang S, Wang Y, Chu X (2020) A survey of deep learning techniques for neural machine translation. arxiv:2002.​07526
Zurück zum Zitat Yasunaga M, Liang P (2020) Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, Virtual Event, Proceedings of machine learning research, vol 119, pp 10,799–10,808. PMLR Yasunaga M, Liang P (2020) Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, Virtual Event, Proceedings of machine learning research, vol 119, pp 10,799–10,808. PMLR
Metadaten
Titel
SPVF: security property assisted vulnerability fixing via attention-based models
verfasst von
Zhou Zhou
Lili Bo
Xiaoxue Wu
Xiaobing Sun
Tao Zhang
Bin Li
Jiale Zhang
Sicong Cao
Publikationsdatum
01.12.2022
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 7/2022
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-022-10216-4

Weitere Artikel der Ausgabe 7/2022

Empirical Software Engineering 7/2022 Zur Ausgabe

Premium Partner