Skip to main content
Top
Published in: Empirical Software Engineering 7/2022

01-12-2022

SPVF: security property assisted vulnerability fixing via attention-based models

Authors: Zhou Zhou, Lili Bo, Xiaoxue Wu, Xiaobing Sun, Tao Zhang, Bin Li, Jiale Zhang, Sicong Cao

Published in: Empirical Software Engineering | Issue 7/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The past few years have witnessed the wide application of machine learning models to fix vulnerabilities automatically. However, existing approaches cannot capture the characteristics of vulnerabilities that are helpful to improve the effectiveness of automated vulnerability fixing. In this paper, we propose a novel approach for automatically fixing vulnerabilities, called SPVF. SPVF captures the security property from the descriptive information about the vulnerability. SPVF is based on the attention mechanism and uses the abstract syntax tree as well as the security properties, integrating them using the pointer generator. The experimental results on two public datasets show that SPVF outperforms the state-of-the-art approaches by 13% for C/C++ and 47% for Python. And SPVF is capable of successfully fixing 153 C/C++ vulnerabilities and 276 Python vulnerabilities.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
3
Note that a small number of vulnerabilities cannot be classified into the classifications listed due to the absence of CWE id information in the dataset.
 
Literature
go back to reference Abhinav K, Sharvani V, Dubey A, D’Souza M, Bhardwaj N, Jain S, Arora V (2021) Repairnet: contextual sequence-to-sequence network for automated program repair. In: Roll I, McNamara DS, Sosnovsky SA, Luckin R, Dimitrova V (eds) Artificial intelligence in education - 22nd international conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part I, Lecture notes in computer science, vol 12748. Springer, pp 3–15. https://doi.org/10.1007/978-3-030-78292-4_1 Abhinav K, Sharvani V, Dubey A, D’Souza M, Bhardwaj N, Jain S, Arora V (2021) Repairnet: contextual sequence-to-sequence network for automated program repair. In: Roll I, McNamara DS, Sosnovsky SA, Luckin R, Dimitrova V (eds) Artificial intelligence in education - 22nd international conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part I, Lecture notes in computer science, vol 12748. Springer, pp 3–15. https://​doi.​org/​10.​1007/​978-3-030-78292-4_​1
go back to reference Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
go back to reference Bird S, Loper E (2004) NLTK: the natural language toolkit. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26, 2004 - Poster and Demonstration. ACL Bird S, Loper E (2004) NLTK: the natural language toolkit. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26, 2004 - Poster and Demonstration. ACL
go back to reference Cao S, Sun X, Bo L, Wu R, Li B, Tao C (2022) MVD: memory-related vulnerability detection based on flow-sensitive graph neural networks. In: 44th IEEE/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, PA, USA, May 25–27, 2022. IEEE, pp 1456–1468. https://doi.org/10.1145/3510003.3510219 Cao S, Sun X, Bo L, Wu R, Li B, Tao C (2022) MVD: memory-related vulnerability detection based on flow-sensitive graph neural networks. In: 44th IEEE/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, PA, USA, May 25–27, 2022. IEEE, pp 1456–1468. https://​doi.​org/​10.​1145/​3510003.​3510219
go back to reference Chakraborty S, Krishna R, Ding Y, Ray B (2020) Deep learning based vulnerability detection: are we there yet? arxiv:2009.07235 Chakraborty S, Krishna R, Ding Y, Ray B (2020) Deep learning based vulnerability detection: are we there yet? arxiv:2009.​07235
go back to reference Chen Z, Kommrusch S, Monperrus M (2019) Using sequence-to-sequence learning for repairing C vulnerabilities. arxiv:1912.02015 Chen Z, Kommrusch S, Monperrus M (2019) Using sequence-to-sequence learning for repairing C vulnerabilities. arxiv:1912.​02015
go back to reference Cheng X, Wang H, Hua J, Xu G, Sui Y (2021) Deepwukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans Softw Eng Methodol 30(3):38:1–38:33CrossRef Cheng X, Wang H, Hua J, Xu G, Sui Y (2021) Deepwukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans Softw Eng Methodol 30(3):38:1–38:33CrossRef
go back to reference Chi J, Qu Y, Liu T, Zheng Q, Yin H (2020) Seqtrans: automatic vulnerability fix via sequence to sequence learning. arxiv:2010.10805. Accessed May 2021 Chi J, Qu Y, Liu T, Zheng Q, Yin H (2020) Seqtrans: automatic vulnerability fix via sequence to sequence learning. arxiv:2010.​10805. Accessed May 2021
go back to reference Cooper N, Bernal-Cárdenas C, Chaparro O, Moran K, Poshyvanyk D (2021) It takes two to TANGO: combining visual and textual information for detecting duplicate video-based bug reports. In: 43rd IEEE/ACM international conference on software engineering, ICSE 2021, Madrid, Spain, 22–30 May 2021. IEEE, pp 957–969. https://doi.org/10.1109/ICSE43902.2021.00091 Cooper N, Bernal-Cárdenas C, Chaparro O, Moran K, Poshyvanyk D (2021) It takes two to TANGO: combining visual and textual information for detecting duplicate video-based bug reports. In: 43rd IEEE/ACM international conference on software engineering, ICSE 2021, Madrid, Spain, 22–30 May 2021. IEEE, pp 957–969. https://​doi.​org/​10.​1109/​ICSE43902.​2021.​00091
go back to reference Durumeric Z, Kasten J, Adrian D, Halderman JA, Bailey M, Li F, Weaver N, Amann J, Beekman J, Payer M, Paxson V (2014) The matter of heartbleed. In: Proceedings of the 2014 internet measurement conference, IMC 2014, Vancouver, BC, Canada, November 5–7, 2014. ACM, pp 475–488. https://doi.org/10.1145/2663716.2663755 Durumeric Z, Kasten J, Adrian D, Halderman JA, Bailey M, Li F, Weaver N, Amann J, Beekman J, Payer M, Paxson V (2014) The matter of heartbleed. In: Proceedings of the 2014 internet measurement conference, IMC 2014, Vancouver, BC, Canada, November 5–7, 2014. ACM, pp 475–488. https://​doi.​org/​10.​1145/​2663716.​2663755
go back to reference Fan J, Li Y, Wang S, Nguyen TN (2020) A C/C++ code vulnerability dataset with code changes and CVE summaries. In: MSR ’20: 17th international conference on mining software repositories, Seoul, Republic of Korea, 29–30 June, 2020. ACM, pp 508–512 Fan J, Li Y, Wang S, Nguyen TN (2020) A C/C++ code vulnerability dataset with code changes and CVE summaries. In: MSR ’20: 17th international conference on mining software repositories, Seoul, Republic of Korea, 29–30 June, 2020. ACM, pp 508–512
go back to reference Freitag M, Al-Onaizan Y (2017) Beam search strategies for neural machine translation. In: Luong T, Birch A, Neubig G, Finch AM (eds) Proceedings of the first workshop on neural machine translation, NMT@ACL 2017, Vancouver, Canada, August 4, 2017. Association for Computational Linguistics, pp 56–60. https://doi.org/10.18653/v1/w17-3207 Freitag M, Al-Onaizan Y (2017) Beam search strategies for neural machine translation. In: Luong T, Birch A, Neubig G, Finch AM (eds) Proceedings of the first workshop on neural machine translation, NMT@ACL 2017, Vancouver, Canada, August 4, 2017. Association for Computational Linguistics, pp 56–60. https://​doi.​org/​10.​18653/​v1/​w17-3207
go back to reference Harer J, Ozdemir O, Lazovich T, Reale CP, Russell RL, Kim LY, Chin SP (2018) Learning to repair software vulnerabilities with generative adversarial networks. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, canada, pp 7944–7954 Harer J, Ozdemir O, Lazovich T, Reale CP, Russell RL, Kim LY, Chin SP (2018) Learning to repair software vulnerabilities with generative adversarial networks. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, canada, pp 7944–7954
go back to reference Huang Z, Lie D, Tan G, Jaeger T (2019) Using safety properties to generate vulnerability patches. In: 2019 IEEE symposium on security and privacy, SP 2019, San Francisco, CA, USA, May 19–23, 2019. IEEE, pp 539–554 Huang Z, Lie D, Tan G, Jaeger T (2019) Using safety properties to generate vulnerability patches. In: 2019 IEEE symposium on security and privacy, SP 2019, San Francisco, CA, USA, May 19–23, 2019. IEEE, pp 539–554
go back to reference Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. arXiv:1412.6980 Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. arXiv:1412.​6980
go back to reference Li F, Paxson V (2017) A large-scale empirical study of security patches. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, CCS 2017, Dallas, TX, USA, October 30–November 03, 2017. ACM, pp 2201–2215 Li F, Paxson V (2017) A large-scale empirical study of security patches. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, CCS 2017, Dallas, TX, USA, October 30–November 03, 2017. ACM, pp 2201–2215
go back to reference Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. arxiv:1801.01681 Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. arxiv:1801.​01681
go back to reference Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 101–114 Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 101–114
go back to reference Mesbah A, Rice A, Johnston E, Glorioso N, Aftandilian E (2019) Deepdelta: learning to repair compilation errors. In: Dumas M, Pfahl D, Apel S, Russo A (eds) Proceedings of the ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26–30, 2019. ACM, pp 925–936. https://doi.org/10.1145/3338906.3340455 Mesbah A, Rice A, Johnston E, Glorioso N, Aftandilian E (2019) Deepdelta: learning to repair compilation errors. In: Dumas M, Pfahl D, Apel S, Russo A (eds) Proceedings of the ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26–30, 2019. ACM, pp 925–936. https://​doi.​org/​10.​1145/​3338906.​3340455
go back to reference Monperrus M (2018) Automatic software repair: a bibliography. ACM Comput Surv 51(1):17:1–17:24 Monperrus M (2018) Automatic software repair: a bibliography. ACM Comput Surv 51(1):17:1–17:24
go back to reference Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) Fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Demonstrations. Association for Computational Linguistics, pp 48–53 Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) Fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Demonstrations. Association for Computational Linguistics, pp 48–53
go back to reference Pradel M, Murali V, Qian R, Machalica M, Meijer E, Chandra S (2020) Scaffle: bug localization on millions of files. In: Khurshid S, Pasareanu CS (eds) ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 225–236. https://doi.org/10.1145/3395363.3397356 Pradel M, Murali V, Qian R, Machalica M, Meijer E, Chandra S (2020) Scaffle: bug localization on millions of files. In: Khurshid S, Pasareanu CS (eds) ISSTA ’20: 29th ACM SIGSOFT international symposium on software testing and analysis, virtual event, USA, July 18–22, 2020. ACM, pp 225–236. https://​doi.​org/​10.​1145/​3395363.​3397356
go back to reference See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp 1073–1083 See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp 1073–1083
go back to reference Tarlow D, Moitra S, Rice A, Chen Z, Manzagol P, Sutton C, Aftandilian E (2020) Learning to fix build errors with graph2diff neural networks. In: ICSE ’20: 42nd international conference on software engineering, workshops, Seoul, Republic of Korea, 27 June–19 July, 2020. ACM, pp 19–20. https://doi.org/10.1145/3387940.3392181 Tarlow D, Moitra S, Rice A, Chen Z, Manzagol P, Sutton C, Aftandilian E (2020) Learning to fix build errors with graph2diff neural networks. In: ICSE ’20: 42nd international conference on software engineering, workshops, Seoul, Republic of Korea, 27 June–19 July, 2020. ACM, pp 19–20. https://​doi.​org/​10.​1145/​3387940.​3392181
go back to reference Tufano M, Watson C, Bavota G, Penta MD, White M, Poshyvanyk D (2018) An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018, Montpellier, France, September 3–7, 2018. ACM, pp 832–837. https://doi.org/10.1145/3238147.3240732 Tufano M, Watson C, Bavota G, Penta MD, White M, Poshyvanyk D (2018) An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018, Montpellier, France, September 3–7, 2018. ACM, pp 832–837. https://​doi.​org/​10.​1145/​3238147.​3240732
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008
go back to reference Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017, Urbana, IL, USA, October 30–November 03, 2017. IEEE Computer Society, pp 660–670. https://doi.org/10.1109/ASE.2017.8115676 Xin Q, Reiss SP (2017) Leveraging syntax-related code for automated program repair. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017, Urbana, IL, USA, October 30–November 03, 2017. IEEE Computer Society, pp 660–670. https://​doi.​org/​10.​1109/​ASE.​2017.​8115676
go back to reference Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE symposium on security and privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014. IEEE Computer Society, pp 590–604. https://doi.org/10.1109/SP.2014.44 Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE symposium on security and privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014. IEEE Computer Society, pp 590–604. https://​doi.​org/​10.​1109/​SP.​2014.​44
go back to reference Yang S, Wang Y, Chu X (2020) A survey of deep learning techniques for neural machine translation. arxiv:2002.07526 Yang S, Wang Y, Chu X (2020) A survey of deep learning techniques for neural machine translation. arxiv:2002.​07526
go back to reference Yasunaga M, Liang P (2020) Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, Virtual Event, Proceedings of machine learning research, vol 119, pp 10,799–10,808. PMLR Yasunaga M, Liang P (2020) Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, Virtual Event, Proceedings of machine learning research, vol 119, pp 10,799–10,808. PMLR
Metadata
Title
SPVF: security property assisted vulnerability fixing via attention-based models
Authors
Zhou Zhou
Lili Bo
Xiaoxue Wu
Xiaobing Sun
Tao Zhang
Bin Li
Jiale Zhang
Sicong Cao
Publication date
01-12-2022
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 7/2022
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-022-10216-4

Other articles of this Issue 7/2022

Empirical Software Engineering 7/2022 Go to the issue

Premium Partner