Top

Automated Software Engineering

Published in:

01-05-2024

An extensive study of the effects of different deep learning models on code vulnerability detection in Python code

Authors: Rongcun Wang, Senlei Xu, Xingyu Ji, Yuan Tian, Lina Gong, Ke Wang

Published in: Automated Software Engineering | Issue 1/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Deep learning has achieved great progress in automated code vulnerability detection. Several code vulnerability detection approaches based on deep learning have been proposed. However, few studies empirically studied the impacts of different deep learning models on code vulnerability detection in Python. For this reason, we strive to cover many more code representation learning models and classification models for vulnerability detection. We design and conduct an empirical study for evaluating the effects of the eighteen deep learning architectures derived from combinations of three representation learning models, i.e., Word2Vec, fastText, and CodeBERT, and six classification models, i.e., random forest, XGBoost, Multi-Layer Perception (MLP), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Gate Recurrent Unit (GRU) on code vulnerability detection in total. Additionally, two machine learning strategies i.e., the attention and bi-directional mechanisms are also empirically compared. The statistical significance and effect size analysis between different models are also conducted. In terms of precision, recall, and F-score, Word2Vec is better than Bidirectional Encoder Representations from Transformers CodeBERT and fastText. Likewise, long short-term memory (LSTM) and gated recurrent unit (GRU) are superior to other classification models we studied. The bi-directional LSTM and GRU with attention using Word2Vec are two optimal models for solving code vulnerability detection for Python code. Moreover, they have medium or large effect sizes on LSTM and GRU using only a single mechanism. Both the representation learning models and classification models have important influences on vulnerability detection in Python code. Likewise, the bi-directional and attention mechanisms can impact the performance of code vulnerability detection.

previous article Coevolutionary scheduling of dynamic software project considering the new skill learning

next article Leveraging privacy profiles to empower users in the digital society

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

https://www.globalsecuritymag.com/Trellix-Advanced-Research-Center-patches-61-000-vulnerable-open-source-projects.html.

https://survey.stackoverflow.co/2022.

https://bit.ly/3bX30ai.

https://GitHub.com/huggingface/transformers.

https://GitHub.com/scikit-learn.

https://GitHub.com/Keras-team/Keras.

Aivatoglou, G., Anastasiadis, M., Spanos, G., Voulgaridis, A., Votis, K., Tzovaras, D.: A tree-based machine learning methodology to automatically classify software vulnerabilities. In: IEEE International Conference on CyberSecurity and Resilience (CSR), pp. 312–317 (2021). IEEE

Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6 (2017). IEEE

Alfadel, M., Costa, D.E., Shihab, E.: Empirical analysis of security vulnerabilities in python packages. In: Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 446–457 (2021)

Amin, A., Anwar, S., Adnan, A., Nawaz, M., Howard, N., Qadir, J., Hawalah, A., Hussain, A.: Comparing oversampling techniques to handle the class imbalance problem: a customer churn prediction case study. IEEE Access 4, 7940–7957 (2016)CrossRef

Aota, M., Kanehara, H., Kubo, M., Murata, N., Sun, B., Takahashi, T.: Automation of vulnerability classification from its description using machine learning. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7 (2020). IEEE

Bagheri, A., Hegedűs, P.: A comparison of different source code representation methods for vulnerability prediction in python. In: International Conference on the Quality of Information and Communications Technology, pp. 267–281 (2021). Springer

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

Bhandari, G., Naseer, A., Moonen, L.: CVEfixes: automated collection of vulnerabilities and their fixes from open-source software. In: Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (2021)

Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)CrossRef

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

Chakraborty, S., Krishna, R., Ding, Y., Ray, B.: Deep learning based vulnerability detection: are we there yet. IEEE Trans. Softw. Eng. 48(09), 3280–3296 (2022)CrossRef

Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

Chollet, F., et al.: Keras: the python deep learning library. Astrophysics Source Code Library (2018)

Cliff, N.: Dominance statistics: ordinal analyses to answer ordinal questions. Psychol. Bull. 144(3), 494–509 (1993)CrossRef

Dam, H.K., Tran, T., Pham, T., Ng, S.W., Grundy, J., Ghose, A.: Automatic feature learning for vulnerability prediction. arXiv:1708.02368 (2017)

Decan, A., Mens, T., Constantinou, E.: On the impact of security vulnerabilities in the npm package dependency network. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 181–191 (2018)

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

Dowd, M., McDonald, J., Schuh, J.: The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Addison-Wesley Professional (2006)

Engler, D., Chen, D.Y., Hallem, S., Chou, A., Chelf, B.: Bugs as seviant behavior: a general approach to inferring errors in systems code. ACM SIGOPS Oper. Syst. Rev. 35(5), 57–72 (2001)CrossRef

Fan, J., Li, Y., Wang, S., Nguyen, T.N.: A c/c++ code vulnerability dataset with code changes and CVE summaries. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 508–512 (2020)

Fang, Y., Liu, Y., Huang, C., Liu, L.: Fastembed: predicting vulnerability exploitation possibility based on ensemble machine learning algorithm. PLoS ONE 15(2), 0228439 (2020)CrossRef

Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)

Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)CrossRef

Fu, M., Tantithamthavorn, C.: Linevul: a transformer-based line-level vulnerability prediction. In: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), pp. 608–620 (2022). https://doi.org/10.1145/3524842.3528452

Ghaffarian, S., Shahriari, H.R.: Software vulnerability analysis and discovery using machine-learning and data-mining techniques: a survey. ACM Comput. Surv. 50, 1–36 (2017)CrossRef

Gong, L., Jiang, S., Wang, R., Jiang, L.: Empirical evaluation of the impact of class overlap on software defect prediction. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 698–709 (2019)

Han, Z., Li, X., Xing, Z., Liu, H., Feng, Z.: Learning to predict severity of software vulnerability using only vulnerability description. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 125–136 (2017). IEEE

Harer, J.A., Kim, L.Y., Russell, R.L., Ozdemir, O., Kosta, L.R., Rangamani, A., Hamilton, L.H., Centeno, G.I., Key, J.R., Ellingwood, P.M., et al.: Automated software vulnerability detection with machine learning. arXiv preprint arXiv:1803.04497 (2018)

Harzevili, N.S., Shin, J., Wang, J., Wang, S.: Characterizing and understanding software security vulnerabilities in machine learning libraries. arXiv preprint arXiv:2203.06502 (2022)

He, J., Wu, X., Cheng, Z., Yuan, Z., Jiang, Y.-G.: DB-LSTM: densely-connected bi-directional LSTM for human action recognition. Neurocomputing 444, 319–331 (2020)CrossRef

Heinemann, L., Deissenboeck, F., Gleirscher, M., Hummel, B., Irlbeck, M.: On the extent and nature of software reuse in open source java projects. In: Proceedings of the 12th International Conference on Top Productivity Through Software Reuse, pp. 207–222 (2011)

Herbold, S., Trautsch, A., Grabowski, J.: A comparative study to benchmark cross-project defect prediction approaches. IEEE Trans. Softw. Eng. 44(9), 811–833 (2018)CrossRef

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

Hussain, S., Ibrahim, N.: Empirical investigation of role of meta-learning approaches for the improvement of software development process via software fault prediction. In: Proceedings of the International Conference on Evaluation and Assessment in Software Engineering 2022, pp. 413–420 (2022)

Jain, S., Wallace, B.C.: Attention is not explanation. arXiv:1902.10186 (2019)

Jiang, C., Hua, B., Ouyang, W., Fan, Q., Pan, Z.: Pyguard: finding and understanding vulnerabilities in python virtuals machines. In: Proceedings of the 32nd International Symposium on Software Reliability Engineering (ISSRE 2021), pp. 468–475 (2021)

Le, T.H.M., Chen, H., Babar, M.A.: A survey on data-driven software vulnerability assessment and prioritization. ACM Comput. Surv. 55(5) (2022)

Le, T., Sabir, B., Ali Babar, M.: Automated software vulnerability assessment with concept drift. In: Proceedings of IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 371–382 (2019)

Li, J., Monroe, W., Jurafsky, D.: Understanding neural networks through representation erasure. CoRR arXiv:1612.08220 (2016)

Li, Y., Wang, S., Nguyen, T.N.: Vulnerability detection with fine-grained interpretations. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 292–303 (2021)

Li, Z., Zou, D., Xu, S., Jin, H., Zhu, Y., Chen, Z.: Sysevr: a framework for using deep learning to detect software vulnerabilities. IEEE Trans. Depend. Secure Comput. (2021)

Li, Z., Zou, D., Xu, S., Ou, X., Jin, H., Wang, S., Deng, Z., Zhong, Y.: Vuldeepecker: a deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 (2018)

Lin, G., Zhang, J., Luo, W., Pan, L., De Vel, O., Montague, P., Xiang, Y.: Software vulnerability discovery via learning multi-domain knowledge bases. IEEE Trans. Dependable Secure Comput. 18(5), 2469–2485 (2019)CrossRef

Lin, G., Wen, S., Han, Q.-L., Zhang, J., Xiang, Y.: Software vulnerability detection using deep neural networks: a survey. Proc. IEEE 108(10), 1825–1848 (2020)CrossRef

Ma, S., Thung, F., Lo, D., Sun, C., Deng, R.H.: Vurle: automatic vulnerability detection and repair by learning from examples. In: European Symposium on Research in Computer Security, pp. 229–246 (2017). Springer

Mashhadi, E., Hemmati, H.: Applying codebert for automated program repair of java simple bugs. In: Proceedings of the 18th International Conference on Mining Software Repositories (MSR), pp. 505–509 (2021). IEEE

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

Nikitopoulos, G., Dritsa, K., Louridas, P., Mitropoulos, D.: Crossvul: a cross-language vulnerability dataset with commit data. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ‘21), pp. 1565–1569 (2021)

Pang, Y., Xue, X., Namin, A.S.: Predicting vulnerable software components through n-gram analysis and statistical feature selection. In: Proceedings of the 14th International Conference on Machine Learning and Applications (ICMLA) (2015)

Pang, N., Zhao, X., Wang, W., Xiao, W., Guo, D.: Few-shot text classification by leveraging bi-directional attention and cross-class knowledge. Sci. China Inf. Sci. 64 (2021)

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet

Pendleton, M., Garcia-Lebron, R., Cho, J.-H., Xu, S.: A survey on systems security metrics. ACM Comput. Surv. (CSUR) 49(4), 1–35 (2016)CrossRef

Pewny, J., Schuster, F., Bernhard, L., Holz, T., Rossow, C.: Leveraging semantic signatures for bug search in binary programs. In: Proceedings of the 30th Annual Computer Security Applications Conference, pp. 406–415 (2014)

Qiao, Y., Zhang, W., Du, X., Guizani, M.: Malware classification based on multilayer perception and word2vec for IoT security. ACM Trans. Internet Technol. 22(1), 1–22 (2021)CrossRef

Romano, J., Kromrey, J.C. J. D., Skowronek, J.: Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’sd for evaluating group differences on the NSSE and other surveys. In: the Annual Meeting of the Florida Association of Institutional Research, pp. 1–31 (2006)

Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)CrossRef

Ruohonen, J.: An empirical analysis of vulnerabilities in python packages for web applications. In: 2018 9th International Workshop on Empirical Software Engineering in Practice (IWESEP), pp. 25–30 (2018). IEEE

Russell, R., Kim, L., Hamilton, L., Lazovich, T., Harer, J., Ozdemir, O., Ellingwood, P., McConley, M.: Automated vulnerability detection in source code using deep representation learning. In: Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 757–762 (2018)

Sahin, S.E., Tosun, A.: A conceptual replication on predicting the severity of software vulnerabilities. In: Proceedings of the Evaluation and Assessment on Software Engineering, pp. 244–250 (2019)

Semasaba, A.O.A., Zheng, W., Wu, X., Agyemang, S.A., Liu, T., Ge, Y.: An empirical evaluation of deep learning-based source code vulnerability detection: representation versus models. J. Softw. Evolut. Process. 2422 (2022)

Shin, Y., Williams, L.: Can traditional fault prediction models be used for vulnerability prediction? Empir. Softw. Eng. 18(1), 25–59 (2013)CrossRef

Shin, Y., Meneely, A., Williams, L., Osborne, J.A.: Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans. Softw. Eng. 37(6), 772–787 (2011)CrossRef

Stein, R.A., Jaques, P.A., Valiati, J.F.: An analysis of hierarchical text classification using word embeddings. Inf. Sci. 471, 216–232 (2019)CrossRef

Subhan, F., Wu, X., Bo, L., Sun, X., Rahman, M.: A deep learning-based approach for software vulnerability detection using code metrics. IET Softw. 16(5), 516–526 (2022)CrossRef

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

Verdi, M., Sami, A., Akhondali, J., Khomh, F., Uddin, G., Motlagh, A.K.: An empirical study of c++ vulnerabilities in crowd-sourced code examples. IEEE Trans. Softw. Eng. (2020)

Wang, K., Cui, Y., Hu, J., Zhang, Y., Zhao, W., Feng, L.: Cyberbullying detection, based on the fasttext and word similarity schemes 20(1) (2020)

Wang, J., Li, B., Zeng, Y.: Xgboost-based android malware detection. In: Proceedings of the 13th International Conference on Computational Intelligence and Security (CIS), pp. 268–272 (2017). IEEE

Wartschinski, L., Noller, Y., Vogel, T., Kehrer, T., Grunske, L.: Vudenc: vulnerability detection with deep learning on a natural codebase for python. Inf. Softw. Technol. 144, 106809 (2022)CrossRef

Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1(6), 80–83 (1944)MathSciNetCrossRef

Wu, X., Zheng, W., Chen, X., Wang, F., Mu, D.: CVE-assisted large-scale security bug report dataset construction method. J. Syst. Softw. 160, 110456 (2019)CrossRef

Xu, A., Dai, T., Chen, H., Ming, Z., Li, W.: Vulnerability detection for source code using contextual LSTM. In: 2018 5th International Conference on Systems and Informatics (ICSAI), pp. 1225–1230 (2018). IEEE

Zheng, Y., Pujar, S., Lewis, B., Buratti, L., Epstein, E., Yang, B., Laredo, J., Morari, A., Su, Z.: D2a: a dataset built for AI-based vulnerability detection methods using differential analysis. In: Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP 2021), pp. 111–120 (2021)

Zhou, X., Han, D., Lo, D.: Assessing generalizability of codebert. In: Proceedings of the 37th International Conference on Software Maintenance and Evolution (ICSME), pp. 425–436 (2021). IEEE

Zou, D., Wang, S., Xu, S., Li, Z., Jin, H.: \(\mu\) vuldeepecker: a deep learning-based system for multiclass vulnerability detection. IEEE Trans. Dependable Secure Comput. 18(5), 2224–2236 (2019)

Title: An extensive study of the effects of different deep learning models on code vulnerability detection in Python code
Authors: Rongcun Wang
Senlei Xu
Xingyu Ji
Yuan Tian
Lina Gong
Ke Wang
Publication date: 01-05-2024
Publisher: Springer US
Published in: Automated Software Engineering / Issue 1/2024
Print ISSN: 0928-8910
Electronic ISSN: 1573-7535
DOI: https://doi.org/10.1007/s10515-024-00413-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2024

Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks

DifFuzzAR: automatic repair of timing side-channel vulnerabilities via refactoring

Sound analysis and migration of data from Ethereum smart contracts

Using model-driven engineering to automate software language translation

Using data mining techniques to generate test cases from graph transformation systems specifications

Prompt enhance API recommendation: visualize the user’s real intention behind this query

Premium Partner