research-article

Deep Learning for Software Defect Prediction: A Survey

Authors:
Safa Omri

Karlsruhe Institute of Technology, Karlsruhe, Germany

Karlsruhe Institute of Technology, Karlsruhe, Germany
View Profile

,
Carsten Sinz

Karlsruhe Institute of Technology, Karlsruhe, Germany

Karlsruhe Institute of Technology, Karlsruhe, Germany
View Profile

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering WorkshopsJune 2020Pages 209–214https://doi.org/10.1145/3387940.3391463

Published:25 September 2020Publication History

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

Pages 209–214

ABSTRACT

Software fault prediction is an important and beneficial practice for improving software quality and reliability. The ability to predict which components in a large software system are most likely to contain the largest numbers of faults in the next release helps to better manage projects, including early estimation of possible release delays, and affordably guide corrective actions to improve the quality of the software. However, developing robust fault prediction models is a challenging task and many techniques have been proposed in the literature. Traditional software fault prediction studies mainly focus on manually designing features (e.g. complexity metrics), which are input into machine learning classifiers to identify defective code. However, these features often fail to capture the semantic and structural information of programs. Such information is needed for building accurate fault prediction models. In this survey, we discuss various approaches in fault prediction, also explaining how in recent studies deep learning algorithms for fault prediction help to bridge the gap between programs' semantics and fault prediction features and make accurate predictions.

References

Sousuke Amasaki, Yasunari Takagi, Osamu Mizuno, and Tohru Kikuno. 2003. A Bayesian Belief Network for Assessing the Likelihood of Fault Content. In Proceedings of the 14th International Symposium on Software Reliability Engineering.Google ScholarDigital Library
Afshine Amidi. 2018. cheatsheet-machine-learning-tips-and-tricks. https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learningtips-and-tricksGoogle Scholar
Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. 1996. A Validation of Object-Oriented Design Metrics As Quality Indicators. IEEE Trans. Softw. Eng. (1996).Google Scholar
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. (2003).Google Scholar
Lionel C. Briand, Jürgen Wüst, Stefan V. Ikonomovski, and Hakim Lounis. 1999. Investigating Quality Factors in Object-oriented Designs: An Industrial Case Study. In Proceedings of the 21st International Conference on Software Engineering.Google ScholarDigital Library
Tse-Hsun Chen, Stephen W. Thomas, Meiyappan Nagappan, and Ahmed E. Hassan. 2012. Explaining Software Defects Using Topic Models. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories.Google Scholar
S. R. Chidamber and C. F. Kemerer. 1994. A Metrics Suite for Object Oriented Design. IEEE Trans. Softw. Eng. (1994).Google Scholar
Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim, and Chul-Joo Kim. 2019. Lessons Learned from Using a Deep Tree-Based Model for Software Defect Prediction in Practice. In Proceedings of the 16th International Conference on Mining Software Repositories.Google ScholarDigital Library
Khanh Hoa Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya K. Ghose, Taeksu Kim, and Chul-Joo Kim. 2018. A deep tree-based model for software defect prediction. ArXiv (2018).Google Scholar
Marco D'Ambros, Michele Lanza, and Romain Robbes. 2012. Evaluating Defect Prediction Approaches: A Benchmark and an Extensive Comparison. Empirical Softw. Engg. (2012).Google Scholar
Elhampaikari, Michael M.richter, and Guentherruhe. 2012. Defect prediction using case-based reasoning: an attribute weighting technique based upon sensitivity analysis in neural network. International Journal of Software Engineering and Knowledge Engineering (2012).Google Scholar
Karim O. Elish and Mahmoud O. Elish. 2008. Predicting Defect-Prone Software Modules Using Support Vector Machines. J. Syst. Softw. (2008).Google Scholar
N. Gayatri, Nickolas Savarimuthu, and A. Reddy. 2010. Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions. Lecture Notes in Engineering and Computer Science (2010).Google Scholar
Andrew Habib and Michael Pradel. 2019. Neural Bug Finding: A Study of Opportunities and Challenges. CoRR (2019).Google Scholar
Z. He, F. Peters, T. Menzies, and Y. Yang. 2013. Learning from Open-Source Projects: An Empirical Study on Defect Prediction. In ACM IEEE International Symposium on Empirical Software Engineering and Measurement.Google Scholar
Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized Defect Prediction. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering.Google ScholarDigital Library
Xiao-Yuan Jing, Shi Ying, Zhi-Wu Zhang, Shan-Shan Wu, and Jin Liu. 2014. Dictionary Learning Based Software Defect Prediction. In Proceedings of the 36th International Conference on Software Engineering.Google ScholarDigital Library
Yasutaka Kamei, Takafumi Fukushima, Shane Mcintosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying Just-in-Time Defect Prediction Using Cross-Project Models. Empirical Softw. Engg. (2016).Google Scholar
Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A Large-Scale Empirical Study of Just-in-Time Quality Assurance. IEEE Trans. Softw. Eng. (2013).Google ScholarDigital Library
Andrej Karpathy, Justin Johnson, and Fei Fei Li. 2015. Visualizing and Understanding Recurrent Networks. Cornell Univ. Lab. (2015).Google Scholar
T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. 1996. Detection of Software Modules with High Debug Code Churn in a Very Large Legacy System. In Proceedings of the The Seventh International Symposium on Software Reliability Engineering.Google Scholar
Taghi M. Khoshgoftaar and Naeem Seliya. 2002. Tree-Based Software Quality Estimation Models For Fault Prediction. In Proceedings of the 8th International Symposium on Software Metrics.Google Scholar
Sunghun Kim, E. James Whitehead, and Yi Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Trans. Softw. Eng. (2008).Google Scholar
Sunghun Kim, Thomas Zimmermann, E. James Whitehead Jr., and Andreas Zeller. 2007. Predicting Faults from Cached History. In Proceedings of the 29th International Conference on Software Engineering.Google Scholar
Barbara A. Kitchenham, Emilia Mendes, and Guilherme H. Travassos. 2007. Cross versus Within-Company Cost Estimation Studies: A Systematic Review. IEEE Trans. Softw. Eng. (2007).Google Scholar
J. Li, P. He, J. Zhu, and M. R. Lyu. 2017. Software Defect Prediction via Convolutional Neural Network. In IEEE International Conference on Software Quality, Reliability and Security (QRS).Google Scholar
C. Manjula and Lilly Florence. 2019. Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Computing (2019).Google Scholar
Shane McIntosh and Yasutaka Kamei. 2018. Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-in-Time Defect Prediction. In Proceedings of the 40th International Conference on Software Engineering.Google ScholarDigital Library
Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayse Bener. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Autom. Softw. Eng. (2010).Google Scholar
A. Mockus and D. M. Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal (2000).Google Scholar
Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction. In Proceedings of the 30th International Conference on Software Engineering.Google ScholarDigital Library
Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering.Google ScholarDigital Library
Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering.Google ScholarDigital Library
Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer Defect Learning. In Proceedings of the International Conference on Software Engineering.Google ScholarDigital Library
Tung Thanh Nguyen, Tien N. Nguyen, and Tu Minh Phuong. 2011. Topic-Based Defect Prediction (NIER Track). In Proceedings of the 33rd International Conference on Software Engineering.Google ScholarDigital Library
S. Omri, P. Montag, and C. Sinz. 2018. Static Analysis and Code Complexity Metrics as Early Indicators of Software Defects. Journal of Software Engineering and Applications (2018).Google Scholar
S. Omri, C. Sinz, and P. Montag. [n.d.]. An Enhanced Fault Prediction Model for Embedded Software based on Code Churn, Complexity Metrics, and Static Analysis Results. ICSEA 2019: The Fourteenth International Conference on Software Engineering Advances.Google Scholar
Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.Google ScholarDigital Library
Anh Phan, Le Nguyen, and Lam Bui. 2018. Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction. (2018).Google Scholar
Lutz Prechelt and Alexander Pepper. 2014. Why Software Repositories Are Not Used for Defect-Insertion Circumstance Analysis More Often: A Case Study. Inf. Softw. Technol. (2014).Google Scholar
Alec Radford, Rafal Jozefowicz, and Ilya Sutskever. 2017. Learning to Generate Reviews and Discovering Sentiment. (2017).Google Scholar
R. Rana, M. Staron, J. Hansson, and M. Nilsson. 2014. Defect prediction over software life cycle in automotive domain state of the art and road map for future. In 9th International Conference on Software Engineering and Applications (ICSOFT-EA).Google Scholar
Ramanath Subramanyam and M. S. Krishnan. 2003. Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects. IEEE Trans. Softw. Eng. (2003).Google Scholar
Ming Tan, Lin Tan, Sashank Dara, and Caleb Mayeux. 2015. Online Defect Prediction for Imbalanced Data. In Proceedings of the 37th International Conference on Software Engineering.Google ScholarCross Ref
Mei-Huei Tang, Ming-Hung Kao, and Mei-Hwa Chen. 1999. An Empirical Study on Object-Oriented Metrics. In Proceedings of the 6th International Symposium on Software Metrics.Google ScholarDigital Library
Haonan Tong, Bin Liu, and Shihai Wang. 2017. Software Defect Prediction Using Stacked Denoising Autoencoders and Two-stage Ensemble Learning. Information and Software Technology (2017).Google Scholar
Burak Turhan, Tim Menzies, Ayundefinede B. Bener, and Justin Di Stefano. 2009. On the Relative Value of Cross-Company and within-Company Data for Defect Prediction. Empirical Softw. Engg. (2009).Google Scholar
Jun Wang, Beijun Shen, and Yuting Chen. [n.d.]. Compressed C4.5 Models for Software Defect Prediction. In Proceedings of the 2012, 12th International Conference on Quality Software.Google Scholar
Jinyong Wang and Ce Zhang. 2018. Software reliability prediction using a deep learning model based on the RNN encoder-decoder. Reliab. Eng. Syst. Saf. (2018).Google Scholar
S. Wang, T. Liu, J. Nam, and L. Tan. 2018. Deep Semantic Feature Learning for Software Defect Prediction. IEEE Transactions on Software Engineering (2018).Google Scholar
Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically Learning Semantic Features for Defect Prediction. In Proceedings of the 38th International Conference on Software Engineering.Google ScholarDigital Library
T. Wang and W. Li. [n.d.]. Naive Bayes Software Defect Prediction Model. In 2010 International Conference on Computational Intelligence and Software Engineering.Google Scholar
X. Xia, D. Lo, X. Wang, and X. Yang. 2016. Collective Personalized Change Classification With Multiobjective Search. IEEE Transactions on Reliability (2016).Google Scholar
Xihao Xie, Wen Zhang, Ye Yang, and Qing Wang. 2012. DRETOM: Developer Recommendation Based on Topic Models for Bug Resolution. In Proceedings of the 8th International Conference on Predictive Models in Software Engineering.Google ScholarDigital Library
Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep Learning for Just-in-Time Defect Prediction. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security.Google ScholarDigital Library
Thomas Zimmermann, Nachiappan Nagappan, Harald Gall, Emanuel Giger, and Brendan Murphy. 2009. Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT.Google ScholarDigital Library

Index Terms

Deep Learning for Software Defect Prediction: A Survey
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Index terms have been assigned to the content through auto-classification.

Recommendations

Progress on approaches to software defect prediction

Software defect prediction is one of the most popular research topics in software engineering. It aims to predict defect‐prone software modules before defects are discovered, therefore it can be used to better prioritise software quality assurance effort. ...
Read More
Transfer learning for cross-company software defect prediction

Context: Software defect prediction studies usually built models using within-company data, but very few focused on the prediction models trained with cross-company data. It is difficult to employ these models which are built on the within-company data ...
Read More
Defect prediction model using transfer learning
Abstract
Software defect prediction (SDP) plays an important role in new research areas of software engineering. Cross-project defect prediction (CPDP) technique achieved success for prediction of defects in innovating projects having lack of data. In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
June 2020
831 pages
ISBN:9781450379632
DOI:10.1145/3387940

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
machine learning
software defect prediction
software quality assurance
software testing
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 891
  Total Downloads
- Downloads (Last 12 months)177
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Learning for Software Defect Prediction: A Survey

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Progress on approaches to software defect prediction

Transfer learning for cross-company software defect prediction

Defect prediction model using transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep Learning for Software Defect Prediction: A Survey

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Progress on approaches to software defect prediction

Transfer learning for cross-company software defect prediction

Defect prediction model using transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media