skip to main content
10.1145/3387940.3391463acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Deep Learning for Software Defect Prediction: A Survey

Authors Info & Claims
Published:25 September 2020Publication History

ABSTRACT

Software fault prediction is an important and beneficial practice for improving software quality and reliability. The ability to predict which components in a large software system are most likely to contain the largest numbers of faults in the next release helps to better manage projects, including early estimation of possible release delays, and affordably guide corrective actions to improve the quality of the software. However, developing robust fault prediction models is a challenging task and many techniques have been proposed in the literature. Traditional software fault prediction studies mainly focus on manually designing features (e.g. complexity metrics), which are input into machine learning classifiers to identify defective code. However, these features often fail to capture the semantic and structural information of programs. Such information is needed for building accurate fault prediction models. In this survey, we discuss various approaches in fault prediction, also explaining how in recent studies deep learning algorithms for fault prediction help to bridge the gap between programs' semantics and fault prediction features and make accurate predictions.

References

  1. Sousuke Amasaki, Yasunari Takagi, Osamu Mizuno, and Tohru Kikuno. 2003. A Bayesian Belief Network for Assessing the Likelihood of Fault Content. In Proceedings of the 14th International Symposium on Software Reliability Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Afshine Amidi. 2018. cheatsheet-machine-learning-tips-and-tricks. https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learningtips-and-tricksGoogle ScholarGoogle Scholar
  3. Victor R. Basili, Lionel C. Briand, and Walcélio L. Melo. 1996. A Validation of Object-Oriented Design Metrics As Quality Indicators. IEEE Trans. Softw. Eng. (1996).Google ScholarGoogle Scholar
  4. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. (2003).Google ScholarGoogle Scholar
  5. Lionel C. Briand, Jürgen Wüst, Stefan V. Ikonomovski, and Hakim Lounis. 1999. Investigating Quality Factors in Object-oriented Designs: An Industrial Case Study. In Proceedings of the 21st International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tse-Hsun Chen, Stephen W. Thomas, Meiyappan Nagappan, and Ahmed E. Hassan. 2012. Explaining Software Defects Using Topic Models. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories.Google ScholarGoogle Scholar
  7. S. R. Chidamber and C. F. Kemerer. 1994. A Metrics Suite for Object Oriented Design. IEEE Trans. Softw. Eng. (1994).Google ScholarGoogle Scholar
  8. Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim, and Chul-Joo Kim. 2019. Lessons Learned from Using a Deep Tree-Based Model for Software Defect Prediction in Practice. In Proceedings of the 16th International Conference on Mining Software Repositories.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Khanh Hoa Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya K. Ghose, Taeksu Kim, and Chul-Joo Kim. 2018. A deep tree-based model for software defect prediction. ArXiv (2018).Google ScholarGoogle Scholar
  10. Marco D'Ambros, Michele Lanza, and Romain Robbes. 2012. Evaluating Defect Prediction Approaches: A Benchmark and an Extensive Comparison. Empirical Softw. Engg. (2012).Google ScholarGoogle Scholar
  11. Elhampaikari, Michael M.richter, and Guentherruhe. 2012. Defect prediction using case-based reasoning: an attribute weighting technique based upon sensitivity analysis in neural network. International Journal of Software Engineering and Knowledge Engineering (2012).Google ScholarGoogle Scholar
  12. Karim O. Elish and Mahmoud O. Elish. 2008. Predicting Defect-Prone Software Modules Using Support Vector Machines. J. Syst. Softw. (2008).Google ScholarGoogle Scholar
  13. N. Gayatri, Nickolas Savarimuthu, and A. Reddy. 2010. Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions. Lecture Notes in Engineering and Computer Science (2010).Google ScholarGoogle Scholar
  14. Andrew Habib and Michael Pradel. 2019. Neural Bug Finding: A Study of Opportunities and Challenges. CoRR (2019).Google ScholarGoogle Scholar
  15. Z. He, F. Peters, T. Menzies, and Y. Yang. 2013. Learning from Open-Source Projects: An Empirical Study on Defect Prediction. In ACM IEEE International Symposium on Empirical Software Engineering and Measurement.Google ScholarGoogle Scholar
  16. Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized Defect Prediction. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xiao-Yuan Jing, Shi Ying, Zhi-Wu Zhang, Shan-Shan Wu, and Jin Liu. 2014. Dictionary Learning Based Software Defect Prediction. In Proceedings of the 36th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yasutaka Kamei, Takafumi Fukushima, Shane Mcintosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying Just-in-Time Defect Prediction Using Cross-Project Models. Empirical Softw. Engg. (2016).Google ScholarGoogle Scholar
  19. Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A Large-Scale Empirical Study of Just-in-Time Quality Assurance. IEEE Trans. Softw. Eng. (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Andrej Karpathy, Justin Johnson, and Fei Fei Li. 2015. Visualizing and Understanding Recurrent Networks. Cornell Univ. Lab. (2015).Google ScholarGoogle Scholar
  21. T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. 1996. Detection of Software Modules with High Debug Code Churn in a Very Large Legacy System. In Proceedings of the The Seventh International Symposium on Software Reliability Engineering.Google ScholarGoogle Scholar
  22. Taghi M. Khoshgoftaar and Naeem Seliya. 2002. Tree-Based Software Quality Estimation Models For Fault Prediction. In Proceedings of the 8th International Symposium on Software Metrics.Google ScholarGoogle Scholar
  23. Sunghun Kim, E. James Whitehead, and Yi Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Trans. Softw. Eng. (2008).Google ScholarGoogle Scholar
  24. Sunghun Kim, Thomas Zimmermann, E. James Whitehead Jr., and Andreas Zeller. 2007. Predicting Faults from Cached History. In Proceedings of the 29th International Conference on Software Engineering.Google ScholarGoogle Scholar
  25. Barbara A. Kitchenham, Emilia Mendes, and Guilherme H. Travassos. 2007. Cross versus Within-Company Cost Estimation Studies: A Systematic Review. IEEE Trans. Softw. Eng. (2007).Google ScholarGoogle Scholar
  26. J. Li, P. He, J. Zhu, and M. R. Lyu. 2017. Software Defect Prediction via Convolutional Neural Network. In IEEE International Conference on Software Quality, Reliability and Security (QRS).Google ScholarGoogle Scholar
  27. C. Manjula and Lilly Florence. 2019. Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Computing (2019).Google ScholarGoogle Scholar
  28. Shane McIntosh and Yasutaka Kamei. 2018. Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-in-Time Defect Prediction. In Proceedings of the 40th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayse Bener. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Autom. Softw. Eng. (2010).Google ScholarGoogle Scholar
  30. A. Mockus and D. M. Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal (2000).Google ScholarGoogle Scholar
  31. Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction. In Proceedings of the 30th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer Defect Learning. In Proceedings of the International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tung Thanh Nguyen, Tien N. Nguyen, and Tu Minh Phuong. 2011. Topic-Based Defect Prediction (NIER Track). In Proceedings of the 33rd International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. Omri, P. Montag, and C. Sinz. 2018. Static Analysis and Code Complexity Metrics as Early Indicators of Software Defects. Journal of Software Engineering and Applications (2018).Google ScholarGoogle Scholar
  38. S. Omri, C. Sinz, and P. Montag. [n.d.]. An Enhanced Fault Prediction Model for Embedded Software based on Code Churn, Complexity Metrics, and Static Analysis Results. ICSEA 2019: The Fourteenth International Conference on Software Engineering Advances.Google ScholarGoogle Scholar
  39. Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Anh Phan, Le Nguyen, and Lam Bui. 2018. Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction. (2018).Google ScholarGoogle Scholar
  41. Lutz Prechelt and Alexander Pepper. 2014. Why Software Repositories Are Not Used for Defect-Insertion Circumstance Analysis More Often: A Case Study. Inf. Softw. Technol. (2014).Google ScholarGoogle Scholar
  42. Alec Radford, Rafal Jozefowicz, and Ilya Sutskever. 2017. Learning to Generate Reviews and Discovering Sentiment. (2017).Google ScholarGoogle Scholar
  43. R. Rana, M. Staron, J. Hansson, and M. Nilsson. 2014. Defect prediction over software life cycle in automotive domain state of the art and road map for future. In 9th International Conference on Software Engineering and Applications (ICSOFT-EA).Google ScholarGoogle Scholar
  44. Ramanath Subramanyam and M. S. Krishnan. 2003. Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects. IEEE Trans. Softw. Eng. (2003).Google ScholarGoogle Scholar
  45. Ming Tan, Lin Tan, Sashank Dara, and Caleb Mayeux. 2015. Online Defect Prediction for Imbalanced Data. In Proceedings of the 37th International Conference on Software Engineering.Google ScholarGoogle ScholarCross RefCross Ref
  46. Mei-Huei Tang, Ming-Hung Kao, and Mei-Hwa Chen. 1999. An Empirical Study on Object-Oriented Metrics. In Proceedings of the 6th International Symposium on Software Metrics.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Haonan Tong, Bin Liu, and Shihai Wang. 2017. Software Defect Prediction Using Stacked Denoising Autoencoders and Two-stage Ensemble Learning. Information and Software Technology (2017).Google ScholarGoogle Scholar
  48. Burak Turhan, Tim Menzies, Ayundefinede B. Bener, and Justin Di Stefano. 2009. On the Relative Value of Cross-Company and within-Company Data for Defect Prediction. Empirical Softw. Engg. (2009).Google ScholarGoogle Scholar
  49. Jun Wang, Beijun Shen, and Yuting Chen. [n.d.]. Compressed C4.5 Models for Software Defect Prediction. In Proceedings of the 2012, 12th International Conference on Quality Software.Google ScholarGoogle Scholar
  50. Jinyong Wang and Ce Zhang. 2018. Software reliability prediction using a deep learning model based on the RNN encoder-decoder. Reliab. Eng. Syst. Saf. (2018).Google ScholarGoogle Scholar
  51. S. Wang, T. Liu, J. Nam, and L. Tan. 2018. Deep Semantic Feature Learning for Software Defect Prediction. IEEE Transactions on Software Engineering (2018).Google ScholarGoogle Scholar
  52. Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically Learning Semantic Features for Defect Prediction. In Proceedings of the 38th International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. T. Wang and W. Li. [n.d.]. Naive Bayes Software Defect Prediction Model. In 2010 International Conference on Computational Intelligence and Software Engineering.Google ScholarGoogle Scholar
  54. X. Xia, D. Lo, X. Wang, and X. Yang. 2016. Collective Personalized Change Classification With Multiobjective Search. IEEE Transactions on Reliability (2016).Google ScholarGoogle Scholar
  55. Xihao Xie, Wen Zhang, Ye Yang, and Qing Wang. 2012. DRETOM: Developer Recommendation Based on Topic Models for Bug Resolution. In Proceedings of the 8th International Conference on Predictive Models in Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep Learning for Just-in-Time Defect Prediction. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Thomas Zimmermann, Nachiappan Nagappan, Harald Gall, Emanuel Giger, and Brendan Murphy. 2009. Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Learning for Software Defect Prediction: A Survey
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
          June 2020
          831 pages
          ISBN:9781450379632
          DOI:10.1145/3387940

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 September 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Upcoming Conference

          ICSE 2025

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader