Skip to main content
Log in

Cross project defect prediction for open source software

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Software defect prediction is the process of identification of defects early in the life cycle so as to optimize the testing resources and reduce maintenance efforts. Defect prediction works well if sufficient amount of data is available to train the prediction model. However, not always this is the case. For example, when the software is the first release or the company has not maintained significant data. In such cases, cross project defect prediction may identify the defective classes. In this work, we have studied the feasibility of cross project defect prediction and empirically validated the same. We conducted our experiments on 12 open source datasets. The prediction model is built using 12 software metrics. After studying the various train test combinations, we found that cross project defect prediction was feasible in 35 out of 132 cases. The success of prediction is determined via precision, recall and AUC of the prediction model. We have also analyzed 14 descriptive characteristics to construct the decision tree. The decision tree learnt from this data has 15 rules which describe the feasibility of successful cross project defect prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Zimmermann T, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction

  2. Malhotra R, Agrawal A (2014) CMS tool. ACM SIGSOFT Softw. Eng. Notes 39(1):1–5

    Article  Google Scholar 

  3. Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418

    Article  Google Scholar 

  4. Gray R, Macdonell SG (1997) A comparison of techniques for developing predictive models of software metrics. Inf Softw Technol 5849(96):6–7

    Google Scholar 

  5. Mishra B, Shukla KK (2011) Impact of attribute selection on defect proneness prediction in OO software. In: 2011 2nd Int. Conf. Comput. Commun. Technol., pp 367–372

  6. Chidamber Shyam R, Kemerer Chris F (1994) A Metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  7. He Z, Shu F, Yang Y, Li M, Wang Q (2011) An investigation on the feasibility of cross-project defect prediction. Autom Softw Eng 19(2):167–199

    Article  Google Scholar 

  8. Ma Y, Luo G, Zeng X, Chen A (2012) Transfer learning for cross-company software defect prediction. Inf Softw Technol 54(3):248–256

    Article  Google Scholar 

  9. Turhan B, Menzies T, Bener AB, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14(5):540–578

    Article  Google Scholar 

  10. Canfora G, De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2013) Multi-objective cross-project defect prediction. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, Luembourg, pp 252–261

  11. Steffen H (2013) Training data selection for cross-project defect prediction.In: 9th International Conference on Predictive Models in Software Engineering, ACM, New York, USA, p 10

  12. Ryu D, Choi O, Baik J (2014) Improving prediction robustness of VAB-SVM for cross-project defect prediction. In: IEEE 17th International Conference on Computational Science and Engineering, Chengdu, pp 994–999

  13. Panichella R, Oliveto R, De Lucia A (2010) Cross-project defect prediction models: L’Union fait la force. Software Evolution Week—IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), Antwerp, pp 164–173

  14. Amasaki S, Kawata K, Yokogawa T (2015) Improving cross-project defect prediction methods with data simplification. In: 41st Euromicro Conference on Software Engineering and Advanced Applications, Funchal, pp 96–103

  15. Herbold S (2015) CrossPare: a tool for benchmarking cross-project defect predictions. In: 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW), Lincoln, NE, pp 90–96

  16. Satin RFP, Wiese IS, Ré R (2015) An exploratory study about the cross-project defect prediction: impact of using different classification algorithms and a measure of performance in building predictive models. In: Latin American Computing Conference (CLEI), Arequipa, pp 1–12

  17. Zhang Y, Lo D, Xia X, Sun J (2015) An empirical study of classifier combination for cross-project defect prediction. In: IEEE 39th Annual Computer Software and Applications Conference, Taichung, pp 264–269

  18. Peters F, Menzies T, Layman L (2015) LACE2: better privacy-preserving data sharing for cross project defect prediction. IEEE/ACM 37th IEEE International Conference on Software Engineering, Florence, pp 801–811

  19. Xia X, Lo D, Pan SJ, Nagappan N, Wang X (2016) HYDRA: massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42(10):977–998

    Article  Google Scholar 

  20. Ryu D, Baik J (2016) Effective multi-objective naïve Bayes learning for cross-project defect prediction. Appl Soft Comput 49:1062–1077

    Article  Google Scholar 

  21. Zhang F, Zheng Q, Zou Y, Hassan AE (2016) Cross-project defect prediction using a connectivity-based unsupervised classifier. In: IEEE/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, pp 309–320

  22. Hosseini S, Turhan B, Mantyla M (2016) Search based training data selection for cross project defect prediction. In: The 12th International Conference on Predictive Models and Data Analytics in Software Engineering, ACM, New York, USA, p 10

  23. Zhang F, Keivanloo I, Zou Y (2017) Data transformation in cross-project defect prediction. Empir Softw Eng 22(6):3186–3218

    Article  Google Scholar 

  24. Fei W et al. (2017) Cross-project and within-project semi-supervised software defect prediction problems study using a unified solution. In: IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), Buenos Aires, pp 195–197

  25. Poon WN, Bennin KE, Huang J, Phannachitta P, Keung JW (2017) Cross-project defect prediction using a credibility theory based naive Bayes classifier. In: IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, pp 434–441

  26. Huang S, Wu Y, Ji H, Bai C (2017) A three-stage defect prediction model for cross-project defect prediction. In: International conference on dependable systems and their applications (DSA), Beijing, pp 169–169

  27. Jing XY, Wu F, Dong X, Xu B (2017) An improved SDA based defect prediction framework for both within-project and cross-project class-imbalance problems. IEEE Trans Softw Eng 43(4):321–339

    Article  Google Scholar 

  28. Goel L, Damodaran D, Khatri SK, Sharma M (2017) A literature review on cross project defect prediction. In: 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), Mathura, pp 680–685

  29. http://amakihi.sourceforge.net/. Accessed 10 Aug 2017

  30. http://sourceforge.net/projects/amberarcher/. Accessed 10 Aug 2017

  31. http://abbot.sourceforge.net/doc/overview.shtml. Accessed 10 Aug 2017

  32. http://sourceforge.net/projects/startec-apollo. Accessed 10 Aug 2017

  33. http://sourceforge.net/projects/avisync/. Accessed 10 Aug 2017

  34. http://sourceforge.net/projects/jfreechart/. Accessed 10 Aug 2017

  35. http://sourceforge.net/projects/jgap/. Accessed 10 Aug 2017

  36. http://sourceforge.net/projects/jtreeview/. Accessed 10 Aug 2017

  37. http://sourceforge.net/projects/barcode4j/. Accessed 10 Aug 2017

  38. http://sourceforge.net/projects/jt400/. Accessed 10 Aug 2017

  39. http://sourceforge.net/projects/jung/. Accessed 10 Aug 2017

  40. http://sourceforge.net/projects/geotag/. Accessed 10 Aug 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anushree Agrawal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agrawal, A., Malhotra, R. Cross project defect prediction for open source software. Int. j. inf. tecnol. 14, 587–601 (2022). https://doi.org/10.1007/s41870-019-00299-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-019-00299-6

Keywords

Navigation