Skip to main content

2022 | OriginalPaper | Buchkapitel

Simplify Your Neural Networks: An Empirical Study on Cross-Project Defect Prediction

verfasst von : Ruchika Malhotra, Abuzar Ahmed Khan, Amrit Khera

Erschienen in: Computer Networks and Inventive Communication Technologies

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Ensuring software quality, when every industry depends on software, is of utmost importance. Software bugs can have grave consequences and thus identifying and fixing them becomes imperative for developers. Software defect prediction (SDP) focuses on identifying defect-prone areas so that testing resources can be allocated judiciously. Sourcing historical data is not easy, especially in the case of new software, and this is where cross-project defect prediction (CPDP) comes in. SDP, and specifically CPDP, have both attracted the attention of the research community. Simultaneously, the versatility of neural networks (NN) has pushed researchers to study the applications of NNs to defect prediction. In most research, the architecture of a NN is arrived at through trial-and-error. This requires effort, which can be reduced if there is a general idea about what kind of architecture works well in a particular field. In this paper, we tackle this problem by testing six different NNs on a dataset of twenty software from the PROMISE repository in a strict CPDP setting. We then compare the best architecture to three proposed methods for CPDP, which cover a wide range of scenarios. During our research, we found that the simplest NN with dropout layers (NN-SD) performed the best and was also statistically significantly better than the CPDP methods it was compared with. We used the area under the curve for receiver operating characteristics (AUC-ROC) as the performance metric, and for testing statistical significance, we use the Friedman chi-squared test and the Wilcoxon signed-rank test.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ryu, D., Jang, J.I., Baik, J.: A hybrid instance selection using nearest-neighbor for cross-project defect prediction. J. Comput. Sci. Technol. 30(5), 969–980 (2015) Ryu, D., Jang, J.I., Baik, J.: A hybrid instance selection using nearest-neighbor for cross-project defect prediction. J. Comput. Sci. Technol. 30(5), 969–980 (2015)
2.
Zurück zum Zitat Turhan, B., et al.: On the relative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng. 14(5), 540–578 (2009) Turhan, B., et al.: On the relative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng. 14(5), 540–578 (2009)
3.
Zurück zum Zitat Zimmermann, T., et al.: Cross-project defect prediction: a large scale experiment on data versus domain versus process. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM Sigsoft Symposium on the Foundations of Software Engineering, 2009, pp. 91–100 Zimmermann, T., et al.: Cross-project defect prediction: a large scale experiment on data versus domain versus process. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM Sigsoft Symposium on the Foundations of Software Engineering, 2009, pp. 91–100
4.
Zurück zum Zitat Herbold, S., Trautsch, A., Grabowski, J.: A comparative study to benchmark cross-project defect prediction approaches. IEEE Trans. Softw. Eng. 44(9), 811–833 (2017) Herbold, S., Trautsch, A., Grabowski, J.: A comparative study to benchmark cross-project defect prediction approaches. IEEE Trans. Softw. Eng. 44(9), 811–833 (2017)
5.
Zurück zum Zitat Hosseini, S., Turhan, B., Gunarathna, D.: A systematic literature review and meta-analysis on cross project defect pre- diction. IEEE Trans. Software Eng. 45(2), 111–147 (2017) Hosseini, S., Turhan, B., Gunarathna, D.: A systematic literature review and meta-analysis on cross project defect pre- diction. IEEE Trans. Software Eng. 45(2), 111–147 (2017)
6.
Zurück zum Zitat Menzies, T., et al.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17(4), 375-407 (2010) Menzies, T., et al.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17(4), 375-407 (2010)
7.
Zurück zum Zitat Arar, Ö.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. Software defect prediction using cost-sensitive neural network. 33, 263–277 (2015) Arar, Ö.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. Software defect prediction using cost-sensitive neural network. 33, 263–277 (2015)
8.
Zurück zum Zitat Jindal, R., Malhotra, R., Jain, A.: Software defect prediction using neural networks. In: Proceedings of 3rd International Conference on Reliability, Infocom Technologies and Optimization, pp. 1–6. IEEE (2014) Jindal, R., Malhotra, R., Jain, A.: Software defect prediction using neural networks. In: Proceedings of 3rd International Conference on Reliability, Infocom Technologies and Optimization, pp. 1–6. IEEE (2014)
9.
Zurück zum Zitat Liu, M., Miao, L., Zhang, D.: Two-stage cost-sensitive learning for software defect prediction. IEEE Trans. Reliab. 63(2), 676–686 (2014) Liu, M., Miao, L., Zhang, D.: Two-stage cost-sensitive learning for software defect prediction. IEEE Trans. Reliab. 63(2), 676–686 (2014)
10.
Zurück zum Zitat Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010) Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010)
11.
Zurück zum Zitat Wang, S.: Training deep neural networks on imbalanced data sets. Int. Joint Conf. Neural Netw. (IJCNN). 2016, 4368–4374 (2016) Wang, S.: Training deep neural networks on imbalanced data sets. Int. Joint Conf. Neural Netw. (IJCNN). 2016, 4368–4374 (2016)
12.
Zurück zum Zitat Menzies, T., et al.: The promise repository of empirical software engineering data. West Virginia University, Department of Computer Science (2012) Menzies, T., et al.: The promise repository of empirical software engineering data. West Virginia University, Department of Computer Science (2012)
13.
Zurück zum Zitat Nam , J., Kim, S.: Clami: Defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 452–463. IEEE (2015) Nam , J., Kim, S.: Clami: Defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 452–463. IEEE (2015)
14.
Zurück zum Zitat Panichella, A., Oliveto, R., Lucia, A.D.: Cross-project defect prediction models: L’union fait la force. In: 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), pp. 164–173. IEEE (2014) Panichella, A., Oliveto, R., Lucia, A.D.: Cross-project defect prediction models: L’union fait la force. In: 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), pp. 164–173. IEEE (2014)
15.
Zurück zum Zitat He, Z., et al.: An investigation on the feasibility of cross-project defect prediction. Autom. Softw. Eng. 19(2), 167–199 (2012) He, Z., et al.: An investigation on the feasibility of cross-project defect prediction. Autom. Softw. Eng. 19(2), 167–199 (2012)
16.
Zurück zum Zitat Kocaguneli, E., et al.: When to use data from other projects for effort estimation. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 321–324 (2010) Kocaguneli, E., et al.: When to use data from other projects for effort estimation. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 321–324 (2010)
17.
Zurück zum Zitat Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994) Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)
18.
Zurück zum Zitat Wu, R., et al.: Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011, pp. 15–25 Wu, R., et al.: Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011, pp. 15–25
19.
Zurück zum Zitat Chen, L., et al.: Negative samples reduction in cross-company software defects prediction. Inf. Softw. Technol. 62, 67–77 (2015) Chen, L., et al.: Negative samples reduction in cross-company software defects prediction. Inf. Softw. Technol. 62, 67–77 (2015)
20.
Zurück zum Zitat Ryu, D., Jang, J.-I., Baik, J.: A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Qual. J. 25(1), 235–272 (2017) Ryu, D., Jang, J.-I., Baik, J.: A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Qual. J. 25(1), 235–272 (2017)
21.
Zurück zum Zitat He, Z.: Learning from open-source projects: an empirical study on defect prediction. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 45–54. IEEE (2013) He, Z.: Learning from open-source projects: an empirical study on defect prediction. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 45–54. IEEE (2013)
22.
Zurück zum Zitat Qiao, L., et al.: Deep learning based software defect prediction. Neurocomputing 385, 100–110 (2020) Qiao, L., et al.: Deep learning based software defect prediction. Neurocomputing 385, 100–110 (2020)
23.
Zurück zum Zitat Tantithamthavorn, C., et al. Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th International Conference on Software Engineering, pp. 321–332 (2016) Tantithamthavorn, C., et al. Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th International Conference on Software Engineering, pp. 321–332 (2016)
24.
Zurück zum Zitat Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering., pp. 1–10 (2010) Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering., pp. 1–10 (2010)
25.
Zurück zum Zitat Cruz, A.E.C., Ochimizu, K.: Towards logistic regression models for predicting fault-prone code across software projects. In: 3rd International Symposium on Empirical Software Engineering and Measurement, pp. 460–463. IEEE (2009) Cruz, A.E.C., Ochimizu, K.: Towards logistic regression models for predicting fault-prone code across software projects. In: 3rd International Symposium on Empirical Software Engineering and Measurement, pp. 460–463. IEEE (2009)
26.
Zurück zum Zitat Bennin, K.E., et al.: Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Trans. Softw. Eng. 44(6), pp. 534–550 (2017) Bennin, K.E., et al.: Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Trans. Softw. Eng. 44(6), pp. 534–550 (2017)
27.
Zurück zum Zitat Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif Intell Res 16, 321–357 (2002) Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif Intell Res 16, 321–357 (2002)
28.
Zurück zum Zitat Qiu , S., et al.: An investigation of imbalanced ensemble learning methods for cross-project defect prediction. Int. J. Pattern Recogn. Artif. Intell. 33(12), 1959037 (2019) Qiu , S., et al.: An investigation of imbalanced ensemble learning methods for cross-project defect prediction. Int. J. Pattern Recogn. Artif. Intell. 33(12), 1959037 (2019)
29.
Zurück zum Zitat Goodfellow, I., et al.: Deep Learning, vol. 1.2. MIT Press Cambridge (2016) Goodfellow, I., et al.: Deep Learning, vol. 1.2. MIT Press Cambridge (2016)
Metadaten
Titel
Simplify Your Neural Networks: An Empirical Study on Cross-Project Defect Prediction
verfasst von
Ruchika Malhotra
Abuzar Ahmed Khan
Amrit Khera
Copyright-Jahr
2022
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-16-3728-5_7

Neuer Inhalt