Skip to main content
Erschienen in: Innovations in Systems and Software Engineering 2/2023

13.06.2022 | Original Article

3PcGE: 3-parent child-based genetic evolution for software defect prediction

verfasst von: Somya Goyal

Erschienen in: Innovations in Systems and Software Engineering | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Software defect prediction (SDP) is the most fascinating research area in software industry to enhance the quality of software products. SDP classifiers predict the fault-prone modules in early development phases prior to begin testing phase, and thence, the testing efforts can be focused to those predicted fault-prone modules. In this way, the early detection of fault-prone modules increases the chances to release error-free products to the clients with reduced testing efforts and cost. For SDP application, which uses voluminous high-dimensional data, feature selection (FS) has become essential data preprocessing technique. From past three decades, search-based feature selection is prominently deployed to improve the efficiency of predictors. This paper proposes a new approach, namely 3PcGE, for feature selection (FS) based on three-parent child (3Pc) and genetic evolution (GE). The 3PcGE is inspired by evolutionary computation involving three-parent biological evolution process to result an off-spring with best survival capability. The 3Pc separates the spindle from the mother’s cell body having defective mitochondria and replaces the separated spindle in the emptied donor cell body having healthy mitochondria. In this way, 3-parent child is healthier than 2-parent child and free from fatal disease. 3PcGE searches the feature space for an optimal feature subset using the performance of classification and number of features selected as fitness function. The FS is modeled as multi-objective optimization problem, and pareto optimal solution is sought using evolutionary algorithm (3PcGE). The performance is compared with the state-of-the-art FS technique. From experimental results, it is clear that the proposed 3PcGE outperforms the competing filter-based FS techniques by 18.98% and wrapper-based FS techniques by 17.5% in AUC measure. The statistical comparison with the baseline technique (NSGA-II) shows that proposed FS technique 3PcGE is effective to select optimal features and results in better accuracy of SDP models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Afzal W, Torkar R (2016) Towards benchmarking feature subset selection methods for software fault prediction. In: Pedrycz W, Succi G, Sillitti A (eds) Computational intelligence and quantitative software engineering. Studies in computational intelligence, vol 617. Springer, Cham. https://doi.org/10.1007/978-3-319-25964-2-3CrossRef Afzal W, Torkar R (2016) Towards benchmarking feature subset selection methods for software fault prediction. In: Pedrycz W, Succi G, Sillitti A (eds) Computational intelligence and quantitative software engineering. Studies in computational intelligence, vol 617. Springer, Cham. https://​doi.​org/​10.​1007/​978-3-319-25964-2-3CrossRef
4.
Zurück zum Zitat Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verific Reliab 25(4):426–459CrossRef Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verific Reliab 25(4):426–459CrossRef
5.
Zurück zum Zitat Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38(4):4626–4636CrossRef Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38(4):4626–4636CrossRef
6.
Zurück zum Zitat Catal C, Diri B (2009) Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf Sci 179(8):1040–1058CrossRef Catal C, Diri B (2009) Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf Sci 179(8):1040–1058CrossRef
7.
Zurück zum Zitat Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: nsga-ii. IEEE Trans Evol Comput 6(2):182–197CrossRef Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: nsga-ii. IEEE Trans Evol Comput 6(2):182–197CrossRef
10.
Zurück zum Zitat Gao K, Khoshgoftaar TM, Wang H, Seliya N (2011) Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw Pract Exp 41(5):579–606CrossRef Gao K, Khoshgoftaar TM, Wang H, Seliya N (2011) Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw Pract Exp 41(5):579–606CrossRef
11.
Zurück zum Zitat Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the international conference on software engineering, pp 789–800 Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the international conference on software engineering, pp 789–800
12.
Zurück zum Zitat Ghotra B, Mcintosh S, Hassan AE (2017) A large-scale study of the impact of feature selection techniques on defect classification models. In: Proceedings of the international conference on mining software repositories, pp 146–157 Ghotra B, Mcintosh S, Hassan AE (2017) A large-scale study of the impact of feature selection techniques on defect classification models. In: Proceedings of the international conference on mining software repositories, pp 146–157
13.
Zurück zum Zitat Goyal S, Bhatia PK (2020) Comparison of machine learning techniques for software quality prediction. Int J Knowl Syst Sci (IJKSS) 11(2):21–40 Goyal S, Bhatia PK (2020) Comparison of machine learning techniques for software quality prediction. Int J Knowl Syst Sci (IJKSS) 11(2):21–40
14.
Zurück zum Zitat Holmes HG et al (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437–1447CrossRef Holmes HG et al (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437–1447CrossRef
15.
Zurück zum Zitat Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. Trans Softw Eng IEEE 38(6):1276–1304CrossRef Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. Trans Softw Eng IEEE 38(6):1276–1304CrossRef
16.
Zurück zum Zitat Halstead MH (1977) Elements of software science. Elsevier North Holland, New YorkMATH Halstead MH (1977) Elements of software science. Elsevier North Holland, New YorkMATH
17.
Zurück zum Zitat Hanley J, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology 143:29–36CrossRef Hanley J, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology 143:29–36CrossRef
18.
Zurück zum Zitat Harman M, Jones B (2001) Search based software engineering. J Inf Softw Technol 43(14):833–839CrossRef Harman M, Jones B (2001) Search based software engineering. J Inf Softw Technol 43(14):833–839CrossRef
19.
Zurück zum Zitat Harman M, Mansouri SA, Zhang Y (2012) Search-based software engineering: trends, techniques and applications. ACM Comput Surv (CSUR) 45(1):1–61CrossRef Harman M, Mansouri SA, Zhang Y (2012) Search-based software engineering: trends, techniques and applications. ACM Comput Surv (CSUR) 45(1):1–61CrossRef
20.
Zurück zum Zitat He P, Li B, Liu X, Chen J, Ma Y (2015) An empirical study on software defect prediction with a simplified metric set. Inf Softw Technol 59:170–190CrossRef He P, Li B, Liu X, Chen J, Ma Y (2015) An empirical study on software defect prediction with a simplified metric set. Inf Softw Technol 59:170–190CrossRef
21.
Zurück zum Zitat Hosseini S, Turhan B, Mäntylä M (2018) A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction. Inf Softw Technol J 95:296–312CrossRef Hosseini S, Turhan B, Mäntylä M (2018) A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction. Inf Softw Technol J 95:296–312CrossRef
22.
Zurück zum Zitat Hosseini S, Turhan B, Gunarathna D (2019) A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans Softw Eng 45(2):111–147CrossRef Hosseini S, Turhan B, Gunarathna D (2019) A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans Softw Eng 45(2):111–147CrossRef
24.
Zurück zum Zitat Khoshgoftaar TM, Allen EB (2000) A practical classification-rule for software quality models. IEEE Trans Reliab 49(2):209–216CrossRef Khoshgoftaar TM, Allen EB (2000) A practical classification-rule for software quality models. IEEE Trans Reliab 49(2):209–216CrossRef
25.
Zurück zum Zitat Kondo M, Bezemer C-P, Kamei Y, Hassan AE, Mizuno O (2019) The impact of feature reduction techniques on defect prediction models. Empir Softw Eng 24:1925–1963CrossRef Kondo M, Bezemer C-P, Kamei Y, Hassan AE, Mizuno O (2019) The impact of feature reduction techniques on defect prediction models. Empir Softw Eng 24:1925–1963CrossRef
26.
Zurück zum Zitat Li Z, Jing XY, Zhu X (2018) Progress on approaches to software defect prediction. IET Softw 12(3):161–175CrossRef Li Z, Jing XY, Zhu X (2018) Progress on approaches to software defect prediction. IET Softw 12(3):161–175CrossRef
27.
Zurück zum Zitat Lin S-W, Ying K-C, Chen S-C, Lee Z-J (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35:1817–1824CrossRef Lin S-W, Ying K-C, Chen S-C, Lee Z-J (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35:1817–1824CrossRef
28.
Zurück zum Zitat Liu YC, Khoshgoftaar TM, Seliya N (2010) Evolutionary optimization of software quality modeling with multiple repositories. IEEE Trans Softw Eng 36(6):852–864CrossRef Liu YC, Khoshgoftaar TM, Seliya N (2010) Evolutionary optimization of software quality modeling with multiple repositories. IEEE Trans Softw Eng 36(6):852–864CrossRef
29.
Zurück zum Zitat Mafarja M, Mirjalili S (2017) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453CrossRef Mafarja M, Mirjalili S (2017) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453CrossRef
31.
Zurück zum Zitat Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13CrossRef Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13CrossRef
32.
Zurück zum Zitat Mitchell T (1997) Machine learning. McGraw-Hill, New YorkMATH Mitchell T (1997) Machine learning. McGraw-Hill, New YorkMATH
36.
Zurück zum Zitat Pressman RS (1997) Software engineering: a practitioner’s approach. McGraw-Hill, New YorkMATH Pressman RS (1997) Software engineering: a practitioner’s approach. McGraw-Hill, New YorkMATH
37.
Zurück zum Zitat Porter A, Selby R (1990) Evaluating techniques for generating metric-based classification trees. J Syst Softw 12:209–218CrossRef Porter A, Selby R (1990) Evaluating techniques for generating metric-based classification trees. J Syst Softw 12:209–218CrossRef
38.
Zurück zum Zitat Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418CrossRef Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418CrossRef
41.
Zurück zum Zitat Rodríguez D, Ruiz R, Cuadrado-Gallego J, AguilarRuiz J (2007) Detecting fault modules applying feature selection to classifiers. In: IEEE international conference on information reuse and integration, 2007. IRI 2007., pp 667–672. IEEE Rodríguez D, Ruiz R, Cuadrado-Gallego J, AguilarRuiz J (2007) Detecting fault modules applying feature selection to classifiers. In: IEEE international conference on information reuse and integration, 2007. IRI 2007., pp 667–672. IEEE
42.
Zurück zum Zitat Ross SM (2004) Introduction to probability and statistics for engineers and scientists, 3rd edn. Elsevier Press, CambridgeMATH Ross SM (2004) Introduction to probability and statistics for engineers and scientists, 3rd edn. Elsevier Press, CambridgeMATH
43.
Zurück zum Zitat Shepperd M, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the NASA software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215CrossRef Shepperd M, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the NASA software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215CrossRef
44.
Zurück zum Zitat Song Q, Jia Z, Shepperd M, Ying S, Liu J (2011) A general software defect-proneness prediction framework. IEEE Trans Softw Eng 37(3):356–370CrossRef Song Q, Jia Z, Shepperd M, Ying S, Liu J (2011) A general software defect-proneness prediction framework. IEEE Trans Softw Eng 37(3):356–370CrossRef
45.
Zurück zum Zitat Wahono RS (2015) A systematic literature review of software defect prediction. J Softw Eng 1(1):1–16 Wahono RS (2015) A systematic literature review of software defect prediction. J Softw Eng 1(1):1–16
46.
Zurück zum Zitat Wahono RS, Suryana N, Ahmad S (2014) Metaheuristic optimization based feature selection for software defect prediction. J Softw 9(5):1324–1333CrossRef Wahono RS, Suryana N, Ahmad S (2014) Metaheuristic optimization based feature selection for software defect prediction. J Softw 9(5):1324–1333CrossRef
47.
Zurück zum Zitat Xu Z, Liu J, Yang Z, An G, Jia X (2016) The impact of feature selection on defect prediction performance: an empirical comparison. In: 2016 IEEE 27th international symposium on software reliability engineering (ISSRE), pp 309–320. IEEE Xu Z, Liu J, Yang Z, An G, Jia X (2016) The impact of feature selection on defect prediction performance: an empirical comparison. In: 2016 IEEE 27th international symposium on software reliability engineering (ISSRE), pp 309–320. IEEE
48.
Zurück zum Zitat Yu Q, Qian J, Jiang S, Zhenhua Wu, Zhang G (2019) An empirical study on the effectiveness of feature selection for cross-project defect prediction. IEEE Access 7(2019):35710–35718CrossRef Yu Q, Qian J, Jiang S, Zhenhua Wu, Zhang G (2019) An empirical study on the effectiveness of feature selection for cross-project defect prediction. IEEE Access 7(2019):35710–35718CrossRef
Metadaten
Titel
3PcGE: 3-parent child-based genetic evolution for software defect prediction
verfasst von
Somya Goyal
Publikationsdatum
13.06.2022
Verlag
Springer London
Erschienen in
Innovations in Systems and Software Engineering / Ausgabe 2/2023
Print ISSN: 1614-5046
Elektronische ISSN: 1614-5054
DOI
https://doi.org/10.1007/s11334-021-00427-1

Weitere Artikel der Ausgabe 2/2023

Innovations in Systems and Software Engineering 2/2023 Zur Ausgabe

Premium Partner