Skip to main content
Erschienen in: Empirical Software Engineering 1/2014

01.02.2014

Software defect prediction using Bayesian networks

verfasst von: Ahmet Okutan, Olcay Taner Yıldız

Erschienen in: Empirical Software Engineering | Ausgabe 1/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Alpaydın E (2004) Introduction to machine learning. The MIT Press, Cambridge, MA Alpaydın E (2004) Introduction to machine learning. The MIT Press, Cambridge, MA
Zurück zum Zitat Amasaki S, Takagi Y, Mizuno O, Kikuno T (2003) A bayesian belief network for assessing the likelihood of fault content. In: Proceedings of the 14th international symposium on software reliability engineering, ISSRE ’03. IEEE Computer Society, Washington, DC Amasaki S, Takagi Y, Mizuno O, Kikuno T (2003) A bayesian belief network for assessing the likelihood of fault content. In: Proceedings of the 14th international symposium on software reliability engineering, ISSRE ’03. IEEE Computer Society, Washington, DC
Zurück zum Zitat Bibi S, Stamelos I (2004) Software process modeling with bayesian belief networks. In: Proceedings of 10th international software metrics symposium Bibi S, Stamelos I (2004) Software process modeling with bayesian belief networks. In: Proceedings of 10th international software metrics symposium
Zurück zum Zitat Boetticher G, Menzies T, Ostrand T (2007) Promise repository of empirical software engineering data http://promisedata.org/ repository. Department of Computer Science, West Virginia University Boetticher G, Menzies T, Ostrand T (2007) Promise repository of empirical software engineering data http://​promisedata.​org/​ repository. Department of Computer Science, West Virginia University
Zurück zum Zitat Boetticher GD (2005) Nearest neighbor sampling for better defect prediction. In: Proceedings of the 2005 workshop on predictor models in software engineering, PROMISE ’05 Boetticher GD (2005) Nearest neighbor sampling for better defect prediction. In: Proceedings of the 2005 workshop on predictor models in software engineering, PROMISE ’05
Zurück zum Zitat Chidamber SR, Kemerer CF (1991) Towards a metrics suite for object oriented design. SIGPLAN 26(11):197–211CrossRef Chidamber SR, Kemerer CF (1991) Towards a metrics suite for object oriented design. SIGPLAN 26(11):197–211CrossRef
Zurück zum Zitat Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347MATH Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347MATH
Zurück zum Zitat Dejaeger K, Verbraken T, Baesens B (2012) Towards comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans Softw Eng 99:1. doi:10.1109/TSE.2012.20 Dejaeger K, Verbraken T, Baesens B (2012) Towards comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans Softw Eng 99:1. doi:10.​1109/​TSE.​2012.​20
Zurück zum Zitat Ekanayake J, Tappolet J, Gall H, Bernstein A (2009) Tracking concept drift of software projects using defect prediction quality. In: 6th IEEE international working conference on mining software repositories, 2009. MSR ’09, pp 51–60 Ekanayake J, Tappolet J, Gall H, Bernstein A (2009) Tracking concept drift of software projects using defect prediction quality. In: 6th IEEE international working conference on mining software repositories, 2009. MSR ’09, pp 51–60
Zurück zum Zitat Emam KE, Melo W, Machado JC (2001) The prediction of faulty classes using object-oriented design metrics. J Syst Softw 56:63–75CrossRef Emam KE, Melo W, Machado JC (2001) The prediction of faulty classes using object-oriented design metrics. J Syst Softw 56:63–75CrossRef
Zurück zum Zitat Fenton N, Krause P, Neil M (2002) Software measurement: uncertainty and causal modeling. IEEE Softw 19:116–122CrossRef Fenton N, Krause P, Neil M (2002) Software measurement: uncertainty and causal modeling. IEEE Softw 19:116–122CrossRef
Zurück zum Zitat Fenton N, Neil M, Marquez D (2008) Using Bayesian networks to predict software defects and reliability. J Risk Reliability 222(4):701–712 Fenton N, Neil M, Marquez D (2008) Using Bayesian networks to predict software defects and reliability. J Risk Reliability 222(4):701–712
Zurück zum Zitat Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Krause P, Mishra R (2007) Predicting software defects in varying development lifecycles using bayesian nets. Inf Softw Technol 49(1):32–43 Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Krause P, Mishra R (2007) Predicting software defects in varying development lifecycles using bayesian nets. Inf Softw Technol 49(1):32–43
Zurück zum Zitat Fenton NE, Neil M (1999) A critique of software defect prediction models. IEEE Trans Softw Eng 25:675–689CrossRef Fenton NE, Neil M (1999) A critique of software defect prediction models. IEEE Trans Softw Eng 25:675–689CrossRef
Zurück zum Zitat Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRef Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRef
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explorations 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explorations 11(1):10–18CrossRef
Zurück zum Zitat Henderson-Sellers B (1996) Object-oriented metrics: measures of complexity. Prentice-Hall Henderson-Sellers B (1996) Object-oriented metrics: measures of complexity. Prentice-Hall
Zurück zum Zitat Hu Y, Zhang X, Sun X, Liu M, Du J (2009) An intelligent model for software project risk prediction. In: International conference on information management, innovation management and industrial engineering, 2009, vol 1, pp 629–632 Hu Y, Zhang X, Sun X, Liu M, Du J (2009) An intelligent model for software project risk prediction. In: International conference on information management, innovation management and industrial engineering, 2009, vol 1, pp 629–632
Zurück zum Zitat Jin C, Liu J-A (2010) Applications of support vector machine and unsupervised learning for predicting maintainability using object-oriented metrics. In: 2010 second international conference on multimedia and information technology (MMIT), vol 1, pp 24–27 Jin C, Liu J-A (2010) Applications of support vector machine and unsupervised learning for predicting maintainability using object-oriented metrics. In: 2010 second international conference on multimedia and information technology (MMIT), vol 1, pp 24–27
Zurück zum Zitat Kaur A, Sandhu P, Bra A (2009) Early software fault prediction using real time defect data. In: Second international conference on machine vision, 2009. ICMV ’09, pp 242–245 Kaur A, Sandhu P, Bra A (2009) Early software fault prediction using real time defect data. In: Second international conference on machine vision, 2009. ICMV ’09, pp 242–245
Zurück zum Zitat Khoshgoftaar T, Allen E, Busboom J (2000) Modeling software quality: the software measurement analysis and reliability toolkit. In: 12th IEEE international conference on tools with artificial intelligence, 2000. ICTAI 2000. Proceedings, pp 54–61 Khoshgoftaar T, Allen E, Busboom J (2000) Modeling software quality: the software measurement analysis and reliability toolkit. In: 12th IEEE international conference on tools with artificial intelligence, 2000. ICTAI 2000. Proceedings, pp 54–61
Zurück zum Zitat Khoshgoftaar T, Ganesan K, Allen E, Ross F, Munikoti R, Goel N, Nandi A (1997) Predicting fault-prone modules with case-based reasoning. In: Proceedings the eighth international symposium on software reliability engineering, pp 27–35, 2–5 Khoshgoftaar T, Ganesan K, Allen E, Ross F, Munikoti R, Goel N, Nandi A (1997) Predicting fault-prone modules with case-based reasoning. In: Proceedings the eighth international symposium on software reliability engineering, pp 27–35, 2–5
Zurück zum Zitat Khoshgoftaar TM, Pandya AS, Lanning DL (1995) Application of neural networks for predicting program faults. Ann Softw Eng 1:141–154CrossRef Khoshgoftaar TM, Pandya AS, Lanning DL (1995) Application of neural networks for predicting program faults. Ann Softw Eng 1:141–154CrossRef
Zurück zum Zitat Koru AG, Liu H (2005) An investigation of the effect of module size on defect prediction using static measures. SIGSOFT Softw Eng Notes 30:1–5 Koru AG, Liu H (2005) An investigation of the effect of module size on defect prediction using static measures. SIGSOFT Softw Eng Notes 30:1–5
Zurück zum Zitat Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34:485–496CrossRef Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34:485–496CrossRef
Zurück zum Zitat Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs global models for effort estimation and defect prediction. In: Proceedings of the 26st IEEE/ACM international conference on automated software engineering, Lawrence, Kansas, USA Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs global models for effort estimation and defect prediction. In: Proceedings of the 26st IEEE/ACM international conference on automated software engineering, Lawrence, Kansas, USA
Zurück zum Zitat Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33:2–13CrossRef Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33:2–13CrossRef
Zurück zum Zitat Menzies T, Shepperd M (2012) Special issue on repeatable results in software engineering prediction. Empir Softw Eng 17(1–2):1–17 Menzies T, Shepperd M (2012) Special issue on repeatable results in software engineering prediction. Empir Softw Eng 17(1–2):1–17
Zurück zum Zitat Mockus A, Fielding RT, Herbsleb JD (2002) Two case studies of open source software development: Apache and Mozilla. ACM Trans Softw Eng Methodol 11:309–346CrossRef Mockus A, Fielding RT, Herbsleb JD (2002) Two case studies of open source software development: Apache and Mozilla. ACM Trans Softw Eng Methodol 11:309–346CrossRef
Zurück zum Zitat Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE Trans Softw Eng 18:423–433CrossRef Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE Trans Softw Eng 18:423–433CrossRef
Zurück zum Zitat Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Softw Eng 31:380–391CrossRef Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Softw Eng 31:380–391CrossRef
Zurück zum Zitat Nagappan N, Murphy B, Basili V (2008) The influence of organizational structure on software quality: an empirical case study. In: Proceedings of the 30th international conference on software engineering, ICSE ’08. ACM, New York, pp 521–530 Nagappan N, Murphy B, Basili V (2008) The influence of organizational structure on software quality: an empirical case study. In: Proceedings of the 30th international conference on software engineering, ICSE ’08. ACM, New York, pp 521–530
Zurück zum Zitat Norick B, Krohn J, Howard E, Welna B, Izurieta C (2010) Effects of the number of developers on code quality in open source software: a case study. In: Succi G, Morisio M, Nagappan N (eds) ESEM. ACM Norick B, Krohn J, Howard E, Welna B, Izurieta C (2010) Effects of the number of developers on code quality in open source software: a case study. In: Succi G, Morisio M, Nagappan N (eds) ESEM. ACM
Zurück zum Zitat Pai G, Dugan J (2007) Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans Softw Eng 33(10):675–686CrossRef Pai G, Dugan J (2007) Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans Softw Eng 33(10):675–686CrossRef
Zurück zum Zitat Pendharkar PC, Rodger JA (2007) An empirical study of the impact of team size on software development effort. Inf Technol Manag 8:253–262CrossRef Pendharkar PC, Rodger JA (2007) An empirical study of the impact of team size on software development effort. Inf Technol Manag 8:253–262CrossRef
Zurück zum Zitat Pérez-Miñana E, Gras J-J (2006) Improving fault prediction using bayesian networks for the development of embedded software applications: research articles. Softw Test Verif Reliab 16(3):157–174CrossRef Pérez-Miñana E, Gras J-J (2006) Improving fault prediction using bayesian networks for the development of embedded software applications: research articles. Softw Test Verif Reliab 16(3):157–174CrossRef
Zurück zum Zitat Perry DE, Porter AA, Votta LG (2000) Empirical studies of software engineering: a roadmap. In: Proceedings of the conference on the future of software engineering, ICSE ’00. ACM, New York, pp 345–355CrossRef Perry DE, Porter AA, Votta LG (2000) Empirical studies of software engineering: a roadmap. In: Proceedings of the conference on the future of software engineering, ICSE ’00. ACM, New York, pp 345–355CrossRef
Zurück zum Zitat Posnett D, Filkov V, Devanbu PT (2011) Ecological inference in empirical software engineering. IEEE, pp 362–371 Posnett D, Filkov V, Devanbu PT (2011) Ecological inference in empirical software engineering. IEEE, pp 362–371
Zurück zum Zitat Shepperd M, Kadoda G (2001) Comparing software prediction techniques using simulation. IEEE Trans Softw Eng 27:1014–1022CrossRef Shepperd M, Kadoda G (2001) Comparing software prediction techniques using simulation. IEEE Trans Softw Eng 27:1014–1022CrossRef
Zurück zum Zitat Shivaji S, Whitehead E, Akella R, Kim S (2009) Reducing features to improve bug prediction. In: 24th IEEE/ACM international conference on automated software engineering, 2009. ASE ’09, pp 600–604 Shivaji S, Whitehead E, Akella R, Kim S (2009) Reducing features to improve bug prediction. In: 24th IEEE/ACM international conference on automated software engineering, 2009. ASE ’09, pp 600–604
Zurück zum Zitat Song Q, Shepperd M, Cartwright M, Mair C (2006) Software defect association mining and defect correction effort prediction. IEEE Trans Softw Eng 32(2):69–82 Song Q, Shepperd M, Cartwright M, Mair C (2006) Software defect association mining and defect correction effort prediction. IEEE Trans Softw Eng 32(2):69–82
Zurück zum Zitat Suffian M, Abdullah M (2010) Establishing a defect prediction model using a combination of product metrics as predictors via six sigma methodology. In: 2010 international symposium in information technology (ITSim), pp 1087–1092 Suffian M, Abdullah M (2010) Establishing a defect prediction model using a combination of product metrics as predictors via six sigma methodology. In: 2010 international symposium in information technology (ITSim), pp 1087–1092
Zurück zum Zitat Thwin MMT, Quah T-S (2002) Application of neural network for predicting software development faults using object-oriented design metrics. In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP ’02, vol 5, pp 2312–2316 Thwin MMT, Quah T-S (2002) Application of neural network for predicting software development faults using object-oriented design metrics. In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP ’02, vol 5, pp 2312–2316
Zurück zum Zitat Wolpert D, Macready W (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82 Wolpert D, Macready W (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82
Zurück zum Zitat Zhang D (2000) Applying machine learning algorithms in software development. In: Proceedings of the 2000 Monterey workshop on modeling software system structures in a fastly moving scenario, pp 275–291 Zhang D (2000) Applying machine learning algorithms in software development. In: Proceedings of the 2000 Monterey workshop on modeling software system structures in a fastly moving scenario, pp 275–291
Zurück zum Zitat Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32:771–789CrossRef Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32:771–789CrossRef
Metadaten
Titel
Software defect prediction using Bayesian networks
verfasst von
Ahmet Okutan
Olcay Taner Yıldız
Publikationsdatum
01.02.2014
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 1/2014
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-012-9218-8

Weitere Artikel der Ausgabe 1/2014

Empirical Software Engineering 1/2014 Zur Ausgabe