Skip to main content
Erschienen in: Empirical Software Engineering 5/2008

01.10.2008

Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

verfasst von: Elaine J. Weyuker, Thomas J. Ostrand, Robert M. Bell

Erschienen in: Empirical Software Engineering | Ausgabe 5/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Fault prediction by negative binomial regression models is shown to be effective for four large production software systems from industry. A model developed originally with data from systems with regularly scheduled releases was successfully adapted to a system without releases to identify 20% of that system’s files that contained 75% of the faults. A model with a pre-specified set of variables derived from earlier research was applied to three additional systems, and proved capable of identifying averages of 81, 94 and 76% of the faults in those systems. A primary focus of this paper is to investigate the impact on predictive accuracy of using data about the number of developers who access individual code units. For each system, including the cumulative number of developers who had previously modified a file yielded no more than a modest improvement in predictive accuracy. We conclude that while many factors can “spoil the broth” (lead to the release of software with too many defects), the number of developers is not a major influence.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adams EN (1984) Optimizing preventive service of software products. IBM J Res Develop 28(1):2–14, JanuaryCrossRef Adams EN (1984) Optimizing preventive service of software products. IBM J Res Develop 28(1):2–14, JanuaryCrossRef
Zurück zum Zitat Arisholm E, Briand LC (2006) Predicting fault-prone components in a java legacy system. In: Proc ACM/IEEE ISESE, Rio de Janeiro, 21 September 2006 Arisholm E, Briand LC (2006) Predicting fault-prone components in a java legacy system. In: Proc ACM/IEEE ISESE, Rio de Janeiro, 21 September 2006
Zurück zum Zitat Basili VR, Perricone BT (1984) Software errors and complexity: an empirical investigation. Commun ACM 27(1):42–52, JanuaryCrossRef Basili VR, Perricone BT (1984) Software errors and complexity: an empirical investigation. Commun ACM 27(1):42–52, JanuaryCrossRef
Zurück zum Zitat Bell RM, Ostrand TJ, Weyuker EJ (2006) Looking for bugs in all the right places. In: Proc ACM/international symposium on software testing and analysis (ISSTA2006), Portland, July 2006, pp 61–71 Bell RM, Ostrand TJ, Weyuker EJ (2006) Looking for bugs in all the right places. In: Proc ACM/international symposium on software testing and analysis (ISSTA2006), Portland, July 2006, pp 61–71
Zurück zum Zitat Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493, JuneCrossRef Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493, JuneCrossRef
Zurück zum Zitat Denaro G, Pezze M (2002) An empirical evaluation of fault-proneness models. In: Proc international conf on software engineering (ICSE2002), Miami, May 2002 Denaro G, Pezze M (2002) An empirical evaluation of fault-proneness models. In: Proc international conf on software engineering (ICSE2002), Miami, May 2002
Zurück zum Zitat Eick SG, Graves TL, Karr AF, Marron JS, Mockus A (2001) Does code decay? Assessing the evidence from change management data. IEEE Trans Softw Eng 27(1):1–12, JanuaryCrossRef Eick SG, Graves TL, Karr AF, Marron JS, Mockus A (2001) Does code decay? Assessing the evidence from change management data. IEEE Trans Softw Eng 27(1):1–12, JanuaryCrossRef
Zurück zum Zitat Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Softw Eng 26(8):797–814, AugustCrossRef Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Softw Eng 26(8):797–814, AugustCrossRef
Zurück zum Zitat Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661, JulyCrossRef Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661, JulyCrossRef
Zurück zum Zitat Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. In: Proc ISSRE 2004, Saint-Malo, November 2004 Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. In: Proc ISSRE 2004, Saint-Malo, November 2004
Zurück zum Zitat Hatton L (1997) Reexamining the fault density—component size connection. IEEE Softw 14:89–97, March/AprilCrossRef Hatton L (1997) Reexamining the fault density—component size connection. IEEE Softw 14:89–97, March/AprilCrossRef
Zurück zum Zitat Khoshgoftaar TM, Allen EB, Deng J (2002) Using regression trees to classify fault-prone software modules. IEEE Trans Reliab 51(4):455–462, DecemberCrossRef Khoshgoftaar TM, Allen EB, Deng J (2002) Using regression trees to classify fault-prone software modules. IEEE Trans Reliab 51(4):455–462, DecemberCrossRef
Zurück zum Zitat Khoshgoftaar TM, Allen EB, Kalaichelvan KS, Goel N (1996) Early quality prediction: a case study in telecommunications. IEEE Softw 13:65–71, JanuaryCrossRef Khoshgoftaar TM, Allen EB, Kalaichelvan KS, Goel N (1996) Early quality prediction: a case study in telecommunications. IEEE Softw 13:65–71, JanuaryCrossRef
Zurück zum Zitat McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, LondonMATH McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, LondonMATH
Zurück zum Zitat Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13, JanuaryCrossRef Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13, JanuaryCrossRef
Zurück zum Zitat Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5:169–180, April–JuneCrossRef Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5:169–180, April–JuneCrossRef
Zurück zum Zitat Moeller K-H, Paulish DJ (1993) An empirical investigation of software fault distribution. In: Proc IEEE first international software metrics symposium, Baltimore, 21–22 May 1993, pp 82–90 Moeller K-H, Paulish DJ (1993) An empirical investigation of software fault distribution. In: Proc IEEE first international software metrics symposium, Baltimore, 21–22 May 1993, pp 82–90
Zurück zum Zitat Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE Trans Softw Eng 18(5):423–433, MayCrossRef Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE Trans Softw Eng 18(5):423–433, MayCrossRef
Zurück zum Zitat Nagappan N, Ball T (2007) Using software dependencies and churn metrics to predict field failures. In: Int symp on software engineering and measurement, Madrid, 21–22 September 2007 Nagappan N, Ball T (2007) Using software dependencies and churn metrics to predict field failures. In: Int symp on software engineering and measurement, Madrid, 21–22 September 2007
Zurück zum Zitat Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proc int conf on software engineering, Shanghai, May 2006, pp 452–461 Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proc int conf on software engineering, Shanghai, May 2006, pp 452–461
Zurück zum Zitat Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894, DecemberCrossRef Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894, DecemberCrossRef
Zurück zum Zitat Ostrand T, Weyuker EJ (2002) The distribution of faults in a large industrial software system. In: Proc ACM/international symposium on software testing and analysis (ISSTA2002), Rome, July 2002, pp 55–64 Ostrand T, Weyuker EJ (2002) The distribution of faults in a large industrial software system. In: Proc ACM/international symposium on software testing and analysis (ISSTA2002), Rome, July 2002, pp 55–64
Zurück zum Zitat Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355, AprilCrossRef Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355, AprilCrossRef
Zurück zum Zitat Ostrand TJ, Weyuker EJ, Bell RM (2007) Automating algorithms for the identification of fault-prone files. In: Proc. ACM/international symposium on software testing and analysis (ISSTA07), London, July 2007 Ostrand TJ, Weyuker EJ, Bell RM (2007) Automating algorithms for the identification of fault-prone files. In: Proc. ACM/international symposium on software testing and analysis (ISSTA07), London, July 2007
Zurück zum Zitat Pighin M, Marzona A (2003) An empirical analysis of fault persistence through software releases. In: Proc. IEEE/ACM ISESE. IEEE, Piscataway, pp 206–212 Pighin M, Marzona A (2003) An empirical analysis of fault persistence through software releases. In: Proc. IEEE/ACM ISESE. IEEE, Piscataway, pp 206–212
Zurück zum Zitat SAS Institute Inc (2004) SAS/STAT 9.1 user’s guide. SAS, Cary SAS Institute Inc (2004) SAS/STAT 9.1 user’s guide. SAS, Cary
Zurück zum Zitat Succi G, Pedrycz W, Stefanovic M, Miller J (2003) Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics. J Syst Softw 65(1):1–12, JanuaryCrossRef Succi G, Pedrycz W, Stefanovic M, Miller J (2003) Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics. J Syst Softw 65(1):1–12, JanuaryCrossRef
Zurück zum Zitat Witten IH, Frank E (2005) Data mining, 2nd edn. Morgan Kaufmann, San FranciscoMATH Witten IH, Frank E (2005) Data mining, 2nd edn. Morgan Kaufmann, San FranciscoMATH
Metadaten
Titel
Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models
verfasst von
Elaine J. Weyuker
Thomas J. Ostrand
Robert M. Bell
Publikationsdatum
01.10.2008
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 5/2008
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-008-9082-8

Weitere Artikel der Ausgabe 5/2008

Empirical Software Engineering 5/2008 Zur Ausgabe