Top

Empirical Software Engineering

Published in:

01-10-2008

Techniques for evaluating fault prediction models

Authors: Yue Jiang, Bojan Cukic, Yan Ma

Published in: Empirical Software Engineering | Issue 5/2008

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Many statistical techniques have been proposed to predict fault-proneness of program modules in software engineering. Choosing the “best” candidate among many available models involves performance assessment and detailed comparison, but these comparisons are not simple due to the applicability of varying performance measures. Classifying a software module as fault-prone implies the application of some verification activities, thus adding to the development cost. Misclassifying a module as fault free carries the risk of system failure, also associated with cost implications. Methodologies for precise evaluation of fault prediction models should be at the core of empirical software engineering research, but have attracted sporadic attention. In this paper, we overview model evaluation techniques. In addition to many techniques that have been used in software engineering studies before, we introduce and discuss the merits of cost curves. Using the data from a public repository, our study demonstrates the strengths and weaknesses of performance evaluation techniques and points to a conclusion that the selection of the “best” model cannot be made without considering project cost characteristics, which are specific in each development environment.

previous article Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Adams NM, Hand DJ (1999) Comparing classifiers when the misallocation costs are uncertain. Pattern Recognit 32:1139–1147. doi:10.1016/S0031-3203(98)00154-X CrossRef

Arisholm E, Briand LC (2006) Predicting fault-prone components in a java legacy system. Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering (ISESE’06)

Azar D, Precup D, Bouktif S, Kegl B, Sahraoui H (2002) Combining and adapting software quality predictive models by genetic algorithms. 17th IEEE International Conference on Automated Software Engineering. IEEE Computer Society

Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761. doi:10.1109/32.544352 CrossRef

Boetticher GD (2005) Nearest neighbor sampling for better defect prediction. ACM SIGSOFT Software Engineering Notes, 30(4). ACM, New York, NY, pp 1–6

Braga AC, Costa L, Oliveira P (2006) A nonparametric method for the comparison of areas under two ROC curves. International Conference on Robust Statistics (ICORS06). Technical University of Lisbon, 16–21 July 2006, Lisbon, Portugal

Breiman L (2001) Random forests. Mach Learn 45:5–32. doi:10.1023/A:1010933404324 MATHCrossRef

Challagulla VUB, Bastani FB, Yen I-L, Paul RA (2005) Empirical assessment of machine learning based software defect prediction techniques. Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS’05), pp 263–270

Conover WJ (1999) Practical nonparametric statistics. Wiley, New York

Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, PA, pp 233–240

Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNet

Drummond C, Holte RC (2006) Cost curves: an improved method for visualizing classifier performance. Mach Learn 65(1):95–130. doi:10.1007/s10994-006-8199-5 CrossRef

El-Emam K, Benlarbi S, Goel N, Rai SN (2001) Comparing case-based reasoning classifiers for predicting high-risk software components. J Syst Softw 55(3):301–320. doi:10.1016/S0164-1212(00)00079-0 CrossRef

Fenton N, Neil M (1999) Software metrics and risk. The 2nd European Software Measurement Conference (FESMA 99), TI-KVIV, Amsterdam, pp 39–55

Gokhale SS, Lyu MR (1997) Regression tree modeling for the prediction of software quality. In: Pham H (ed) The third ISSAT International Conference on Reliability and Quality in Design. Anaheim, CA, pp 31–36

Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. Proceedings of the 15th IEEE International Symposium on Software Reliability Engineering (ISSRE 2004), IEEE Press

Khoshgoftaar TM, Lanning DL (1995) A neural network approach for early detection of program modules having high risk in the maintenance phase. J Syst Softw 29(1):85–91. doi:10.1016/0164-1212(94)00130-F CrossRef

Khoshgoftaar TM, Allen EB, Ross FD, Munikoti R, Goel N, Nandi A (1997) Predicting fault-prone modules with case-based reasoning. The Eighth International Symposium on Software Engineering (ISSRE '07). IEEE Computer Society, pp 27–35

Khoshgoftaar TM, Seliya N (2002) Tree-based software quality estimation models for fault prediction. The 8th IEEE Symposium on Software Metrics (METRICS’02), IEEE Computer Society, pp 203–214

Khoshgoftaar TM, Cukic B, Seliya N (2007) An empirical assessment on program module-order models. Qual Technol Quant Manag 4(2):171–190MathSciNet

Koru AG, Liu H (2005) Building effective defect-prediction models in practice. IEEE Softw 22(6):23–29. doi:10.1109/MS.2005.149 CrossRef

Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2–3):195–215. doi:10.1023/A:1007452223027 CrossRef

Lewis D, Gale W (1994) A sequential algorithm for training text classifiers. Annual ACM Conference on Research and Development in Information Retrieval, the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Springer-Verlag, New York, NY, pp 3–12

Ling CX, Li C (1998) Data mining for direct marketing: problems and solutions. Proc. of the 4th Intern. Conf. on Knowledge Discovery and Data Mining, New York, pp 73–79

Ma Y (2007) An empirical investigation of tree ensembles in biometrics and bioinformatics. West Virginia University, PhD thesis, January 2007

Macskassy S, Provost F, Rosset S (2005a) Pointwise ROC confidence bounds: an empirical evaluation. Proceedings of the Workshop on ROC Analysis in Machine Learning (ROCML-2005)

Macskassy S, Provost F, Rosset S (2005b) ROC confidence bands: an empirical evaluation. Proceedings of the 22nd International Conference on Machine Learning (ICML). Bonn, Germany

Menzies T, Stefano JD, Ammar K, Chapman RM, McGill K, Callis P et al (2003) When can we test less? Proceedings of the Ninth International Software Metrics Symposium (METRICS’03), IEEE Computer Society

Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13. doi:10.1109/TSE.2007.256941 CrossRef

Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355. doi:10.1109/TSE.2005.49 CrossRef

Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894. doi:10.1109/32.553637 CrossRef

Ohlsson N, Eriksson AC, Helander ME (1997) Early risk-management by identification of fault-prone modules. Empir Softw Eng 2(2):166–173. doi:10.1023/A:1009757419320 CrossRef

Selby RW, Porter AA (1988) Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Trans Softw Eng 14(12):1743–1757. doi:10.1109/32.9061 CrossRef

Siegel S (1956) Nonparametric statistics. McGraw-Hill, New YorkMATH

Vuk M, Curk T (2006) ROC curve, lift chart and calibration plot. Metodoloski zvezki 3:89–108

Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann

Youden W (1950) Index for rating diagnostic tests. Cancer 3:32–35. doi:10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3 CrossRef

Yousef WA, Wagner RF, Loew MH (2004) Comparison of non-parametric methods for assessing classifier performance in terms of ROC parameters. In Proceedings of Applied Imagery Pattern Recognition Workshop, vol. 33, issue 13–15, pp 190–195

Zhang H, Zhang X (2007) Comments on ‘data mining static code attributes to learn defect predictors’. IEEE Trans Softw Eng 33(9):635–637CrossRef

Title: Techniques for evaluating fault prediction models
Authors: Yue Jiang
Bojan Cukic
Yan Ma
Publication date: 01-10-2008
Publisher: Springer US
Published in: Empirical Software Engineering / Issue 5/2008
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI: https://doi.org/10.1007/s10664-008-9079-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 5/2008

Theory of relative defect proneness

On the effectiveness of early life cycle defect prediction with Bayesian Nets

Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

Editorial, special issue, repeatable experiments in software engineering

Premium Partner