Skip to main content
Top
Published in: Soft Computing 9/2015

01-09-2015 | Focus

Three empirical studies on predicting software maintainability using ensemble methods

Authors: Mahmoud O. Elish, Hamoud Aljamaan, Irfan Ahmad

Published in: Soft Computing | Issue 9/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

More accurate prediction of software maintenance effort contributes to better management and control of software maintenance. Several research studies have recently investigated the use of computational intelligence models for software maintainability prediction. The performance of these models, however, may vary from dataset to dataset. Consequently, ensemble methods have become increasingly popular as they take advantage of the capabilities of their constituent computational intelligence models toward a dataset to come up with more accurate or at least competitive prediction accuracy compared to individual models. This paper investigates and empirically evaluates different homogenous and heterogeneous ensemble methods in predicting software maintenance effort and change proneness. Three major empirical studies were designed and conducted taken into consideration different design such as the types of the investigated ensembles methods, types of prediction problems, used datasets, and other experimental setup. Overall empirical evidence obtained from the three studies confirms that some ensemble methods provide more accurate or at least competitive prediction accuracy compared to individual models across datasets, and thus they are more reliable.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Ahmed M, Al-Jamimi H (2013) Machine learning approaches for predicting software maintainability: a fuzzy-based transparent model. IET Softw 7(6):317–326CrossRef Ahmed M, Al-Jamimi H (2013) Machine learning approaches for predicting software maintainability: a fuzzy-based transparent model. IET Softw 7(6):317–326CrossRef
go back to reference Al-Dallal J (2013) Object-oriented class maintainability prediction using internal quality attributes. Inf Softw Technol 55:2028–2048CrossRef Al-Dallal J (2013) Object-oriented class maintainability prediction using internal quality attributes. Inf Softw Technol 55:2028–2048CrossRef
go back to reference Aljamaan H, Elish M (2009) An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: IEEE symposium on computational intelligence and data mining, pp 187–194 Aljamaan H, Elish M (2009) An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: IEEE symposium on computational intelligence and data mining, pp 187–194
go back to reference Aljamaan H, Elish M, Ahmad I (2013) An ensemble of computational intelligence models for software maintenance effort prediction. In: 12th International work conference on artificial neural networks (IWANN 2013), part I, LNCS 7902, pp 592–603 Aljamaan H, Elish M, Ahmad I (2013) An ensemble of computational intelligence models for software maintenance effort prediction. In: 12th International work conference on artificial neural networks (IWANN 2013), part I, LNCS 7902, pp 592–603
go back to reference Bandi R, Vaishnavi V, Turk D (2003) Predicting maintenance performance using object-oriented design complexity metrics. IEEE Trans Softw Eng 29(1):77–87CrossRef Bandi R, Vaishnavi V, Turk D (2003) Predicting maintenance performance using object-oriented design complexity metrics. IEEE Trans Softw Eng 29(1):77–87CrossRef
go back to reference Banfield R, Hall L, Bowyer K, Kegelmeyer W (2007) A comparison of decision tree ensemble creation techniques. IEEE Trans Pattern Anal Mach Intell 29(1):173–180CrossRef Banfield R, Hall L, Bowyer K, Kegelmeyer W (2007) A comparison of decision tree ensemble creation techniques. IEEE Trans Pattern Anal Mach Intell 29(1):173–180CrossRef
go back to reference Bittencourt V, Abreu M, Souto M, Canuto A (2005) An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction. In: International joint conference on neural networks, pp 527–531 Bittencourt V, Abreu M, Souto M, Canuto A (2005) An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction. In: International joint conference on neural networks, pp 527–531
go back to reference Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef
go back to reference Braga P, Oliveira A, Ribeiro G, Meira S (2007) Bagging predictors for estimation of software project effort. In: International joint conference on neural networks, pp 1595–1600 Braga P, Oliveira A, Ribeiro G, Meira S (2007) Bagging predictors for estimation of software project effort. In: International joint conference on neural networks, pp 1595–1600
go back to reference Briand L, Bunse C, Daly J (2001) A controlled experiment for evaluating quality guidelines on the maintainability of object-oriented designs. IEEE Trans Softw Eng 27(6):513–530CrossRef Briand L, Bunse C, Daly J (2001) A controlled experiment for evaluating quality guidelines on the maintainability of object-oriented designs. IEEE Trans Softw Eng 27(6):513–530CrossRef
go back to reference Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRef Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRef
go back to reference Conte S, Dunsmore H, Shen V (1986) Software engineering metrics and models. Benjamin/Cummings, Menlo Park Conte S, Dunsmore H, Shen V (1986) Software engineering metrics and models. Benjamin/Cummings, Menlo Park
go back to reference De Lucia A, Pompella E, Stefanucci S (2005) Assessing effort estimation models for corrective maintenance through empirical studies. Inf Softw Technol 47(1):3–15CrossRef De Lucia A, Pompella E, Stefanucci S (2005) Assessing effort estimation models for corrective maintenance through empirical studies. Inf Softw Technol 47(1):3–15CrossRef
go back to reference Elish M, Al-Khiaty M (2013) A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software. J Softw Evol Process 25(5):407–437CrossRef Elish M, Al-Khiaty M (2013) A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software. J Softw Evol Process 25(5):407–437CrossRef
go back to reference Elish M, Elish K (2009) Application of TreeNet in predicting object-oriented software maintainability: a comparative study. In: 13th European conference on software maintenance and reengineering (CSMR ’09), pp 69–78 Elish M, Elish K (2009) Application of TreeNet in predicting object-oriented software maintainability: a comparative study. In: 13th European conference on software maintenance and reengineering (CSMR ’09), pp 69–78
go back to reference Elish M, Helmy T, Hussain M (2013) Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math Probl Eng 2013:1–21. doi:10.1155/2013/312067 Elish M, Helmy T, Hussain M (2013) Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math Probl Eng 2013:1–21. doi:10.​1155/​2013/​312067
go back to reference Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst 13(2):87–129 Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst 13(2):87–129
go back to reference Fioravanti F, Nesi P (2001) Estimation and prediction metrics for adaptive maintenance effort of object-oriented systems. IEEE Trans Softw Eng 27(12):1062–1084CrossRef Fioravanti F, Nesi P (2001) Estimation and prediction metrics for adaptive maintenance effort of object-oriented systems. IEEE Trans Softw Eng 27(12):1062–1084CrossRef
go back to reference Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285CrossRef Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285CrossRef
go back to reference Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp 23–37 Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp 23–37
go back to reference Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Thirteenth international conference on machine learning, Italy, pp 148–156 Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Thirteenth international conference on machine learning, Italy, pp 148–156
go back to reference Gutta S, Wechsler H (1996) Face recognition using hybrid classifier systems. In: IEEE international conference on neural networks, pp 1017–1022 Gutta S, Wechsler H (1996) Face recognition using hybrid classifier systems. In: IEEE international conference on neural networks, pp 1017–1022
go back to reference Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001CrossRef Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001CrossRef
go back to reference Hartigan J, Wong M (1979) Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108 Hartigan J, Wong M (1979) Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
go back to reference Hashem S, Schmeiser B, Yih Y (1994) Optimal linear combinations of neural networks. Neural Netw 3:1507–1512 Hashem S, Schmeiser B, Yih Y (1994) Optimal linear combinations of neural networks. Neural Netw 3:1507–1512
go back to reference Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, New Jersey Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, New Jersey
go back to reference Huang FJ, Zhou Z, Zhang H-J, Chen T (2000) Pose invariant face recognition. In: Proceedings of the 4th IEEE international conference on automatic face and gesture recognition, France, pp 245–250 Huang FJ, Zhou Z, Zhang H-J, Chen T (2000) Pose invariant face recognition. In: Proceedings of the 4th IEEE international conference on automatic face and gesture recognition, France, pp 245–250
go back to reference Khoshgoftaar T, Geleyn E, Nguyen L (2003) Empirical case studies of combining software quality classification models. In: Third international conference on quality software, p 40 Khoshgoftaar T, Geleyn E, Nguyen L (2003) Empirical case studies of combining software quality classification models. In: Third international conference on quality software, p 40
go back to reference Kiran N, Ravi V (2008) Software reliability prediction by soft computing techniques. J Syst Softw 81(4):576–583CrossRef Kiran N, Ravi V (2008) Software reliability prediction by soft computing techniques. J Syst Softw 81(4):576–583CrossRef
go back to reference Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI), pp 1137–1143 Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI), pp 1137–1143
go back to reference Koten C, Gray A (2006) An application of Bayesian network for predicting object-oriented software maintainability. Inf Softw Technol 48(1):59–67CrossRef Koten C, Gray A (2006) An application of Bayesian network for predicting object-oriented software maintainability. Inf Softw Technol 48(1):59–67CrossRef
go back to reference Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. Adv Neural Inf Process Syst 7:231–238 Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. Adv Neural Inf Process Syst 7:231–238
go back to reference Li W, Henry S (1993) Object-oriented metrics that predict maintainability. J Syst Softw 23(2):111–122CrossRef Li W, Henry S (1993) Object-oriented metrics that predict maintainability. J Syst Softw 23(2):111–122CrossRef
go back to reference Mao J (1998) A case study on bagging, boosting and basic ensembles of neural networks for OCR. In: Proceedings of IEEE international joint conference on neural networks, pp 1828–1833 Mao J (1998) A case study on bagging, boosting and basic ensembles of neural networks for OCR. In: Proceedings of IEEE international joint conference on neural networks, pp 1828–1833
go back to reference Misra S (2005) Modeling design/coding factors that drive maintainability of software systems. Softw Qual Control 13(3):297–320CrossRef Misra S (2005) Modeling design/coding factors that drive maintainability of software systems. Softw Qual Control 13(3):297–320CrossRef
go back to reference Opitz D, Shavlik J (1996) Actively searching for an effective neural-network ensemble. Connect Sci 8(3/4):337–353CrossRef Opitz D, Shavlik J (1996) Actively searching for an effective neural-network ensemble. Connect Sci 8(3/4):337–353CrossRef
go back to reference Opitz D, Shavlik J (1996) Generating accurate and diverse members of a neural-network ensemble. Adv Neural Inf Process Syst 8:535–541 Opitz D, Shavlik J (1996) Generating accurate and diverse members of a neural-network ensemble. Adv Neural Inf Process Syst 8:535–541
go back to reference Optiz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198 Optiz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
go back to reference Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497CrossRef Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497CrossRef
go back to reference Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco
go back to reference Quinlan R (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, Singapore, pp 343–348 Quinlan R (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, Singapore, pp 343–348
go back to reference Shevade S, Keerthi S, Bhattacharyya C, Murthy K (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11(5):1188–1193CrossRef Shevade S, Keerthi S, Bhattacharyya C, Murthy K (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11(5):1188–1193CrossRef
go back to reference Shimshoni Y, Intrator N (1998) Classification of seismic signals by integrating ensembles of neural networks. IEEE Trans Signal Process 46(5):1194–1201CrossRef Shimshoni Y, Intrator N (1998) Classification of seismic signals by integrating ensembles of neural networks. IEEE Trans Signal Process 46(5):1194–1201CrossRef
go back to reference Sollich P (1996) Learning with ensembles: how over-fitting can be useful. Adv Neural Inf Process Syst 8:190–196 Sollich P (1996) Learning with ensembles: how over-fitting can be useful. Adv Neural Inf Process Syst 8:190–196
go back to reference Thwin M, Quah T (2005) Application of neural networks for software quality prediction using object-oriented metrics. J Syst Softw 76(2):147–156CrossRef Thwin M, Quah T (2005) Application of neural networks for software quality prediction using object-oriented metrics. J Syst Softw 76(2):147–156CrossRef
go back to reference Vapnik V (1995) The nature of statistical learning theory. Springer, New York Vapnik V (1995) The nature of statistical learning theory. Springer, New York
go back to reference Wang Y, Witten IH (1997) Induction of model trees for predicting continuous classes. In: Poster papers of the 9th European conference on machine learning Wang Y, Witten IH (1997) Induction of model trees for predicting continuous classes. In: Poster papers of the 9th European conference on machine learning
go back to reference Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
go back to reference Zhang C, Zhang J, Zhang G (2008) An efficient modified boosting method for solving classification problems. J Comput Appl Math 214:381–392MathSciNetCrossRef Zhang C, Zhang J, Zhang G (2008) An efficient modified boosting method for solving classification problems. J Comput Appl Math 214:381–392MathSciNetCrossRef
go back to reference Zheng J (2009) Predicting software reliability with neural network ensembles. Expert Syst App 36(2):2116–2122CrossRef Zheng J (2009) Predicting software reliability with neural network ensembles. Expert Syst App 36(2):2116–2122CrossRef
go back to reference Zhou Y, Leung H (2007) Predicting object-oriented software maintainability using multivariate adaptive regression splines. J Syst Softw 80(8):1349–1361CrossRef Zhou Y, Leung H (2007) Predicting object-oriented software maintainability using multivariate adaptive regression splines. J Syst Softw 80(8):1349–1361CrossRef
Metadata
Title
Three empirical studies on predicting software maintainability using ensemble methods
Authors
Mahmoud O. Elish
Hamoud Aljamaan
Irfan Ahmad
Publication date
01-09-2015
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 9/2015
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-014-1576-2

Other articles of this Issue 9/2015

Soft Computing 9/2015 Go to the issue

Premium Partner