Skip to main content
Top
Published in: Soft Computing 18/2018

30-11-2017 | Focus

On the value of parameter tuning in heterogeneous ensembles effort estimation

Authors: Mohamed Hosni, Ali Idri, Alain Abran, Ali Bou Nassif

Published in: Soft Computing | Issue 18/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Accurate estimation of software development effort estimation (SDEE) is fundamental for efficient management of software development projects as it assists software managers to efficiently manage their human resources. Over the last four decades, while software engineering researchers have used several effort estimation techniques, including those based on statistical and machine learning methods, no consensus has been reached on the technique that can perform best in all circumstances. To tackle this challenge, Ensemble Effort Estimation, which predicts software development effort by combining more than one solo estimation technique, has recently been investigated. In this paper, heterogeneous ensembles based on four well-known machine learning techniques (K-nearest neighbor, support vector regression, multilayer perceptron and decision trees) were developed and evaluated by investigating the impact of parameter values of the ensemble members on estimation accuracy. In particular, this paper evaluates whether setting ensemble parameters using two optimization techniques (e.g., grid search optimization and particle swarm) permits more accurate estimates of SDEE. The heterogeneous ensembles of this study were built using three combination rules (mean, median and inverse ranked weighted mean) over seven datasets. The results obtained suggest that: (1) Optimized single techniques using grid search or particle swarm optimization provide more accurate estimation; (2) in general ensembles achieve higher accuracy than their single techniques whatever the optimization technique used, even though ensembles do not dominate over all single techniques; (3) heterogeneous ensembles based on optimized single techniques provide more accurate estimation; and (4) generally, particle swarm optimization and grid search techniques generate ensembles with the same predictive capability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Amazal FA, Idri A, Abran A (2014b) An analogy-based approach to estimation of software development effort using categorical data. In: Joint conference of the international workshop on software measurement and the international conference on software process and product measurement, pp 252–262 Amazal FA, Idri A, Abran A (2014b) An analogy-based approach to estimation of software development effort using categorical data. In: Joint conference of the international workshop on software measurement and the international conference on software process and product measurement, pp 252–262
go back to reference Araújo RDA, De Oliveira ALI, Soares S (2010) Hybrid intelligent design of morphological-rank-linear perceptrons for software development cost estimation. In: Proceedings of international conference of tools with artif intell ICTAI, vol 1, pp 160–167. https://doi.org/10.1109/ICTAI.2010.30 Araújo RDA, De Oliveira ALI, Soares S (2010) Hybrid intelligent design of morphological-rank-linear perceptrons for software development cost estimation. In: Proceedings of international conference of tools with artif intell ICTAI, vol 1, pp 160–167. https://​doi.​org/​10.​1109/​ICTAI.​2010.​30
go back to reference Azhar D, Riddle P, Mendes E, et al (2013) Using ensembles for web effort estimation. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement, pp 173–182 Azhar D, Riddle P, Mendes E, et al (2013) Using ensembles for web effort estimation. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement, pp 173–182
go back to reference Baskeles B, Turhan B, Bener A (2007) Software effort estimation using machine learning methods. In: Proceedings of the 22nd international symposium on computer and information sciences, pp 1–6 Baskeles B, Turhan B, Bener A (2007) Software effort estimation using machine learning methods. In: Proceedings of the 22nd international symposium on computer and information sciences, pp 1–6
go back to reference Berlin S, Raz T, Glezer C, Zviran M (2009) Comparison of estimation methods of cost and duration in IT projects. Inf Softw Technol 51:738–748CrossRef Berlin S, Raz T, Glezer C, Zviran M (2009) Comparison of estimation methods of cost and duration in IT projects. Inf Softw Technol 51:738–748CrossRef
go back to reference Boehm B (1984) Software engineering economics. IEEE Trans Softw Eng 10:4–21CrossRef Boehm B (1984) Software engineering economics. IEEE Trans Softw Eng 10:4–21CrossRef
go back to reference Boeringer DW, Werner DH, Member S (2004) Particle swarm optimization versus genetic algorithms for phased array synthesis. IEEE Trans Antennas Propag 52:771–779CrossRef Boeringer DW, Werner DH, Member S (2004) Particle swarm optimization versus genetic algorithms for phased array synthesis. IEEE Trans Antennas Propag 52:771–779CrossRef
go back to reference Booba B, Gopal TV (2013) Comparison of ant colony optimization & particle swarm optimization in grid environment. Int J Adv Res Comput Sci Appl 1:27–33 Booba B, Gopal TV (2013) Comparison of ant colony optimization & particle swarm optimization in grid environment. Int J Adv Res Comput Sci Appl 1:27–33
go back to reference Borges L, Ferreira D (2003) Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Mat Estat 21:67–83 Borges L, Ferreira D (2003) Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Mat Estat 21:67–83
go back to reference Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc 26:211–252MATH Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc 26:211–252MATH
go back to reference Braga P, Oliveira A, Ribeiro G, Meira S (2007a) Bagging predictors for estimation of software project effort. In: Proceedings of international joint conference on neural networks, pp 14–19 Braga P, Oliveira A, Ribeiro G, Meira S (2007a) Bagging predictors for estimation of software project effort. In: Proceedings of international joint conference on neural networks, pp 14–19
go back to reference Braga PL, Oliveira ALI, Meira SRL (2007b) Software effort estimation using machine learning techniques with robust confidence intervals. In: 7th international conference on hybrid intelligent systems (HIS 2007), pp 352–357 Braga PL, Oliveira ALI, Meira SRL (2007b) Software effort estimation using machine learning techniques with robust confidence intervals. In: 7th international conference on hybrid intelligent systems (HIS 2007), pp 352–357
go back to reference Brooks Jr FP (1975) The mythical man-month: essays on software engineering. Addison Wesley Longman, Inc, United States, Boston Brooks Jr FP (1975) The mythical man-month: essays on software engineering. Addison Wesley Longman, Inc, United States, Boston
go back to reference Byrne BM (2009) Structural equation modeling with AMOS. Mahwah, New York Byrne BM (2009) Structural equation modeling with AMOS. Mahwah, New York
go back to reference Calinski T, Corsten LCA (1985) Clustering means in ANOVA by simultaneous testing. Biometrics 41:39–48CrossRef Calinski T, Corsten LCA (1985) Clustering means in ANOVA by simultaneous testing. Biometrics 41:39–48CrossRef
go back to reference Conte SD, Dunsmore HE, Shen YE (1986) Software engineering metrics and models. Benjamin-Cummings Publishing Co., Inc, Redwood City Conte SD, Dunsmore HE, Shen YE (1986) Software engineering metrics and models. Benjamin-Cummings Publishing Co., Inc, Redwood City
go back to reference Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, UK Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, UK
go back to reference Das H, Jena AK, Nayak J, et al (2014) A novel PSO based back propagation learning-MLP (PSO-BP-MLP) for Classification. In: Proceedings of the international conference on IEEE symposium on computational intelligence and data mining, 20–21 December 2014 Das H, Jena AK, Nayak J, et al (2014) A novel PSO based back propagation learning-MLP (PSO-BP-MLP) for Classification. In: Proceedings of the international conference on IEEE symposium on computational intelligence and data mining, 20–21 December 2014
go back to reference Deharnais J (1989) Analyse statistique de la productivitie des projects de development en informatique apartir de la technique des points des fontion. Quebec university Deharnais J (1989) Analyse statistique de la productivitie des projects de development en informatique apartir de la technique des points des fontion. Quebec university
go back to reference Elish MO (2013) Assessment of voting ensemble for estimating software development effort. In: IEEE symposium on computational intelligence and data mining, Singapore, pp 316–321 Elish MO (2013) Assessment of voting ensemble for estimating software development effort. In: IEEE symposium on computational intelligence and data mining, Singapore, pp 316–321
go back to reference Finnie GR, Wittig GE, Desharnais J-M (1997) A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models. J Syst Softw 39:281–289CrossRef Finnie GR, Wittig GE, Desharnais J-M (1997) A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models. J Syst Softw 39:281–289CrossRef
go back to reference Fonseca CM, Fleming PJ (1993) Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: Proceedings of the 5th international conference on genetic algorithms, pp 416–423 Fonseca CM, Fleming PJ (1993) Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: Proceedings of the 5th international conference on genetic algorithms, pp 416–423
go back to reference Foss T, Myrtveit I, Stensrud E (2001) MRE and heteroscedasticity?: An empirical validation of the assumption of homoscedasticity of the magnitude of relative error. In: ESCOM, 12th european software control and metrics conference, Netherlands, pp 157–164 Foss T, Myrtveit I, Stensrud E (2001) MRE and heteroscedasticity?: An empirical validation of the assumption of homoscedasticity of the magnitude of relative error. In: ESCOM, 12th european software control and metrics conference, Netherlands, pp 157–164
go back to reference Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: Computational learning theory, pp 23–37 Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: Computational learning theory, pp 23–37
go back to reference Göndör M, Bresfelean VP (2012) REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul 6:378–386 Göndör M, Bresfelean VP (2012) REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul 6:378–386
go back to reference Hassan R, Cohanim B, De Weck O et al (2005) A comparison of particle swarm optimization and the genetic algorithm. AIAA Pap 2005–1897:1–13 Hassan R, Cohanim B, De Weck O et al (2005) A comparison of particle swarm optimization and the genetic algorithm. AIAA Pap 2005–1897:1–13
go back to reference Hosni M, Idri A (2017) Software effort estimation using classical analogy ensembles based on random subspace. In: Proceedings of the ACM symposium on applied computing Hosni M, Idri A (2017) Software effort estimation using classical analogy ensembles based on random subspace. In: Proceedings of the ACM symposium on applied computing
go back to reference Hsu C-J, Rodas NU, Huang C-Y, Peng K-L (2010) A study of improving the accuracy of software effort estimation using linearly weighted combinations. In: Proceedings of the 34th IEEE annual computer software and applications conference workshops, Seoul, pp 98–103 Hsu C-J, Rodas NU, Huang C-Y, Peng K-L (2010) A study of improving the accuracy of software effort estimation using linearly weighted combinations. In: Proceedings of the 34th IEEE annual computer software and applications conference workshops, Seoul, pp 98–103
go back to reference Idri A, Abran A, Kjiri L (2000) COCOMO cost model using fuzzy logic. In: Proceedings of the 7th international conference on fuzzy theory & techniques. Atlantic, New Jersey, pp 1–4 Idri A, Abran A, Kjiri L (2000) COCOMO cost model using fuzzy logic. In: Proceedings of the 7th international conference on fuzzy theory & techniques. Atlantic, New Jersey, pp 1–4
go back to reference Idri A, Amazal FA (2012) Software cost estimation by fuzzy analogy for ISBSG repository. In: Proceedings of the 10th international FLINS conference on uncertainty modeling in knowledge engineering and decision making, Istanbul, Turkey Idri A, Amazal FA (2012) Software cost estimation by fuzzy analogy for ISBSG repository. In: Proceedings of the 10th international FLINS conference on uncertainty modeling in knowledge engineering and decision making, Istanbul, Turkey
go back to reference Idri A, Hosni M, Abran A (2016b) Systematic mapping study of ensemble effort estimation. In: Proceedings of the 11th international conference on evaluation of novel software approaches to software engineering, pp 132–139 Idri A, Hosni M, Abran A (2016b) Systematic mapping study of ensemble effort estimation. In: Proceedings of the 11th international conference on evaluation of novel software approaches to software engineering, pp 132–139
go back to reference Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. In: Seventh international software metrics symposium, METRICS 2001, pp 16–27 Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. In: Seventh international software metrics symposium, METRICS 2001, pp 16–27
go back to reference Jolliffe IT (1975) Cluster analysis as multiple comparison method. In: Applied statistics, Proceedings of conference at Dalhousie University. North Holland, pp 159–168 Jolliffe IT (1975) Cluster analysis as multiple comparison method. In: Applied statistics, Proceedings of conference at Dalhousie University. North Holland, pp 159–168
go back to reference Kalmegh S (2015) Analysis of WEKA data mining algorithm REPTree, simple cart and randomtree for classification of indian news. Int J Innov Sci Eng Technol 2:438–446 Kalmegh S (2015) Analysis of WEKA data mining algorithm REPTree, simple cart and randomtree for classification of indian news. Int J Innov Sci Eng Technol 2:438–446
go back to reference Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4, pp 1942–1948 Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4, pp 1942–1948
go back to reference Kocaguneli E, Kultur Y, Bener AB (2009) Combining multiple learners induced on multiple datasets for software effort prediction. In: Proceedings of international symposium on software reliability engineering Kocaguneli E, Kultur Y, Bener AB (2009) Combining multiple learners induced on multiple datasets for software effort prediction. In: Proceedings of international symposium on software reliability engineering
go back to reference Ma X, Zhang Y, Wang Y (2015) Performance evaluation of kernel functions based on grid search for support vector regression. In: 2015 IEEE 7th international conference on cybernetics and intelligent systems (CIS) and IEEE conference on robotics, automation and mechatronics (RAM), pp 283–288 Ma X, Zhang Y, Wang Y (2015) Performance evaluation of kernel functions based on grid search for support vector regression. In: 2015 IEEE 7th international conference on cybernetics and intelligent systems (CIS) and IEEE conference on robotics, automation and mechatronics (RAM), pp 283–288
go back to reference Mansour Y (1997) Pessimistic decision tree pruning based on tree size. In: Proceedings on 14th international conference on machine learning, pp 195–201 Mansour Y (1997) Pessimistic decision tree pruning based on tree size. In: Proceedings on 14th international conference on machine learning, pp 195–201
go back to reference Mendes E, Watson I, Triggs C, et al (2002) A comparison of development effort estimation techniques for Web hypermedia applications. In: Proceedings on international software metrics symposium, pp 131–140 Mendes E, Watson I, Triggs C, et al (2002) A comparison of development effort estimation techniques for Web hypermedia applications. In: Proceedings on international software metrics symposium, pp 131–140
go back to reference Minku LL, Yao X (2013) Software effort estimation as a multiobjective learning problem. ACM Trans Softw Eng Methodol 22:35:1–35:32CrossRef Minku LL, Yao X (2013) Software effort estimation as a multiobjective learning problem. ACM Trans Softw Eng Methodol 22:35:1–35:32CrossRef
go back to reference Minku LL, Yao X (2013c) An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation. In: Proceedings of the 9th international conference on predictive models in software engineering—PROMISE ’13, pp 1–10 Minku LL, Yao X (2013c) An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation. In: Proceedings of the 9th international conference on predictive models in software engineering—PROMISE ’13, pp 1–10
go back to reference Quinlan JR (1993) C4.5: program for machine learning. Morgan Kaufmann, Burlington Quinlan JR (1993) C4.5: program for machine learning. Morgan Kaufmann, Burlington
go back to reference Sadri J, Suen CY, Bui TD (2003) Application of support vector machines for recognition of handwritten Arabic/Persian digits. In: Second conference on machine vision and image processing & applications (MVIP 2003), pp 300–307 Sadri J, Suen CY, Bui TD (2003) Application of support vector machines for recognition of handwritten Arabic/Persian digits. In: Second conference on machine vision and image processing & applications (MVIP 2003), pp 300–307
go back to reference Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30:507–512CrossRefMATH Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30:507–512CrossRefMATH
go back to reference Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: 1998 IEEE international conference on evolutionary computation proceedings. IEEE world congress on computational intelligence (Cat. No. 98TH8360), pp 69–73 Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: 1998 IEEE international conference on evolutionary computation proceedings. IEEE world congress on computational intelligence (Cat. No. 98TH8360), pp 69–73
go back to reference Simon H (1999) Neural networks: a comprehensive foundation, 2nd edn. MacMillan Publishing Company, BasingstokeMATH Simon H (1999) Neural networks: a comprehensive foundation, 2nd edn. MacMillan Publishing Company, BasingstokeMATH
go back to reference Song L, Minku LL, Yao X (2013) The impact of parameter tuning on software effort estimation using learning machines. In: Proceedings of the 9th international conference on predictive models in software engineering Song L, Minku LL, Yao X (2013) The impact of parameter tuning on software effort estimation using learning machines. In: Proceedings of the 9th international conference on predictive models in software engineering
go back to reference Tsoumakas G, Angelis L, Vlahavas I (2005) Selective fusion of heterogeneous classifiers. Intell Data Anal 9:511–525 Tsoumakas G, Angelis L, Vlahavas I (2005) Selective fusion of heterogeneous classifiers. Intell Data Anal 9:511–525
go back to reference Vapnik V (1992) Principles of risk minimization for learning theory. In: Advances in neural information processing systems, pp 831–838 Vapnik V (1992) Principles of risk minimization for learning theory. In: Advances in neural information processing systems, pp 831–838
go back to reference Vapnik V, Bottou L (1993) Local algorithms for pattern recognition and dependencies estimation. Neural Comput 5:893–909CrossRef Vapnik V, Bottou L (1993) Local algorithms for pattern recognition and dependencies estimation. Neural Comput 5:893–909CrossRef
go back to reference Vinaykumar K, Ravi V, Carr M (2009) Software cost estimation using soft computing approaches. In: Handbook of research on machine learning applications and trends. IGI-global, pp 499–518 Vinaykumar K, Ravi V, Carr M (2009) Software cost estimation using soft computing approaches. In: Handbook of research on machine learning applications and trends. IGI-global, pp 499–518
go back to reference W. N. Haizan W. M, Mohd Najib Mohd S, Abdul Halim O (2012) A comparative study of Reduced Error Pruning method in decision tree algorithms. In: Proceedings—2012 IEEE international conference on control system, computing and engineering, ICCSCE 2012. pp 392–397 W. N. Haizan W. M, Mohd Najib Mohd S, Abdul Halim O (2012) A comparative study of Reduced Error Pruning method in decision tree algorithms. In: Proceedings—2012 IEEE international conference on control system, computing and engineering, ICCSCE 2012. pp 392–397
go back to reference Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, Inc, San Francisco, USA Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, Inc, San Francisco, USA
go back to reference Xiao T, Ren D, Lei S et al (2014) Based on grid-search and PSO parameter optimization for support vector machine. In: 11th world congress on intelligent control and automation (WCICA). IEEE, pp 1529–1533 Xiao T, Ren D, Lei S et al (2014) Based on grid-search and PSO parameter optimization for support vector machine. In: 11th world congress on intelligent control and automation (WCICA). IEEE, pp 1529–1533
Metadata
Title
On the value of parameter tuning in heterogeneous ensembles effort estimation
Authors
Mohamed Hosni
Ali Idri
Alain Abran
Ali Bou Nassif
Publication date
30-11-2017
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 18/2018
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-017-2945-4

Other articles of this Issue 18/2018

Soft Computing 18/2018 Go to the issue

Premium Partner