Skip to main content
Erschienen in: Empirical Software Engineering 2/2019

11.09.2018

An ensemble-based model for predicting agile software development effort

verfasst von: Onkar Malgonde, Kaushal Chari

Erschienen in: Empirical Software Engineering | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To support agile software development projects, an array of tools and systems is available to plan, design, track, and manage the development process. In this paper, we explore a critical aspect of agile development i.e., effort prediction, that cuts across these tools and agile project teams. Accurate effort prediction can improve the planning of a sprint by enabling optimal assignments of both stories and developers. We develop a model for story-effort prediction using variables that are readily available when a story is created. We use seven predictive algorithms to predict a story’s effort. Interestingly, none of the predictive algorithms consistently outperforms others in predicting story effort across our test data of 423 stories. We develop an ensemble-based method based on our model for predicting story effort. We conduct computational experiments to show that our ensemble-based approach performs better in comparison to other ensemble-based benchmarking approaches. We then demonstrate the practical application of our predictive model and our ensemble-based approach by optimizing sprint planning for two projects from our dataset using an optimization model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
We thank the anonymous reviewer for pointing this out.
 
2
Dejaeger et al. (2012) identify 13 data mining techniques used in software effort estimating in a traditional setting. In this paper, we identify representative techniques (regression, decision trees, support vector machines, neural network, Bayesian network) that perform better than the variants and augment them with newer data mining techniques (k-nearest neighbor and ensemble approaches). For example, a radial kernel outperformed other kernels in Support Vector Machines.
 
3
Prior effort estimation studies have considered different variants of the algorithms considered in this study. Readers are referred to studies by Jørgensen and Shepperd (2007), Wen et al. (2012), and Idri et al. (2016) for a review of candidate algorithms in traditional software development projects.
 
4
Log Transformation of dependent variable provided worse performance.
 
5
We thank the anonymous reviewer for pointing this out.
 
6
The measure has four categories: (a) negligible effect (|d| < 0.147), (b) small effect (|d| < 0.33), (c) medium effect (|d| <0.474), and (d) large effect (|d| >0.474).
 
7
To facilitate practical interpretation, we also provide (Vargha and Delaney 2000) statistic (\(\hat {A_{12}}\)) for each pair of predictive algorithms.
 
8
Increasing β leads to higher computation costs. We choose the maximum value as 8.0 based on our experimental results. Values greater than 8.0 did not significantly improve the results.
 
Literatur
Zurück zum Zitat Abrahamsson P, Salo O, Ronkainen J, Warsta J (2002) Agile software development methods: Review and analysis. Report, VTT Abrahamsson P, Salo O, Ronkainen J, Warsta J (2002) Agile software development methods: Review and analysis. Report, VTT
Zurück zum Zitat Abrahamsson P, Moser R, Pedrycz W, Sillitti A, Succi G (2007) Effort prediction in iterative software development processes - incremental versus global prediction models. Empirical Software Engineering and Measurement, pp 344–353 Abrahamsson P, Moser R, Pedrycz W, Sillitti A, Succi G (2007) Effort prediction in iterative software development processes - incremental versus global prediction models. Empirical Software Engineering and Measurement, pp 344–353
Zurück zum Zitat Abrahamsson P, Fronza I, Moser R, Vlasenko J, Pedrycz W (2011) Predicting development effort from user stories. In: International Symposium on Empirical Software Engineering and Measurement, pp 400–403 Abrahamsson P, Fronza I, Moser R, Vlasenko J, Pedrycz W (2011) Predicting development effort from user stories. In: International Symposium on Empirical Software Engineering and Measurement, pp 400–403
Zurück zum Zitat Aggarwal C (2015) Data Mining: The Textbook. Springer, New YorkMATH Aggarwal C (2015) Data Mining: The Textbook. Springer, New YorkMATH
Zurück zum Zitat Azhar D, Riddle P, Mendes E, Mittas N, Angelis l (2013) Using ensembles for web effort estimation. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement Azhar D, Riddle P, Mendes E, Mittas N, Angelis l (2013) Using ensembles for web effort estimation. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
Zurück zum Zitat Azzeh M, Nassif AB, Minku L (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. The Journal of Systems and Software 103:36–52CrossRef Azzeh M, Nassif AB, Minku L (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. The Journal of Systems and Software 103:36–52CrossRef
Zurück zum Zitat Bayley S, Falessi D (2018) Optimizing prediction intervals by tuning random forest via meta-validation. arXiv:1801.07194 Bayley S, Falessi D (2018) Optimizing prediction intervals by tuning random forest via meta-validation. arXiv:1801.​07194
Zurück zum Zitat Beck K, Andres C (2004) Extreme Programming Explained:Embrace Change. Addison-Wesley, Reading Beck K, Andres C (2004) Extreme Programming Explained:Embrace Change. Addison-Wesley, Reading
Zurück zum Zitat Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57 (1):289–300MathSciNetMATH Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57 (1):289–300MathSciNetMATH
Zurück zum Zitat Bergmeir C, Benitez JM (2011) Forecaster performance evaluation with cross-validation and variants. In: 11Th international conference on intelligent systems design and applications (ISDA). IEEE, pp 849-854 Bergmeir C, Benitez JM (2011) Forecaster performance evaluation with cross-validation and variants. In: 11Th international conference on intelligent systems design and applications (ISDA). IEEE, pp 849-854
Zurück zum Zitat Chari K, Agrawal M (2018) Impact of incorrect and new requirements on waterfall software project outcomes. Empir Softw Eng 23(1):165–185CrossRef Chari K, Agrawal M (2018) Impact of incorrect and new requirements on waterfall software project outcomes. Empir Softw Eng 23(1):165–185CrossRef
Zurück zum Zitat Chowdhury S, Di Nardo S, Hindle A, Jiang ZMJ (2018) An exploratory study on assessing the energy impact of logging on android applications. Empir Softw Eng 23(3):1422–1456CrossRef Chowdhury S, Di Nardo S, Hindle A, Jiang ZMJ (2018) An exploratory study on assessing the energy impact of logging on android applications. Empir Softw Eng 23(3):1422–1456CrossRef
Zurück zum Zitat Cinnéide MÓ, Moghadam IH, Harman M, Counsell S, Tratt L (2017) An experimental search-based approach to cohesion metric evaluation. Empir Softw Eng 22(1):292–329CrossRef Cinnéide MÓ, Moghadam IH, Harman M, Counsell S, Tratt L (2017) An experimental search-based approach to cohesion metric evaluation. Empir Softw Eng 22(1):292–329CrossRef
Zurück zum Zitat Conboy K (2009) Agility from first principles: Reconstructing the concept of agility in information systems development. Inf Syst Res 20(3):329–354CrossRef Conboy K (2009) Agility from first principles: Reconstructing the concept of agility in information systems development. Inf Syst Res 20(3):329–354CrossRef
Zurück zum Zitat Dejaeger K, Verbeke W, Martens D, Baesens B (2012) Data mining techniques for software effort estimation: A comparative study. IEEE Trans Softw Eng 38(2):375–97CrossRef Dejaeger K, Verbeke W, Martens D, Baesens B (2012) Data mining techniques for software effort estimation: A comparative study. IEEE Trans Softw Eng 38(2):375–97CrossRef
Zurück zum Zitat Grenning J (2002) Planning poker or how to avoid analysis paralysis while release planning. Report, Hawthorn Woods: Renaissance Software Consulting Grenning J (2002) Planning poker or how to avoid analysis paralysis while release planning. Report, Hawthorn Woods: Renaissance Software Consulting
Zurück zum Zitat Hastie T, Tibshirani R, Friedman J (2008) The Elements of Statistical Learning. Springer, New YorkMATH Hastie T, Tibshirani R, Friedman J (2008) The Elements of Statistical Learning. Springer, New YorkMATH
Zurück zum Zitat Haugen NC (2006) An empirical study of using planning poker for user story estimation. In: Agile Conference, 2006, IEEE, pp 9–pp Haugen NC (2006) An empirical study of using planning poker for user story estimation. In: Agile Conference, 2006, IEEE, pp 9–pp
Zurück zum Zitat Hearty P, Fenton N, Marquez D, Neil M (2009) Predicting project velocity in XP using a learning dynamic bayesian network model. IEEE Trans Softw Eng 35 (1):124–137CrossRef Hearty P, Fenton N, Marquez D, Neil M (2009) Predicting project velocity in XP using a learning dynamic bayesian network model. IEEE Trans Softw Eng 35 (1):124–137CrossRef
Zurück zum Zitat Hussain I, Kosseim L, Ormandjieva O (2013) Approximation of cosmic functional size to support early effort estimation in agile. Data and Knowledge Engineering 85:2–14CrossRef Hussain I, Kosseim L, Ormandjieva O (2013) Approximation of cosmic functional size to support early effort estimation in agile. Data and Knowledge Engineering 85:2–14CrossRef
Zurück zum Zitat Idri A, Hosni M, Abran A (2016) Systematic literature review of ensemble effort estimation. J Syst Softw 118:151–175CrossRef Idri A, Hosni M, Abran A (2016) Systematic literature review of ensemble effort estimation. J Syst Softw 118:151–175CrossRef
Zurück zum Zitat Jahedpari F (2016) Artificial prediction markets for online prediction of continuous variables. PhD thesis, University of Bath, Bath Jahedpari F (2016) Artificial prediction markets for online prediction of continuous variables. PhD thesis, University of Bath, Bath
Zurück zum Zitat James G, Witten D, Hastie T, Tibshirani R (2015) An Introduction to Statistical Learning with Applications in R. Springer Texts in Statistics. Springer, New YorkMATH James G, Witten D, Hastie T, Tibshirani R (2015) An Introduction to Statistical Learning with Applications in R. Springer Texts in Statistics. Springer, New YorkMATH
Zurück zum Zitat Jonsson L, Borg M, Broman D, Sandahl K, Eldh S, Runeson P (2016) Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts. Empir Softw Eng 21(4):1533–1578CrossRef Jonsson L, Borg M, Broman D, Sandahl K, Eldh S, Runeson P (2016) Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts. Empir Softw Eng 21(4):1533–1578CrossRef
Zurück zum Zitat Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53CrossRef Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53CrossRef
Zurück zum Zitat Karner G (1993) Resource estimation for objectory projects. Objective Systems SF AB, p 17 Karner G (1993) Resource estimation for objectory projects. Objective Systems SF AB, p 17
Zurück zum Zitat Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416CrossRef Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416CrossRef
Zurück zum Zitat Kultur Y, Turhanm B, Bener AB (2008) ENNA: software effort estimation using ensemble of neural networks with associative memory. In: 16th ACM SIGSOFT Kultur Y, Turhanm B, Bener AB (2008) ENNA: software effort estimation using ensemble of neural networks with associative memory. In: 16th ACM SIGSOFT
Zurück zum Zitat Lee D (2016) Alternatives to p value: confidence interval and effect size. Korean Journal of Anesthesiology 69(6):555–562CrossRef Lee D (2016) Alternatives to p value: confidence interval and effect size. Korean Journal of Anesthesiology 69(6):555–562CrossRef
Zurück zum Zitat Li Y, Yue T, Ali S, Zhang L (2017) Zen-ReqOptimizer: A search-based approach for requirements assignment optimization. Empir Softw Eng 22(1):175–234CrossRef Li Y, Yue T, Ali S, Zhang L (2017) Zen-ReqOptimizer: A search-based approach for requirements assignment optimization. Empir Softw Eng 22(1):175–234CrossRef
Zurück zum Zitat Logue K, McDaid K, Greer D (2007) Allowing for task uncertainties and dependencies in agile release planning. In: 4th Proceedings of the Software Measurement European Forum, pp 275–284 Logue K, McDaid K, Greer D (2007) Allowing for task uncertainties and dependencies in agile release planning. In: 4th Proceedings of the Software Measurement European Forum, pp 275–284
Zurück zum Zitat Lokan C, Mendes E (2014) Investigating the use of duration-based moving windows to improve software effort prediction: A replicated study. Inf Softw Technol 56(9):1063–1075CrossRef Lokan C, Mendes E (2014) Investigating the use of duration-based moving windows to improve software effort prediction: A replicated study. Inf Softw Technol 56(9):1063–1075CrossRef
Zurück zum Zitat MacDonell SG, Shepperd M (2010) Data accumulation and software effort prediction. In: ACM-IEEE International Symposium on Empirical Software Engineering and Measurement MacDonell SG, Shepperd M (2010) Data accumulation and software effort prediction. In: ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Zurück zum Zitat Magazinius A, Börjesson S, Feldt R (2012) Investigating intentional distortions in software cost estimation–an exploratory study. J Syst Softw 85(8):1770–1781CrossRef Magazinius A, Börjesson S, Feldt R (2012) Investigating intentional distortions in software cost estimation–an exploratory study. J Syst Softw 85(8):1770–1781CrossRef
Zurück zum Zitat Mahnic V, Hovelja T (2012) On using planning poker for estimating user stories. The Journal of Systems and Software 85:2086–2095CrossRef Mahnic V, Hovelja T (2012) On using planning poker for estimating user stories. The Journal of Systems and Software 85:2086–2095CrossRef
Zurück zum Zitat Minku L, Yao X (2013) Ensembles and locality: Insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528CrossRef Minku L, Yao X (2013) Ensembles and locality: Insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528CrossRef
Zurück zum Zitat Miyazaki Y, Takanou A, Nozaki H, Nakagawa N, Okada K (1991) Method to estimate parameter values in software prediction models. Inf Softw Technol 33:239–243CrossRef Miyazaki Y, Takanou A, Nozaki H, Nakagawa N, Okada K (1991) Method to estimate parameter values in software prediction models. Inf Softw Technol 33:239–243CrossRef
Zurück zum Zitat Nunes N, Constantine L, Kazman R (2011) iUCP: Estimating interactive-software project size with enhanced use-case points. IEEE Software 28(04):64–73CrossRef Nunes N, Constantine L, Kazman R (2011) iUCP: Estimating interactive-software project size with enhanced use-case points. IEEE Software 28(04):64–73CrossRef
Zurück zum Zitat Palmer S, Felsing J (2002) A Practical Guide to Feature-driven Development. Prentice Hall, Upper Sadle River Palmer S, Felsing J (2002) A Practical Guide to Feature-driven Development. Prentice Hall, Upper Sadle River
Zurück zum Zitat Papatheocharous E, Papadopoulos H, Andreou A (2010) Feature subset selection for software cost modelling and estimation. Eng Intell Syst 18:233–246 Papatheocharous E, Papadopoulos H, Andreou A (2010) Feature subset selection for software cost modelling and estimation. Eng Intell Syst 18:233–246
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH
Zurück zum Zitat Pendharkar P, Subramanian G, Rodger J (2005) A probabilistic model for predicting software development effort. IEEE Trans Softw Eng 31(7):615–624CrossRef Pendharkar P, Subramanian G, Rodger J (2005) A probabilistic model for predicting software development effort. IEEE Trans Softw Eng 31(7):615–624CrossRef
Zurück zum Zitat Perols J, Chari K, Agrawal M (2009) Information market-based decision fusion. Manag Sci 55(5):827–842CrossRef Perols J, Chari K, Agrawal M (2009) Information market-based decision fusion. Manag Sci 55(5):827–842CrossRef
Zurück zum Zitat Pikkarainen M, Haikara J, Salo O, Abrahamsson P, Still J (2008) The impact of agile practices on communication in software development. Empir Softw Eng 13(3):303–337CrossRef Pikkarainen M, Haikara J, Salo O, Abrahamsson P, Still J (2008) The impact of agile practices on communication in software development. Empir Softw Eng 13(3):303–337CrossRef
Zurück zum Zitat Santana C, Leoneo F, Vasconcelos A, Gusmão C (2011) Using function points in agile projects. In: International Conference on Agile Software Development. Springer, pp 176–191 Santana C, Leoneo F, Vasconcelos A, Gusmão C (2011) Using function points in agile projects. In: International Conference on Agile Software Development. Springer, pp 176–191
Zurück zum Zitat Shmueli G, Bruce P, Patel N (2016) Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner. Wiley, Hoboken Shmueli G, Bruce P, Patel N (2016) Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner. Wiley, Hoboken
Zurück zum Zitat Stapleton J (1997) Dynamic systems development method. Addison-Wesley, Boston Stapleton J (1997) Dynamic systems development method. Addison-Wesley, Boston
Zurück zum Zitat Usman M, Mendes E, Weidt F, Britto R (2014) Effort estimation in agile software development: a systematic literature review. In: 10th International Conference on Predictive Models in Software Engineering, pp 82–91 Usman M, Mendes E, Weidt F, Britto R (2014) Effort estimation in agile software development: a systematic literature review. In: 10th International Conference on Predictive Models in Software Engineering, pp 82–91
Zurück zum Zitat Usman M, Mendes E, Börstler J (2015) Effort estimation in agile software development: a survey on the state of the practice. In: 19th International Conference on Evaluation and Assessment in Software Engineering Usman M, Mendes E, Börstler J (2015) Effort estimation in agile software development: a survey on the state of the practice. In: 19th International Conference on Evaluation and Assessment in Software Engineering
Zurück zum Zitat Vargha A, Delaney HD (2000) A critique and improvement of the cl common language effect size statistics of mcgraw and wong. J Educ Behav Stat 25(2):101–132 Vargha A, Delaney HD (2000) A critique and improvement of the cl common language effect size statistics of mcgraw and wong. J Educ Behav Stat 25(2):101–132
Zurück zum Zitat VersionOne (2016) 10th annual state of agile report. Technical report VersionOne (2016) 10th annual state of agile report. Technical report
Zurück zum Zitat Vidgen R, Wang X (2009) Coevolving systems and the organization of agile software development. Inf Syst Res 20(3):355–376CrossRef Vidgen R, Wang X (2009) Coevolving systems and the organization of agile software development. Inf Syst Res 20(3):355–376CrossRef
Zurück zum Zitat Wen J, Li S, Lin Z, Hu Y, Huang C (2012) Systematic literature review of machine learning based software development effort estimation models. Inf Softw Technol 54(1):41–59CrossRef Wen J, Li S, Lin Z, Hu Y, Huang C (2012) Systematic literature review of machine learning based software development effort estimation models. Inf Softw Technol 54(1):41–59CrossRef
Zurück zum Zitat Wolpert DH (1992) Stacked generalization. Neural networks 5(2):241–259CrossRef Wolpert DH (1992) Stacked generalization. Neural networks 5(2):241–259CrossRef
Metadaten
Titel
An ensemble-based model for predicting agile software development effort
verfasst von
Onkar Malgonde
Kaushal Chari
Publikationsdatum
11.09.2018
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 2/2019
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-018-9647-0

Weitere Artikel der Ausgabe 2/2019

Empirical Software Engineering 2/2019 Zur Ausgabe

Premium Partner