research-article

Software effort estimation as a multiobjective learning problem

Authors:
Leandro L. Minku

The University of Birmingham, UK

The University of Birmingham, UK
View Profile

,
Xin Yao

The University of Birmingham, UK

The University of Birmingham, UK
View Profile

ACM Transactions on Software Engineering and Methodology Volume 22 Issue 4Article No.: 35pp 1–32https://doi.org/10.1145/2522920.2522928

Published:22 October 2013Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Ensembles of learning machines are promising for software effort estimation (SEE), but need to be tailored for this task to have their potential exploited. A key issue when creating ensembles is to produce diverse and accurate base models. Depending on how differently different performance measures behave for SEE, they could be used as a natural way of creating SEE ensembles. We propose to view SEE model creation as a multiobjective learning problem. A multiobjective evolutionary algorithm (MOEA) is used to better understand the tradeoff among different performance measures by creating SEE models through the simultaneous optimisation of these measures. We show that the performance measures behave very differently, presenting sometimes even opposite trends. They are then used as a source of diversity for creating SEE ensembles. A good tradeoff among different measures can be obtained by using an ensemble of MOEA solutions. This ensemble performs similarly or better than a model that does not consider these measures explicitly. Besides, MOEA is also flexible, allowing emphasis of a particular measure if desired. In conclusion, MOEA can be used to better understand the relationship among performance measures and has shown to be very effective in creating SEE models.

References

Agarwal, R., Kumar, M., Mallick, Y. S., Bharadwaj, R. M., and Anantwar, D. 2001. Estimating software projects. Softw. Eng. Notes 16, 4, 60--67. Google ScholarDigital Library
Baskeles, B., Turhan, B., and Bener, A. 2007. Software effort estimation using machine learning methods. In Proceedings of ISCIS'07. 1--6.Google Scholar
Bishop, C. M. 2005. Neural Networks for Pattern Recognition. Oxford University Press, UK.Google Scholar
Boehm, B. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R., Reifer, D. J., and Steece, B. 2000. Software Cost Estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
Braga, P. L., Oliveira, A., Ribeiro, G., and Meira, S. 2007. Bagging predictors for estimation of software project effort. In Proceedings of IJCNN'07. 1595--1600.Google Scholar
Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140. Google ScholarDigital Library
Brown, G., Wyatt, J., Harris, R., and Yao, X. 2005. Diversity creation methods: A survey and categorisation. Inf. Fusion 6, 5--20.Google ScholarCross Ref
Cartwright, M. H., Shepperd, M. J., and Song, Q. 2003. Dealing with missing software project data. In Proceedings of METRICS'03. 154--165. Google ScholarDigital Library
Chandra, A. and Yao, X. 2006. Ensemble learning using multi-objective evolutionary algorithms. J. Math. Modell. Algor. 5, 4, 417--445.Google ScholarCross Ref
Chen, H. and Yao, X. 2009. Regularized negative correlation learning for neural network ensembles. IEEE Trans. Neural Netw. 20, 12, 1962--1979. Google ScholarDigital Library
Chulani, S., Bohem, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4, 573--583. Google ScholarDigital Library
Cohen, J. 1992. A power primer. Psych. Bull. 112, 155--159.Google ScholarCross Ref
Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evalut. Computa. 6, 2, 182--197. Google ScholarDigital Library
Dejaeger, K., Verbeke, W., Martens, D., and Baesens, B. 2012. Data mining techniques for software effort estimation: A comparative study. IEEE Trans. Softw. Eng. 38, 2, 375--397. Google ScholarDigital Library
Demšar, J. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Rese. 7, 130. Google ScholarDigital Library
Dolado, J. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 1006--1021. Google ScholarDigital Library
Dolado, J. 2001. On the problem of the software cost function. Info. Softw. Tech. 43, 61--72.Google ScholarCross Ref
Finnoff, W., Hergert, F., and Zimmermann, H. G. 1993. Improving model selection by nonconvergent methods. Neural Netw. 6, 771--783. Google ScholarDigital Library
Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit, I. 2003. A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29, 11, 985--995. Google ScholarDigital Library
Gruschke, T. M. and Jørgensen, M. 2008. The role of outcome feedback in improving the uncertainty assessment of software development effort estimates. ACM Trans. Softw. Eng. Meth. 17, 4, 20:1--20:35. Google ScholarDigital Library
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explorations 11, 1, 10--18. Google ScholarDigital Library
Harman, M. and Clark, J. 2004. Metrics are fitness functions too. In Proceedings of METRICS'04. 172--183. Google ScholarDigital Library
Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York. Google ScholarDigital Library
Heiat, A. 2002. Comparison of artificial neural network and regression models for estimating software development effort. Info. Softw. Tech. 44, 911--922.Google ScholarCross Ref
ISBSG. 2011. The International Software Benchmarking Standards Group. http://www.isbsg.org.Google Scholar
Jørgensen, M. and Shepperd, M. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1, 33--53. Google ScholarDigital Library
Jørgensen, M. and Grimstad, S. 2011. The impact of irrelevant and misleading information on software development effort estimates: A randomized controlled field experiment. IEEE Trans. Softw. Eng. 37, 5, 695--707. Google ScholarDigital Library
Khare, V., Yao, X., and Deb, K. 2003. Performance scaling of multi-objective evolutionary algorithms. In Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (EMO'03), C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, Eds., Lecture Notes in Computer Science, vol. 2632. Springer-Verlag, 376--390. Google ScholarDigital Library
Kocaguneli, E., Bener, A., and Kultur, Y. 2009. Combining multiple learners induced on multiple datasets for software effort prediction. In Proceedings of ISSRE'07.Google Scholar
Kocaguneli, E., Menzies, T., and Keung, J. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38, 6, 1403--1416. Google ScholarDigital Library
Kultur, Y., Turhan, B., and Bener, A. 2009. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowl. Based Syst. 22, 395--402. Google ScholarDigital Library
Kuncheva, L. I. and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machi. Learn. 51, 181--207. Google ScholarDigital Library
Legg, S., Hutter, M., and Kumar, A. 2004. Tournament versus fitness uniform selection. In Proceedings of the Congress of Evolutionary Computation (CEC). 2144--2151.Google Scholar
Liu, Y. and Yao, X. 1999a. Ensemble learning via negative correlation. Neur. Netw. 12, 1399--1404. Google ScholarDigital Library
Liu, Y. and Yao, X. 1999b. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans. Syst. Man Cybernetics - Part B: Cybernetics 29, 6, 716--725. Google ScholarDigital Library
Lukasiewycz, M., Gla, M., Reimann, F., and Helwig, S. 2011. Opt4j: The meta-heuristic optimisation framework for java. http://opt4j.sourceforge.net.Google Scholar
Menzies, T., Chen, Z., Hihn, J., and Lum, K. 2006. Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32, 11, 883--895. Google ScholarDigital Library
Menzies, T. and Shepperd, M. Eds. 2012. Empirical Software Engineering: Special issue on Repeatable Results in Software Engineering Prediction. 17, 1/2:1--17. Google ScholarDigital Library
Miller, B. L. and Goldberg, D. E. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 3, 193--212.Google Scholar
Minku, L. L. 2011. Machine learning for software effort estimation. The 13th CREST Open Workshop Future Internet Testing (FITTEST) & Search Based Software Engineering (SBSE), http://crest.cs.ucl.ac.uk/cow/13/slides/presentation_leandro.pdf, http://crest.cs.ucl.ac.uk/cow/13/videos/M2U00270Minku.mp4.Google Scholar
Minku, L. L. and Yao, X. 2011. A principled evaluation of ensembles of learning machines for software effort estimation. In Proceedings of PROMISE'11. Google ScholarDigital Library
Minku, L. L. and Yao, X. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8, 1512--1528.Google ScholarDigital Library
Mohagheghi, P., Anda, B., and Conradi, R. 2005. Effort estimation of use cases for incremental large-scale software development. In Proceedings of ICSE. 303--311. Google ScholarDigital Library
Montgomery, D. C. 2004. Design and Analysis of Experiments 6th Ed. John Wiley and Sons. Google ScholarDigital Library
Praditwong, K., Harman, M., and Yao, X. 2011. Software module clustering as a multi-objective search problem. IEEE Trans. Softw. Eng. 37, 2, 264--282. Google ScholarDigital Library
Praditwong, K. and Yao, X. 2006. A new multi-objective evolutionary optimisation algorithm: the two-archive algorithm. In Proceedings of the International Conference on Computational Intelligence and Security (CIS'06). Vol. 1, 286--291.Google Scholar
Rosenthal, R. 1994. The Handbook of Research Synthesis. Vol. 236, Sage, New York.Google Scholar
Seo, Y.-S., Yoon, K.-A., and Bae, D.-H. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the PROMISE. 25--32. Google ScholarDigital Library
Shan, Y., McKay, R. J., Lokan, C. J., and Essam, D. L. 2002. Software project effort estimation using genetic programming. In Proceedings of the ICCCAS & WESINO EXPO. Vol. 2. 1108--1112.Google Scholar
Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23, 12, 736--743. Google ScholarDigital Library
Shirabad, J. S. and Menzies, T. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise. site.uottawa.ca/SERepository.Google Scholar
Srinivas, N. and Deb, K. 1994. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolut. Comput. 2, 221--248. Google ScholarDigital Library
Srivasan, K. and Fisher, D. 1995. Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21, 2, 126--137. Google ScholarDigital Library
Tan, H. B. K., Zhao, Y., and Zhang, H. 2006. Estimating LOC for information systems from their conceptual data models. In Proceedings of ICSE. 321--330. Google ScholarDigital Library
Tan, H. B. K., Zhao, Y., and Zhang, H. 2009. Conceptual data model-based software size estimation for information systems. ACM Trans. Softw. Eng. Meth. 19, 2, 4:1--4:37. Google ScholarDigital Library
Tronto, I. F. B., Silva, J. D. S., and Sant'Anna, N. 2007. Comparison of artificial neural network and regression models in software effort estimation. In Proceedings of IJCNN'07. 771--776.Google Scholar
Wang, Z., Tang, K., and Yao, X. 2010. Multi-objective approaches to optimal testing resource allocation in modular software systems. IEEE Trans. Reliability 59, 3, 563--575.Google ScholarCross Ref
Wittig, G. E. and Finnie, G. R. 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austral. J. Info. Syst. 1, 2, 87--94.Google Scholar
Wittig, G. E. and Finnie, G. R. 1997. Estimating software development effort with connectionist models. Inf. Softw. Tech. 39, 469--476.Google ScholarCross Ref
Zhao, Y. and Zhang, Y. 2008. Comparison of decision tree methods for finding active objects. Adv. Space 41, 1955--1959.Google ScholarCross Ref
Zitzler, E., Laumanns, M., and Thiele, L. 2002. SPEA2: Improving the strength pareto evolutionary algorithm. In Proceedings of the Conference on Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems (EUROGEN'02), 95--100.Google Scholar

Index Terms

Software effort estimation as a multiobjective learning problem

Recommendations

An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation
PROMISE '13: Proceedings of the 9th International Conference on Predictive Models in Software Engineering

Background: Previous work showed that Multi-objective Evolutionary Algorithms (MOEAs) can be used for training ensembles of learning machines for Software Effort Estimation (SEE) by optimising different performance measures concurrently. Optimisation ...
Read More
MOEA/D with opposition-based learning for multiobjective optimization problem

Multiobjective evolutionary algorithm based on decomposition (MOEA/D) has attracted a great deal of attention and has obtained enormous success in the field of evolutionary multiobjective optimization. It converts a multiobjective optimization problem (...
Read More
Towards a Pareto Front Shape Invariant Multi-Objective Evolutionary Algorithm Using Pair-Potential Functions
Advances in Computational Intelligence
Abstract
Reference sets generated with uniformly distributed weight vectors on a unit simplex are widely used by several multi-objective evolutionary algorithms (MOEAs). They have been employed to tackle multi-objective optimization problems (MOPs) with ... $^{}$
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Software Engineering and Methodology Volume 22, Issue 4
Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
October 2013
387 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2522920
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2013
- Accepted: 1 December 2012
- Revised: 1 August 2012
- Received: 1 November 2011
Published in tosem Volume 22, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Software effort estimation
ensembles of learning machines
multi-objective evolutionary algorithms
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 80
  Total Citations
  View Citations
- 1,011
  Total Downloads
- Downloads (Last 12 months)26
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Software effort estimation as a multiobjective learning problem

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation

MOEA/D with opposition-based learning for multiobjective optimization problem

Towards a Pareto Front Shape Invariant Multi-Objective Evolutionary Algorithm Using Pair-Potential Functions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Software effort estimation as a multiobjective learning problem

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation

MOEA/D with opposition-based learning for multiobjective optimization problem

Towards a Pareto Front Shape Invariant Multi-Objective Evolutionary Algorithm Using Pair-Potential Functions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media