skip to main content
research-article

Software effort estimation as a multiobjective learning problem

Published:22 October 2013Publication History
Skip Abstract Section

Abstract

Ensembles of learning machines are promising for software effort estimation (SEE), but need to be tailored for this task to have their potential exploited. A key issue when creating ensembles is to produce diverse and accurate base models. Depending on how differently different performance measures behave for SEE, they could be used as a natural way of creating SEE ensembles. We propose to view SEE model creation as a multiobjective learning problem. A multiobjective evolutionary algorithm (MOEA) is used to better understand the tradeoff among different performance measures by creating SEE models through the simultaneous optimisation of these measures. We show that the performance measures behave very differently, presenting sometimes even opposite trends. They are then used as a source of diversity for creating SEE ensembles. A good tradeoff among different measures can be obtained by using an ensemble of MOEA solutions. This ensemble performs similarly or better than a model that does not consider these measures explicitly. Besides, MOEA is also flexible, allowing emphasis of a particular measure if desired. In conclusion, MOEA can be used to better understand the relationship among performance measures and has shown to be very effective in creating SEE models.

References

  1. Agarwal, R., Kumar, M., Mallick, Y. S., Bharadwaj, R. M., and Anantwar, D. 2001. Estimating software projects. Softw. Eng. Notes 16, 4, 60--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Baskeles, B., Turhan, B., and Bener, A. 2007. Software effort estimation using machine learning methods. In Proceedings of ISCIS'07. 1--6.Google ScholarGoogle Scholar
  3. Bishop, C. M. 2005. Neural Networks for Pattern Recognition. Oxford University Press, UK.Google ScholarGoogle Scholar
  4. Boehm, B. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R., Reifer, D. J., and Steece, B. 2000. Software Cost Estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Braga, P. L., Oliveira, A., Ribeiro, G., and Meira, S. 2007. Bagging predictors for estimation of software project effort. In Proceedings of IJCNN'07. 1595--1600.Google ScholarGoogle Scholar
  7. Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Brown, G., Wyatt, J., Harris, R., and Yao, X. 2005. Diversity creation methods: A survey and categorisation. Inf. Fusion 6, 5--20.Google ScholarGoogle ScholarCross RefCross Ref
  9. Cartwright, M. H., Shepperd, M. J., and Song, Q. 2003. Dealing with missing software project data. In Proceedings of METRICS'03. 154--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chandra, A. and Yao, X. 2006. Ensemble learning using multi-objective evolutionary algorithms. J. Math. Modell. Algor. 5, 4, 417--445.Google ScholarGoogle ScholarCross RefCross Ref
  11. Chen, H. and Yao, X. 2009. Regularized negative correlation learning for neural network ensembles. IEEE Trans. Neural Netw. 20, 12, 1962--1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chulani, S., Bohem, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4, 573--583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cohen, J. 1992. A power primer. Psych. Bull. 112, 155--159.Google ScholarGoogle ScholarCross RefCross Ref
  14. Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evalut. Computa. 6, 2, 182--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dejaeger, K., Verbeke, W., Martens, D., and Baesens, B. 2012. Data mining techniques for software effort estimation: A comparative study. IEEE Trans. Softw. Eng. 38, 2, 375--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Demšar, J. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Rese. 7, 130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dolado, J. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 1006--1021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Dolado, J. 2001. On the problem of the software cost function. Info. Softw. Tech. 43, 61--72.Google ScholarGoogle ScholarCross RefCross Ref
  19. Finnoff, W., Hergert, F., and Zimmermann, H. G. 1993. Improving model selection by nonconvergent methods. Neural Netw. 6, 771--783. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit, I. 2003. A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29, 11, 985--995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gruschke, T. M. and Jørgensen, M. 2008. The role of outcome feedback in improving the uncertainty assessment of software development effort estimates. ACM Trans. Softw. Eng. Meth. 17, 4, 20:1--20:35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explorations 11, 1, 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Harman, M. and Clark, J. 2004. Metrics are fitness functions too. In Proceedings of METRICS'04. 172--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Heiat, A. 2002. Comparison of artificial neural network and regression models for estimating software development effort. Info. Softw. Tech. 44, 911--922.Google ScholarGoogle ScholarCross RefCross Ref
  26. ISBSG. 2011. The International Software Benchmarking Standards Group. http://www.isbsg.org.Google ScholarGoogle Scholar
  27. Jørgensen, M. and Shepperd, M. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1, 33--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jørgensen, M. and Grimstad, S. 2011. The impact of irrelevant and misleading information on software development effort estimates: A randomized controlled field experiment. IEEE Trans. Softw. Eng. 37, 5, 695--707. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Khare, V., Yao, X., and Deb, K. 2003. Performance scaling of multi-objective evolutionary algorithms. In Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (EMO'03), C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, Eds., Lecture Notes in Computer Science, vol. 2632. Springer-Verlag, 376--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kocaguneli, E., Bener, A., and Kultur, Y. 2009. Combining multiple learners induced on multiple datasets for software effort prediction. In Proceedings of ISSRE'07.Google ScholarGoogle Scholar
  31. Kocaguneli, E., Menzies, T., and Keung, J. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38, 6, 1403--1416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kultur, Y., Turhan, B., and Bener, A. 2009. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowl. Based Syst. 22, 395--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kuncheva, L. I. and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machi. Learn. 51, 181--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Legg, S., Hutter, M., and Kumar, A. 2004. Tournament versus fitness uniform selection. In Proceedings of the Congress of Evolutionary Computation (CEC). 2144--2151.Google ScholarGoogle Scholar
  35. Liu, Y. and Yao, X. 1999a. Ensemble learning via negative correlation. Neur. Netw. 12, 1399--1404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Liu, Y. and Yao, X. 1999b. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans. Syst. Man Cybernetics - Part B: Cybernetics 29, 6, 716--725. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lukasiewycz, M., Gla, M., Reimann, F., and Helwig, S. 2011. Opt4j: The meta-heuristic optimisation framework for java. http://opt4j.sourceforge.net.Google ScholarGoogle Scholar
  38. Menzies, T., Chen, Z., Hihn, J., and Lum, K. 2006. Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32, 11, 883--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Menzies, T. and Shepperd, M. Eds. 2012. Empirical Software Engineering: Special issue on Repeatable Results in Software Engineering Prediction. 17, 1/2:1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Miller, B. L. and Goldberg, D. E. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 3, 193--212.Google ScholarGoogle Scholar
  41. Minku, L. L. 2011. Machine learning for software effort estimation. The 13th CREST Open Workshop Future Internet Testing (FITTEST) & Search Based Software Engineering (SBSE), http://crest.cs.ucl.ac.uk/cow/13/slides/presentation_leandro.pdf, http://crest.cs.ucl.ac.uk/cow/13/videos/M2U00270Minku.mp4.Google ScholarGoogle Scholar
  42. Minku, L. L. and Yao, X. 2011. A principled evaluation of ensembles of learning machines for software effort estimation. In Proceedings of PROMISE'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Minku, L. L. and Yao, X. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8, 1512--1528.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Mohagheghi, P., Anda, B., and Conradi, R. 2005. Effort estimation of use cases for incremental large-scale software development. In Proceedings of ICSE. 303--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Montgomery, D. C. 2004. Design and Analysis of Experiments 6th Ed. John Wiley and Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Praditwong, K., Harman, M., and Yao, X. 2011. Software module clustering as a multi-objective search problem. IEEE Trans. Softw. Eng. 37, 2, 264--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Praditwong, K. and Yao, X. 2006. A new multi-objective evolutionary optimisation algorithm: the two-archive algorithm. In Proceedings of the International Conference on Computational Intelligence and Security (CIS'06). Vol. 1, 286--291.Google ScholarGoogle Scholar
  48. Rosenthal, R. 1994. The Handbook of Research Synthesis. Vol. 236, Sage, New York.Google ScholarGoogle Scholar
  49. Seo, Y.-S., Yoon, K.-A., and Bae, D.-H. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the PROMISE. 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Shan, Y., McKay, R. J., Lokan, C. J., and Essam, D. L. 2002. Software project effort estimation using genetic programming. In Proceedings of the ICCCAS & WESINO EXPO. Vol. 2. 1108--1112.Google ScholarGoogle Scholar
  51. Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23, 12, 736--743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Shirabad, J. S. and Menzies, T. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise. site.uottawa.ca/SERepository.Google ScholarGoogle Scholar
  53. Srinivas, N. and Deb, K. 1994. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolut. Comput. 2, 221--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Srivasan, K. and Fisher, D. 1995. Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21, 2, 126--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Tan, H. B. K., Zhao, Y., and Zhang, H. 2006. Estimating LOC for information systems from their conceptual data models. In Proceedings of ICSE. 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Tan, H. B. K., Zhao, Y., and Zhang, H. 2009. Conceptual data model-based software size estimation for information systems. ACM Trans. Softw. Eng. Meth. 19, 2, 4:1--4:37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Tronto, I. F. B., Silva, J. D. S., and Sant'Anna, N. 2007. Comparison of artificial neural network and regression models in software effort estimation. In Proceedings of IJCNN'07. 771--776.Google ScholarGoogle Scholar
  58. Wang, Z., Tang, K., and Yao, X. 2010. Multi-objective approaches to optimal testing resource allocation in modular software systems. IEEE Trans. Reliability 59, 3, 563--575.Google ScholarGoogle ScholarCross RefCross Ref
  59. Wittig, G. E. and Finnie, G. R. 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austral. J. Info. Syst. 1, 2, 87--94.Google ScholarGoogle Scholar
  60. Wittig, G. E. and Finnie, G. R. 1997. Estimating software development effort with connectionist models. Inf. Softw. Tech. 39, 469--476.Google ScholarGoogle ScholarCross RefCross Ref
  61. Zhao, Y. and Zhang, Y. 2008. Comparison of decision tree methods for finding active objects. Adv. Space 41, 1955--1959.Google ScholarGoogle ScholarCross RefCross Ref
  62. Zitzler, E., Laumanns, M., and Thiele, L. 2002. SPEA2: Improving the strength pareto evolutionary algorithm. In Proceedings of the Conference on Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems (EUROGEN'02), 95--100.Google ScholarGoogle Scholar

Index Terms

  1. Software effort estimation as a multiobjective learning problem

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Software Engineering and Methodology
              ACM Transactions on Software Engineering and Methodology  Volume 22, Issue 4
              Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
              October 2013
              387 pages
              ISSN:1049-331X
              EISSN:1557-7392
              DOI:10.1145/2522920
              Issue’s Table of Contents

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 22 October 2013
              • Accepted: 1 December 2012
              • Revised: 1 August 2012
              • Received: 1 November 2011
              Published in tosem Volume 22, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader