Abstract
Ensembles of learning machines are promising for software effort estimation (SEE), but need to be tailored for this task to have their potential exploited. A key issue when creating ensembles is to produce diverse and accurate base models. Depending on how differently different performance measures behave for SEE, they could be used as a natural way of creating SEE ensembles. We propose to view SEE model creation as a multiobjective learning problem. A multiobjective evolutionary algorithm (MOEA) is used to better understand the tradeoff among different performance measures by creating SEE models through the simultaneous optimisation of these measures. We show that the performance measures behave very differently, presenting sometimes even opposite trends. They are then used as a source of diversity for creating SEE ensembles. A good tradeoff among different measures can be obtained by using an ensemble of MOEA solutions. This ensemble performs similarly or better than a model that does not consider these measures explicitly. Besides, MOEA is also flexible, allowing emphasis of a particular measure if desired. In conclusion, MOEA can be used to better understand the relationship among performance measures and has shown to be very effective in creating SEE models.
- Agarwal, R., Kumar, M., Mallick, Y. S., Bharadwaj, R. M., and Anantwar, D. 2001. Estimating software projects. Softw. Eng. Notes 16, 4, 60--67. Google ScholarDigital Library
- Baskeles, B., Turhan, B., and Bener, A. 2007. Software effort estimation using machine learning methods. In Proceedings of ISCIS'07. 1--6.Google Scholar
- Bishop, C. M. 2005. Neural Networks for Pattern Recognition. Oxford University Press, UK.Google Scholar
- Boehm, B. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R., Reifer, D. J., and Steece, B. 2000. Software Cost Estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Braga, P. L., Oliveira, A., Ribeiro, G., and Meira, S. 2007. Bagging predictors for estimation of software project effort. In Proceedings of IJCNN'07. 1595--1600.Google Scholar
- Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140. Google ScholarDigital Library
- Brown, G., Wyatt, J., Harris, R., and Yao, X. 2005. Diversity creation methods: A survey and categorisation. Inf. Fusion 6, 5--20.Google ScholarCross Ref
- Cartwright, M. H., Shepperd, M. J., and Song, Q. 2003. Dealing with missing software project data. In Proceedings of METRICS'03. 154--165. Google ScholarDigital Library
- Chandra, A. and Yao, X. 2006. Ensemble learning using multi-objective evolutionary algorithms. J. Math. Modell. Algor. 5, 4, 417--445.Google ScholarCross Ref
- Chen, H. and Yao, X. 2009. Regularized negative correlation learning for neural network ensembles. IEEE Trans. Neural Netw. 20, 12, 1962--1979. Google ScholarDigital Library
- Chulani, S., Bohem, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4, 573--583. Google ScholarDigital Library
- Cohen, J. 1992. A power primer. Psych. Bull. 112, 155--159.Google ScholarCross Ref
- Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evalut. Computa. 6, 2, 182--197. Google ScholarDigital Library
- Dejaeger, K., Verbeke, W., Martens, D., and Baesens, B. 2012. Data mining techniques for software effort estimation: A comparative study. IEEE Trans. Softw. Eng. 38, 2, 375--397. Google ScholarDigital Library
- Demšar, J. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Rese. 7, 130. Google ScholarDigital Library
- Dolado, J. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 1006--1021. Google ScholarDigital Library
- Dolado, J. 2001. On the problem of the software cost function. Info. Softw. Tech. 43, 61--72.Google ScholarCross Ref
- Finnoff, W., Hergert, F., and Zimmermann, H. G. 1993. Improving model selection by nonconvergent methods. Neural Netw. 6, 771--783. Google ScholarDigital Library
- Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit, I. 2003. A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29, 11, 985--995. Google ScholarDigital Library
- Gruschke, T. M. and Jørgensen, M. 2008. The role of outcome feedback in improving the uncertainty assessment of software development effort estimates. ACM Trans. Softw. Eng. Meth. 17, 4, 20:1--20:35. Google ScholarDigital Library
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explorations 11, 1, 10--18. Google ScholarDigital Library
- Harman, M. and Clark, J. 2004. Metrics are fitness functions too. In Proceedings of METRICS'04. 172--183. Google ScholarDigital Library
- Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York. Google ScholarDigital Library
- Heiat, A. 2002. Comparison of artificial neural network and regression models for estimating software development effort. Info. Softw. Tech. 44, 911--922.Google ScholarCross Ref
- ISBSG. 2011. The International Software Benchmarking Standards Group. http://www.isbsg.org.Google Scholar
- Jørgensen, M. and Shepperd, M. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1, 33--53. Google ScholarDigital Library
- Jørgensen, M. and Grimstad, S. 2011. The impact of irrelevant and misleading information on software development effort estimates: A randomized controlled field experiment. IEEE Trans. Softw. Eng. 37, 5, 695--707. Google ScholarDigital Library
- Khare, V., Yao, X., and Deb, K. 2003. Performance scaling of multi-objective evolutionary algorithms. In Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (EMO'03), C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, Eds., Lecture Notes in Computer Science, vol. 2632. Springer-Verlag, 376--390. Google ScholarDigital Library
- Kocaguneli, E., Bener, A., and Kultur, Y. 2009. Combining multiple learners induced on multiple datasets for software effort prediction. In Proceedings of ISSRE'07.Google Scholar
- Kocaguneli, E., Menzies, T., and Keung, J. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38, 6, 1403--1416. Google ScholarDigital Library
- Kultur, Y., Turhan, B., and Bener, A. 2009. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowl. Based Syst. 22, 395--402. Google ScholarDigital Library
- Kuncheva, L. I. and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machi. Learn. 51, 181--207. Google ScholarDigital Library
- Legg, S., Hutter, M., and Kumar, A. 2004. Tournament versus fitness uniform selection. In Proceedings of the Congress of Evolutionary Computation (CEC). 2144--2151.Google Scholar
- Liu, Y. and Yao, X. 1999a. Ensemble learning via negative correlation. Neur. Netw. 12, 1399--1404. Google ScholarDigital Library
- Liu, Y. and Yao, X. 1999b. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans. Syst. Man Cybernetics - Part B: Cybernetics 29, 6, 716--725. Google ScholarDigital Library
- Lukasiewycz, M., Gla, M., Reimann, F., and Helwig, S. 2011. Opt4j: The meta-heuristic optimisation framework for java. http://opt4j.sourceforge.net.Google Scholar
- Menzies, T., Chen, Z., Hihn, J., and Lum, K. 2006. Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32, 11, 883--895. Google ScholarDigital Library
- Menzies, T. and Shepperd, M. Eds. 2012. Empirical Software Engineering: Special issue on Repeatable Results in Software Engineering Prediction. 17, 1/2:1--17. Google ScholarDigital Library
- Miller, B. L. and Goldberg, D. E. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 3, 193--212.Google Scholar
- Minku, L. L. 2011. Machine learning for software effort estimation. The 13th CREST Open Workshop Future Internet Testing (FITTEST) & Search Based Software Engineering (SBSE), http://crest.cs.ucl.ac.uk/cow/13/slides/presentation_leandro.pdf, http://crest.cs.ucl.ac.uk/cow/13/videos/M2U00270Minku.mp4.Google Scholar
- Minku, L. L. and Yao, X. 2011. A principled evaluation of ensembles of learning machines for software effort estimation. In Proceedings of PROMISE'11. Google ScholarDigital Library
- Minku, L. L. and Yao, X. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8, 1512--1528.Google ScholarDigital Library
- Mohagheghi, P., Anda, B., and Conradi, R. 2005. Effort estimation of use cases for incremental large-scale software development. In Proceedings of ICSE. 303--311. Google ScholarDigital Library
- Montgomery, D. C. 2004. Design and Analysis of Experiments 6th Ed. John Wiley and Sons. Google ScholarDigital Library
- Praditwong, K., Harman, M., and Yao, X. 2011. Software module clustering as a multi-objective search problem. IEEE Trans. Softw. Eng. 37, 2, 264--282. Google ScholarDigital Library
- Praditwong, K. and Yao, X. 2006. A new multi-objective evolutionary optimisation algorithm: the two-archive algorithm. In Proceedings of the International Conference on Computational Intelligence and Security (CIS'06). Vol. 1, 286--291.Google Scholar
- Rosenthal, R. 1994. The Handbook of Research Synthesis. Vol. 236, Sage, New York.Google Scholar
- Seo, Y.-S., Yoon, K.-A., and Bae, D.-H. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the PROMISE. 25--32. Google ScholarDigital Library
- Shan, Y., McKay, R. J., Lokan, C. J., and Essam, D. L. 2002. Software project effort estimation using genetic programming. In Proceedings of the ICCCAS & WESINO EXPO. Vol. 2. 1108--1112.Google Scholar
- Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23, 12, 736--743. Google ScholarDigital Library
- Shirabad, J. S. and Menzies, T. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise. site.uottawa.ca/SERepository.Google Scholar
- Srinivas, N. and Deb, K. 1994. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolut. Comput. 2, 221--248. Google ScholarDigital Library
- Srivasan, K. and Fisher, D. 1995. Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21, 2, 126--137. Google ScholarDigital Library
- Tan, H. B. K., Zhao, Y., and Zhang, H. 2006. Estimating LOC for information systems from their conceptual data models. In Proceedings of ICSE. 321--330. Google ScholarDigital Library
- Tan, H. B. K., Zhao, Y., and Zhang, H. 2009. Conceptual data model-based software size estimation for information systems. ACM Trans. Softw. Eng. Meth. 19, 2, 4:1--4:37. Google ScholarDigital Library
- Tronto, I. F. B., Silva, J. D. S., and Sant'Anna, N. 2007. Comparison of artificial neural network and regression models in software effort estimation. In Proceedings of IJCNN'07. 771--776.Google Scholar
- Wang, Z., Tang, K., and Yao, X. 2010. Multi-objective approaches to optimal testing resource allocation in modular software systems. IEEE Trans. Reliability 59, 3, 563--575.Google ScholarCross Ref
- Wittig, G. E. and Finnie, G. R. 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austral. J. Info. Syst. 1, 2, 87--94.Google Scholar
- Wittig, G. E. and Finnie, G. R. 1997. Estimating software development effort with connectionist models. Inf. Softw. Tech. 39, 469--476.Google ScholarCross Ref
- Zhao, Y. and Zhang, Y. 2008. Comparison of decision tree methods for finding active objects. Adv. Space 41, 1955--1959.Google ScholarCross Ref
- Zitzler, E., Laumanns, M., and Thiele, L. 2002. SPEA2: Improving the strength pareto evolutionary algorithm. In Proceedings of the Conference on Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems (EUROGEN'02), 95--100.Google Scholar
Index Terms
- Software effort estimation as a multiobjective learning problem
Recommendations
An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation
PROMISE '13: Proceedings of the 9th International Conference on Predictive Models in Software EngineeringBackground: Previous work showed that Multi-objective Evolutionary Algorithms (MOEAs) can be used for training ensembles of learning machines for Software Effort Estimation (SEE) by optimising different performance measures concurrently. Optimisation ...
MOEA/D with opposition-based learning for multiobjective optimization problem
Multiobjective evolutionary algorithm based on decomposition (MOEA/D) has attracted a great deal of attention and has obtained enormous success in the field of evolutionary multiobjective optimization. It converts a multiobjective optimization problem (...
Towards a Pareto Front Shape Invariant Multi-Objective Evolutionary Algorithm Using Pair-Potential Functions
Advances in Computational IntelligenceAbstractReference sets generated with uniformly distributed weight vectors on a unit simplex are widely used by several multi-objective evolutionary algorithms (MOEAs). They have been employed to tackle multi-objective optimization problems (MOPs) with ...
Comments