Top

Automated Software Engineering

Published in:

01-12-2013

Finding conclusion stability for selecting the best effort predictor in software effort estimation

Published in: Automated Software Engineering | Issue 4/2013

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Background: Conclusion Instability in software effort estimation (SEE) refers to the inconsistent results produced by a diversity of predictors using different datasets. This is largely due to the “ranking instability” problem, which is highly related to the evaluation criteria and the subset of the data being used.

Aim: To determine stable rankings of different predictors.

Method: 90 predictors are used with 20 datasets and evaluated using 7 performance measures, whose results are subject to Wilcoxon rank test (95 %). These results are called the “aggregate results”. The aggregate results are challenged by a sanity check, which focuses on a single error measure (MRE) and uses a newly developed evaluation algorithm called CLUSTER. These results are called the “specific results.”

Results: Aggregate results show that: (1) It is now possible to draw stable conclusions about the relative performance of SEE predictors; (2) Regression trees or analogy-based methods are the best performers. The aggregate results are also confirmed by the specific results of the sanity check.

Conclusion: This study offers means to address the conclusion instability issue in SEE, which is an important finding for empirical software engineering.

previous article The WTE+ framework: automated construction and runtime adaptation of service mashups

next article Verification of complex dynamic data tree with mu-calculus

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

http://promisedata.org/data.

Albrecht, A., Gaffney, J.: Software function, source lines of code and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. 9, 639–648 (1983) CrossRef

Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)

Auer, M., Trendowicz, A., Graser, B., Haunschmid, E., Biffl, S.: Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Trans. Softw. Eng. 32, 83–92 (2006) CrossRef

Baker, D.: A hybrid approach to expert and model-based effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University (2007). Available from https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5443

Bakir, A., Turhan, B., Bener, A.B.: A new perspective on data homogeneity in software cost estimation: a study in the embedded systems domain. Softw. Qual. Control 18, 57–80 (2010) CrossRef

Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Upper Saddle River (1981) MATH

Brady, A., Menzies, T.: Case-based reasoning vs parametric models for software quality optimization. In: International Conference on Predictive Models in Software Engineering PROMISE’10, Sept. IEEE, New York (2010)

Breiman, L.: Technical note: some properties of splitting criteria. Mach. Learn. 24(41–47), 10 (1996) doi:1023/A:1018094028462 MathSciNet

Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984) MATH

Chang, C.-L.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Comput. C-23(11), 1179–1184 (1974) CrossRef

Fayyad, U.M., Irani, I.H.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the International Joint Conference on Uncertainty in AI, pp. 1022–1027 (1993)

Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29(11), 985–995 (2003) CrossRef

Gama, J., Pinto, C.: Discretization from data streams: applications to histograms and data mining. In: SAC ’06: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667. ACM Press, New York (2006). Available from http://www.liacc.up.pt/~jgama/IWKDDS/Papers/p6.pdf CrossRef

Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003) CrossRef

Jørgensen, M.: A review of studies on expert estimation of software development effort. J. Syst. Softw. 70(1–2), 37–60 (2004) CrossRef

Jorgensen, M.: Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw. 22(3), 57–63 (2005) CrossRef

Kadoda, G., Cartwright, M., Shepperd, M.: On configuring a case-based reasoning software project prediction system. In: UK CBR Workshop, Cambridge, UK, pp. 1–10 (2000)

Kemerer, C.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987) CrossRef

Keung, J.: Empirical evaluation of analogy-x for software cost estimation. In: ESEM ’08: Proceedings of the Second International Symposium on Empirical Software Engineering and Measurement, pp. 294–296. ACM, New York (2008) CrossRef

Keung, J., Kitchenham, B.: Experiments with analogy-x for software cost estimation. In: ASWEC ’08: Proceedings of the 19th Australian Conference on Software Engineering, pp. 229–238. IEEE Computer Society, Washington (2008) CrossRef

Keung, J.W., Kitchenham, B.A., Jeffery, D.R.: Analogy-x: providing statistical inference to analogy-based software cost estimation. IEEE Trans. Softw. Eng. 34(4), 471–484 (2008) CrossRef

Kirsopp, C., Shepperd, M., Premrag, R.: Case and feature subset selection in case-based software project effort prediction. In: Research and Development in Intelligent Systems XIX: Proceedings of ES2002, the Twenty-Second SGAI International Conference on Knowledge Based Systems and Applied Artificial Intelligence, p. 61 (2003)

Kirsopp, C., Shepperd, M.J.: Making inferences with small numbers of training sets. IEEE Softw. 149(5), 123–130 (2002) CrossRef

Kitchenham, B., Känsälä, K.: Inter-item correlations among function points. In: ICSE’93:Proceedings of the 15th International Conference on Software Engineering, ICSE ’93, pp. 477–480. IEEE Computer Society Press, Los Alamitos (1993) CrossRef

Kitchenham, B., Mendes, E., Travassos, G.H.: Cross versus within-company cost estimation studies: a systematic review. IEEE Trans. Softw. Eng. 33(5), 316–329 (2007) CrossRef

Kliijnen, J.: Sensitivity analysis and related analyses: a survey of statistical techniques. J. Stat. Comput. Simul. 57(1–4), 111–142 (1997) CrossRef

Li, J., Ruhe, G.: A comparative study of attribute weighting heuristics for effort estimation by analogy. In: Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, p. 74 (2006)

Li, J., Ruhe, G.: Decision support analysis for software effort estimation by analogy. In: International Conference on Predictive Models in Software Engineering PROMISE’07, May (2007)

Li, Y., Xie, M., Goh, T.: A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82, 241–252 (2009) CrossRef

Lipowezky, U.: Selection of the optimal prototype subset for 1-nn classification. Pattern Recognit. Lett. 19, 907–918 (1998) CrossRef

Maxwell, K.D.: Applied Statistics for Software Managers. Prentice Hall, PTR, Upper Saddle River (2002)

Mendes, E., Watson, I.D., Triggs, C., Mosley, N., Counsell, S.: A comparative study of cost estimation models for web hypermedia applications. Empir. Softw. Eng. 8(2), 163–196 (2003) CrossRef

Menzies, T., Jalali, O., Hihn, J., Baker, D., Lum, K.: Stable rankings for different effort models. Autom. Softw. Eng. 17, 409–437 (2010) CrossRef

Milicic, D., Wohlin, C.: Distribution patterns of effort estimations. In: EUROMICRO, pp. 422–429 (2004)

Miyazaki, Y., Terakado, M., Ozaki, K., Nozaki, H.: Robust regression for developing software estimation models. J. Syst. Softw. 27(1), 3–16 (1994) CrossRef

Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and validity in comparative studies of software prediction models. IEEE Trans. Softw. Eng. 31, 380–391 (2005) CrossRef

Robson, C.: Real World Research: A Resource for Social Scientists and Practitioner-Researchers. Blackwell Publisher Ltd, Oxford (2002)

Shepperd, M., Kadoda, G.: Comparing software prediction techniques using simulation. IEEE Trans. Softw. Eng. 27(11), 1014–1022 (2001) CrossRef

Shepperd, M., Schofield, C.: Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23(11), 736–743 (1997) CrossRef

Shepperd, M., Schofield, C., Kitchenham, B.: Effort estimation using analogy. In: Proceedings of the 18th International Conference on Software Engineering, pp. 170–178 (1996)

Walkerden, F., Jeffery, R.: An empirical study of analogy-based software effort estimation. Empir. Softw. Eng. 4(2), 135–158 (1999) CrossRef

Yang, Y., Webb, G.I.: A comparative study of discretization methods for naive-Bayes classifiers. In: Proceedings of PKAW 2002: The 2002 Pacific Rim Knowledge Acquisition Workshop, pp. 159–173 (2002)

Title: Finding conclusion stability for selecting the best effort predictor in software effort estimation
Publication date: 01-12-2013
Published in: Automated Software Engineering / Issue 4/2013
Print ISSN: 0928-8910
Electronic ISSN: 1573-7535
DOI: https://doi.org/10.1007/s10515-012-0108-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner