Abstract
Data is a cornerstone of empirical software engineering (ESE) research and practice. Data underpin numerous process and project management activities, including the estimation of development effort and the prediction of the likely location and severity of defects in code. Serious questions have been raised, however, over the quality of the data used in ESE. Data quality problems caused by noise, outliers, and incompleteness have been noted as being especially prevalent. Other quality issues, although also potentially important, have received less attention. In this study, we assess the quality of 13 datasets that have been used extensively in research on software effort estimation. The quality issues considered in this article draw on a taxonomy that we published previously based on a systematic mapping of data quality issues in ESE. Our contributions are as follows: (1) an evaluation of the “fitness for purpose” of these commonly used datasets and (2) an assessment of the utility of the taxonomy in terms of dataset benchmarking. We also propose a template that could be used to both improve the ESE data collection/submission process and to evaluate other such datasets, contributing to enhanced awareness of data quality issues in the ESE community and, in time, the availability and use of higher-quality datasets.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Experience: Quality Benchmarking of Datasets Used in Software Effort Estimation
- Pekka Abrahamsson, Ilenia Fronza, Raimund Moser, Jelena Vlasenko, and Witold Pedrycz. 2011. Predicting development effort from user stories. In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement. 400--403. Google ScholarDigital Library
- Allan J. Albrecht and John E. Gaffney. 1983. Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. 9, 6 (1983), 639--648. Google ScholarDigital Library
- Allan J. Albrecht. 1979. Measuring application development productivity. In Proceedings of the Joint SHARE/GUIDE/IBM Application Development Symposium, 83--92.Google Scholar
- Sousuke Amasaki. 2012. Replicated analyses of windowing approach with single company datasets. In Proceedings of the 12th International Conference on Product Focused Software Development and Process Improvement. ACM. 14--17 Google ScholarDigital Library
- Sousuke Amasaki, Yohei Takahara, and Tomoyuki Yokogawa. 2011. Performance evaluation of windowing approach on effort estimation by analogy. In Proceedings of the 2011 Joint Conference of the 21st International Workshop on Software Measurement and the 6th International Conference on Software Process and Product Measurement, 188--195. Google ScholarDigital Library
- Mohammad Azzeh, Daniel Neagu, and Peter I. Cowling. 2010. Fuzzy grey relational analysis for software effort estimation. Emp. Softw. Eng. 15,1 (2009), 60--90. Google ScholarDigital Library
- Ali Sajedi Badashian, Afsaneh Esteki, Ameneh Gholipour, Abram Hindle, and Eleni Stroulia. 2014. Involvement, contribution and influence in github and stackoverflow. In Proceedings of the 24th Annual International Conference on Computer Science and Software Engineering. 19--33. Google ScholarDigital Library
- Rajiv D. Banker, Hsihui Chang, and Chris F. Kemerer. 1994. Evidence on economies of scale in software development. Inf. Softw. Technol. 36, 5 (1994), 275--282.Google ScholarCross Ref
- K. Bennett, E. Burd, C. Kemerer, M. M. Lehman, M. Lee, R. Madachy, C. Mair, D. Sjoberg, and S. Slaughter. 1999. Empirical studies of evolving systems. Emp. Softw. Eng. 4, 4 (1999), 370--380. Google ScholarDigital Library
- Nicolas Bettenburg, Sascha Just, and Adrian Schröter. 2008. What makes a good bug report? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 308--318. Google ScholarDigital Library
- Barry W. Boehm 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Michael F. Bosu and Stephen G. MacDonell. 2013a. A taxonomy of data quality challenges in empirical software engineering. In Proceedings of the 22nd Australian Conference on Software Engineering. 97--106. Google ScholarDigital Library
- Michael F. Bosu and Stephen G. MacDonell. 2013b. Data quality in empirical software engineering: A targeted review. In Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering. ACM. 171--176. Google ScholarDigital Library
- Luigi Buglione and Cigdem Gencel. 2008. Impact of base functional component types on software functional size based effort estimation. In Proceedings of PROFES 2008 9th International Conference on Product-Focused Software Development and Process Improvement. Springer, Berlin, 75--89. Google ScholarDigital Library
- Andrea Capiluppi and Daniel Izquierdo-Cortázar. 2013. Effort estimation of FLOSS projects: A study of the Linux kernel. Emp. Softw. Eng. 18, 1 (2013), 60--88.Google ScholarCross Ref
- Cagatay Catal and Banu Diri. 2009. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179, 8 (2009), 1040--1058. Google ScholarDigital Library
- Laila Cheikhi and Alain Abran. 2013. Promise and isbsg software engineering data repositories: A survey. In Proceedings of the 2013 Joint Conference of the 23nd International Workshop on Software Measurement (IWSM’13) and the 8th International Conference on Software Process and Product Measurement (Mensura’13), 17--24. Google ScholarDigital Library
- Jr-shian Chen and Ching-hsue Cheng. 2006. Software diagnosis using fuzzified attribute base on modified MEPA. In Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, Berlin, Heidelberg, 1270--1279. Google ScholarDigital Library
- Sun-Jen Huang and Nan-Hsing Chiu. 2009. Applying fuzzy neural network to estimate software development. Appl. Intell. 30, 2 (2009), 73--83. Google ScholarDigital Library
- Tony Clear and Stephen G. MacDonell. 2011. Understanding technology use in global virtual teams: Research methodologies and methods. Inf. Softw. Technol. 53 9 (2011), 994--1011. Google ScholarDigital Library
- Juan J. Cuadrado-Gallego, Luigi Buglione, María J. Domínguez-Alda, Marian Fernández De Sevilla, J. Antonio Gutierrez De Mesa, and Onur Demirors. 2010. An experimental study on the conversion between ifpug and cosmic functional size measurement units. Inf. Softw. Technol. 52, 3 (2010), 347--357. Google ScholarDigital Library
- Michael K. Daskalantonakis. 1992. A practical view of software measurement and implementation experiences within motorolla. IEEE Trans. Softw. Eng. 18 11 (1992), 998--1010. Google ScholarDigital Library
- Kefu Deng and Stephen G. MacDonell. 2008. Maximising data retention from the isbsg repository. In Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering. Italy. Google ScholarDigital Library
- Jean-Marc Desharnais. 1988. Statistical Analysis on the Productivity of Data Processing with Development Projects Using the Function Point Technique. Master's thesis. Université du Québec à Montréal, Canada.Google Scholar
- Norman Fenton, Martin Neil, William Marsh, Peter Hearty, Lukasz Radliński, amd Paul Krause. 2008. On the effectiveness of early life cycle defect prediction with bayesian nets. Emp. Softw. Eng. 13, 5 (2008), 499--537. Google ScholarDigital Library
- Andreas Folleco, Taghi M. Khoshgoftaar, Jason Van Hulse, and Lofton Bullard. 2008. Software quality modeling: The impact of class noise on the random forest classifier. In Proceedings of the 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). 3853--3859.Google ScholarCross Ref
- Pekka Forselius. 2008. Quality of benchmarking data. Retrieved January 29, 2014 from www.4sumpartners.com.Google Scholar
- Cigdem Gencel, Luigi Buglione, and Alain Abran. 2009. Improvement opportunities and suggestions for benchmarking. In Software Process and Product Measurement. Springer, Berlin, 144--156. Google ScholarDigital Library
- R. L. Glass, I. Vessey, and V. Ramesh. 2002. Research in software engineering: An analysis of the literature. Inf. Softw. Technol. 44, 8 (2002), 491--506. Google ScholarDigital Library
- María Paula González, Jesús Lorés, and Antoni Granollers. 2008. Enhancing usability testing through datamining techniques: A novel approach to detecting usability problem patterns for a context of use. Inf. Softw. Technol. 50, 6 (2008), 547--68. Google ScholarDigital Library
- D. Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson. 2012. Reflections on the NASA MDP data sets. IET Softw. 6 6, (2012) 549--558.Google Scholar
- Tracy Hall. 2007. Longitudinal studies in evidence-based software engineering. In Empirical Software Engineering Issues: Critical Assessment and Future Directions. Springer, Berlin, 41--41. Google ScholarDigital Library
- Tracy Hall and Norman Fenton. 1997. Implementing effective software metrics programs. IEEE Software 14, 2 (1997), 55--65. Google ScholarDigital Library
- Zhimm He, Fayola Peters, Tim Menzies, and Ye Yang. 2013. Learning from open-source projects: An empirical study on defect prediction. In Proceedings of the 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 45--54.Google ScholarCross Ref
- Chao-Jung Hsu and Chin-Yu Huang. 2007. Improving effort estimation accuracy by weighted grey relational analysis during software development. In Proceedings of the 14th Asia-Pacific Software Engineering Conference. 534--541. Google ScholarDigital Library
- Sun-Jen Huang, Nan-Hsing Chiu, and Li-Wei Chen. 2008. Integration of the grey relational analysis with genetic algorithm for software effort estimation. Eur. J. Operat. Res. 188, 3 (2008), 898--909.Google ScholarCross Ref
- Jason Van Hulse, Taghi M. Khoshgoftaar, Chris Seiffert, and Lili Zhao. 2006. Noise correction using bayesian multiple imputation. In Proceedings of the 2006 IEEE International Conference on Information Reuse and Integration, 478--483.Google ScholarCross Ref
- Jason Van Hulse and Taghi M. Khoshgoftaar. 2008. A comprehensive empirical evaluation of missing value imputation in noisy software measurement data. J. Syst. Softw. 81 5 (2008), 691--708. Google ScholarDigital Library
- Jason Van Hulse and T. M. Khoshgoftaar. 2014. Incomplete-case nearest neighbor imputation in software measurement data. Information Sciences 259 (2014), 596--610. Google ScholarDigital Library
- Ayelet Israeli and Dror G. Feitelson. 2010. The linux kernel as a case study in software evolution. J. Syst. Softw. 83, 3 (2010), 485--501. Google ScholarDigital Library
- Philip M. Johnson and Anne M. Disney. 1999. A critical analysis of psp data quality: Results from a case study. Emp. Softw. Eng. 4, 1 (1999), 317--349. Google ScholarDigital Library
- Chris F. Kemerer. 1987. An empirical validation of software cost estimation models. Commun. ACM 30, 5 (1987), 416--429. Google ScholarDigital Library
- Jacky Keung and Barbara Kitchenham. 2008. Experiments with analogy-x for software cost estimation. In Proceedings of the 19th Australian Conference on Software Engineering 229--238. Google ScholarDigital Library
- Taghi M. Khoshgoftaar, Andres Folleco, Jason Van Hulse, and Lofton Bullard. 2006. Software quality imputation in the presence of noisy data. In Proceedings of IEEE International Conference on Information Reuse and Integration. 484--489.Google ScholarCross Ref
- T. M. Khoshgoftaar and P. Rebours. 2004. Generating multiple noise elimination filters with the ensemble-partitioning filter. In Proceedings of the IEEE International Conference on Information Reuse and Integration, 369--375.Google Scholar
- Taghi M. Khoshgoftaar and Jason Van Hulse. 2005. Identifying noise in an attribute of interest. In Proceedings of the 4th International Conference on Machine Learning and Applications. Google ScholarDigital Library
- Barbara Kitchenham, Tore Dybå, and Magne Jorgensen. 2004. Evidence-based software engineering. In Proceedings of the 26th International Conference on Software Engineering (ICSE’04), 273--81. Google ScholarDigital Library
- Barbara Kitchenham and Kari Kansala. 1993. Inter-item correlations among function points. In Proceedings of the 15th International Conference on Software Engineering. 229--238. Google ScholarDigital Library
- Barbara Kitchenham, Shari L. Pfleeger, Beth Mccoll, and Suzanne Eagan. 2002. An empirical study of maintenance and development estimation accuracy. J. Syst. Softw. 64, 1 (2002), 57--77. Google ScholarDigital Library
- Ekrem Kocaguneli, Gregory Gay, Tim Menzies, Ye Yang, and Jacky W. Keung. 2010. When to use data from other projects for effort estimation. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering. ACM. 321--324. Google ScholarDigital Library
- Ekrem Kocaguneli and Tim Menzies. 2011. How to find relevant data for effort estimation? In Proceedings of the 5th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’11), 2011, 255--264. Google ScholarDigital Library
- Ekrem Kocaguneli, Tim Menzies, and Jacky W. Keung. 2013. Kernel methods for software effort estimation. Emp. Softw. Eng. 18, 1 (2013), 1--24.Google ScholarCross Ref
- Ekrem Kocaguneli, Tim Menzies, and Jacky W. Keung. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38 6 (2012), 1403--1416. Google ScholarDigital Library
- Ekrem Kocaguneli, Tim Menzies, and Emilia Mendes. 2015. Transfer learning in effort estimation. Emp. Softw. Eng. 20, 3 (2015), 813--843. Google ScholarDigital Library
- Luigi Lavazza and Sandro Morasca. 2012. Software effort estimation with a generalized robust linear regression technique. In Proceedings of the 16th International Conference on Evaluation 8 Assessment in Software Engineering (EASE’12), 206--215.Google ScholarCross Ref
- Taeho Lee, Taewan Gu, and Jongmoon Baik. 2014. Mnd-Scemp: An empirical study of a software cost estimation modeling process in the defense domain. Emp. Softw. Eng. 19, 1 (2014), 213--240. Google ScholarDigital Library
- Sukumar Letchmunan, Marc Roper, and Murray Wood. 2010. Investigating effort prediction of web-based applications using cbr on the ISBSG dataset. In Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering. 1--10. Google ScholarDigital Library
- Y. F. Li, M. Xie, and T. N. Goh. 2009. A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82, 2 (2009), 241--252. Google ScholarDigital Library
- Gernot A. Liebchen and Martin J. Shepperd. 2005. Software productivity analysis of a large data set and issues of confidentiality and data quality. In Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS’05). Google ScholarDigital Library
- Gernot A. Liebchen and Martin J. Shepperd. 2008. Data sets and data quality in software engineering. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering. ACM Press New York, NY. Google ScholarDigital Library
- Gernot A. Liebchen, Bheki Twala, Martin J. Shepperd, and Michelle Cartwright. 2006. Assessing the quality and cleaning of a software project dataset: An experience report. In Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering. 1--7. Google ScholarDigital Library
- Gernot Liebchen, Bheki Twala, Martin J. Shepperd, Michelle Cartwright, and Mark Stephens. 2007. Filtering, robust filtering, polishing: Techniques for addressing quality in software data. In Proceedings of the 1st International Symposium on Empirical Software Engineering and Measurement (ESEM’07), 99--106. Google ScholarDigital Library
- Chris Lokan and Emilia Mendes. 2006. Cross-company and single-company effort models using the ISBSG Database: A further replicated study. In Proceedings of the International Symposium on Empirical Software Engineering. Google ScholarDigital Library
- Chris Lokan and Emilia Mendes. 2009. Applying moving windows to software effort estimation. In Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement, 111--122. Google ScholarDigital Library
- Stephen G. MacDonell and Martin J. Shepperd. 2007. Comparing local and global software effort estimation models -- reflections on a systematic review. In Proceedings of the 1st International Symposium on Empirical Software Engineering and Measurement (ESEM’07), 401--409. Google ScholarDigital Library
- Carolyn Mair, Martin J. Shepperd, and Magne Jørgensen. 2005. An analysis of data sets used to train and validate cost prediction systems. In Proceedings of the 2005 Workshop on Predictor Models in Software Engineering (PROMISE’05). 1--6. Google ScholarDigital Library
- Katrina Maxwell. 2002. Applied Statistics for Software Managers. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- Katrina D. Maxwell and Pekka Forselius. 2000. Benchmarking software-development productivity. IEEE Softw. 17, 1 (2000), 80--88. Google ScholarDigital Library
- D. Mazinanian, M. Doroodchi, and M. Hassany. 2012. WDMES: A comprehensive measurement system for web application development. In Proceedings of the Euro American Conference on Telematics and Information Systems (EATIS’12). 135--42. Google ScholarDigital Library
- Emilia Mendes, Sergio Di Martino, Filomena Ferrucci, and Carmine Gravino. 2008. Cross-company vs. single-company web effort models using the tukutuku database: An extended study. J. Syst. Softw. 81, 5 (2008), 673--690. Google ScholarDigital Library
- Emelia Mendes and Chris Lokan. 2008. Replicating studies on cross- vs single-company effort models using the ISBSG database. Emp. Softw. Eng. 13, 1 (2008), 3--37. Google ScholarDigital Library
- Emilia Mendes, Sergio Di Martino, Filomena Ferrucci, and Carmine Gravino. 2007. Effort estimation: How valuable is it for a web company to use a cross-company data set, compared to using its own single-company data set? In Proceedings of the 16th International Conference on World Wide Web (WWW’07). 963--972. Google ScholarDigital Library
- Tim Menzies, Andrew Butcher, Andrian Marcus, Thomas Zimmermann, and David Cok. 2011. Local Vs. global models for effort estimation and defect prediction. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11), 343--351. Google ScholarDigital Library
- Leandro L. Minku and Xin Yao. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8 (2013) 1512--1528. Google ScholarDigital Library
- Y. Miyazaki, M. Terakado, K. Ozaki, and H. Nozaki. 1994. Robust regression for developing software estimation models. J. Syst. Softw. 27, 1 (1994) 3--16. Google ScholarDigital Library
- Sandro Morasca. 2009. Building statistically significant robust regression models in empirical software engineering. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE’09). Google ScholarDigital Library
- Raimund Moser, Pedrycz Witold, and Giancarlo Succi. 2008. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the International Conference on Software Engineering. 181--90. Google ScholarDigital Library
- Vu Nguyen, Bert Steece, and Barry W. Boehm. 2008. A constrained regression technique for cocomo calibration. In Proceedings of the 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’08). 213--222. Google ScholarDigital Library
- Fayola Peters, Tim Menzies, Liang Gong, and Hongyu Zhang. 2013. Balancing privacy and utility in cross-company defect prediction. IEEE Trans. Softw. Eng. 3, 8 (2013) 1054--1068. Google ScholarDigital Library
- M. E. Prabhakar and Maitreyee Dutta. 2013. Prediction of software effort using artificial neural network and support vector machine. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 3 (2013), 40--46.Google Scholar
- Rahul Premraj, Martin J. Shepperd, Barbara Kitchenham, and Pekka Forselius. 2005. An empirical analysis of software productivity over time. Software Metrics, 2005. In Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS’05), 37--46. Google ScholarDigital Library
- Tomi Prifti, Sean Banerjee, and Bojan Cukic. 2011. Detecting bug duplicate reports through local references. In Proceedings of the 7th International Conference on Predictive Models in Software Engineering (Promise’11). Google ScholarDigital Library
- Fumin Qi, Xiao-Yuan Jing, Xiaoke Zhu, Xiaovuan Xie, Baowen Xu, and Shi Ying. 2017. Software effort estimation based on open source projects: Case study of github. Inf. Softw. Technol. 92, 145--157.Google ScholarCross Ref
- Ch. Satyananda Reddy and Kvsvn Raju. 2009. An improved fuzzy approach for cocomo's effort estimation using gaussian membership function. J. Softw. 4, 5 (2009), 452--459.Google Scholar
- Gregorio Robles. 2010. Replicating MSR: A study of the potential replicability of papers published in the mining software repositories proceedings. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10). 171--180.Google ScholarCross Ref
- Daniel Rodriguez, Israel Herraiz, and Rachel Harrison. 2012. On software engineering repositories and their open problems. In Proceedings of the 2012 1st International Workshop on Realizing AI Synergies in Software Engineering (RAISE’12). 52--56. Google ScholarDigital Library
- Marshima M. Rosli, Ewan Tempero, and Andrew Luxton-Reilly. 2013. Can we trust our results? a mapping study on data quality. In Proceedings of the 2013 20th Asia-Pacific Software Engineering Conference (APSEC’13). 116--123. Google ScholarDigital Library
- Joost Schalken and Hans van Vliet. 2008. Measuring where it matters: Determining starting points for metrics collection. J. Syst. Softw. 81, 5 (2008), 603--15. Google ScholarDigital Library
- Yeong-Seok Seo, Kyung-A Yoon, and Doo-Hwan Bae. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’08) 25. Google ScholarDigital Library
- Martin J. Shepperd, David Bowes, and Tracy Hall. 2014. Researcher bias: The use of machine learning in software defect prediction. IEEE Trans. Softw. Eng. 40, 6 (2014), 603--616.Google ScholarCross Ref
- Martin J. Shepperd and Chris Schofield. 1997. Estimating software project effort using analogies. software engineering. IEEE Trans. Softw. Eng. 23, 12 (1997) 736--743. Google ScholarDigital Library
- Martin J. Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data quality: Some comments on the NASA software defect datasets. Software Engineering. IEEE Trans. Softw. Eng. 39, 9 (2013), 1208--1215. Google ScholarDigital Library
- Yonghee Shin, Andrew Meneely, Laurie Williams, and Jason A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans. Softw. Eng. 37, 6 (2011), 772--787. Google ScholarDigital Library
- Forrest J. Shull, Jeffrey C. Carver, Sira Vegas, and Natalia Juristo. 2008. The role of replications in empirical software engineering. Emp. Softw. Eng. 13, 2 (2008), 211--18. Google ScholarDigital Library
- Thomas Tan, Guan Cun, Mei He, and Barry Boehm. 2009. Productivity trends in incremental and iterative software development. In Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement, 1--10. Google ScholarDigital Library
- Wang-chiew Tan. 2007. “Provenance in databases : Past current, and future.” IEEE Data Eng. Bull. 30, 4 (2007), 3--12.Google Scholar
- Wei Tang and Taghi M. Khoshgoftaar. 2004. Noise identification with the k-means algorithm. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence. 373--378. Google ScholarDigital Library
- Choh Man Teng. 2000. Evaluating noise correction. PRICAI 2000 Topics in Artificial Intelligence 188--98. Google ScholarDigital Library
- Ayse Tosun, Burak Turhan, and Ayse B. Bener. 2009. Feature weighting heuristics for analogy-based effort estimation models. Exp. Syst. Appl. 36, 7 (2009), 10325--10333. Google ScholarDigital Library
- Burak Turhan, Tim Menzies, Ayşe B. Bener, and Justin Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Emp. Softw. Eng. 14, 5 (2009), 540--578. Google ScholarDigital Library
- María C. Valverde, Diego Vallespir, Adriana Marotta, and Joseignacio Panach. 2014. Applying a data quality model to experiments in software engineering. In Advances in Conceptual Modeling, Lecture Notes in Computer Science, Vol. 8823. 168--177.Google ScholarCross Ref
- Isabella Wieczorek. 2002. Improved software cost estimation—A robust and interpretable modelling method and a comprehensive empirical investigation. Emp. Softw. Eng. 7, 2 (2002), 177--80. Google ScholarDigital Library
- Kyung- A. Yoon and Doo-Hwan Bae. 2010. A pattern-based outlier detection method identifying abnormal attributes in software project data. Inf. Softw. Technol. 52, 2 (2010), 137--151. Google ScholarDigital Library
- Wen Zhang, Ye Yang, and Qing Wang. 2011. Handling missing data in software effort prediction with naive bayes and em algorithm categories and subject descriptors. In Proceedings of the 7th International Conference on Predictive Models in Software Engineering (Promise’11). Google ScholarDigital Library
- Chen Zhihao, Barry Boehm, Tim Menzies, and Daniel Port. 2005. Finding the right data for software cost modeling. IEEE Softw. 22, 6 (2005), 38--46. Google ScholarDigital Library
Index Terms
- Experience: Quality Benchmarking of Datasets Used in Software Effort Estimation
Recommendations
SEERA: a software cost estimation dataset for constrained environments
PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software EngineeringThe accuracy of software cost estimation depends on the relevancy of the cost estimation dataset, the quality of its data and its suitability for the targeted software development environment. Software development cost is impacted by technical, socio-...
Linear Programming as a Baseline for Software Effort Estimation
Software effort estimation studies still suffer from discordant empirical results (i.e., conclusion instability) mainly due to the lack of rigorous benchmarking methods. So far only one baseline model, namely, Automatically Transformed Linear Model (...
Data Quality: Some Comments on the NASA Software Defect Datasets
Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using ...
Comments