ABSTRACT
Agile teams juggle multiple tasks so professionals are often assigned to multiple projects, especially in service organizations that monitor and maintain large suites of software for a large user base. If we could predict changes in project conditions change, then managers could better adjust the staff allocated to those projects.
This paper builds such a predictor using data from 832 open source and proprietary projects. Using a time series analysis of the last 4 months of issues, we can forecast how many bug reports and enhancement requests will be generated the next month.
The forecasts made in this way only require a frequency count of these issue reports (and do <u>not</u> require an historical record of bugs found in the project). That is, this kind of predictive model is very easy to deploy within a project. We hence strongly recommend this method for forecasting future issues, enhancements, and bugs in a project.
- Pekka Abrahamsson, Outi Salo, Jussi Ronkainen, and Juhani Warsta. 2017. Agile software development methods: Review and analysis. arXiv preprint arXiv:1709.08439 (2017).Google Scholar
- Ayman Amin, Lars Grunske, and Alan Colman. 2013. An approach to software reliability prediction based on time series modeling. Journal of Systems and Software 86, 7 (2013), 1923--1932. Google ScholarDigital Library
- Kamel Ayari, Peyman Meshkinfam, Giuliano Antoniol, and Massimiliano Di Penta. 2007. Threats on building models from cvs and bugzilla repositories: the mozilla case study. In Proceedings of the 2007 conference of the center for advanced studies on Collaborative research. IBM Corp., 215--228. Google ScholarDigital Library
- CG Bai, QP Hu, Min Xie, and Szu Hui Ng. 2005. Software failure prediction based on a Markov Bayesian network model. Journal of Systems and Software 74, 3 (2005), 275--282. Google ScholarDigital Library
- May Barghout, Bev Littlewood, and Abdalla Abdel-Ghaly. 1998. A non-parametric order statistics software reliability model. Software Testing Verification and Reliability 8, 3 (1998), 113--132.Google ScholarCross Ref
- Andrew Begel and Nachiappan Nagappan. 2007. Usage and perceptions of agile software development in an industrial context: An exploratory study. In Empirical Software Engineering and Measurement, 2007. ESEM 2007. First International Symposium on. IEEE, 255--264. Google ScholarDigital Library
- Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. In Proceedings of the 2006 international workshop on Mining software repositories. ACM, 137--143. Google ScholarDigital Library
- George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. 2015. Time series analysis: forecasting and control. John Wiley & Sons.Google Scholar
- Bernard Burtschy, Grigore Albeanu, Dragos N Boros, Florin Popentiu, and Victor Nicola. 1997. Improving software reliability forecasting. Microelectronics Reliability 37, 6 (1997), 901--907.Google ScholarCross Ref
- S Chatterjee, RB Misra, and SS Alam. 1997. Prediction of software reliability using an auto regressive process. International journal of systems science 28, 2 (1997), 211--216.Google Scholar
- CKS Chong Hok Yuen. 1988. On analyzing maintenance process data at the global and the detailed levels: A case study. In Proceedings of the IEEE Conference on Software Maintenance. 248--255.Google Scholar
- Márcio das Chagas Moura, Enrico Zio, Isis Didier Lins, and Enrique Droguett. 2011. Failure and reliability prediction by support vector machines regression of time series data. Reliability Engineering & System Safety 96, 11 (2011), 1527--1534.Google ScholarCross Ref
- David A Dickey and Wayne A Fuller. 1979. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association 74, 366a (1979), 427--431.Google Scholar
- Tore Dybå and Torgeir Dingsøyr. 2008. Empirical studies of agile software development: A systematic review. Information and software technology 50, 9 (2008), 833--859. Google ScholarDigital Library
- N Fenton, Martin Neil, and D Marquez. 2008. Using Bayesian networks to predict software defects and reliability. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability 222, 4 (2008), 701--712.Google ScholarCross Ref
- Eduardo Fuentetaja and Donald J Bagert. 2002. Software evolution from a time-series perspective. In Software Maintenance, 2002. Proceedings. International Conference on. IEEE, 226--229. Google ScholarDigital Library
- Github. {n. d.}. Build software better, together. ({n. d.}). https://github.com/showcasesGoogle Scholar
- Michael Godfrey and Qiang Tu. 2001. Growth, evolution, and structural change in open source software. In Proceedings of the 4th international workshop on principles of software evolution. ACM, 103--106. Google ScholarDigital Library
- Amrit L. Goel. 1985. Software reliability models: Assumptions, limitations, and applicability. IEEE Transactions on software engineering 12 (1985), 1411--1423. Google ScholarDigital Library
- Amrit L Goel and Kazu Okumoto. 1979. Time-dependent error-detection rate model for software reliability and other performance measures. IEEE transactions on Reliability 28, 3 (1979), 206--211.Google Scholar
- Israel Herraiz, Jesus M Gonzalez-Barahona, and Gregorio Robles. 2007. Forecasting the number of changes in Eclipse using time series analysis. In Mining Software Repositories, 2007. ICSE Workshops MSR'07. Fourth International Workshop on. IEEE, 32--32. Google ScholarDigital Library
- SL Ho and M Xie. 1998. The use of ARIMA models for reliability forecasting and analysis. Computers & industrial engineering 35, 1--2 (1998), 213--216. Google ScholarDigital Library
- Harold Edwin Hurst. 1951. Long-term storage capacity of reservoirs. Trans. Amer. Soc. Civil Eng. 116 (1951), 770--808.Google ScholarCross Ref
- Z Jelinski and PB Moranda. 1972. Software reliability research, Statistical Computer Performance Evaluation, W. Freiberger (ed.), 465--484. (1972).Google Scholar
- Guo Junhong, Liu Hongwei, and Yang Xiaozong. 2005. An autoregressive time series software reliability growth model with independent increment. In Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering. World Scientific and Engineering Academy and Society (WSEAS), 362--366. Google ScholarDigital Library
- Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The promises and perils of mining github. In Proceedings of the 11th working conference on mining software repositories. ACM, 92--101. Google ScholarDigital Library
- Nachimuthu Karunanithi, Darrell Whitley, and Yashwant K. Malaiya. 1992. Using neural networks in reliability prediction. IEEE Software 9, 4 (1992), 53--59. Google ScholarDigital Library
- Chris F. Kemerer and Sandra Slaughter. 1999. An empirical approach to studying software evolution. IEEE Transactions on Software Engineering 25, 4 (1999), 493--509. Google ScholarDigital Library
- Benedicte Kenmei, Giuliano Antoniol, and Massimiliano Di Penta. 2008. Trend analysis and issue prediction in large-scale open source systems. In Software Maintenance and Reengineering, 2008. CSMR 2008. 12th European Conference on. IEEE, 73--82. Google ScholarDigital Library
- N Raj Kiran and Vadlamani Ravi. 2008. Software reliability prediction by soft computing techniques. Journal of Systems and Software 81, 4 (2008), 576--583. Google ScholarDigital Library
- Rahul Krishna, Tim Menzies, and Wei Fu. 2016. Too much automation? The bellwether effect and its implications for transfer learning. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ACM, 122--131. Google ScholarDigital Library
- Rahul Krishna, Tim Menzies, and Lucas Layman. 2017. Less is more: Minimizing code reorganization using XTREE. Information and Software Technology 88 (2017), 53--66. Google ScholarDigital Library
- Manny M Lehman and Laszlo A Belady. 1985. Program evolution: processes of software change. Academic Press Professional, Inc. Google ScholarDigital Library
- Jung-Hua Lo. 2009. The implementation of artificial neural networks applying to software reliability modeling. In Control and Decision Conference, 2009. CCDC'09. Chinese. IEEE, 4349--4354. Google ScholarDigital Library
- Michael R Lyu. 2007. Software reliability engineering: A roadmap. In 2007 Future of Software Engineering. IEEE Computer Society, 153--170. Google ScholarDigital Library
- Michael R Lyu et al. 1996. Handbook of software reliability engineering. (1996). Google ScholarDigital Library
- Tim Menzies, Jeremy Greenwald, and Art Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE transactions on software engineering 33, 1 (2007), 2--13. Google ScholarDigital Library
- Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375--407. Google ScholarDigital Library
- Subhas Chandra Misra, Vinod Kumar, and Uma Kumar. 2009. Identifying some important success factors in adopting agile software development practices. Journal of Systems and Software 82, 11 (2009), 1869--1890. Google ScholarDigital Library
- Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating GitHub for engineered software projects. Empirical Software Engineering 22, 6 (2017), 3219--3253. Google ScholarDigital Library
- Martin Neil and Norman Fenton. 1996. Predicting software quality using Bayesian belief networks. In Proceedings of the 21st Annual Software Engineering Workshop. NASA/Goddard Space Flight Centre, 217--230.Google Scholar
- Ping-Feng Pai and Wei-Chiang Hong. 2006. Software reliability forecasting by support vector machines with simulated annealing algorithms. Journal of Systems and Software 79, 6 (2006), 747--755. Google ScholarDigital Library
- Ping-Feng Pai and Wei-Chiang Hong. 2006. Software reliability forecasting by support vector machines with simulated annealing algorithms. Journal of Systems and Software 79, 6 (2006), 747--755. Google ScholarDigital Library
- David G Robinson and Duane Dietrich. 1987. A new nonparametric growth model. IEEE Transactions on Reliability 36, 4 (1987), 411--418.Google ScholarCross Ref
- George AF Seber and Alan J Lee. 2012. Linear regression analysis. Vol. 936. John Wiley & Sons.Google Scholar
- Nozer D. Singpurwalla and Refik Soyer. 1985. Assessing (software) reliability growth using a random coefficient autoregressive process and its ramifications. IEEE Transactions on Software Engineering 12 (1985), 1456--1464. Google ScholarDigital Library
- Alina Tugend. 2008. Multitasking Can Make You Lose ... Um ... Focus. (Oct 2008). https://nyti.ms/2jD6gzjGoogle Scholar
- Burak Turhan, Tim Menzies, Ayşe B Bener, and Justin Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 (2009), 540--578. Google ScholarDigital Library
- Wladyslaw M Turski. 1996. Reference model for smooth growth of software systems. IEEE Transactions on Software Engineering 22, 8 (1996), 599. Google ScholarDigital Library
- Gerald M Weinberg. 1992. Quality software management (Vol. 1): systems thinking. (1992). Google ScholarDigital Library
- Cort J Willmott and Kenji Matsuura. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research 30, 1 (2005), 79--82.Google Scholar
- Cort J Willmott and Kenji Matsuura. 2006. On the use of dimensioned measures of error to evaluate the performance of spatial interpolators. International Journal of Geographical Information Science 20, 1 (2006), 89--102.Google ScholarCross Ref
- MP Wiper, AP Palacios, and JM Marín. 2012. Bayesian software reliability prediction using software metrics information. Quality Technology & Quantitative Management 9, 1 (2012), 35--44.Google ScholarCross Ref
- Alan Wood. 1997. Software reliability growth models: assumptions vs. reality. In Software Reliability Engineering, 1997. Proceedings., The Eighth International Symposium on. IEEE, 136--141. Google ScholarDigital Library
- Jingwei Wu and Richard Holt. 2006. Seeking empirical evidence for self-organized criticality in open source software evolution. (2006).Google Scholar
- M Xie and SL Ho. 1999. Analysis of repairable system failure data using time series models. Journal of Quality in Maintenance Engineering 5, 1 (1999), 50--61.Google ScholarCross Ref
- Bo Yang, Xiang Li, Min Xie, and Feng Tan. 2010. A generic data-driven software reliability model with model mining technique. Reliability Engineering & System Safety 95, 6 (2010), 671--678.Google ScholarCross Ref
- S Jamal H Zaidi, Syed Nasir Danial, and Bilal A Usmani. 2008. Modeling inter-failure time series using neural networks. In Multitopic Conference, 2008. INMIC 2008. IEEE International. IEEE, 409--411.Google ScholarCross Ref
- David Zeitler. 1991. Realistic assumptions for software reliability models. In Software Reliability Engineering, 1991. Proceedings., 1991 International Symposium on. IEEE, 67--74.Google ScholarCross Ref
Index Terms
- What is the connection between issues, bugs, and enhancements?: lessons learned from 800+ software projects
Recommendations
A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia
The model identification process for VARMA, ARMA and Winters method.Display Omitted VARMA, ARMA and Winters methods are used extensively for planning and management.Multivariate VARMA model is a reliable tool for predicting ED demand by category.It ...
Do bugs lead to unnaturalness of source code?
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringTexts in natural languages are highly repetitive and predictable because of the naturalness of natural languages. Recent research validated that source code in programming languages is also repetitive and predictable, and naturalness is an inherent ...
Sales forecasting for Chemical Products by Using SARIMA Model
ICBDE '22: Proceedings of the 5th International Conference on Big Data and EducationSales forecasting is widely used in enterprise resource management, which provides valuable information for efficient management. Sales forecasting facilitates the company to produce and stock products on demand. Based on the analysis of time series, ...
Comments