research-article

What is the connection between issues, bugs, and enhancements?: lessons learned from 800+ software projects

Authors:
Rahul Krishna

North Carolina State University

North Carolina State University
View Profile

,
Amritanshu Agrawal

North Carolina State University

North Carolina State University
View Profile

,
Akond Rahman

North Carolina State University

North Carolina State University
View Profile

,
Alexander Sobran

IBM Corp

IBM Corp
View Profile

,
Tim Menzies

North Carolina State University

North Carolina State University
View Profile

ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in PracticeMay 2018Pages 306–315https://doi.org/10.1145/3183519.3183548

Published:27 May 2018Publication History

ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice

Pages 306–315

ABSTRACT

Agile teams juggle multiple tasks so professionals are often assigned to multiple projects, especially in service organizations that monitor and maintain large suites of software for a large user base. If we could predict changes in project conditions change, then managers could better adjust the staff allocated to those projects.

This paper builds such a predictor using data from 832 open source and proprietary projects. Using a time series analysis of the last 4 months of issues, we can forecast how many bug reports and enhancement requests will be generated the next month.

The forecasts made in this way only require a frequency count of these issue reports (and do <u>not</u> require an historical record of bugs found in the project). That is, this kind of predictive model is very easy to deploy within a project. We hence strongly recommend this method for forecasting future issues, enhancements, and bugs in a project.

References

Pekka Abrahamsson, Outi Salo, Jussi Ronkainen, and Juhani Warsta. 2017. Agile software development methods: Review and analysis. arXiv preprint arXiv:1709.08439 (2017).Google Scholar
Ayman Amin, Lars Grunske, and Alan Colman. 2013. An approach to software reliability prediction based on time series modeling. Journal of Systems and Software 86, 7 (2013), 1923--1932. Google ScholarDigital Library
Kamel Ayari, Peyman Meshkinfam, Giuliano Antoniol, and Massimiliano Di Penta. 2007. Threats on building models from cvs and bugzilla repositories: the mozilla case study. In Proceedings of the 2007 conference of the center for advanced studies on Collaborative research. IBM Corp., 215--228. Google ScholarDigital Library
CG Bai, QP Hu, Min Xie, and Szu Hui Ng. 2005. Software failure prediction based on a Markov Bayesian network model. Journal of Systems and Software 74, 3 (2005), 275--282. Google ScholarDigital Library
May Barghout, Bev Littlewood, and Abdalla Abdel-Ghaly. 1998. A non-parametric order statistics software reliability model. Software Testing Verification and Reliability 8, 3 (1998), 113--132.Google ScholarCross Ref
Andrew Begel and Nachiappan Nagappan. 2007. Usage and perceptions of agile software development in an industrial context: An exploratory study. In Empirical Software Engineering and Measurement, 2007. ESEM 2007. First International Symposium on. IEEE, 255--264. Google ScholarDigital Library
Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. In Proceedings of the 2006 international workshop on Mining software repositories. ACM, 137--143. Google ScholarDigital Library
George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. 2015. Time series analysis: forecasting and control. John Wiley & Sons.Google Scholar
Bernard Burtschy, Grigore Albeanu, Dragos N Boros, Florin Popentiu, and Victor Nicola. 1997. Improving software reliability forecasting. Microelectronics Reliability 37, 6 (1997), 901--907.Google ScholarCross Ref
S Chatterjee, RB Misra, and SS Alam. 1997. Prediction of software reliability using an auto regressive process. International journal of systems science 28, 2 (1997), 211--216.Google Scholar
CKS Chong Hok Yuen. 1988. On analyzing maintenance process data at the global and the detailed levels: A case study. In Proceedings of the IEEE Conference on Software Maintenance. 248--255.Google Scholar
Márcio das Chagas Moura, Enrico Zio, Isis Didier Lins, and Enrique Droguett. 2011. Failure and reliability prediction by support vector machines regression of time series data. Reliability Engineering & System Safety 96, 11 (2011), 1527--1534.Google ScholarCross Ref
David A Dickey and Wayne A Fuller. 1979. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association 74, 366a (1979), 427--431.Google Scholar
Tore Dybå and Torgeir Dingsøyr. 2008. Empirical studies of agile software development: A systematic review. Information and software technology 50, 9 (2008), 833--859. Google ScholarDigital Library
N Fenton, Martin Neil, and D Marquez. 2008. Using Bayesian networks to predict software defects and reliability. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability 222, 4 (2008), 701--712.Google ScholarCross Ref
Eduardo Fuentetaja and Donald J Bagert. 2002. Software evolution from a time-series perspective. In Software Maintenance, 2002. Proceedings. International Conference on. IEEE, 226--229. Google ScholarDigital Library
Github. {n. d.}. Build software better, together. ({n. d.}). https://github.com/showcasesGoogle Scholar
Michael Godfrey and Qiang Tu. 2001. Growth, evolution, and structural change in open source software. In Proceedings of the 4th international workshop on principles of software evolution. ACM, 103--106. Google ScholarDigital Library
Amrit L. Goel. 1985. Software reliability models: Assumptions, limitations, and applicability. IEEE Transactions on software engineering 12 (1985), 1411--1423. Google ScholarDigital Library
Amrit L Goel and Kazu Okumoto. 1979. Time-dependent error-detection rate model for software reliability and other performance measures. IEEE transactions on Reliability 28, 3 (1979), 206--211.Google Scholar
Israel Herraiz, Jesus M Gonzalez-Barahona, and Gregorio Robles. 2007. Forecasting the number of changes in Eclipse using time series analysis. In Mining Software Repositories, 2007. ICSE Workshops MSR'07. Fourth International Workshop on. IEEE, 32--32. Google ScholarDigital Library
SL Ho and M Xie. 1998. The use of ARIMA models for reliability forecasting and analysis. Computers & industrial engineering 35, 1--2 (1998), 213--216. Google ScholarDigital Library
Harold Edwin Hurst. 1951. Long-term storage capacity of reservoirs. Trans. Amer. Soc. Civil Eng. 116 (1951), 770--808.Google ScholarCross Ref
Z Jelinski and PB Moranda. 1972. Software reliability research, Statistical Computer Performance Evaluation, W. Freiberger (ed.), 465--484. (1972).Google Scholar
Guo Junhong, Liu Hongwei, and Yang Xiaozong. 2005. An autoregressive time series software reliability growth model with independent increment. In Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering. World Scientific and Engineering Academy and Society (WSEAS), 362--366. Google ScholarDigital Library
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The promises and perils of mining github. In Proceedings of the 11th working conference on mining software repositories. ACM, 92--101. Google ScholarDigital Library
Nachimuthu Karunanithi, Darrell Whitley, and Yashwant K. Malaiya. 1992. Using neural networks in reliability prediction. IEEE Software 9, 4 (1992), 53--59. Google ScholarDigital Library
Chris F. Kemerer and Sandra Slaughter. 1999. An empirical approach to studying software evolution. IEEE Transactions on Software Engineering 25, 4 (1999), 493--509. Google ScholarDigital Library
Benedicte Kenmei, Giuliano Antoniol, and Massimiliano Di Penta. 2008. Trend analysis and issue prediction in large-scale open source systems. In Software Maintenance and Reengineering, 2008. CSMR 2008. 12th European Conference on. IEEE, 73--82. Google ScholarDigital Library
N Raj Kiran and Vadlamani Ravi. 2008. Software reliability prediction by soft computing techniques. Journal of Systems and Software 81, 4 (2008), 576--583. Google ScholarDigital Library
Rahul Krishna, Tim Menzies, and Wei Fu. 2016. Too much automation? The bellwether effect and its implications for transfer learning. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ACM, 122--131. Google ScholarDigital Library
Rahul Krishna, Tim Menzies, and Lucas Layman. 2017. Less is more: Minimizing code reorganization using XTREE. Information and Software Technology 88 (2017), 53--66. Google ScholarDigital Library
Manny M Lehman and Laszlo A Belady. 1985. Program evolution: processes of software change. Academic Press Professional, Inc. Google ScholarDigital Library
Jung-Hua Lo. 2009. The implementation of artificial neural networks applying to software reliability modeling. In Control and Decision Conference, 2009. CCDC'09. Chinese. IEEE, 4349--4354. Google ScholarDigital Library
Michael R Lyu. 2007. Software reliability engineering: A roadmap. In 2007 Future of Software Engineering. IEEE Computer Society, 153--170. Google ScholarDigital Library
Michael R Lyu et al. 1996. Handbook of software reliability engineering. (1996). Google ScholarDigital Library
Tim Menzies, Jeremy Greenwald, and Art Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE transactions on software engineering 33, 1 (2007), 2--13. Google ScholarDigital Library
Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375--407. Google ScholarDigital Library
Subhas Chandra Misra, Vinod Kumar, and Uma Kumar. 2009. Identifying some important success factors in adopting agile software development practices. Journal of Systems and Software 82, 11 (2009), 1869--1890. Google ScholarDigital Library
Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating GitHub for engineered software projects. Empirical Software Engineering 22, 6 (2017), 3219--3253. Google ScholarDigital Library
Martin Neil and Norman Fenton. 1996. Predicting software quality using Bayesian belief networks. In Proceedings of the 21st Annual Software Engineering Workshop. NASA/Goddard Space Flight Centre, 217--230.Google Scholar
Ping-Feng Pai and Wei-Chiang Hong. 2006. Software reliability forecasting by support vector machines with simulated annealing algorithms. Journal of Systems and Software 79, 6 (2006), 747--755. Google ScholarDigital Library
Ping-Feng Pai and Wei-Chiang Hong. 2006. Software reliability forecasting by support vector machines with simulated annealing algorithms. Journal of Systems and Software 79, 6 (2006), 747--755. Google ScholarDigital Library
David G Robinson and Duane Dietrich. 1987. A new nonparametric growth model. IEEE Transactions on Reliability 36, 4 (1987), 411--418.Google ScholarCross Ref
George AF Seber and Alan J Lee. 2012. Linear regression analysis. Vol. 936. John Wiley & Sons.Google Scholar
Nozer D. Singpurwalla and Refik Soyer. 1985. Assessing (software) reliability growth using a random coefficient autoregressive process and its ramifications. IEEE Transactions on Software Engineering 12 (1985), 1456--1464. Google ScholarDigital Library
Alina Tugend. 2008. Multitasking Can Make You Lose ... Um ... Focus. (Oct 2008). https://nyti.ms/2jD6gzjGoogle Scholar
Burak Turhan, Tim Menzies, Ayşe B Bener, and Justin Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 (2009), 540--578. Google ScholarDigital Library
Wladyslaw M Turski. 1996. Reference model for smooth growth of software systems. IEEE Transactions on Software Engineering 22, 8 (1996), 599. Google ScholarDigital Library
Gerald M Weinberg. 1992. Quality software management (Vol. 1): systems thinking. (1992). Google ScholarDigital Library
Cort J Willmott and Kenji Matsuura. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research 30, 1 (2005), 79--82.Google Scholar
Cort J Willmott and Kenji Matsuura. 2006. On the use of dimensioned measures of error to evaluate the performance of spatial interpolators. International Journal of Geographical Information Science 20, 1 (2006), 89--102.Google ScholarCross Ref
MP Wiper, AP Palacios, and JM Marín. 2012. Bayesian software reliability prediction using software metrics information. Quality Technology & Quantitative Management 9, 1 (2012), 35--44.Google ScholarCross Ref
Alan Wood. 1997. Software reliability growth models: assumptions vs. reality. In Software Reliability Engineering, 1997. Proceedings., The Eighth International Symposium on. IEEE, 136--141. Google ScholarDigital Library
Jingwei Wu and Richard Holt. 2006. Seeking empirical evidence for self-organized criticality in open source software evolution. (2006).Google Scholar
M Xie and SL Ho. 1999. Analysis of repairable system failure data using time series models. Journal of Quality in Maintenance Engineering 5, 1 (1999), 50--61.Google ScholarCross Ref
Bo Yang, Xiang Li, Min Xie, and Feng Tan. 2010. A generic data-driven software reliability model with model mining technique. Reliability Engineering & System Safety 95, 6 (2010), 671--678.Google ScholarCross Ref
S Jamal H Zaidi, Syed Nasir Danial, and Bilal A Usmani. 2008. Modeling inter-failure time series using neural networks. In Multitopic Conference, 2008. INMIC 2008. IEEE International. IEEE, 409--411.Google ScholarCross Ref
David Zeitler. 1991. Realistic assumptions for software reliability models. In Software Reliability Engineering, 1991. Proceedings., 1991 International Symposium on. IEEE, 67--74.Google ScholarCross Ref

Index Terms

What is the connection between issues, bugs, and enhancements?: lessons learned from 800+ software projects
1. Software and its engineering
  1. Software creation and management
    1. Software development process management
      1. Software development methods
        Agile software development

Recommendations

A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia

The model identification process for VARMA, ARMA and Winters method.Display Omitted VARMA, ARMA and Winters methods are used extensively for planning and management.Multivariate VARMA model is a reliable tool for predicting ED demand by category.It ...
Read More
Do bugs lead to unnaturalness of source code?
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Texts in natural languages are highly repetitive and predictable because of the naturalness of natural languages. Recent research validated that source code in programming languages is also repetitive and predictable, and naturalness is an inherent ...
Read More
Sales forecasting for Chemical Products by Using SARIMA Model
ICBDE '22: Proceedings of the 5th International Conference on Big Data and Education

Sales forecasting is widely used in enterprise resource management, which provides valuable information for efficient management. Sales forecasting facilitates the company to produce and stock products on demand. Based on the analysis of time series, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice
May 2018
336 pages
ISBN:9781450356596
DOI:10.1145/3183519
Conference Chairs:
Frances Paulisch
Siemens Healthineers, Germany
,
Jan Bosch
Chalmers University of Technology
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 May 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bugs
collaborations
issues
time series analysis
Qualifiers
- research-article
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 193
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

What is the connection between issues, bugs, and enhancements?: lessons learned from 800+ software projects

ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice

ABSTRACT

References

Cited By

Index Terms

Recommendations

A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia

Do bugs lead to unnaturalness of source code?

Sales forecasting for Chemical Products by Using SARIMA Model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

What is the connection between issues, bugs, and enhancements?: lessons learned from 800+ software projects

ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice

ABSTRACT

References

Cited By

Index Terms

Recommendations

A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia

Do bugs lead to unnaturalness of source code?

Sales forecasting for Chemical Products by Using SARIMA Model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media