research-article

Is it a bug or an enhancement?: a text-based approach to classify change requests

Authors:
Giuliano Antoniol

SOCCER Lab. -- DGIGL, Québec, Canada

SOCCER Lab. -- DGIGL, Québec, Canada
View Profile

,
Kamel Ayari

SOCCER Lab. -- DGIGL, Québec, Canada

SOCCER Lab. -- DGIGL, Québec, Canada
View Profile

,
Massimiliano Di Penta

University of Sannio, Benevento, Italy

University of Sannio, Benevento, Italy
View Profile

,
Foutse Khomh

Université de Montréal, Québec, Canada

Université de Montréal, Québec, Canada
View Profile

,
Yann-Gaël Guéhéneuc

Université de Montréal, Québec, Canada

Université de Montréal, Québec, Canada
View Profile

CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of mindsOctober 2008Article No.: 23Pages 304–318https://doi.org/10.1145/1463788.1463819

Published:27 October 2008Publication History

CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds

Pages 304–318

ABSTRACT

Bug tracking systems are valuable assets for managing maintenance activities. They are widely used in open-source projects as well as in the software industry. They collect many different kinds of issues: requests for defect fixing, enhancements, refactoring/restructuring activities and organizational issues. These different kinds of issues are simply labeled as "bug" for lack of a better classification support or of knowledge about the possible kinds.

This paper investigates whether the text of the issues posted in bug tracking systems is enough to classify them into corrective maintenance and other kinds of activities.

We show that alternating decision trees, naive Bayes classifiers, and logistic regression can be used to accurately distinguish bugs from other kinds of issues. Results from empirical studies performed on issues for Mozilla, Eclipse, and JBoss indicate that issues can be classified with between 77% and 82% of correct decisions.

References

Ethem Aplaydin. Introduction to Machine Learning. MIT Press, 2004. Google ScholarDigital Library
Kamel Ayari, Peyman Meshkinfam, Giulio Antoniol, and Massimiliano Di Penta. Threats on building models from cvs and bugzilla repositories: the mozilla case study. In CASCON, Toronto, CA, Oct 23--25 2007. Google ScholarDigital Library
V. Basili, G. Caldiera, and D. H. Rombach. The Goal Question Metric Paradigm Encyclopedia of Software Engineering. John Wiley and Sons, 1994.Google Scholar
L. C. Briand, S. Morasca, and V. Basili. Measuring and assesing maintainability at the end of high level design. In Proceedings of IEEE International Conference on Software Maintenance, pages 88--97, Montreal, 1993. Google ScholarDigital Library
S. E. Robertson C. J. van Rijsbergen and M. F. Porter. New models in probabilistic information retrieval. London: British Library, Research and Development Report, no. 5587, 1980.Google Scholar
Rumelhart D. E., Hinton G. E., and Williams R. J. Learning representations by back-propagating errors. Nature, 323:533--536, 1986.Google ScholarCross Ref
Fenton N. and Neil M. A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5):675--689, 1999. Google ScholarDigital Library
Michael Fischer, Martin Pinzger, and Harald Gall. Populating a release history database from version control and bug tracking systems. In Proceedings of the International Conference on Software Maintenance, pages 23--32, Amsterdam Netherlands, September 2003. Google ScholarDigital Library
W. B. Frakes and R. Baeza-Yates. Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs, NJ, 1992.Google ScholarDigital Library
Harald Gall, Karin Hajek, and Mehdi Jazayeri. Detection of logical coupling based on product release history. In Proceedings of IEEE International Conference on Software Maintenance, pages 190--197, 1998. Google ScholarDigital Library
Daniel M. German. An empirical study of fine-grained software modifications. Journal of Empirical Software Engineering, 2005. Google ScholarDigital Library
Tibor Gyimóthy, Rudolf Ferenc, and István Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Software Eng., 31(10):897--910, 2005. Google ScholarDigital Library
N. Kurishima, H. Oikawa, J. Nakamura, K. Amari, M. Fujioka, and K. D. Denwa. Quantitative analysis of error in telecomunications software. In Proceedings of IEEE International Conference on Software Maintenance, pages 190--198, Victoria, 1994. Google ScholarDigital Library
Tom Mitchell. Machine Learning. MIT Press, 1997. Google ScholarDigital Library
J. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarDigital Library
J. O. Rawlings, S. G. Pandula, and D. A. Dickey. Applied Regression Analysis a Research Tool. Springer Texts in Statistics. New York: Springer-Verlag, second edition edition, 1998.Google Scholar
Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. When do changes induce fixes? In Proceedings of the 2005 International Workshop on Mining Software Repositories MSR 2005 Saint Louis Missouri USA, May 17 2005. Google ScholarDigital Library
M. Stone. Cross-validatory choice and assesment of statistical predictions (with discussion). Journal of the Royal Statistical Society B, 36:111--147, 1974.Google Scholar
Marek Vokavc. Defect frequency and design patterns: An empirical study of industrial code. IEEE Trans. Software Eng., 30:904--917, 2004. Google ScholarDigital Library
Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik, and Jiasu Sun. An approach to detecting duplicate bug reports using natural language and execution information. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 461--470, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
Peter Weissgerber and Stephan Diehl. Are refactorings less error-prone than other changes? In Proceedings of the 2006 International Workshop on Mining Software Repositories MSR 2006 Shanghai China May 22--23 2006, pages 112--118, 2006. Google ScholarDigital Library
Ian Witten and Eibe Frank. Data Mining Practical Machine Learning Tools and Techniques - Second Edition. Elsevier, 2005. Google ScholarDigital Library
R. K. Yin. Case Study Research: Design and Methods - Third Edition. SAGE Publications, London, 2002.Google Scholar
Annie T. T. Ying, Gail C. Murphy, Raymond T. Ng, and Mark Chu-Carroll. Predicting source code changes by mining change history. IEEE Trans. Software Eng., 30(9):574--586, 2004. Google ScholarDigital Library
Thomas Zimmermann, Peter Weissgerber, Stephan Diehl, and Andreas Zeller. Mining version histories to guide software changes. In Proceedings of the International Conference on Software Engineering, pages 563--572, 2004. Google ScholarDigital Library

Index Terms

Is it a bug or an enhancement?: a text-based approach to classify change requests

Recommendations

Effective Bug Triage Based on Historical Bug-Fix Information
ISSRE '14: Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability Engineering

For complex and popular software, project teams could receive a large number of bug reports. It is often tedious and costly to manually assign these bug reports to developers who have the expertise to fix the bugs. Many bug triage techniques have been ...
Read More
Is it a bug or an enhancement?: a text-based approach to classify change requests
CASCON '18: Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering

Bug tracking systems are valuable assets for managing maintenance activities. They are widely used in open-source projects as well as in the software industry. They collect many different kinds of issues: requests for defect fixing, enhancements, ...
Read More
Memories of bug fixes
SIGSOFT '06/FSE-14: Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering

The change history of a software project contains a rich collection of code changes that record previous development experience. Changes that fix bugs are especially interesting, since they record both the old buggy code and the new fixed code. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
October 2008
357 pages
ISBN:9781450378826
DOI:10.1145/1463788
Conference Chairs:
Joanna Ng
Head of the Centre for Advanced Studies
,
Christian Couturier
National Research Council Canada
,
Editors:
Marsha Chechik
University of Toronto
,
Mark Vigder
National Research Council Canada
,
Darlene Stewart
National Research Council Canada
,
Program Chairs:
Mark Vigder
National Research Council Canada
,
Marsh Chechik
University of Toronto
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate24of90submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 259
  Total Citations
  View Citations
- 1,544
  Total Downloads
- Downloads (Last 12 months)64
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Is it a bug or an enhancement?: a text-based approach to classify change requests

CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds

ABSTRACT

References

Cited By

Index Terms

Recommendations

Effective Bug Triage Based on Historical Bug-Fix Information

Is it a bug or an enhancement?: a text-based approach to classify change requests

Memories of bug fixes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Is it a bug or an enhancement?: a text-based approach to classify change requests

CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds

ABSTRACT

References

Cited By

Index Terms

Recommendations

Effective Bug Triage Based on Historical Bug-Fix Information

Is it a bug or an enhancement?: a text-based approach to classify change requests

Memories of bug fixes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media