skip to main content
10.1145/1463788.1463819acmotherconferencesArticle/Chapter ViewAbstractPublication PagescasconConference Proceedingsconference-collections
research-article

Is it a bug or an enhancement?: a text-based approach to classify change requests

Published:27 October 2008Publication History

ABSTRACT

Bug tracking systems are valuable assets for managing maintenance activities. They are widely used in open-source projects as well as in the software industry. They collect many different kinds of issues: requests for defect fixing, enhancements, refactoring/restructuring activities and organizational issues. These different kinds of issues are simply labeled as "bug" for lack of a better classification support or of knowledge about the possible kinds.

This paper investigates whether the text of the issues posted in bug tracking systems is enough to classify them into corrective maintenance and other kinds of activities.

We show that alternating decision trees, naive Bayes classifiers, and logistic regression can be used to accurately distinguish bugs from other kinds of issues. Results from empirical studies performed on issues for Mozilla, Eclipse, and JBoss indicate that issues can be classified with between 77% and 82% of correct decisions.

References

  1. Ethem Aplaydin. Introduction to Machine Learning. MIT Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Kamel Ayari, Peyman Meshkinfam, Giulio Antoniol, and Massimiliano Di Penta. Threats on building models from cvs and bugzilla repositories: the mozilla case study. In CASCON, Toronto, CA, Oct 23--25 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Basili, G. Caldiera, and D. H. Rombach. The Goal Question Metric Paradigm Encyclopedia of Software Engineering. John Wiley and Sons, 1994.Google ScholarGoogle Scholar
  4. L. C. Briand, S. Morasca, and V. Basili. Measuring and assesing maintainability at the end of high level design. In Proceedings of IEEE International Conference on Software Maintenance, pages 88--97, Montreal, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. E. Robertson C. J. van Rijsbergen and M. F. Porter. New models in probabilistic information retrieval. London: British Library, Research and Development Report, no. 5587, 1980.Google ScholarGoogle Scholar
  6. Rumelhart D. E., Hinton G. E., and Williams R. J. Learning representations by back-propagating errors. Nature, 323:533--536, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  7. Fenton N. and Neil M. A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5):675--689, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael Fischer, Martin Pinzger, and Harald Gall. Populating a release history database from version control and bug tracking systems. In Proceedings of the International Conference on Software Maintenance, pages 23--32, Amsterdam Netherlands, September 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. B. Frakes and R. Baeza-Yates. Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs, NJ, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Harald Gall, Karin Hajek, and Mehdi Jazayeri. Detection of logical coupling based on product release history. In Proceedings of IEEE International Conference on Software Maintenance, pages 190--197, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Daniel M. German. An empirical study of fine-grained software modifications. Journal of Empirical Software Engineering, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tibor Gyimóthy, Rudolf Ferenc, and István Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Software Eng., 31(10):897--910, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Kurishima, H. Oikawa, J. Nakamura, K. Amari, M. Fujioka, and K. D. Denwa. Quantitative analysis of error in telecomunications software. In Proceedings of IEEE International Conference on Software Maintenance, pages 190--198, Victoria, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tom Mitchell. Machine Learning. MIT Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. O. Rawlings, S. G. Pandula, and D. A. Dickey. Applied Regression Analysis a Research Tool. Springer Texts in Statistics. New York: Springer-Verlag, second edition edition, 1998.Google ScholarGoogle Scholar
  17. Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. When do changes induce fixes? In Proceedings of the 2005 International Workshop on Mining Software Repositories MSR 2005 Saint Louis Missouri USA, May 17 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Stone. Cross-validatory choice and assesment of statistical predictions (with discussion). Journal of the Royal Statistical Society B, 36:111--147, 1974.Google ScholarGoogle Scholar
  19. Marek Vokavc. Defect frequency and design patterns: An empirical study of industrial code. IEEE Trans. Software Eng., 30:904--917, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik, and Jiasu Sun. An approach to detecting duplicate bug reports using natural language and execution information. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 461--470, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Peter Weissgerber and Stephan Diehl. Are refactorings less error-prone than other changes? In Proceedings of the 2006 International Workshop on Mining Software Repositories MSR 2006 Shanghai China May 22--23 2006, pages 112--118, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ian Witten and Eibe Frank. Data Mining Practical Machine Learning Tools and Techniques - Second Edition. Elsevier, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. K. Yin. Case Study Research: Design and Methods - Third Edition. SAGE Publications, London, 2002.Google ScholarGoogle Scholar
  24. Annie T. T. Ying, Gail C. Murphy, Raymond T. Ng, and Mark Chu-Carroll. Predicting source code changes by mining change history. IEEE Trans. Software Eng., 30(9):574--586, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Thomas Zimmermann, Peter Weissgerber, Stephan Diehl, and Andreas Zeller. Mining version histories to guide software changes. In Proceedings of the International Conference on Software Engineering, pages 563--572, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Is it a bug or an enhancement?: a text-based approach to classify change requests

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
            October 2008
            357 pages
            ISBN:9781450378826
            DOI:10.1145/1463788

            Copyright © 2008 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 October 2008

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate24of90submissions,27%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader