skip to main content
10.1145/1117696.1117704acmotherconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
Article

Coping with an open bug repository

Published:16 October 2005Publication History

ABSTRACT

Most open source software development projects include an open bug repository---one to which users of the software can gain full access---that is used to report and track problems with, and potential enhancements to, the software system. There are several potential advantages to the use of an open bug repository: more problems with the system might be identified because of the relative ease of reporting bugs, more problems might be fixed because more developers might engage in problem solving, and developers and users can engage in focused conversations about the bugs, allowing users input into the direction of the system. However, there are also some potential disadvantages such as the possibility that developers must process irrelevant bugs that reduce their productivity. Despite the rise in use of open bug repositories, there is little data about what is stored inside these repositories and how they are used. In this paper, we provide an initial characterization of two open bug repositories from the Eclipse and Firefox projects, describe the duplicate bug and bug triage problems that arise with these open bug repositories, and discuss how we are applying machine learning technology to help automate these processes.

References

  1. J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? Unpublished. Available from the authors.Google ScholarGoogle Scholar
  2. G. Canfora and L. Cerulo. How software repositories can help in resolving a new change request. In Workshop on Empirical Studies in Reverse Engineering, September 2005.Google ScholarGoogle Scholar
  3. K. Crowston and J. Howison. The social structure of free and open source software development. First Monday, 10, 2005.Google ScholarGoogle Scholar
  4. D. Cubranic and G. C. Murphy. Automatic bug triage using text classification. In Proc. of Software Engineering and Knowledge Engineering, pages 92--97, 2004.Google ScholarGoogle Scholar
  5. S. R. Gunn. Support Vector Machines for classification and regression. Technical report, University of Southampton, Faculty of Engineering, Science and Mathematics; School of Electronics and Computer Science, May 1998.Google ScholarGoogle Scholar
  6. A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Trans. Softw. Eng. Methodol., 11(3):309--346, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. S. Raymond. The cathedral and the bazaar. First Monday, 3(3), 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Reis, R. Pontin, and M. Fortes. An overview of the software engineering process and tools in the mozilla project. In Proc. of Open Source Soft. Dev. Workshop, Newcastle upon Tyne, pages 155--175, 2002.Google ScholarGoogle Scholar
  9. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Sandusky, L. Gasser, and G. Ripoche. Bug report networks: Varieties, strategies, and impacts in a f/oss development community. Proc. of 1st Int'l Workshop on Mining Software Repositories, pages 80--84, 2004.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Coping with an open bug repository

    Recommendations

    Reviews

    Phillip A. Laplante

    With respect to the need for flexibility in humans, science fiction writer Robert Heinlein wrote, "specialization is for insects" [1]. Upon reading this paper, I think you will conclude that specialization is also for bug reporting. This conclusion follows from the study reported in this work, in which the data from the bug repositories for two open source projects are characterized, and an automated triage tool proposed. Indeed, this paper is recommended reading for those involved in open source software. The authors selected two well-known projects, Eclipse V3.0 and Firefox V1.0, for their study. This choice was appropriate, as there are a large number of bugs reported daily in these repositories (around 1,000 per year), and a variety of developers available to resolve the bugs. These large projects surely helped to provide amplification of the kinds of problems found in most other bug repositories. Among the problems the authors identified in managing the bug repository were: the large number of bugs reported daily, and the varied quality of the reports, leading to a great deal of wasted time and effort; a great deal of duplication in the bug reports; and difficulty in assigning the right person to resolve the bug report. Together, these problems face the bug triager (a person who must assign the bug report to an appropriate developer for investigation and resolution). These challenges are compounded by the fact that most bugs are resolved as, cryptically, "INVALID," "DUPLICATE," "WORKSFORME," or "WONTFIX," so that when a "new" bug is reported, the triager must possess a great deal of working knowledge to determine if a new bug has indeed been found. The authors found that, even after a bug had been resolved, determining the developer who resolved a report was nontrivial. While knowing who resolved a particular bug is clearly important for a human triager, this information is particularly vital to automating bug triage. In fact, the authors propose that bug triaging difficulties can be resolved via an artificial intelligence approach. They note, for example, that using spam filtering-like rules can help triage automatically. But, more powerfully, the authors are investigating a machine learning approach using a support vector machine (SVM), which involves separating a multi-dimensional feature space by discriminating hyperplanes. The authors' bug reporting system apparently uses a statistical database of knowledge from past bug reports and the SVM to help cut down on duplication, through better classification of bug types. The system also uses domain analysis techniques in order to assign bugs to an appropriate person for resolution. This system can be incorporated into an existing integrated development environment (IDE) such as Eclipse. Unfortunately, the paper provides little other information on this automated system, and, in fact, it is unclear if the system has even been built. This brings me back to my original statement about specialization: what is not explicitly stated in this work, but what is clearly implied, is that a better, more fine-grained bug classification system is needed to reduce bug report duplication, and assist the machine learning algorithms in assignment. Therefore, I hope the authors eventually develop and report a fully working system, replete with a rich entomological taxonomy. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      eclipse '05: Proceedings of the 2005 OOPSLA workshop on Eclipse technology eXchange
      October 2005
      141 pages
      ISBN:1595933425
      DOI:10.1145/1117696

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 October 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate38of79submissions,48%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader