skip to main content
research-article

Reducing the effort of bug report triage: Recommenders for development-oriented decisions

Published:26 August 2011Publication History
Skip Abstract Section

Abstract

A key collaborative hub for many software development projects is the bug report repository. Although its use can improve the software development process in a number of ways, reports added to the repository need to be triaged. A triager determines if a report is meaningful. Meaningful reports are then organized for integration into the project's development process.

To assist triagers with their work, this article presents a machine learning approach to create recommenders that assist with a variety of decisions aimed at streamlining the development process. The recommenders created with this approach are accurate; for instance, recommenders for which developer to assign a report that we have created using this approach have a precision between 70% and 98% over five open source projects. As the configuration of a recommender for a particular project can require substantial effort and be time consuming, we also present an approach to assist the configuration of such recommenders that significantly lowers the cost of putting a recommender in place for a project. We show that recommenders for which developer should fix a bug can be quickly configured with this approach and that the configured recommenders are within 15% precision of hand-tuned developer recommenders.

Skip Supplemental Material Section

Supplemental Material

References

  1. Aha, D. W., Kibler, D., and Albert, M. K. 1991. Instance-based learning algorithms. Mach. Learn. 6, 1, 37--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Anvik, J., Hiew, L., and Murphy, G. C. 2006. Who should fix this bug? In Proceedings of the 28th International Conference on Software Engineering (ICSE'06). ACM, 361--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Anvik, J. and Murphy, G. C. 2007. Determining implementation expertise from bug reports. In Proceedings of the 4th International Workshop on Mining Software Repositories. IEEE Computer Society, 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Anvik, J. K. 2007. Assisting bug report triage through recommendation. Ph.D. dissertation, University of British Columbia.Google ScholarGoogle Scholar
  5. Baeza-Yates, R. A. and Ribeiro-Neto, B. A. 1999. Modern Information Retrieval. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Canfora, G. and Cerulo, L. 2006. Supporting change request assignment in open source development. In Proceedings of the 21st ACM Symposium on Applied Computing. ACM, 1767--1772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carstensen, P. H. and Sorensen, C. 1995. Let's talk about bugs! Scand. J. Inf. Syst. 7, 1, 33--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Crowston, K., Howison, J., and Annabi, H. 2006. Information systems success in free and open source software development: theory and measures. Softw. Proc. Improve. Pract. 11, 2, 123--148.Google ScholarGoogle ScholarCross RefCross Ref
  9. Čubranić, D. and Murphy, G. C. 2004. Automatic bug triage using text classification. In Proceedings of 16th International Conference on Software Engineering and Knowledge Engineering. 92--97.Google ScholarGoogle Scholar
  10. de Souza, C. R. B., Redmiles, D., Mark, G., Penix, J., and Sierhuis, M. 2003. Management of interdependencies in collaborative software development. In Proceedings of the International Symposium on Empirical Software Engineering (ISESE'03). IEEE Computer Society Press, 294--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39, 1, 1--38.Google ScholarGoogle Scholar
  12. Di Lucca, G. A., Penta, M. D., and Gradara, S. 2002. An approach to classify software maintenance requests. In Proceedings of the International Conference on Software Maintenance (ICSM'02). IEEE Computer Society Press, 93--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gunn, S. R. 1998. Support Vector Machines for classification and regression. Tech. rep., Faculty of Engineering, Science and Mathematics; School of Electronics and Computer Science, University of Southampton.Google ScholarGoogle Scholar
  14. Hiew, L. 2006. Assisted detection of duplicate bug reports. M.S. dissertation, University of British Columbia.Google ScholarGoogle Scholar
  15. Hooimeijer, P. and Weimer, W. 2007. Modeling bug report quality. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE'07). 34--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. John, G. H. and Langley, P. 1995. Estimating continous distributions in Bayesian classifiers. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence. Morgan-Kaufmann, 338--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. McDonald, D. W. 2001. Evaluating expertise recommendations. In Proceedings of the International ACM SIGGROUP Conference on Supporting Group Work. ACM, 214--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Minto, S. and Murphy, G. C. 2007. Recommending emergent teams. In Proceedings of 4th International Workshop on Mining Software Repositories. IEEE Computer Society Press, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mitchell, T. M. 1997. Machine Learning. WCB/McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mockus, A., Fielding, R. T., and Herbsleb, J. D. 2002. Two case studies of open source software development: Apache and Mozilla. ACM Trans. Softw. Eng. Meth. 11, 3, 309--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mockus, A. and Herbsleb, J. D. 2002. Expertise browser: A quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering. ACM, 503--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mockus, A. and Votta, L. G. 2000. Identifying reasons for software changes using historic databases. In Proceedings of the International Conference on Software Maintenance (ICSM'00). IEEE Computer Society Press, 120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Quinlan, R. 1993. C4.5: Programs for Machine Learning. Morgan-Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Raymond, E. S. 1999. The cathedral and the bazaar. Knowl. Tech. Policy 12, 3, 23--49.Google ScholarGoogle ScholarCross RefCross Ref
  25. Reis, C. R. and de Mattos Fortes, R. P. 2002. An overview of the software engineering process and tools in the Mozilla project. In Proceedings of the Open Source Software Development Workshop. 155--175.Google ScholarGoogle Scholar
  26. Rennie, J. D. M., Shih, L., Teevan, J., and Karger, D. R. 2003. Tackling the poor assumptions of Naïve Bayes classifiers. In Proceedings of 20th International Conference on Machine Learning. AAAI Press, 616--623.Google ScholarGoogle Scholar
  27. Runeson, P., Alexandersson, M., and Nyholm, O. 2007. Detection of duplicate defect reports using natural language processing. In Proceedings of the 29th International Conference on Software Engineering (ICSE'07). IEEE Computer Society Press, 499--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sakakibara, Y. 1997. Recent advances of grammatical inference. Theoret. Comput. Sci. 185, 1, 15--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sandusky, R. J. and Gasser, L. 2005. Negotiation and the coordination of information and activity in distributed software problem management. In Proceedings of the International ACM SIGGROUP Conference on Supporting Group Work (GROUP'05). ACM, 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sandusky, R. J., Gasser, L., and Ripoche, G. 2004. Bug report networks: Varieties, strategies, and impacts in a F/OSS development community. In Proceedings of ICSE Workshop on Mining Software Repositories. 80--84.Google ScholarGoogle Scholar
  31. Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Comput. Surv. 34, 1, 1--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Wang, X., Zhang, L., Xie, T., Anvik, J., and Sun, J. 2008. An approach to detecting duplicate bug reports using natural language and execution information. In Proceedings of the 30th International Conference on Software Engineering (ICSE'08). ACM, 461--470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Weiss, C., Premraj, R., Zimmermann, T., and Zeller, A. 2007. How long will it take to fix this bug? In Proceedings of 4th International Workshop on Mining Software Repositories (MSR'07). IEEE Computer Society Press, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan-Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reducing the effort of bug report triage: Recommenders for development-oriented decisions

            Recommendations

            Reviews

            Gerald D Everett

            Anvik and Murphy focus on bug reports, which are traditionally underutilized by developers, as a tool to increase quality and decrease risks in software development. The authors' bug report analysis tool"?a recommender"?shows definite promise in reducing bug report categorization and pre-grouping costs while achieving predictive results. However, additional work is needed to make the three presented recommenders valuable enough for practical use. It is an international testing best practice to analyze bug reports for corrective development action. The authors demonstrate their recommender tool to be a more cost-effective bug report analysis approach than manual review. In order to establish credibility for the paper's recommender value premise and proof of cost reduction for a nontester professional audience, the authors must provide additional information. The paper's value premise about cost reduction is stated numerous times; however, no benefits are presented to offset the recommender costs"?the real value for practical use. In addition, I could not find a list of the Bugzilla bug report fields on which the authors based their recommender case study analysis. Online Computing Reviews Service

            Access critical reviews of Computing literature here

            Become a reviewer for Computing Reviews.

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Software Engineering and Methodology
              ACM Transactions on Software Engineering and Methodology  Volume 20, Issue 3
              August 2011
              176 pages
              ISSN:1049-331X
              EISSN:1557-7392
              DOI:10.1145/2000791
              Issue’s Table of Contents

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 26 August 2011
              • Accepted: 1 June 2009
              • Revised: 1 March 2009
              • Received: 1 May 2008
              Published in tosem Volume 20, Issue 3

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader