ABSTRACT
The change history of a software project contains a rich collection of code changes that record previous development experience. Changes that fix bugs are especially interesting, since they record both the old buggy code and the new fixed code. This paper presents a bug finding algorithm using bug fix memories: a project-specific bug and fix knowledge base developed by analyzing the history of bug fixes. A bug finding tool, BugMem, implements the algorithm. The approach is different from bug finding tools based on theorem proving or static model checking such as Bandera, ESC/Java, FindBugs, JLint, and PMD. Since these tools use pre-defined common bug patterns to find bugs, they do not aim to identify project-specific bugs. Bug fix memories use a learning process, so the bug patterns are project-specific, and project-specific bugs can be detected. The algorithm and tool are assessed by evaluating if real bugs and fixes in project histories can be found in the bug fix memories. Analysis of five open source projects shows that, for these projects, 19.3%-40.3% of bugs appear repeatedly in the memories, and 7.9%-15.5% of bug and fix pairs are found in memories. The results demonstrate that project-specific bug fix patterns occur frequently enough to be useful as a bug detection technique. Furthermore, for the bug and fix pairs, it is possible to both detect the bug and provide a strong suggestion for the fix. However, there is also a high false positive rate, with 20.8%-32.5% of non-bug containing changes also having patterns found in the memories. A comparison of BugMem with a bug finding tool, PMD, shows that the bug sets identified by both tools are mostly exclusive, indicating that BugMem complements other bug finding tools.
- C. Artho, "Jlint - Find Bugs in Java Programs," 2006, http://jlint.sourceforge.net/.Google Scholar
- J. Bevan and E. J. Whitehead, Jr., "Identification of Software Instabilities," Proc. of 10th Working Conference on Reverse Engineering (WCRE 2003), Victoria, Canada, pp. 134--145, 2003. Google ScholarDigital Library
- J. Bevan, E. J. Whitehead, Jr., S. Kim, and M. Godfrey, "Facilitating Software Evolution with Kenyon," Proc. of the 2005 European Software Engineering Conference and 2005 Foundations of Software Engineering (ESEC/FSE 2005), Lisbon, Portugal, pp. 177--186, 2005. Google ScholarDigital Library
- Y. Brun and M. D. Ernst, "Finding Latent Code Errors via Machine Learning over Program Executions," Proc. of 26th International Conference on Software Engineering (ICSE 2004), Scotland, UK, pp. 480--490, 2004. Google ScholarDigital Library
- T. Copeland, PMD Applied: Centennial Books, 2005.Google Scholar
- J. Corbett, M. Dwyer, J. Hatcliff, C. Pasareanu, Robby, S. Laubach, and H. Zheng, "Bandera: Extracting Finite-state Models from Java Source Code," Proc. of 22nd International Conference on Software Engineering (ICSE 2000), Limerick, Ireland, pp. 439--448, 2000. Google ScholarDigital Library
- D. Cubranic and G. C. Murphy, "Hipikat: Recommending pertinent software development artifacts," Proc. of 25th International Conference on Software Engineering (ICSE 2003), Portland, Oregon, pp. 408--418, 2003. Google ScholarDigital Library
- D. Cubranic, G. C. Murphy, J. Singer, and K. S. Booth, "Hipikat: A Project Memory for Software Development," IEEE Trans. Software Engineering, vol. 31, no. 6, pp. 446--465, 2005. Google ScholarDigital Library
- M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao, "The Daikon System for Dynamic Detection of Likely Invariants," Science of Computer Programming, 2006. Google ScholarDigital Library
- M. Fischer, M. Pinzger, and H. Gall, "Populating a Release History Database from Version Control and Bug Tracking Systems," Proc. of 19th International Conference on Software Maintenance (ICSM 2003), Amsterdam, The Netherlands, pp. 23--32, 2003. Google ScholarDigital Library
- C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe, and R. Stata, "Extended Static Checking for Java," Proc. of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, Berlin, Germany, pp. 234--245, 2002. Google ScholarDigital Library
- T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy, "Predicting Fault Incidence Using Software Change History," IEEE Transactions on Software Engineering, vol. 26, no. 7, pp. 653--661, 2000. Google ScholarDigital Library
- R. Holmes and G. C. Murphy, "Using Structural Context to Recommend Source Code Examples," Proc. of 27th International Conference on Software Engineering (ICSE 2005), St. Louis, MO, USA, pp. 117--125, 2005. Google ScholarDigital Library
- D. Hovemeyer and W. Pugh, "Finding Bugs is Easy," Proc. of the 19th Object Oriented Programming Systems Languages and Applications (OOPSLA '04), Vancouver, British Columbia, Canada, pp. 92--106, 2004. Google ScholarDigital Library
- T. M. Khoshgoftaar and E. B. Allen, "Ordering Fault-Prone Software Modules," Software Quality Control Journal, vol. 11, no. 1, pp. 19--37, 2003. Google ScholarDigital Library
- Koders, "Koders - Source Code Search Engine," 2006, http://www.koders.com/.Google Scholar
- Z. Li, S. Lu, S. Myagmar, and Y. Zhou, "CP-Miner: finding Copy-paste and Related Bugs in Large-scale Software Code," IEEE Trans. Software Engineering, vol. 32, no. 3, pp. 176--192, 2005. Google ScholarDigital Library
- B. Livshits and T. Zimmermann, "DynaMine: Finding Common Error Patterns by Mining Software Revision Histories," Proc. of the 2005 European Software Engineering Conference and 2005 Foundations of Software Engineering (ESEC/FSE 2005), Lisbon, Portugal, pp. 296--305, 2005. Google ScholarDigital Library
- D. Mandelin, L. Xu, R. Bodik, and D. Kimelman, "Jungloid Mining: Helping to Navigate the API Jungle," Proc. of Conference on Programming Language Design and Implementation (PLDI 2005), Chicago, Illinois, USA, pp. 48--61, 2005. Google ScholarDigital Library
- A. Mockus and L. G. Votta, "Identifying Reasons for Software Changes Using Historic Databases," Proc. of 16th International Conference on Software Maintenance (ICSM 2000), San Jose, California, USA, pp. 120--130, 2000. Google ScholarDigital Library
- A. Mockus and D. M. Weiss, "Predicting Risk of Software Changes," Bell Labs Technical Journal, vol. 5, no. 2, pp. 169--180, 2002.Google ScholarCross Ref
- A. W. Moore, "Cross-Validation," 2005, http://www.autonlab.org/tutorials/overfit.html.Google Scholar
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell, "Predicting the Location and Number of Faults in Large Software Systems," IEEE Transactions on Software Engineering, vol. 31, no. 4, pp. 340--355, 2005. Google ScholarDigital Library
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell, "Where the Bugs Are," Proc. of 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis, Boston, Massachusetts, USA, pp. 86--96, 2004. Google ScholarDigital Library
- N. Rutar, C. B. Almazan, and J. S. Foster, "A Comparison of Bug Finding Tools for Java," Proc. of 15th IEEE International Symposium on Software Reliability Engineering (ISSRE'04), Saint-Malo, Bretagne, France, pp. 245--256, 2004. Google ScholarDigital Library
- J. Sliwerski, T. Zimmermann, and A. Zeller, "When Do Changes Induce Fixes?" Proc. of Int'l Workshop on Mining Software Repositories (MSR 2005), Saint Louis, Missouri, USA, pp. 24--28, 2005. Google ScholarDigital Library
- Q. Song, M. Shepperd, M. Cartwright, and C. Mair, "Software Defect Association Mining and Defect Correction Effort Prediction," IEEE Trans. Software Engineering, vol. 32, no. 2, pp. 69--82, 2006. Google ScholarDigital Library
- C. C. Williams and J. K. Hollingsworth, "Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques," IEEE Trans. Software Engineering, vol. 31, no. 6, pp. 466--480, 2005. Google ScholarDigital Library
Index Terms
- Memories of bug fixes
Recommendations
Which warnings should I fix first?
ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineeringAutomatic bug-finding tools have a high false positive rate: most warnings do not indicate real bugs. Usually bug-finding tools assign important warnings high priority. However, the prioritization of tools tends to be ineffective. We observed the ...
How bugs are fixed: exposing bug-fix patterns with edits and nesting levels
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied ComputingA deep understanding of the common patterns of bug-fixing changes is useful in several ways: (a) such knowledge can help developers in proactively avoiding coding patterns that lead to bugs and (b) bug-fixing patterns can be exploited in devising ...
Supplementary Bug Fixes vs. Re-opened Bugs
SCAM '14: Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and ManipulationA typical bug fixing cycle involves the reporting of a bug, the triaging of the report, the production and verification of a fix, and the closing of the bug. However, previous work has studied two phenomena where more than one fix are associated with ...
Comments