ABSTRACT
The detection of bugs in software systems has been divided into two research areas: static code analysis and statistical modeling of historical data. Static analysis indicates precise problems on line numbers but has the disadvantage of suggesting many warning which are often false positives. In contrast, statistical models use the history of the system to suggest which files or commits are likely to contain bugs. These course-grained predictions do not indicate to the developer the precise reasons for the bug prediction. We combine static analysis with statistical bug models to limit the number of warnings and provide specific warnings information at the line level. Previous research was able to process only a limited number of releases, our tool, WarningsGuru, can analyze all commits in a source code repository and we currently have processed thousands of commits and warnings. Since we process every commit, we present developers with more precise information about when a warning is introduced allowing us to show recent warnings that are introduced in statistically risky commits. Results from two OSS projects show that CommitGuru's statistical model flags 25% and 29% of all commits as risky. When we combine this with static analysis in WarningsGuru the number of risky commits with warnings is 20% for both projects and the number commits with new warnings is only 3% and 6%. We can drastically reduce the number of commits and warnings developers have to examine. The tool, source code, and demo is available at https://github.com/louisq/warningsguru.
- Nathaniel Ayewah, William Pugh, J. David Morgenthaler, John Penix, and YuQian Zhou. 2007. Evaluating Static Analysis Defect Warnings on Production Software. In Proceedings of the 7th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’07). ACM, 1–8. 1145/1251535.1251536 Google ScholarDigital Library
- M. Beller, R. Bholanath, S. McIntosh, and A. Zaidman. 2016. Analyzing the State of Static Analysis: A Large-Scale Evaluation in Open Source Software. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 470–481.Google Scholar
- Cesar Couto, João Eduardo Montandon, Christofer Silva, and Marco Tulio Valente. 2013. Static correspondence and correlation between field defects and warnings reported by a bug finding tool. Software Quality Journal 21, 2 (2013), 241–257. Google ScholarDigital Library
- T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. 2012. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Transactions on Software Engineering 38, 6 (Nov 2012), 1276–1304. Google ScholarDigital Library
- Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A Large-Scale Empirical Study of Just-in-Time Quality Assurance. IEEE Trans. Softw. Eng. 39, 6 (June 2013), 757–773. Google ScholarDigital Library
- KDM Analytics. 2016. Blade Tool Output Integration Framework (TOIF). http: //www.kdmanalytics.com/toif/.Google Scholar
- Ugur Koc, Parsa Saadatpanah, Jeffrey S. Foster, and Adam A. Porter. 2017. Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools. In Proceedings of the 1st ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL 2017). ACM, 35–42. Google ScholarDigital Library
- MITRE Corporation. 2016. Common Weakness Enumeration (CWE). https: //cwe.mitre.org/.Google Scholar
- Foyzur Rahman, Sameer Khatri, Earl T. Barr, and Premkumar Devanbu. 2014. Comparing Static Bug Finders and Statistical Prediction. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 424–434. Google ScholarDigital Library
- Christoffer Rosen, Ben Grawi, and Emad Shihab. 2015. Commit Guru: Analytics and Risk Prediction of Software Commits. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 966–969. Google ScholarDigital Library
- Hao Tang, Tian Lan, Dan Hao, and Lu Zhang. 2015. Enhancing Defect Prediction with Static Defect Analysis. In Proceedings of the 7th Asia-Pacific Symposium on Internetware (Internetware ’15). ACM, New York, NY, USA, 43–51. Google ScholarDigital Library
- The Apache Software Foundation. 2016. Maven - POM Reference. https: //maven.apache.org/pom.html.Google Scholar
- Omer Tripp, Salvatore Guarnieri, Marco Pistoia, and Aleksandr Aravkin. 2014. ALETHEIA: Improving the Usability of Static Security Analysis. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS ’14). ACM, New York, NY, USA, 762–774. 1 WarningsGuru source code: https://github.com/louisq/warningsguru Abstract 1 Introduction 2 WarningsGuru Features and Architecture 3 Building thousands of commits 4 Static Analysis Integration - TOIF 5 Version Control - Git 6 Statistical models with CommitGuru 7 Tool Effectiveness and Conclusions References Google ScholarDigital Library
Index Terms
- WarningsGuru: integrating statistical bug models with static analysis to provide timely and specific bug warnings
Recommendations
Automated Bug Neighborhood Analysis for Identifying Incomplete Bug Fixes
ICST '10: Proceedings of the 2010 Third International Conference on Software Testing, Verification and ValidationAlthough many static-analysis techniques have been developed for automatically detecting bugs, such as null dereferences, fewer automated approaches have been presented for analyzing whether and how such bugs are fixed. Attempted bug fixes may be ...
Effective Bug Triage Based on Historical Bug-Fix Information
ISSRE '14: Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability EngineeringFor complex and popular software, project teams could receive a large number of bug reports. It is often tedious and costly to manually assign these bug reports to developers who have the expertise to fix the bugs. Many bug triage techniques have been ...
Memories of bug fixes
SIGSOFT '06/FSE-14: Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineeringThe change history of a software project contains a rich collection of code changes that record previous development experience. Changes that fix bugs are especially interesting, since they record both the old buggy code and the new fixed code. This ...
Comments