research-article

How many of all bugs do we find? a study of static bug detectors

Authors:
Andrew Habib

TU Darmstadt, Germany

TU Darmstadt, Germany
View Profile

,
Michael Pradel

TU Darmstadt, Germany

TU Darmstadt, Germany
View Profile

ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software EngineeringSeptember 2018Pages 317–328https://doi.org/10.1145/3238147.3238213

Published:03 September 2018Publication History

ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

Pages 317–328

ABSTRACT

Static bug detectors are becoming increasingly popular and are widely used by professional software developers. While most work on bug detectors focuses on whether they find bugs at all, and on how many false positives they report in addition to legitimate warnings, the inverse question is often neglected: How many of all real-world bugs do static bug detectors find? This paper addresses this question by studying the results of applying three widely used static bug detectors to an extended version of the Defects4J dataset that consists of 15 Java projects with 594 known bugs. To decide which of these bugs the tools detect, we use a novel methodology that combines an automatic analysis of warnings and bugs with a manual validation of each candidate of a detected bug. The results of the study show that: (i) static bug detectors find a non-negligible amount of all bugs, (ii) different tools are mostly complementary to each other, and (iii) current bug detectors miss the large majority of the studied bugs. A detailed analysis of bugs missed by the static detectors shows that some bugs could have been found by variants of the existing detectors, while others are domain-specific problems that do not match any existing bug pattern. These findings help potential users of such tools to assess their utility, motivate and outline directions for future work on static bug detection, and provide a basis for future comparisons of static bug detection with other bug finding techniques, such as manual and automated testing.

References

Edward Aftandilian, Raluca Sauciuc, Siddharth Priya, and Sundaresan Krishnan. 2012. Building Useful Program Analysis Tools Using an Extensible Java Compiler. In 12th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2012, Riva del Garda, Italy, September 23-24, 2012. 14–23. Google ScholarDigital Library
Mohammad Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Janis Benefelds. 2017. An Industrial Evaluation of Unit Test Generation: Finding Real Faults in a Financial Application. In 39th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice Track, ICSE-SEIP 2017, Buenos Aires, Argentina, May 20-28, 2017. 263–272. Google ScholarDigital Library
Nathaniel Ayewah, David Hovemeyer, J. David Morgenthaler, John Penix, and William Pugh. 2008. Using Static Analysis to Find Bugs. IEEE Software 25, 5 (2008), 22–29. Google ScholarDigital Library
Nathaniel Ayewah and William Pugh. 2010. The Google FindBugs fixit. In Proceedings of the Nineteenth International Symposium on Software Testing and Analysis, ISSTA 2010, Trento, Italy, July 12-16, 2010. 241–252. Google ScholarDigital Library
Al Bessey, Ken Block, Benjamin Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson R. Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66–75. Google ScholarDigital Library
Fraser Brown, Shravan Narayan, Riad S. Wahby, Dawson R. Engler, Ranjit Jhala, and Deian Stefan. 2017. Finding and Preventing Bugs in JavaScript Bindings. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. 559–578.Google Scholar
Cristiano Calcagno, Dino Distefano, Jérémy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving fast with software verification. In NASA Formal Methods Symposium. Springer, 3–11.Google ScholarCross Ref
Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson R. Engler. 2001. An Empirical Study of Operating System Errors. In Symposium on Operating Systems Principles (SOSP). 73–88. Google ScholarDigital Library
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. In Symposium on Operating Systems Principles (SOSP). ACM, 57– 72. Google ScholarDigital Library
David Hovemeyer and William Pugh. 2004. Finding bugs is easy. In Companion to the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 132–136. Google ScholarDigital Library
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why don’t software developers use static analysis tools to find bugs?. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 672–681. Google ScholarDigital Library
S. C. Johnson. 1978. Lint, a C Program Checker.Google Scholar
René Just, Darioush Jalali, Laura Inozemtseva, Michael D Ernst, Reid Holmes, and Gordon Fraser. 2014. Are mutants a valid substitute for real faults in software testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 654–665. Google ScholarDigital Library
Sunghun Kim and Michael D. Ernst. 2007. Which warnings should I fix first?. In European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE). ACM, 45–54. Google ScholarDigital Library
Ted Kremenek and Dawson R. Engler. 2003. Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations. In International Symposium on Static Analysis (SAS). Springer, 295–315. Google ScholarDigital Library
Bin Liang, Pan Bian, Yan Zhang, Wenchang Shi, Wei You, and Yan Cai. 2016. AntMiner: Mining More Bugs by Reducing Noise Interference. In ICSE. Google ScholarDigital Library
J. L. Lions. 1996. ARIANE 5 Flight 501 Failure. Report by the Inquiry Board. European Space Agency.Google Scholar
Shan Lu, Zhenmin Li, Feng Qin, Lin Tan, Pin Zhou, and Yuanyuan Zhou. 2005. Bugbench: Benchmarks for evaluating bug detection tools. In Workshop on the Evaluation of Software Defect Detection Tools.Google Scholar
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 329–339. Google ScholarDigital Library
Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in java: A large-scale experiment on the defects4j dataset. Empirical Software Engineering 22, 4 (2017), 1936–1964. Google ScholarDigital Library
Steve McConnell. 2004. Code Complete: A Practical Handbook of Software Construction, Second Edition. Microsoft Press.Google Scholar
Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 383–392. Google ScholarDigital Library
Frolin S. Ocariza Jr., Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah. 2013. An Empirical Study of Client-Side JavaScript Bugs. In Symposium on Empirical Software Engineering and Measurement (ESEM). 55–64.Google Scholar
Nicolas Palix, Gaël Thomas 0001, Suman Saha, Christophe Calvès, Julia L. Lawall, and Gilles Muller. 2011. Faults in linux: ten years later. In Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 305–318. Google ScholarDigital Library
Kai Pan, Sunghun Kim, and E. James Whitehead Jr. 2009. Toward an understanding of bug fix patterns. Empirical Software Engineering 14, 3 (2009), 286–315. Google ScholarDigital Library
Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In Software Engineering (ICSE), 2017 IEEE/ACM 39th International Conference on. IEEE, 609–620. Google ScholarDigital Library
Jacques A. Pienaar and Robert Hundt. 2013. JSWhiz: Static analysis for JavaScript memory leaks. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2013, Shenzhen, China, February 23-27, 2013. 11:1–11:11. Google ScholarDigital Library
Kevin Poulsen. 2004. Software Bug Contributed to Blackout. SecurityFocus.Google Scholar
Michael Pradel and Thomas R. Gross. 2011. Detecting anomalies in the order of equally-typed method arguments. In International Symposium on Software Testing and Analysis (ISSTA). 232–242. Google ScholarDigital Library
Michael Pradel, Ciera Jaspan, Jonathan Aldrich, and Thomas R. Gross. 2012. Statically Checking API Protocol Conformance with Mined Multi-Object Specifications. In International Conference on Software Engineering (ICSE). 925–935. Google ScholarDigital Library
Foyzur Rahman, Sameer Khatri, Earl T Barr, and Premkumar Devanbu. 2014. Comparing static bug finders and statistical prediction. In Proceedings of the 36th International Conference on Software Engineering. ACM, 424–434. Google ScholarDigital Library
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar T. Devanbu. 2016. On the "naturalness" of buggy code. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 428–439. Google ScholarDigital Library
Andrew Rice, Edward Aftandilian, Ciera Jaspan, Emily Johnston, Michael Pradel, and Yulissa Arroyo-Paredes. 2017. Detecting Argument Selection Defects. In OOPSLA. Google ScholarDigital Library
Nick Rutar, Christian B. Almazan, and Jeffrey S. Foster. 2004. A Comparison of Bug Finding Tools for Java. In International Symposium on Software Reliability Engineering (ISSRE). IEEE Computer Society, 245–256. Google ScholarDigital Library
Joseph R. Ruthruff, John Penix, J. David Morgenthaler, Sebastian Elbaum, and Gregg Rothermel. 2008. Predicting accurate and actionable static analysis warnings: an experimental approach. In International Conference on Software Engineering (ICSE). 341–350. Google ScholarDigital Library
Caitlin Sadowski, Jeffrey van Gogh, Ciera Jaspan, Emma Söderberg, and Collin Winter. 2015. Tricorder: Building a Program Analysis Ecosystem. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 598–608. http://dl.acm.org/citation.cfm?id= 2818754.2818828 Google ScholarDigital Library
Marija Selakovic and Michael Pradel. 2016. Performance Issues and Optimizations in JavaScript: An Empirical Study. In International Conference on Software Engineering (ICSE). 61–72. Google ScholarDigital Library
Sina Shamshiri, René Just, José Miguel Rojas, Gordon Fraser, Phil McMinn, and Andrea Arcuri. 2015. Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T). In 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, November 9-13, 2015. 201–211.Google ScholarDigital Library
Ferdian Thung, Lucia, David Lo, Lingxiao Jiang, Foyzur Rahman, and Premkumar T. Devanbu. 2012. To what extent could we detect field defects? an empirical study of false negatives in static bug finding tools. In Conference on Automated Software Engineering (ASE). ACM, 50–59. Google ScholarDigital Library
Ferdian Thung, Lucia, David Lo, Lingxiao Jiang, Foyzur Rahman, and Premkumar T. Devanbu. 2015. To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools. Autom. Softw. Eng. 22, 4 (2015), 561–602. Google ScholarDigital Library
Stefan Wagner, Jan Jürjens, Claudia Koller, and Peter Trischberger. 2005. Comparing Bug Finding Tools with Reviews and Tests. In International Conference on Testing of Communicating Systems (TestCom). Springer, 40–55. Google ScholarDigital Library
Song Wang, Devin Chollak, Dana Movshovitz-Attias, and Lin Tan. 2016. Bugram: bug detection with n-gram language models. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016. 708–719. Google ScholarDigital Library
Andrzej Wasylkowski and Andreas Zeller. 2009. Mining Temporal Specifications from Object Usage. In International Conference on Automated Software Engineering (ASE). IEEE, 295–306. Google ScholarDigital Library
Murray Wood, Marc Roper, Andrew Brooks, and James Miller. 1997. Comparing and Combining Software Defect Detection Techniques: A Replicated Empirical Study. In Software Engineering - ESEC/FSE ’97, 6th European Software Engineering Conference Held Jointly with the 5th ACM SIGSOFT Symposium on Foundations of Software Engineering, Zurich, Switzerland, September 22-25, 1997, Proceedings. 262–277. Google ScholarDigital Library
Jiang Zheng, Laurie A. Williams, Nachiappan Nagappan, Will Snipes, John P. Hudepohl, and Mladen A. Vouk. 2006. On the Value of Static Analysis for Fault Detection in Software. IEEE Trans. Software Eng. 32, 4 (2006), 240–253. ASE ’18, September 3–7, 2018, Montpellier, France Andrew Habib and Michael Pradel Google ScholarDigital Library

Index Terms

How many of all bugs do we find? a study of static bug detectors
1. General and reference
  1. Cross-computing tools and techniques
    1. Empirical studies
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis

Recommendations

On the real-world effectiveness of static bug detectors at finding null pointer exceptions
ASE '21: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering

Static bug detectors aim at helping developers to automatically find and prevent bugs. In this experience paper, we study the effectiveness of static bug detectors at identifying Null Pointer Dereferences or Null Pointer Exceptions (NPEs). NPEs pervade ...
Read More
Find bugs in static bug finders
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

Static bug finders (also known as static code analyzers, e.g., Find-Bugs, SonarQube) have been widely-adopted by developers to find bugs in real-world software projects. They leverage predefined heuristic static analysis rules to scan source code or ...
Read More
Using evolution patterns to find duplicated bugs
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
September 2018
955 pages
ISBN:9781450359375
DOI:10.1145/3238147
General Chair:
Marianne Huchard
University of Montpellier, France
,
Program Chairs:
Christian Kästner
Carnegie Mellon University, USA
,
Gordon Fraser
University of Passau, Germany
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 September 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Defects4J
bug finding
static analysis
static bug checkers
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate82of337submissions,24%
Upcoming Conference
ASE '24

Sponsor:

sigsoft online

sigsoft online

ASE '24: 39th IEEE/ACM International Conference on Automated Software Engineering

October 27 - November 1, 2024

Sacramento , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 60
  Total Citations
  View Citations
- 677
  Total Downloads
- Downloads (Last 12 months)156
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How many of all bugs do we find? a study of static bug detectors

ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

On the real-world effectiveness of static bug detectors at finding null pointer exceptions

Find bugs in static bug finders

Using evolution patterns to find duplicated bugs