ABSTRACT
A major obstacle to finding program errors in a real system is knowing what correctness rules the system must obey. These rules are often undocumented or specified in an ad hoc manner. This paper demonstrates techniques that automatically extract such checking information from the source code itself, rather than the programmer, thereby avoiding the need for a priori knowledge of system rules.The cornerstone of our approach is inferring programmer "beliefs" that we then cross-check for contradictions. Beliefs are facts implied by code: a dereference of a pointer, p, implies a belief that p is non-null, a call to "unlock(1)" implies that 1 was locked, etc. For beliefs we know the programmer must hold, such as the pointer dereference above, we immediately flag contradictions as errors. For beliefs that the programmer may hold, we can assume these beliefs hold and use a statistical analysis to rank the resulting errors from most to least likely. For example, a call to "spin_lock" followed once by a call to "spin_unlock" implies that the programmer may have paired these calls by coincidence. If the pairing happens 999 out of 1000 times, though, then it is probably a valid belief and the sole deviation a probable error. The key feature of this approach is that it requires no a priori knowledge of truth: if two beliefs contradict, we know that one is an error without knowing what the correct belief is.Conceptually, our checkers extract beliefs by tailoring rule "templates" to a system --- for example, finding all functions that fit the rule template "a must be paired with b." We have developed six checkers that follow this conceptual framework. They find hundreds of bugs in real systems such as Linux and OpenBSD. From our experience, they give a dramatic reduction in the manual effort needed to check a large system. Compared to our previous work [9], these template checkers find ten to one hundred times more rule instances and derive properties we found impractical to specify manually.
- 1.A. Aiken, M. Faehndrich, and Z. Su. Detecting races in relay ladder logic programs. In Proceedings of the 1st International Conference on Tools and Algorithms .for the Construction and Analysis of Systems, April 1998.]] Google ScholarDigital Library
- 2.T. Ball and S. K. Rajamani. Bebop: A symbolic model checker for boolean programs. In SPIN P000 Workshop on Model Checking of Software (LNCS 1885, Springer), August/September 2000.]] Google ScholarDigital Library
- 3.M. Bishop and M. Dilger. Checking for race conditions in file accesses. Computing systems, pages 131-152, Spring 1996.]]Google Scholar
- 4.W.R. Bush, J.D. Pincus, and D.J. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30(7):775-802, 2000.]] Google ScholarDigital Library
- 5.J.C. Corbett, M.B. Dwyer, J. Hatcliff, S. Laubach, C.S. Pasareanu, Robby, and H. Zheng. Bandera: Extracting finite-state models from java source code. In ICSE PO00, 2000.]] Google ScholarDigital Library
- 6.R. DeLine and M. Fahndrich. Enforcing high-level protocols in low-level software. In Proceedings of the ACM SIG- PLAN P001 Conference on Programming Language Design and Implementation, June 2001.]] Google ScholarDigital Library
- 7.D.L. Detlefs, R.M. Leino, G. Nelson, and J.B. Saxe. Extended static checking. TR SRC-159, COMPAQ SRC, December 1998.]]Google Scholar
- 8.P. Eidorff, F. Henglein, C. Mossin, H. Niss, M. H. S0rensen, and M. Tofte. AnnoDomini: From type theory to year 2000 conversion toot. In Conference Record of POPL 99: The P6th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Antonio, Texas, pages 1-14, New York, NY, 1999.]] Google ScholarDigital Library
- 9.D. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system-specific, programmer-written compiler extensions. In Proceedings of Operating Systems Design and Implementation (OSDI), September 2000.]] Google ScholarDigital Library
- 10.M.D. Ernst, J. Cockrell, W.G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering, 27(2):1-25, February 2001.]] Google ScholarDigital Library
- 11.D. Evans, J. Guttag, J. Homing, and Y.M. Tan. Lclint: A tool for using specifications to check code. In Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, December 1994.]] Google ScholarDigital Library
- 12.C. Flanagan, K. Rustan, and M. Leino. Houdini, an annotation assistant for esc/java. In Symposium of Formal Methods Europe, pages 500-517, March 2001.]] Google ScholarDigital Library
- 13.R. W. Floyd. Assigning meanings to programs, pages 19- 32. J.T. Schwartz, Ed. American Mathematical Society, 1967.]]Google Scholar
- 14.D. Freedman, R. Pisani, and R. Purves. Statistics. W.W. Norton, third edition edition, 1998.]]Google Scholar
- 15.G. Holzmann and M. Smith. Software model checking: Extracting verification models from source code. In Invited Paper. Proc. PSTV/FORTE99 Publ. Kluwer, 1999.]] Google ScholarDigital Library
- 16.G. Humpreys. Personal communication. Wasted days configuring a graphics card because of a missing error check in the driver., September 2000.]]Google Scholar
- 17.G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C.V. Lopes, J. Loingtier, and J. Irwin. Aspect-oriented programming. In European Conference on Object-Oriented Programming (ECOOP), June 1997.]]Google ScholarCross Ref
- 18.D. Lie, A. Chou, D. Engler, and D. Dill. A simple method for extracting models from protocol code. In Proceedings of the 28th Annual International Symposium on Computer Architecture, July 2001.]] Google ScholarDigital Library
- 19.K.L. McMillan and J. Schwalbe. Formal verification of the gigamax cache consistency protocol. In Proceedings of the International Symposium on Shared Memory Multiprocessing, pages 242-51. Tokyo, Japan InL Process. Soc., 1991.]]Google Scholar
- 20.G. Nelson. Techniques for program verification. Available as Xerox PARC Research Report CSL-81-10, June, 1981, Stanford University, 1981.]]Google Scholar
- 21.S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T.E. Anderson. Eraser: A dynamic data race detector for multithreaded programming. ACM Transactions on Computer Systems, 15(4):391-411, 1997.]] Google ScholarDigital Library
- 22.U. Stern and D.L. Dill. Automatic verification of the SCI cache coherence protocol. In Correct Hardware Design and Verification Methods: IFIP WGIO.5 Advanced Research Working Conference Proceedings, 1995.]] Google ScholarDigital Library
- 23.R E Strom and S Yemini. Typestate a programming language concept for enhancing software reliability. IEEE Transactions on Software Engineering, 1:157-171, January 1986.]] Google ScholarDigital Library
- 24.D. Wagner, J. Foster, E. Brewer, and A. Aiken. A first step towards automated detection of buffer overrun ruinerabilities. In The 2000 Network and Distributed Systems Security Conference. San Diego, CA, February 2000.]]Google Scholar
Index Terms
- Bugs as deviant behavior: a general approach to inferring errors in systems code
Recommendations
Bugs as deviant behavior: a general approach to inferring errors in systems code
A major obstacle to finding program errors in a real system is knowing what correctness rules the system must obey. These rules are often undocumented or specified in an ad hoc manner. This paper demonstrates techniques that automatically extract such ...
Characterizing duplicate bugs: Perceptions of practitioners and an empirical analysis
AbstractBug handling is an essential part of the software development process. Ideally, in a bug‐tracking system, bugs are reported, fixed, verified, and closed. In some cases, bugs have to be reopened mostly due to an incorrect fix. However, instead of ...
Our study investigates the difference between unique and duplicate bugs and further categorizes duplicates into Master‐Unresolved (Category‐1) and Missed‐Reopen (Category‐2) bugs. Through investigation of bug reports, we found that duplicates are up to ...
Comments