ABSTRACT
Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.
In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.
- D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.Google Scholar
- S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.Google ScholarDigital Library
- D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983. Google ScholarDigital Library
- N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.Google Scholar
- C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002. Google ScholarDigital Library
- C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006. Google ScholarDigital Library
- K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000. Google ScholarDigital Library
- D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005. Google ScholarDigital Library
- W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007. Google ScholarDigital Library
- B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007. Google ScholarDigital Library
- M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007. Google ScholarDigital Library
- J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000. Google ScholarDigital Library
- P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007. Google ScholarDigital Library
- P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005. Google ScholarDigital Library
- P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.Google Scholar
- P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.Google Scholar
- K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.Google ScholarDigital Library
- J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979. Google ScholarDigital Library
- S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004. Google ScholarDigital Library
- J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976. Google ScholarDigital Library
- R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.Google ScholarDigital Library
- R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.Google Scholar
- R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007. Google ScholarDigital Library
- B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.Google Scholar
- P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990. Google ScholarDigital Library
- B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.Google Scholar
- D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997. Google ScholarDigital Library
- B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990. Google ScholarDigital Library
- R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000. Google ScholarDigital Library
- C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007. Google ScholarDigital Library
- R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006. Google ScholarDigital Library
- P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.Google Scholar
- D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989. Google ScholarDigital Library
- K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005. Google ScholarDigital Library
- E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999. Google ScholarDigital Library
- K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004. Google ScholarDigital Library
- M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007. Google ScholarDigital Library
- M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.Google Scholar
- G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007. Google ScholarDigital Library
Index Terms
- Grammar-based whitebox fuzzing
Recommendations
Grammar-based whitebox fuzzing
PLDI '08Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
Random testing for security: blackbox vs. whitebox fuzzing
RT '07: Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007)Fuzz testing is an effective technique for finding security vulnerabilities in software. Fuzz testing is a form of blackbox random testing which randomly mutates well-formed inputs and tests the program on the resulting data. In some cases, grammars are ...
Compositional dynamic test generation
POPL '07: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesDynamic test generation is a form of dynamic program analysis that attempts to compute test inputs to drive a program along a specific program path. Directed Automated Random Testing, or DART for short, blends dynamic test generation with model checking ...
Comments