skip to main content
10.1145/1375581.1375607acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Grammar-based whitebox fuzzing

Published:07 June 2008Publication History

ABSTRACT

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.

In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.

References

  1. D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.Google ScholarGoogle Scholar
  2. S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.Google ScholarGoogle Scholar
  5. C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.Google ScholarGoogle Scholar
  16. P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.Google ScholarGoogle Scholar
  17. K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.Google ScholarGoogle Scholar
  23. R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.Google ScholarGoogle Scholar
  25. P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.Google ScholarGoogle Scholar
  27. D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.Google ScholarGoogle Scholar
  33. D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.Google ScholarGoogle Scholar
  39. G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Grammar-based whitebox fuzzing

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation
              June 2008
              396 pages
              ISBN:9781595938602
              DOI:10.1145/1375581
              • General Chair:
              • Rajiv Gupta,
              • Program Chair:
              • Saman Amarasinghe
              • cover image ACM SIGPLAN Notices
                ACM SIGPLAN Notices  Volume 43, Issue 6
                PLDI '08
                June 2008
                382 pages
                ISSN:0362-1340
                EISSN:1558-1160
                DOI:10.1145/1379022
                Issue’s Table of Contents

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 June 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate406of2,067submissions,20%

              Upcoming Conference

              PLDI '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader