Abstract
Packrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless guarantees linear parse time. Any language defined by an LL(k) or LR(k) grammar can be recognized by a packrat parser, in addition to many languages that conventional linear-time algorithms do not support. This additional power simplifies the handling of common syntactic idioms such as the widespread but troublesome longest-match rule, enables the use of sophisticated disambiguation strategies such as syntactic and semantic predicates, provides better grammar composition properties, and allows lexical analysis to be integrated seamlessly into parsing. Yet despite its power, packrat parsing shares the same simplicity and elegance as recursive descent parsing; in fact converting a backtracking recursive descent parser into a linear-time packrat parser often involves only a fairly straightforward structural change. This paper describes packrat parsing informally with emphasis on its use in practical applications, and explores its advantages and disadvantages with respect to the more conventional alternatives.
- Stephen Robert Adams. Modular Grammars for Programming Language Prototyping. PhD thesis, University of Southampton, 1991.Google Scholar
- Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986. Google ScholarDigital Library
- Alfred V. Aho and Jeffrey D. Ullman. The Theory of Parsing, Translation and Compiling-Vol. I: Parsing. Prentice Hall, Englewood Cliffs, N.J., 1972. Google ScholarDigital Library
- Alexander Birman and Jeffrey D. Ullman. Parsing algorithms with backtrack. Information and Control, 23(1):1--34, Aug 1973.Google ScholarCross Ref
- Dmitri Bronnikov. Free Yacc-able Java(tm) grammar, 1998. http://home.inreach.com/bronikov/grammars/java.html.Google Scholar
- Carlos Camarão and Lucília Figueiredo. A monadic combinator compiler compiler. In 5th Brazilian Symposium on Programming Languages, Curitiba - PR - Brazil, May 2001. Universidade Federal do Paraná.Google Scholar
- Luca Cardelli, Florian Matthes, and Martín Abadi. Extensible syntax with lexical scoping. Technical Report 121, Digital Systems Research Center, 1994.Google Scholar
- Jeroen Fokker. Functional parsers. In Advanced Functional Programming, pages 1--23, 1995. Google ScholarDigital Library
- Bryan Ford. Packrat parsing: a practical linear-time algorithm with backtracking. Master's thesis, Massachusetts Institute of Technology, Sep 2002.Google Scholar
- Andy Gill and Simon Marlow. Happy: The parser generator for Haskell. http://www.haskell.org/happy.Google Scholar
- Graham Hutton and Erik Meijer. Monadic parsing in Haskell. Journal of Functional Programming, 8(4):437--444, Jul 1998. Google ScholarDigital Library
- Lillian Lee. Fast context-free grammar parsing requires fast boolean matrix multiplication. Journal of the ACM, 2002. To appear. Google ScholarDigital Library
- Daan Leijen. Parsec, a fast combinator parser. http://www.cs.uu.nl/~daan.Google Scholar
- Terence J. Parr and Russell W. Quong. Adding semantic and syntactic predicates to LL(k): pred-LL(k). In Computational Complexity, pages 263--277, 1994. Google ScholarDigital Library
- Terence J. Parr and Russell W. Quong. ANTLR: A predicatedLL(k) parser generator. Software Practice and Experience, 25(7):789--810, 1995. Google ScholarDigital Library
- Terence John Parr. Obtaining practical variants of LL(k) and LR(k) for kρ 1 by splitting the atomic k-tuple. PhD thesis, Purdue University, Apr 1993.Google Scholar
- Peter Pepper. LR parsing = grammar transformation + LL parsing: Making LR parsing more understandable and more efficient. Technical Report 99-5, TU Berlin, Apr 1999.Google Scholar
- Daniel J. Salomon and Gordon V. Cormack. Scannerless NSLR(1) parsing of programming languages. In Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), pages 170--178, Jul 1989. Google ScholarDigital Library
- Kuo-Chung Tai. Noncanonical SLR(1) grammars. ACM Transactions on Programming Languages and Systems, 1(2): 295--320, Oct 1979. Google ScholarDigital Library
- Masaru Tomita. Efficient parsing for natural language. Kluwer Academic Publishers, 1985. Google ScholarDigital Library
- M. G. J. van den Brand, J. Scheerder, J. J. Vinju, and E. Visser. Disambiguation filters for scannerless generalized LR parsers. In Compiler Construction, 2002. Google ScholarDigital Library
- Eelco Visser. Scannerless generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam, 1997.Google Scholar
- Philip Wadler. How to replace failure by a list of successes: A method for exception handling, backtracking, and pattern matching in lazy functional languages. In Functional Programming Languages and Computer Architecture, pages 113--128, 1985. Google ScholarDigital Library
Index Terms
- Packrat parsing:: simple, powerful, lazy, linear time, functional pearl
Recommendations
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languagesFor decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Packrat parsing:: simple, powerful, lazy, linear time, functional pearl
ICFP '02: Proceedings of the seventh ACM SIGPLAN international conference on Functional programmingPackrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless guarantees ...
Comments