article

Packrat parsing:: simple, powerful, lazy, linear time, functional pearl

Author:
Bryan Ford

Massachusetts Institute of Technology, Cambridge, MA

Massachusetts Institute of Technology, Cambridge, MA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 37 Issue 9September 2002pp 36–47https://doi.org/10.1145/583852.581483

Published:17 September 2002Publication History

ACM SIGPLAN Notices

Abstract

Packrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless guarantees linear parse time. Any language defined by an LL(k) or LR(k) grammar can be recognized by a packrat parser, in addition to many languages that conventional linear-time algorithms do not support. This additional power simplifies the handling of common syntactic idioms such as the widespread but troublesome longest-match rule, enables the use of sophisticated disambiguation strategies such as syntactic and semantic predicates, provides better grammar composition properties, and allows lexical analysis to be integrated seamlessly into parsing. Yet despite its power, packrat parsing shares the same simplicity and elegance as recursive descent parsing; in fact converting a backtracking recursive descent parser into a linear-time packrat parser often involves only a fairly straightforward structural change. This paper describes packrat parsing informally with emphasis on its use in practical applications, and explores its advantages and disadvantages with respect to the more conventional alternatives.

References

Stephen Robert Adams. Modular Grammars for Programming Language Prototyping. PhD thesis, University of Southampton, 1991.Google Scholar
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986. Google ScholarDigital Library
Alfred V. Aho and Jeffrey D. Ullman. The Theory of Parsing, Translation and Compiling-Vol. I: Parsing. Prentice Hall, Englewood Cliffs, N.J., 1972. Google ScholarDigital Library
Alexander Birman and Jeffrey D. Ullman. Parsing algorithms with backtrack. Information and Control, 23(1):1--34, Aug 1973.Google ScholarCross Ref
Dmitri Bronnikov. Free Yacc-able Java(tm) grammar, 1998. http://home.inreach.com/bronikov/grammars/java.html.Google Scholar
Carlos Camarão and Lucília Figueiredo. A monadic combinator compiler compiler. In 5th Brazilian Symposium on Programming Languages, Curitiba - PR - Brazil, May 2001. Universidade Federal do Paraná.Google Scholar
Luca Cardelli, Florian Matthes, and Martín Abadi. Extensible syntax with lexical scoping. Technical Report 121, Digital Systems Research Center, 1994.Google Scholar
Jeroen Fokker. Functional parsers. In Advanced Functional Programming, pages 1--23, 1995. Google ScholarDigital Library
Bryan Ford. Packrat parsing: a practical linear-time algorithm with backtracking. Master's thesis, Massachusetts Institute of Technology, Sep 2002.Google Scholar
Andy Gill and Simon Marlow. Happy: The parser generator for Haskell. http://www.haskell.org/happy.Google Scholar
Graham Hutton and Erik Meijer. Monadic parsing in Haskell. Journal of Functional Programming, 8(4):437--444, Jul 1998. Google ScholarDigital Library
Lillian Lee. Fast context-free grammar parsing requires fast boolean matrix multiplication. Journal of the ACM, 2002. To appear. Google ScholarDigital Library
Daan Leijen. Parsec, a fast combinator parser. http://www.cs.uu.nl/~daan.Google Scholar
Terence J. Parr and Russell W. Quong. Adding semantic and syntactic predicates to LL(k): pred-LL(k). In Computational Complexity, pages 263--277, 1994. Google ScholarDigital Library
Terence J. Parr and Russell W. Quong. ANTLR: A predicatedLL(k) parser generator. Software Practice and Experience, 25(7):789--810, 1995. Google ScholarDigital Library
Terence John Parr. Obtaining practical variants of LL(k) and LR(k) for kρ 1 by splitting the atomic k-tuple. PhD thesis, Purdue University, Apr 1993.Google Scholar
Peter Pepper. LR parsing = grammar transformation + LL parsing: Making LR parsing more understandable and more efficient. Technical Report 99-5, TU Berlin, Apr 1999.Google Scholar
Daniel J. Salomon and Gordon V. Cormack. Scannerless NSLR(1) parsing of programming languages. In Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), pages 170--178, Jul 1989. Google ScholarDigital Library
Kuo-Chung Tai. Noncanonical SLR(1) grammars. ACM Transactions on Programming Languages and Systems, 1(2): 295--320, Oct 1979. Google ScholarDigital Library
Masaru Tomita. Efficient parsing for natural language. Kluwer Academic Publishers, 1985. Google ScholarDigital Library
M. G. J. van den Brand, J. Scheerder, J. J. Vinju, and E. Visser. Disambiguation filters for scannerless generalized LR parsers. In Compiler Construction, 2002. Google ScholarDigital Library
Eelco Visser. Scannerless generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam, 1997.Google Scholar
Philip Wadler. How to replace failure by a list of successes: A method for exception handling, backtracking, and pattern matching in lazy functional languages. In Functional Programming Languages and Computer Architecture, pages 113--128, 1985. Google ScholarDigital Library

Index Terms

Packrat parsing:: simple, powerful, lazy, linear time, functional pearl
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Parsers
    2. General programming languages
      1. Language types
        Functional languages
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Parsing

Recommendations

Parsing expression grammars: a recognition-based syntactic foundation
POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Read More
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Read More
Packrat parsing:: simple, powerful, lazy, linear time, functional pearl
ICFP '02: Proceedings of the seventh ACM SIGPLAN international conference on Functional programming

Packrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless guarantees ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 37, Issue 9
September 2002
283 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/583852
Issue’s Table of Contents
ICFP '02: Proceedings of the seventh ACM SIGPLAN international conference on Functional programming
October 2002
294 pages
ISBN:1581134878
DOI:10.1145/581478
Conference Chair:
Mitchell Wand
Northeastern University
,
Program Chair:
Simon Peyton Jones
Microsoft Research Ltd., Cambridge
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 September 2002
Check for updates
Author Tags
Haskell
backtracking
lexical analysis
memoization
parser combinators
scannerless parsing
top-down parsing
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 158
  Total Citations
  View Citations
- 1,364
  Total Downloads
- Downloads (Last 12 months)55
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Packrat parsing:: simple, powerful, lazy, linear time, functional pearl

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Parsing expression grammars: a recognition-based syntactic foundation

Parsing expression grammars: a recognition-based syntactic foundation

Packrat parsing:: simple, powerful, lazy, linear time, functional pearl