Skip to main content
main-content

Über dieses Buch

The theory of parsing is an important application area of the theory of formal languages and automata. The evolution of modem high-level programming languages created a need for a general and theoretically dean methodology for writing compilers for these languages. It was perceived that the compilation process had to be "syntax-directed", that is, the functioning of a programming language compiler had to be defined completely by the underlying formal syntax of the language. A program text to be compiled is "parsed" according to the syntax of the language, and the object code for the program is generated according to the semantics attached to the parsed syntactic entities. Context-free grammars were soon found to be the most convenient formalism for describing the syntax of programming languages, and accordingly methods for parsing context-free languages were devel­ oped. Practical considerations led to the definition of various kinds of restricted context-free grammars that are parsable by means of efficient deterministic linear-time algorithms.

Inhaltsverzeichnis

Frontmatter

1. Elements of Language Theory

Abstract
In this chapter we shall review the mathematical and computer science background on which the presentation in this book is based. We shall discuss the elements of discrete mathematics and formal language theory, emphasizing those issues that are of importance from the point of view of context-free parsing. We shall devote a considerable part of this chapter to matters such as random access machines and computational complexity. These will be relevant later when we derive efficient algorithms for parsing theoretic problems or prove lower bounds for the complexity of these problems. In this chapter we shall also discuss a general class of formal language descriptors called “rewriting systems” or “semi-Thue systems”. Later in the book we shall consider various language descriptors and language recognizers as special cases of a general rewriting system. As this approach is somewhat unconventional, we advise even the experienced reader to go through the definitions given in this chapter if he or she wishes to appreciate fully the presentation in this book.
Seppo Sippu, Eljas Soisalon-Soininen

2. Algorithms on Graphs

Abstract
In this chapter we shall develop some basic algorithms for directed graphs and relations which will be of use in later chapters, where the efficient construction of parsers is considered. The constructions needed can be expressed as the computing of certain “relational expressions”. These are expressions whose operands are relations and whose operators are chosen from among multiplication, closure, union and inverse. For this purpose we need to develop an algorithm for computing the closure of a relation. In view of the nature of our applications, the most appropriate way to do this is by a depth-first traversal of the graph that corresponds to the given relation. Other ways of computing the closure of a relation are considered in the exercises.
Seppo Sippu, Eljas Soisalon-Soininen

3. Regular Languages

Abstract
In this chapter we shall consider a subfamily of the family of context-free languages known as the regular languages. The family of regular languages over an alphabet V is the smallest family of languages over V that contains all finite languages over V and is closed under the operations closure, concatenation and finite union. We shall consider three equivalent ways of describing regular languages, namely regular expressions, finite automata and regular grammars. These are “equivalent” in the sense that each defines precisely the same family of regular languages. Moreover, there exist algorithms for transforming any regular expression, finite automaton or regular grammar into an equivalent language description belonging to either of the other two classes.
Seppo Sippu, Eljas Soisalon-Soininen

4. Context-free Languages

Abstract
In this chapter we shall define a class of rewriting systems called context-free grammars. The left-hand side of a rule in a context-free grammar consists of a single symbol, so that symbols are rewritten “context-freely”. Context-free grammars are of central importance to us because they define the class of context-free languages, the parsing of which is the subject of this book. In this chapter we shall consider some structural properties of context-free grammars which are of importance in parsing. Also, a basic method for recognizing context-free languages will be given.
Seppo Sippu, Eljas Soisalon-Soininen

5. Parsing

Abstract
In this chapter we shall introduce the central concept of this monograph, namely the parsing of context-free languages. The theory of parsing plays an important role in the design of compilers for programming languages. Every compiler includes a module called the parser, which has a twofold task in the compilation process. First, the parser checks that the program text to be compiled is syntactically correct, i.e., derivable by the context-free grammar of the programming language. In doing this the parser acts as a language recognizer. Secondly, if the program text proves to be syntactically correct, the parser goes on to produce some intermediate representation of the text for use as input to the module responsible for object code generation. In doing this the parser acts as a text transformer.
Seppo Sippu, Eljas Soisalon-Soininen

Backmatter

Weitere Informationen