Skip to main content
main-content
Top

About this book

"Principles of Compilers: A New Approach to Compilers Including the Algebraic Method" introduces the ideas of the compilation from the natural intelligence of human beings by comparing similarities and differences between the compilations of natural languages and programming languages. The notation is created to list the source language, target languages, and compiler language, vividly illustrating the multilevel procedure of the compilation in the process. The book thoroughly explains the LL(1) and LR(1) parsing methods to help readers to understand the how and why. It not only covers established methods used in the development of compilers, but also introduces an increasingly important alternative — the algebraic formal method. This book is intended for undergraduates, graduates and researchers in computer science.

Professor Yunlin Su is Head of the Research Center of Information Technology, Universitas Ma Chung, Indonesia and Department of Computer Science, Jinan University, Guangzhou, China. Dr. Song Y. Yan is a Professor of Computer Science and Mathematics at the Institute for Research in Applicable Computing, University of Bedfordshire, UK and Visiting Professor at the Massachusetts Institute of Technology and Harvard University, USA.

Table of Contents

Frontmatter

Chapter 1. Introduction

Abstract
If you read the text above, you must be engaging in one of the mind’s most enchanting process—the way one mind influences another through language. However, we put a precondition on it that you have to know English, otherwise the text has no influence at all to you. There are so many languages in the world that even no one can exactly tell how many there are. Therefore, there is the need of a bridge that connects different languages so that people can understand each other. The bridge is the translation. And the subject of the book is the translation between the formal language and the machine language, or compilation.
Yunlin Su, Song Y. Yan

Chapter 2. Grammars and Languages

Abstract
From the development of the mankind language, the language itself was created first without the establishment of the grammar. As the knowledge of mankind enriched and developed, the grammar was created to help the study of the language and to make the language normalized. As any native language is very complicate and the grammar was founded after the language, no matter what language is, not any grammar can totally describe the phenomena of the language. In addition, there exist ambiguities in the native languages. For the human being, in general, these phenomena of ambiguities can be handled by human themselves. For computers, however, it is hard for them to accept and even to understand ambiguity. Programming languages are different from native languages in that the generation of the language is almost at the same time. The the purpose of the grammar is to help the users of the language to avoid any ambiguity and to express the meaning correctly. The program should be correctly written in order to be run on computer with correct results. Therefore, the research on compilers should be started with the discussion on the relation between grammars and languages.
Yunlin Su, Song Y. Yan

Chapter 3. Finite State Automata and Regular Languages

Abstract
One of the most important functions of a computer is to recognize specified patterns. For example, a text-editing software often needs to replace a string of symbols with another string of symbols, whereas a compiler system must scan the symbols of a program to locate a certain key-word. In fact, the fastest string search algorithm is based on pattern recognition, which is in turn, based on automata theory.
Yunlin Su, Song Y. Yan

Chapter 4. Lexical Analysis

Abstract
According to the cognitive science, the understanding of the language by mankind starts with the word recognition. Without the phase, the understanding of language cannot take place at all. Similarly, as the first phase of a compiler, the main task of the lexical analyzer is to read the input characters of the source program, group them into lexemes, and produce as output of a sequence of tokens for each lexeme in the source program. However, before we discuss the lexical analyzer further, we would like to discuss the language understanding in terms of intelligence first. We want to explore why we need the lexical analysis in language understanding.
Yunlin Su, Song Y. Yan

Chapter 5. Push-Down Automata and Context-Free Languages

Abstract
Push-down automata (PDA) form the most important class of automata between finite automata and Turing machines. As can be seen from the previous chapter, deterministic finite automata (DFA) cannot accept even very simple languages such as
$$\left\{ {{x^n}{y^n}\left| {n \in \mathbb{N}} \right.} \right\},$$
but fortunately, there exists a more powerful machine, push-down automata, which can accept it. Just as DFA and nondeterministic finite automata (NFA), there are also two types of push-down automata: deterministic push-down automata (DPDA) and non-deterministic push-down automata (NPDA). The languages which can be accepted by PDA are called context-free languages (CFL), denoted by LCF. Diagrammatically, a PDA is a finite state automaton (see Fig. 5.1), with memories (push-down stacks). In this chapter, we shall study PDA and their associated languages, context-free languages LCF. For the sake of completeness of the automata theory and formal languages, We shall also study Turing machines and their associated languages.
Yunlin Su, Song Y. Yan

Chapter 6. Context-Free Grammars

Abstract
In the compilation of source programs, the second phase of the process is the syntactical analysis. Based on the lexical analysis, the syntactical analysis checks the correctness of the source programs in terms of the grammar of the language used. And it is well-known that most of the properties of the programming languages are context-free. Therefore, naturally if we want to check whether a program is correct or not in terms of syntax, we should check if the syntax of the program is consistent with context-free, at least for most of it. In order to do so, the basis is to know about the context-free grammars. This chapter and Chapter 5 together form the preparation of the syntactical analysis.
Yunlin Su, Song Y. Yan

Chapter 7. Syntax Analysis

Abstract
The syntax analysis is the essential step for the compilation of programs written in programming languages. In order to produce the object programs executable on the computer, the source program has to be analyzed with respect to its correctness, the correctness of the lexicon, syntax and semantics. And in general, the construction of the compiler puts the first two, i.e., the analysis of the lexicon and analysis of the syntax into a phase-the analysis phase. In the two analyses, obviously the syntax analysis is more important than the lexical analysis though it is the base of the former. In comparison with the lexical analysis, the syntax analysis has more issues to explore. In the development history of compilers many researchers and software engineers had devoted lots of their times to design methods for the syntax analysis. The syntax analysis had been regarded as the key to the success of compilers, and it was also regarded as the most arduous task in the construction of compilers. Now in the field of the syntax analysis there are many successful techniques that are widely used in the construction of compilers, especially in the design of the syntax analysis, we should introduce and discuss them in the book that devotes to the principles of compilers. The aim of this chapter is to explain the role of the syntax analysis and to introduce the important techniques used in the syntax analysis, and to compare these techniques in terms of their efficiency and easiness, etc.
Yunlin Su, Song Y. Yan

Chapter 8. Attribute Grammars and Analysis

Abstract
In the Chapter 7, we concentrated on the discussion of parsing methods, i.e. the top-down and bottom-up syntactical methods, especially LL(1) and LR(1) syntactical analysis methods. From the discussion, we can see that in order to carry out LL(1) or LR(1) syntactical analysis there is a need for the premise that the grammar to be analyzed is a context-free one, otherwise both methods do not work. In other words, in order to analyze a programming language through the top-down or bottom-up method, the language must be guaranteed to be context-free. Therefore, we need to explore the description or definition of the programming language. Before any programming language is designed, the designers must take into account the requirements from two sides. One is the requirements of the programmers who would use the language to develop programs. Because they want the language explicit, distinct, authoritative, and unambiguous; meanwhile, it should be easy to read and easy to use. Another one is the requirements of the developers of the language compiler, they want the structure of programs in the language easy to implement, or the development of the compiler also easy.
Yunlin Su, Song Y. Yan

Chapter 9. Algebraic Method of Compiler Design

Abstract
This chapter will be independent of the last several chapters. It will introduce a grand new method for the design of compilers of the procedure oriented programming languages. The method is based on that these languages satisfy the algebraic laws. The new practical strategy is to reduce any source program to a canonical form through a series algebraic transformations. And the canonical form precisely specifies the features of the object machine. The outstanding character of the method is that the correctness of the compiler is ensured by that of each algebra transformation, while the correctness of these transformations is proven by more basic laws.
Yunlin Su, Song Y. Yan

Chapter 10. Generation of Intermediate Code

Abstract
After finishing the first phase, i.e., the analytical phase of compilation, naturally we may enter the second phase, i.e., the synthetic phase. And the main task of this phase is to generate the target code. Why don’t we directly generate the target code instead of bothering to generate the intermediate code? The first question should be answered in this chapter.
Yunlin Su, Song Y. Yan

Chapter 11. Debugging and Optimization

Abstract
Before turning the intermediate code generation to target code generation, we are still facing a number of works that require us to do.
Yunlin Su, Song Y. Yan

Chapter 12. Storage Management

Abstract
The task of compilers is to translate the source programs to the target programs. Therefore, strictly speaking there is nothing to do for a compiler with the storage management. Storage management is not its task. However, compilers can only work when they stay in the memory and the target code which they generate is also in memory. During the compilation, the compiler should consider the layout of the source program, the various tables and the placements of intermediate code and target code, etc. If the layout is not appropriate, the compiler will not be able to efficiently access and the work of compiler cannot be efficient either. Therefore, in this sense, compiler has intimate relation with the storage management.
Yunlin Su, Song Y. Yan

Chapter 13. Generation of Object Code

Abstract
After we travelled from the lexical analysis, syntactical analysis, semantic analysis, generation of intermediate code, and debugging and optimization, etc. till now, we are ready for getting to the last phase—the generation of target codes. But before we formally discuss the matter some sorts of preparation is still needed to do.
Yunlin Su, Song Y. Yan

Chapter 14. Compilation of Object-oriented Languages

Abstract
In the previous chapters, the grammars, lexical analysis, syntactical analysis, context processing, intermediate code generation, target code generation, debugging, optimization of programs in programming languages are explored. The management of memory for the execution of compiler and the target code is also involved. The contents are directed at a variety of programming languages instead of any specific language. However, we should also admit that they are more suitable to the so-called procedural languages or imperative languages.
Yunlin Su, Song Y. Yan

Chapter 15. Compilation of Parallel Languages

Abstract
This chapter will be totally different from previous chapters as so far there is no really grand new parallel programming language to be used already. What people have are only the parallel facilities parasitically affiliated to the existing languages. The discussions before mainly focus on the issues of compilation of sequential programming languages, and in this chapter and the following chapter we will discuss the issues on the compilation of parallel languages. Is it important? The answer is definitely yes. Because in the new century, no doubt, parallel computers will dominate the whole market of computers. Then developing programs for this kind of computers will be the must if some one wants to be the programmer of the new century. Of course he/she also needs to know how his/her programs are compiled by the corresponding compilers. It is not exaggerating that parallel compilation will become the main theme in compiler field.
Yunlin Su, Song Y. Yan

Chapter 16. Compilation of Grid Computing

Abstract
The rising of the grid computing, no doubt, is a new thing that happened in the end of last century and the beginning of this century. It definitely brought lots of new things with it. If it has become a reality and gradually occupies certain amount of market, everything involved should be taken into account, especially it is not mature yet. For those problems that are still open, we should also make our effort to solve them, to make our contribution to the solutions. This chapter only presents a brief introduction to grid computing. We believe that more knowledge on it will be added as the time goes, and the contents on the topic will be more bountiful in the future.
Yunlin Su, Song Y. Yan

Backmatter

Additional information

Premium Partner

    Image Credits