nach oben

2003 | Buch

Kapitel lesen Erstes Kapitel lesen

Compiler Construction

12th International Conference, CC 2003 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2003 Warsaw, Poland, April 7–11, 2003 Proceedings

herausgegeben von: Görel Hedin

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings of the 12th International Conference on Compiler Construction, CC 2003, held in Warsaw, Poland, in April 2003.

The 20 revised full regular papers and one tool demonstration paper presented together with two invited papers were carefully reviewed and selected from 83 submissions. The papers are organized in topical sections on register allocation, language constructs and their implementation, type analysis, Java, pot pourri, and optimization.

Inhaltsverzeichnis

Frontmatter

Register Allocation

Combined Code Motion and Register Allocation Using the Value State Dependence Graph

Abstract

We define the Value State Dependence Graph (VSDG). The VSDG is a form of the Value Dependence Graph (VDG) extended by the addition of state dependence edges to model sequentialised computation. These express store dependencies and loop termination dependencies of the original program. We also exploit them to express the additional serialization inherent in producing final object code.

The central idea is that this latter serialization can be done incrementally so that we have a class of algorithms which effectively interleave register allocation and code motion, thereby avoiding a well-known phase-order problem in compilers. This class operates by first normalizing the VSDG during construction, to remove all duplicated computation, and then repeatedly choosing between: (i) allocating a value to a register, (ii) spilling a value to memory, (iii) moving a loop-invariant computation within a loop to avoid register spillage, and (iv ) statically duplicating a computation to avoid register spillage.

We show that the classical two-phase approach (code motion then register allocation in both Chow and Chaitin forms) are examples of this class, and propose a new algorithm based on depth-first cuts of the VSDG.

Neil Johnson, Alan Mycroft

Early Control of Register Pressure for Software Pipelined Loops

Abstract

The register allocation in loops is generally performed after or during the software pipelining process. This is because doing a conventional register allocation at first step without assuming a schedule lacks the information of interferences between variable lifetime intervals. Thus, the register allocator may introduce an excessive amount of false dependences that reduce dramatically the ILP (Instruction Level Parallelism).We present a new framework for controlling the register pressure before software pipelining. This is based on inserting some anti-dependences edges (register reuse edges) labeled with reuse distances, directly on the data dependence graph. In this new graph, we are able to guarantee that the number of simultaneously alive variables in any schedule does not exceed a limit. The determination of register and distance reuse is parameterized by the desired critical circuit ratio (MII) as well as by the register pressure constraints - either can be minimized while the other one is fixed. After scheduling, register allocation is done cyclically on conventional register sets or on rotating register files. We give an optimal exact model, and another approximative one that generalizes the Ning-Gao [13] buffer optimization heuristics.

Sid-Ahmed-Ali Touati, Christine Eisenbeis

Register Allocation by Optimal Graph Coloring

Abstract

We here present new insights of properties of real-life interference graphs emerging in register allocation. These new insights imply good hopes for a possibility of improving the coloring approach towards optimal solutions. The conclusions are based on measurements of nearly 28,000 real instances of such interference graphs. All the instances explored are determined to possess the so-called 1-perfectness property, a fact that seems to make them easy to color optimally. The exact algorithms presented not only produce better solutions than the traditional heuristic methods, but also, indeed, seem to perform surprisingly fast, according to the measurements on our implementations.

Christian Andersson

Language Constructs and Their Implementation

A Compilation and Optimization Model for Aspect-Oriented Programs

Abstract

This paper presents a semantics-based compilation model for an aspect-oriented programming language based on its operational semantics. Using partial evaluation, the model can explain several issues in compilation processes, including how to find places in program text to insert aspect code and how to remove unnecessary run-time checks. It also illustrates optimization of calling-context sensitive pointcuts (cflow), implemented in real compilers.

H. Masuhara, G. Kiczales, C. Dutchyn

A Pattern Matching Compiler for Multiple Target Languages

Abstract

Many processes can be seen as transformations of tree-like data structures. In compiler construction, for example, we continuously manipulate trees and perform tree transformations. This paper introduces a pattern matching compiler (Tom): a set of primitives which add pattern matching facilities to imperative languages such as C, Java, or Eiffel. We show that this tool is extremely non-intrusive, lightweight and useful to implement tree transformations. It is also flexible enough to allow the reuse of existing data structures.

Pierre-Etienne Moreau, Christophe Ringeissen, Marian Vittek

A New One-Pass Transformation into Monadic Normal Form

Abstract

We present a translation from the call-by-value λ-calculus to monadic normal forms that includes short-cut boolean evaluation. The translation is higher-order, operates in one pass, duplicates no code, generates no chains of thunks, and is properly tail recursive. It makes a crucial use of symbolic computation at translation time.

Olivier Danvy

Type Analysis

Run-Time Type Checking for Binary Programs

Abstract

Many important software systems are written in the C programming language. Unfortunately, the C language does not provide strong safety guarantees, and many common programming mistakes introduce type errors that are not caught by the compiler. These errors only manifest themselves at run time through unexpected program behavior, and it is often hard to isolate and identify their causes. This paper presents the Hobbes run-time type checker for compiled C programs. Our tool interprets compiled binaries, tracks type information for all memory and register locations, and reports warnings when a variety of type errors occur. Because the Hobbes type checker does not rely on source code, it is effective in many situations where similar tools are not, such as when full source code is not available or when C source is linked with program fragments written in assembly or other languages.

Michael Burrows, Stephen N. Freund, Janet L. Wiener

Precision in Practice: A Type-Preserving Java Compiler

Abstract

Popular mobile code architectures (Java and .NET) include verifiers to check for memory safety and other security properties. Since their formats are relatively high level, supporting a wide range of source language features is awkward. Further compilation and optimization, necessary for efficiency, must be trusted. We describe the design and implementation of a fully type-preserving compiler for Java and ML. Its strongly-typed intermediate language provides a low-level abstract machine model and a type system general enough to prove the safety of a variety of implementation techniques. We show that precise type preservation is within reach for real-world Java systems.

Christopher League, Zhong Shao, Valery Trifonov

The MAGICA Type Inference Engine for MATLAB ®

Abstract

MAGICA(MAthematica system for General-purpose Inferring and Compile-time Analyses) is an extensible inference engine that can determine the types (value range, intrinsic type and array shape) of expressions in a MATLAB program. Written as a Mathematica application, it is designed as an add-on module that any MATLAB compiler infrastructure can use to obtain high-quality type inferences.

Pramod G. Joisha, Prithviraj Banerjee

CC Invited Talk

Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages

Abstract

There has been approximately a ten year history of reference analyses for object-oriented programming languages. Approaches vary as to how different analyses account for program execution flow, how they capture calling context, and how they model objects, reference variables and the possible calling structure of the program. A taxonomy of analysis dimensions that affect precision (and cost) will be presented and illustrated by examples of existing reference analysis techniques.

Barbara G. Ryder

Java

Polyglot: An Extensible Compiler Framework for Java

Abstract

Polyglot is an extensible compiler framework that supports the easy creation of compilers for languages similar to Java, while avoiding code duplication. The Polyglot framework is useful for domain-specific languages, exploration of language design, and for simplified versions of Java for pedagogical use. We have used Polyglot to implement several major and minor modifications to Java; the cost of implementing language extensions scales well with the degree to which the language differs from Java. This paper focuses on the design choices in Polyglot that are important for making the framework usable and highly extensible. Polyglot source code is available.

Nathaniel Nystrom, Michael R. Clarkson, Andrew C. Myers

Scaling Java Points-to Analysis Using Spark

Abstract

Most points-to analysis research has been done on different systems by different groups, making it difficult to compare results, and to understand interactions between individual factors each group studied. Furthermore, pointsto analysis for Java has been studied much less thoroughly than for C, and the tradeoffs appear very different. We introduce Spark, a flexible framework for experimenting with points-to analyses for Java. Spark supports equality- and subset-based analyses, variations in field sensitivity, respect for declared types, variations in call graph construction, off-line simplification, and several solving algorithms. Spark is composed of building blocks on which new analyses can be based.

We demonstrate Spark in a substantial study of factors affecting precision and efficiency of subset-based points-to analyses, including interactions between these factors. Our results show that Spark is not only flexible and modular, but also offers superior time/space performance when compared to other points-to analysis implementations.

Ondřej Lhoták, Laurie Hendren

Effective Inline-Threaded Interpretation of Java Bytecode Using Preparation Sequences

Abstract

Inline-threaded interpretation is a recent technique that improves performance by eliminating dispatch overhead within basic blocks for interpreters written in C [11]. The dynamic class loading, lazy class initialization, and multi-threading features of Java reduce the effiectiveness of a straight-forward implementation of this technique within Java interpreters. In this paper, we introduce preparation sequences, a new technique that solves the particular challenge of effiectively inline-threading Java. We have implemented our technique in the SableVM Java virtual machine, and our experimental results show that using our technique, inline-threaded interpretation of Java, on a set of benchmarks, achieves a speedup ranging from 1.20 to 2.41 over switch-based interpretation, and a speedup ranging from 1.15 to 2.14 over direct-threaded interpretation.

Etienne Gagnon, Laurie Hendren

Integrating Generations with Advanced Reference Counting Garbage Collectors

Abstract

We study an incorporation of generations into a modern reference counting collector. We start with the two on-the-fly collectors suggested by Levanoni and Petrank: a reference counting collector and a tracing (mark and sweep) collector. We then propose three designs for combining them so that the reference counting collector collects the young generation or the old generation or both. Our designs maintain the good properties of the Levanoni-Petrank collector. In particular, it is adequate for multithreaded environment and a multiprocessor platform, and it has an effcient write barrier with no synchronization operations. To the best of our knowledge, the use of generations with reference counting has not been tried before.

We have implemented these algorithms with the Jikes JVM and compared them against the concurrent reference counting collector supplied with the Jikes package. As expected, the best combination is the one that lets the tracing collector work on the young generation (where most objects die) and the reference counting work on the old generation (where many objects survive). Matching the expected survival rate with the nature of the collector yields a large improvement in throughput while maintaining the pause times around a couple of milliseconds.

Hezi Azatchi, Erez Petrank

Pot Pourri

The Interprocedural Express-Lane Transformation

Abstract

The express-lane transformation isolates and duplicates frequently executed program paths, aiming for better data-flow facts along the duplicated paths. An express-lane p is a copy of a frequently executed program path such that p has only one entry point at its beginning; p may have branches back to the original code, but the original code never branches into p. Classical data-flow analysis is likely to find sharper data-flow facts along an express-lane, because there are no join points.

This paper describes several variants of interprocedural express-lane transformations; these duplicate hot interprocedural paths, i.e., paths that may cross procedure boundaries. The paper also reports results from an experimental study of the effects of the express-lane transformation on interprocedural range analysis.

David Melski, Thomas Reps

Automatic Detection of Uninitialized Variables

Abstract

One of the most common programming errors is the use of a variable before its definition. This undefined value may produce incorrect results, memory violations, unpredictable behaviors and program failure. To detect this kind of error, two approaches can be used: compile-time analysis and run-time checking. However, compile-time analysis is far from perfect because of complicated data and control flows as well as arrays with non-linear, indirection subscripts, etc. On the other hand, dynamic checking, although supported by hardware and compiler techniques, is costly due to heavy code instrumentation while information available at compile-time is not taken into account.

This paper presents a combination of an efficient compile-time analysis and a source code instrumentation for run-time checking. All kinds of variables are checked by PIPS, a Fortran research compiler for program analyses, transformation, parallelization and verification. Uninitialized array elements are detected by using imported array region, an efficient inter-procedural array data flow analysis. If exact array regions cannot be computed and compile-time information is not sufficient, array elements are initialized to a special value and their utilization is accompanied by a value test to assert the legality of the access. In comparison to the dynamic instrumentation, our method greatly reduces the number of variables to be initialized and to be checked. Code instrumentation is only needed for some array sections, not for the whole array. Tests are generated as early as possible. In addition, programs can be proved to be free from used-before-set errors statically at compile-time or, on the contrary, have real undefined errors. Experiments on SPEC95 CFP show encouraging results on analysis cost and run-time overheads.

Thi Viet Nga Nguyen, François Irigoin, Corinne Ancourt, Fabien Coelho

Generalised Regular Parsers

Abstract

Aycock and Horspool have given an algorithm which improves the efficiency of GLR parsers. A grammar is ‘reduced’ so that there is no recursion apart from non-hidden left recursion, an FA recogniser is then constructed and a stack is used when the recursive parts of the original grammar are required. Aycock and Horspool then give an algorithm which performs all possible traversals of the resulting PDA on a given input string. This mirrors the approach taken by Tomita to perform all traversals of an LR(0) FA. However, Aycock and Horspool’s algorithm does not terminate in the case where the grammar contains hidden left recursion. In this paper we give a different method for constructing an FA which recognises the language generated by the grammar provided that the only recursion in the grammar arises from left or right recursion. Using this FA allows us to reduce the number of places that the stack is required. We also give a different algorithm for constructing all traversals of the final PDA which is correct in all cases, including grammars with hidden left recursion. Thus we can apply our algorithm to all context free grammars.

Adrian Johnstone, Elizabeth Scott

Rapid and Robust Compiler Construction Using Template-Based Metacompilation

Abstract

We have developed Tm, a template-based metacompiler. Given a set of data-structure definitions and a template,Tm generates files that instantiate the template for the given data structures. With this process, Tm is able to generate program code to manipulate these data structures. Since it uses templates, the generated code is not restricted to a specific programming language: any sufficiently powerful programming language can be targeted.

Tm has been used for a wide variety of tasks and languages. However, it was designed to support compiler construction, and most applications have been in that area.

In this paper we outlineTm, and describe our experiences with using it to construct a static compiler for Java. As we will show, it has significantly accelerated implementation of the compiler. Almost 75% of its source code is generated by Tm, allowing us to rapidly implement a much more robust and sophisticated compiler than would have been possible otherwise.

C. van Reeuwijk

ETAPS Invited Talk

The Verifying Compiler: A Grand Challenge for Computing Research

Abstract

I propose a set of criteria which distinguish a grand challenge in science or engineering from the many other kinds of short-term or long-term research problems that engage the interest of scientists and engineers. As an example drawn from Computer Science, I revive an old challenge: the construction and application of a verifying compiler that guarantees correctness of a program before running it.

Tony Hoare

Optimization

Address Register Assignment for Reducing Code Size

Abstract

In DSP processors, minimizing the amount of address calculations is critical for reducing code size and improving performance since studies of programs have shown that instructions that manipulate address registers constitute a significant portion of the overall instruction count (up to 55%). This work presents a compiler-based optimization strategy to reduce the code size in embedded systems. Our strategy maximizes the use of indirect addressing modes with post-increment and post-decrement capabilities available in DSP processors. These modes can be exploited by ensuring that successive references to variables access consecutive memory locations. To achieve this spatial locality, our approach uses both access pattern modification (program code restructuring) and memory storage reordering (data layout restructuring).

M. Kandemir, M. J. Irwin, G. Chen, J. Ramanujam

Offset Assignment Showdown: Evaluation of DSP Address Code Optimization Algorithms

Abstract

Offset assignment is a highly effective DSP address code optimization technique that has been implemented in a number of ANSI C compilers. In this paper we concentrate on a special class of offset assignment problems called “simple offset assignment” (SOA). A number of SOA algorithms have been proposed recently, but experimental results and direct comparisons are still sparse. This makes the practical selection of a suitable SOA algorithm for implementation in a compiler very difficult. This paper aims at closing this gap by providing a comprehensive benchmark suite and empirical evaluation based on real-life application programs. Our results for the first time permit a detailed assessment of all major SOA algorithms. In addition, we propose a new and superior combination of SOA heuristics.

Rainer Leupers

Integrating High-Level Optimizations in a Production Compiler: Design and Implementation Experience

Abstract

The High-Level Optimizer (HLO) is a key part of the compiler technology that enabled ItaniumTM and ItaniumTM2 processors deliver leading floating- point performance at their introduction. In this paper, we discuss the design and implementation experience in integrating diverse optimizations in the HLO module. In particular, we describe decisions made in the design of HLO targeting Itanium processor family. We provide empirical data to validate the design decisions. Since HLO was implemented in a production compiler, we made certain engineering trade-offs. We discuss these trade-offs and outline key learning derived from our experience.

Somnath Ghosh, Abhay Kanhere, Rakesh Krishnaiyer, Dattatraya Kulkarni, Wei Li, Chu-Cheow Lim, John Ng

Improving Data Locality by Chunking

Abstract

Cache memories were invented to decouple fast processors from slow memories. However, this decoupling is only partial, and many researchers have attempted to improve cache use by program optimization. Potential benefits are significant since both energy dissipation and performance highly depend on the traffic between memory levels. But modeling the traffic is diffcult; this observation has led to the use of heuristic methods for steering program transformations. In this paper, we propose another approach: we simplify the cache model and we organize the target program in such a way that an asymptotic evaluation of the memory trafic is possible. This information is used by our optimization algorithm in order to find the best reordering of the program operations, at least in an asymptotic sense. Our method optimizes both temporal and spatial locality. It can be applied to any static control program with arbitrary dependences. The optimizer has been partially implemented and applied to non-trivial programs. We present experimental evidence that the amount of cache misses is drastically reduced with corresponding performance improvements.

Cédric Bastoul, Paul Feautrier

Backmatter

Titel: Compiler Construction
herausgegeben von: Görel Hedin
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-36579-2
Print ISBN: 978-3-540-00904-7
DOI: https://doi.org/10.1007/3-540-36579-6

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Register Allocation

Combined Code Motion and Register Allocation Using the Value State Dependence Graph

Early Control of Register Pressure for Software Pipelined Loops

Register Allocation by Optimal Graph Coloring

Language Constructs and Their Implementation

A Compilation and Optimization Model for Aspect-Oriented Programs

A Pattern Matching Compiler for Multiple Target Languages

A New One-Pass Transformation into Monadic Normal Form

Type Analysis

Run-Time Type Checking for Binary Programs

Precision in Practice: A Type-Preserving Java Compiler

The MAGICA Type Inference Engine for MATLAB ®

CC Invited Talk

Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages

Java

Polyglot: An Extensible Compiler Framework for Java

Scaling Java Points-to Analysis Using Spark

Effective Inline-Threaded Interpretation of Java Bytecode Using Preparation Sequences

Integrating Generations with Advanced Reference Counting Garbage Collectors

Pot Pourri

The Interprocedural Express-Lane Transformation

Automatic Detection of Uninitialized Variables

Generalised Regular Parsers

Rapid and Robust Compiler Construction Using Template-Based Metacompilation

ETAPS Invited Talk

The Verifying Compiler: A Grand Challenge for Computing Research

Optimization

Address Register Assignment for Reducing Code Size

Offset Assignment Showdown: Evaluation of DSP Address Code Optimization Algorithms

Integrating High-Level Optimizations in a Production Compiler: Design and Implementation Experience

Improving Data Locality by Chunking

Backmatter