Skip to main content

2007 | Buch

Static Analysis

14th International Symposium, SAS 2007, Kongens Lyngby, Denmark, August 22-24, 2007. Proceedings

herausgegeben von: Hanne Riis Nielson, Gilberto Filé

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The aim of static analysis is to develop principles, techniques and tools for validatingpropertiesofprograms,fordesigningsemantics-basedtransformations of programs and for obtaining high-performance implementations of high-level programming languages. Over the years the series of static analysis symposia has served as the primary venue for presentation and discussion of theoretical, practical and innovative advances in the area. This volume contains the papers accepted for presentation at the 14th Int- national Static Analysis Symposium (SAS 2007). The meeting was held August, 22–24, 2007, at the Technical University of Denmark (DTU) in Kongens L- gby, Denmark. In response to the call for papers, 85 submissions were received. Each submission was reviewed by at least 3 experts and, based on these reports, 26 papers were selected after a week of intense electronic discussion using the EasyChair conference system. In addition to these 26 papers, this volume also containscontributionsbythetwoinvitedspeakers:FrankTip(IBMT.J.Watson Research Center, USA) and Alan Mycroft (Cambridge University, UK). On the behalf of the Program Committee, the Program Chairs would like to thank all the authors who submitted their work to the conference and also all the external referees who have been indispensable for the selection process. Special thanks go to TerkelTolstrup and J¨ org Bauer,who helped in handing the submitted papers and in organizing the structure of this volume. We would also like to thank the members of the Organizing Committee at DTU for their great work. Finally we want to thank the PhD school ITMAN at DTU for ?nancial support.

Inhaltsverzeichnis

Frontmatter

Invited Papers

Refactoring Using Type Constraints
Abstract
Type constraints express subtype-relationships between the types of program expressions that are required for type-correctness, and were originally proposed as a convenient framework for solving type checking and type inference problems. In this paper, we show how type constraints can be used as the basis for practical refactoring tools. In our approach, a set of type constraints is derived from a type-correct program P. The main insight behind our work is the fact that P constitutes just one solution to this constraint system, and that alternative solutions may exist that correspond to refactored versions of P. We show how a number of refactorings for manipulating types and class hierarchies can be expressed naturally using type constraints. Several refactorings in the standard distribution of Eclipse are based on our results.
Frank Tip
Programming Language Design and Analysis Motivated by Hardware Evolution
(Invited Presentation)
Abstract
Silicon chip design has passed a threshold whereby exponentially increasing transistor density (Moore’s Law) no longer translates into increased processing power for single-processor architectures. Moore’s Law now expresses itself as an exponentially increasing number of processing cores per fixed-size chip.
We survey this process and its implications on programming language design and static analysis. Particular aspects addressed include the reduced reliability of ever-smaller components, the problems of physical distribution of programs and the growing problems of providing shared memory.
Alan Mycroft

Contributed Papers

A Compilation Model for Aspect-Oriented Polymorphically Typed Functional Languages
Abstract
Introducing aspect orientation to a polymorphically typed functional language strengthens the importance of type-scoped advices; i.e., advices with their effects harnessed by type constraints. As types are typically treated as compile time entities, it is highly desirable to be able to perform static weaving to determine at compile time the chaining of type-scoped advices to their associated join points. In this paper, we describe a compilation model, as well as its implementation, that supports static type inference and static weaving of programs in an aspect-oriented polymorphically typed lazy functional language, AspectFun. We present a type-directed weaving scheme that coherently weaves type-scoped advices into the base program at compile time. We state the correctness of the static weaving with respect to the operational semantics of AspectFun. We also demonstrate how control-flow based pointcuts (such as cflowbelow) are compiled away, and highlight several type-directed optimization strategies that can improve the efficiency of woven code.
Kung Chen, Shu-Chun Weng, Meng Wang, Siau-Cheng Khoo, Chung-Hsin Chen
Lattice Automata: A Representation for Languages on Infinite Alphabets, and Some Applications to Verification
Abstract
This paper proposes a new abstract domain for languages on infinite alphabets, which acts as a functor taking an abstract domain for a concrete alphabet and lift it to an abstract domain for words on this alphabet. The abstract representation is based on lattice automata, which are finite automata labeled by elements of an atomic lattice. We define a normal form, standard language operations and a widening operator for these automata. We apply this abstract lattice for the verification of symbolic communicating machines, and we discuss its usefulness for interprocedural analysis.
Tristan Le Gall, Bertrand Jeannet
Compositional Verification and 3-Valued Abstractions Join Forces
Abstract
Two of the most promising approaches to fighting the state explosion problem are abstraction and compositional verification. In this work we join their forces to obtain a novel fully automatic compositional technique that can determine the truth value of the full μ-calculus with respect to a given system.
Given a system \(\ensuremath{M} = \ensuremath{M}_1 \parallel \ensuremath{M}_2\), we view each component \(\ensuremath{M}_i\) as an abstraction \({\ensuremath{M}_i}\uparrow\) of the global system. The abstract component \({\ensuremath{M}_i}\uparrow \) is defined using a 3-valued semantics so that whenever a μ-calculus formula ϕ has a definite value (true or false) on \({\ensuremath{M}_i}\uparrow \), the same value holds also for \(\ensuremath{M}\). Thus, ϕ can be checked on either \({\ensuremath{M}_1}\uparrow \) or \({\ensuremath{M}_2}\uparrow \) (or both), and if any of them returns a definite result, then this result holds also for \(\ensuremath{M}\). If both checks result in an indefinite value, the composition of the components needs to be considered. However, instead of constructing the composition of \({\ensuremath{M}_1}\uparrow \) and \({\ensuremath{M}_2}\uparrow \), our approach identifies and composes only the parts of the components in which their composition is necessary in order to conclude the truth value of ϕ. It ignores the parts which can be handled separately. The resulting model is often significantly smaller than the full system.
We explain how our compositional approach can be combined with abstraction, in order to further reduce the size of the checked components. The result is an incremental compositional abstraction-refinement framework, which resembles automatic Assume-Guarantee reasoning.
Sharon Shoham, Orna Grumberg
Formalised Inductive Reasoning in the Logic of Bunched Implications
Abstract
We present a framework for inductive definitions in the logic of bunched implications, BI, and formulate two sequent calculus proof systems for inductive reasoning in this framework. The first proof system adopts a traditional approach to inductive proof, extending the usual sequent calculus for predicate BI with explicit induction rules for the inductively defined predicates. The second system allows an alternative mode of reasoning with inductive definitions by cyclic proof. In this system, the induction rules are replaced by simple case-split rules, and the proof structures are cyclic graphs formed by identifying some sequent occurrences in a derivation tree. Because such proof structures are not sound in general, we demand that cyclic proofs must additionally satisfy a global trace condition that ensures soundness. We illustrate our inductive definition framework and proof systems with simple examples which indicate that, in our setting, cyclic proof may enjoy certain advantages over the traditional induction approach.
James Brotherston
Optimal Abstraction on Real-Valued Programs
Abstract
In this paper, we show that it is possible to abstract program fragments using real variables using formulas in the theory of real closed fields. This abstraction is compositional and modular. We first propose an exact abstraction for programs without loops. Given an abstract domain (in a wide class including intervals and octagons), we then show how to obtain an optimal abstraction of program fragments with respect to that domain. This abstraction allows computing optimal fixed points inside that abstract domain, without the need for a widening operator.
David Monniaux
Taming the Wrapping of Integer Arithmetic
Abstract
Variables in programs are usually confined to a fixed number of bits and results that require more bits are truncated. Due to the use of 32-bit and 64-bit variables, inadvertent overflows are rare. However, a sound static analysis must reason about overflowing calculations and conversions between unsigned and signed integers; the latter remaining a common source of subtle programming errors. Rather than polluting an analysis with the low-level details of modelling two’s complement wrapping behaviour, this paper presents a computationally light-weight solution based on polyhedral analysis which eliminates the need to check for wrapping when evaluating most (particularly linear) assignments.
Axel Simon, Andy King
Under-Approximations of Computations in Real Numbers Based on Generalized Affine Arithmetic
Abstract
We build a new, implicitly relational abstract domain which gives accurate under-approximations of the set of real values that program variables can take. This statement is demonstrated both on a theoretical basis and on non-trivial numerical examples. It is, we believe, the first non-trivial under-approximating numerical domain in the static analysis literature.
Eric Goubault, Sylvie Putot
A Framework for End-to-End Verification and Evaluation of Register Allocators
Abstract
This paper presents a framework for designing, verifying, and evaluating register allocation algorithms. The proposed framework has three main components. The first component is MIRA, a language for describing programs prior to register allocation. The second component is FORD, a language that describes the results produced by the register allocator. The third component is a type checker for the output of a register allocator which helps to find bugs. To illustrate the effectiveness of the framework, we present RALF, a tool that allows a register allocator to be integrated into the gcc compiler for the StrongARM architecture. RALF simplifies the development of register allocators by sheltering the programmer from the internal complexity of gcc. MIRA and FORD’s features are sufficient to implement most of the register allocators currently in use and are independent of any particular register allocation algorithm or compiler. To demonstrate the generality of our framework, we have used RALF to evaluate eight different register allocators, including iterated register coalescing, linear scan, a chordal based allocator, and two integer linear programming approaches.
V. Krishna Nandivada, Fernando Magno Quintão Pereira, Jens Palsberg
A New Algorithm for Identifying Loops in Decompilation
Abstract
Loop identification is an essential step of control flow analysis in decompilation. The Classical algorithm for identifying loops is Tarjan’s interval-finding algorithm, which is restricted to reducible graphs. Havlak presents one extension of Tarjan’s algorithm to deal with irreducible graphs, which constructs a loop-nesting forest for an arbitrary flow graph. There’s evidence showing that the running time of this algorithm is quadratic in the worst-case, and not almost linear as claimed. Ramalingam presents an improved algorithm with low time complexity on arbitrary graphs, but it performs not quite well on “real” control flow graphs (CFG). We present a novel algorithm for identifying loops in arbitrary CFGs. Based on a more detailed exploration on properties of loops and depth-first search (DFS), this algorithm traverses a CFG only once based on DFS and collects all information needed on the fly. It runs in approximately linear time and does not use any complicated data structures such as Interval/Derived Sequence of Graphs (DSG) or UNION-FIND sets. To perform complexity analysis of the algorithm, we introduce a new concept called unstructuredness coefficient to describe the unstructuredness of CFGs, and we find that the unstructuredness coefficients of these executables are usually small (<1.5). Such “low-unstructuredness” property distinguishes these CFGs from general single-root connected directed graphs, and it offers an explanation why those algorithms existed perform not quite well on real-world cases. The new algorithm has been applied to 11526 CFGs in 6 typical binary executables on both Linux and Window platforms. Experimental result has validated our theoretical analysis and it shows that our algorithm runs 2-5 times faster than the Havlak-Tarjan algorithm, and 2-8 times faster than the Ramalingam-Havlak-Tarjan algorithm.
Tao Wei, Jian Mao, Wei Zou, Yu Chen
Accelerated Data-Flow Analysis
Abstract
Acceleration in symbolic verification consists in computing the exact effect of some control-flow loops in order to speed up the iterative fix-point computation of reachable states. Even if no termination guarantee is provided in theory, successful results were obtained in practice by different tools implementing this framework. In this paper, the acceleration framework is extended to data-flow analysis. Compared to a classical widening/narrowing-based abstract interpretation, the loss of precision is controlled here by the choice of the abstract domain and does not depend on the way the abstract value is computed. Our approach is geared towards precision, but we don’t loose efficiency on the way. Indeed, we provide a cubic-time acceleration-based algorithm for solving interval constraints with full multiplication.
Jérôme Leroux, Grégoire Sutre
Abstract Error Projection
Abstract
In this paper, we extend model-checking technology with the notion of an error projection. Given a program abstraction, an error projection divides the program into two parts: the part outside the error projection is guaranteed to be correct, while the part inside the error projection can have bugs. Subsequent automated or manual verification effort need only be concentrated on the part inside the error projection. We present novel algorithms for computing error projections using weighted pushdown systems that are sound and complete for the class of Boolean programs and discuss additional applications for these algorithms.
Akash Lal, Nicholas Kidd, Thomas Reps, Tayssir Touili
Precise Thread-Modular Verification
Abstract
Thread-modular verification is a promising approach for the verification of concurrent programs. Its high efficiency is achieved by abstracting the interaction between threads. The resulting polynomial complexity (in the number of threads) has its price: many interesting concurrent programs cannot be handled due to the imprecision of the abstraction. We propose a new abstraction algorithm for thread-modular verification that offers both high degree precision and polynomial complexity. Our algorithm is based on a new abstraction domain that combines Cartesian abstraction with exception sets, which allow one to handle particular thread interactions precisely. Our experimental results demonstrate the practical applicability of the algorithm.
Alexander Malkis, Andreas Podelski, Andrey Rybalchenko
Modular Safety Checking for Fine-Grained Concurrency
Abstract
Concurrent programs are difficult to verify because the proof must consider the interactions between the threads. Fine-grained concurrency and heap allocated data structures exacerbate this problem, because threads interfere more often and in richer ways. In this paper we provide a thread-modular safety checker for a class of pointer-manipulating fine-grained concurrent algorithms. Our checker uses ownership to avoid interference whenever possible, and rely/guarantee (assume/guarantee) to deal with interference when it genuinely exists.
Cristiano Calcagno, Matthew Parkinson, Viktor Vafeiadis
Static Analysis of Dynamic Communication Systems by Partner Abstraction
Abstract
Prominent examples of dynamic communication systems include traffic control systems and ad hoc networks. They are hard to verify due to inherent unboundedness. Unbounded creation and destruction of objects and a dynamically evolving communication topology are characteristic features.
Partner graph grammars are presented as an adequate specification formalism for dynamic communication systems. They are based on the single pushout approach to algebraic graph transformation and specifically tailored to dynamic communication systems. We propose a new verification technique based on abstract interpretation of partner graph grammars. It uses a novel two-layered abstraction, partner abstraction, that keeps precise information about objects and their communication partners. We identify statically checkable cases for which the abstract interpretation is even complete. In particular, applicability of transformation rules is preserved precisely. The analysis has been implemented in the hiralysis tool. It is evaluated on a complex case study, car platooning, for which many interesting properties can be proven automatically.
Jörg Bauer, Reinhard Wilhelm
Exploiting Pointer and Location Equivalence to Optimize Pointer Analysis
Abstract
Pointer information is a prerequisite for most program analyses, and inclusion-based, i.e. Andersen-style, pointer analysis is widely used to compute such information. However, current inclusion-based analyses can have prohibitive costs in time and space, especially for programs with millions of lines of code. We present a suite of offline optimizations that exploit pointer and location equivalence to shrink the input to the subsequent pointer analysis without affecting precision, dramatically reducing both analysis time and memory consumption. Using a suite of six open-source C programs ranging in size from 169K to 2.17M LOC, we demonstrate that our techniques on average improve analysis time by 1.3–2.7× and reduce memory consumption by 3.2–6.9× over the best current techniques.
Ben Hardekopf, Calvin Lin
Hierarchical Pointer Analysis for Distributed Programs
Abstract
We present a new pointer analysis for use in shared memory programs running on hierarchical parallel machines. The analysis is motivated by the partitioned global address space languages, in which programmers have control over data layout and threads and can directly read and write to memory associated with other threads. Titanium, UPC, Co-Array Fortran, X10, Chapel, and Fortress are all examples of such languages. The novelty of our analysis comes from the hierarchical machine model used, which captures the increasingly hierarchical nature of modern parallel machines. For example, the analysis can distinguish between pointers that can reference values within a thread, within a shared memory multiprocessor, or within a network of processors. The analysis is presented with a formal type system and operational semantics, articulating the various ways in which pointers can be used within a hierarchical machine model. The hierarchical analysis has several applications, including race detection, sequential consistency enforcement, and software caching. We present results of an implementation of the analysis, applying it to data race detection, and show that the hierarchical analysis is very effective at reducing the number of false races detected.
Amir Kamil, Katherine Yelick
Semantics-Based Transformation of Arithmetic Expressions
Abstract
Floating-point arithmetic is an important source of errors in programs because of the loss of precision arising during a computation. Unfortunately, this arithmetic is not intuitive (e.g. many elementary operations are not associative, inversible, etc.) making the debugging phase very difficult and empiric.
This article introduces a new kind of program transformation in order to automatically improve the accuracy of floating-point computations. We use P. Cousot and R. Cousot’s framework for semantics program transformation and we propose an offline transformation. This technique was implemented, and the first experimental results are presented.
Matthieu Martel
A Fast Implementation of the Octagon Abstract Domain on Graphics Hardware
Abstract
We propose an efficient implementation of the Octagon Abstract Domain (OAD) on Graphics Processing Unit (GPU) by exploiting stream processing to speed-up OAD computations. OAD is a relational numerical abstract domain which approximates invariants as conjunctions of constraints of the form ±x±y < = c, where x and y are program variables and c is a constant which can be an integer, rational or real. Since OAD computations are based on matrices, and basic matrix operators, they can be mapped easily on Graphics Hardware using texture and pixel shader in the form of a kernel that implements matrix operators. The main advantage of our implementation is that we can achieve sensible speed up by using a single GPU, for each OAD operation. This can be the basis for an efficient abstract program analyzer based on a mixed CPU-GPU architecture.
Francesco Banterle, Roberto Giacobazzi
Fixpoint-Guided Abstraction Refinements
Abstract
In this paper, we present an abstract fixpoint checking algorithm with automatic refinement by backward completion in Moore closed abstract domains. We study the properties of our algorithm and prove it to be more precise than the counterexample guided abstract refinement algorithm (CEGAR). Contrary to several works in the literature, our algorithm does not require the abstract domains to be partitions of the state space. We also show that our automatic refinement technique is compatible with so-called acceleration techniques. Furthermore, the use of Boolean closed domains does not improve the precision of our algorithm. The algorithm is illustrated by proving properties of programs with nested loops.
Patrick Cousot, Pierre Ganty, Jean-François Raskin
Guided Static Analysis
Abstract
In static analysis, the semantics of the program is expressed as a set of equations. The equations are solved iteratively over some abstract domain. If the abstract domain is distributive and satisfies the ascending-chain condition, an iterative technique yields the most precise solution for the equations. However, if the above properties are not satisfied, the solution obtained is typically imprecise. Moreover, due to the properties of widening operators, the precision loss is sensitive to the order in which the state-space is explored.
In this paper, we introduce guided static analysis, a framework for controlling the exploration of the state-space of a program. The framework guides the state-space exploration by applying standard static-analysis techniques to a sequence of modified versions of the analyzed program. As such, the framework does not require any modifications to existing analysis techniques, and thus can be easily integrated into existing static-analysis tools.
We present two instantiations of the framework, which improve the precision of widening in (i) loops with multiple phases and (ii) loops in which the transformation performed on each iteration is chosen non-deterministically.
Denis Gopan, Thomas Reps
Program Analysis Using Symbolic Ranges
Abstract
Interval analysis seeks static lower and upper bounds on the values of program variables. These bounds are useful, especially for inferring invariants to prove buffer overflow checks. In practice, however, intervals by themselves are often inadequate as invariants due to the lack of relational information among program variables.
In this paper, we present a technique for deriving symbolic bounds on variable values. We study a restricted class of polyhedra whose constraints are stratified with respect to some variable ordering provided by the user, or chosen heuristically. We define a notion of normalization for such constraints and demonstrate polynomial time domain operations on the resulting domain of symbolic range constraints. The abstract domain is intended to complement widely used domains such as intervals and octagons for use in buffer overflow analysis. Finally, we study the impact of our analysis on commercial software using an overflow analyzer for the C language.
Sriram Sankaranarayanan, Franjo Ivančić, Aarti Gupta
Shape Analysis with Structural Invariant Checkers
Abstract
Developer-supplied data structure specifications are important to shape analyses, as they tell the analysis what information should be tracked in order to obtain the desired shape invariants. We observe that data structure checking code (eg, used in testing or dynamic analysis) provides shape information that can also be used in static analysis. In this paper, we propose a lightweight, automatic shape analysis based on these developer-supplied structural invariant checkers. In particular, we set up a parametric abstract domain, which is instantiated with such checker specifications to summarize memory regions using both notions of complete and partial checker evaluations. The analysis then automatically derives a strategy for canonicalizing or weakening shape invariants.
Bor-Yuh Evan Chang, Xavier Rival, George C. Necula
Footprint Analysis: A Shape Analysis That Discovers Preconditions
Abstract
Existing shape analysis algorithms infer descriptions of data structures at program points, starting from a given precondition. We describe an analysis that does not require any preconditions. It works by attempting to infer a description of only the cells that might be accessed, following the footprint idea in separation logic. The analysis allows us to establish a true Hoare triple for a piece of code, independently of the context in which it occurs and without a whole-program analysis. We present experimental results for a range of typical list-processing algorithms, as well as for code fragments from a Windows device driver.
Cristiano Calcagno, Dino Distefano, Peter W. O’Hearn, Hongseok Yang
Arithmetic Strengthening for Shape Analysis
Abstract
Shape analyses are often imprecise in their numerical reasoning, whereas numerical static analyses are often largely unaware of the shape of a program’s heap. In this paper we propose a lazy method of combining a shape analysis based on separation logic with an arbitrary arithmetic analysis. When potentially spurious counterexamples are reported by our shape analysis, the method constructs a purely arithmetic program whose traces over-approximate the set of counterexample traces. It then uses this arithmetic program together with the arithmetic analysis to construct a refinement for the shape analysis. Our method is aimed at proving properties that require comprehensive reasoning about heaps together with more targeted arithmetic reasoning. Given a sufficient precondition, our technique can automatically prove memory safety of programs whose error-free operation depends on a combination of shape, size, and integer invariants. We have implemented our algorithm and tested it on a number of common list routines using a variety of arithmetic analysis tools for refinement.
Stephen Magill, Josh Berdine, Edmund Clarke, Byron Cook
Astrée: From Research to Industry
Abstract
Airbus has started introducing abstract interpretation based static analysers into the verification process of some of its avionics software products. Industrial constraints require any such tool to be extremely precise, which can only be achieved after a twofold specialisation process: first, it must be designed to verify a class of properties for a family of programs efficiently; second, it must be parametric enough for the user to be able to fine tune the analysis of any particular program of the family. This implies a close cooperation between the tool-providers and the end-users. Astrée is such a static analyser: it produces only a small number of false alarms when attempting to prove the absence of run-time errors in control/command programs written in C, and provides the user with enough options and directives to help reduce this number down to zero. Its specialisation process has been reported in several scientific papers, such as [1] and [2]. Through the description of analyses performed with Astrée on industrial programs, we give an overview of the false alarm reduction process from an engineering point of view, and sketch a possible customer-supplier relationship model for the emerging market for static analysers.
David Delmas, Jean Souyris
Magic-Sets Transformation for the Analysis of Java Bytecode
Abstract
Denotational static analysis of Java bytecode has a nice and clean compositional definition and an efficient implementation with binary decision diagrams. But it models only the functionalie input/output behaviour of a program P, not enough if one needs P’s internal behaviours ie from the input to some internal program points. We overcome this limitation with a technique used up to now for logic programs only. It adds new magic blocks of code to P, whose functional behaviours are the internal behaviours of P. We prove this transformation correct with an operational semantics. We define an equivalent denotational semantics, whose denotations for the magic blocks are hence the internal behaviours of P. We implement our transformation and instantiate it with abstract domains modelling sharing of two variables and non-cyclicity of variables. We get a static analyser for full Java bytecode that is faster and scales better than another operational pair-sharing analyser and a constraint-based pointer analyser.
Étienne Payet, Fausto Spoto
Backmatter
Metadaten
Titel
Static Analysis
herausgegeben von
Hanne Riis Nielson
Gilberto Filé
Copyright-Jahr
2007
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-74061-2
Print ISBN
978-3-540-74060-5
DOI
https://doi.org/10.1007/978-3-540-74061-2

Premium Partner