nach oben

2005 | Buch

Kapitel lesen Erstes Kapitel lesen

Programming Languages and Systems

Third Asian Symposium, APLAS 2005, Tsukuba, Japan, November 2-5, 2005. Proceedings

herausgegeben von: Kwangkeun Yi

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Inhaltsverzeichnis

Frontmatter

Invited Talk

Session 1

The Essence of Dataflow Programming

Abstract

We propose a novel, comonadic approach to dataflow (stream-based) computation. This is based on the observation that both general and causal stream functions can be characterized as coKleisli arrows of comonads and on the intuition that comonads in general must be a good means to structure context-dependent computation. In particular, we develop a generic comonadic interpreter of languages for context-dependent computation and instantiate it for stream-based computation. We also discuss distributive laws of a comonad over a monad as a means to structure combinations of effectful and context-dependent computation. We apply the latter to analyse clocked dataflow (partial stream based) computation.

Tarmo Uustalu, Varmo Vene

Data Refinement with Low-Level Pointer Operations

Abstract

We present a method for proving data refinement in the presence of low-level pointer operations, such as memory allocation and deallocation, and pointer arithmetic. Surprisingly, none of the existing methods for data refinement, including those specifically designed for pointers, are sound in the presence of low-level pointer operations. The reason is that the low-level pointer operations allow an additional potential for obtaining the information about the implementation details of the module: using memory allocation and pointer comparison, a client of a module can find out which cells are internally used by the module, even without dereferencing any pointers. The unsoundness of the existing methods comes from the failure of handling this potential. In the paper, we propose a novel method for proving data refinement, called power simulation, and show that power simulation is sound even with low-level pointer operations.

Ivana Mijajlović, Hongseok Yang

A Simple Semantics for Polymorphic Recursion

Abstract

Polymorphic recursion is a useful extension of Hindley- Milner typing and has been incorporated in the functional programming language Haskell. It allows the expression of efficient algorithms that take advantage of non-uniform data structures and provides key support for generic programming. However, polymorphic recursion is, perhaps, not as broadly understood as it could be and this, in part, motivates the denotational semantics presented here. The semantics reported here also contributes an essential building block to any semantics of Haskell: a model for first-order polymorphic recursion. Furthermore, Haskell-style type classes may be described within this semantic framework in a straightforward and intuitively appealing manner.

William L. Harrison

Symbolic Execution with Separation Logic

Abstract

We describe a sound method for automatically proving Hoare triples for loop-free code in Separation Logic, for certain preconditions and postconditions (symbolic heaps). The method uses a form of symbolic execution, a decidable proof theory for symbolic heaps, and extraction of frame axioms from incomplete proofs. This is a precursor to the use of the logic in automatic specification checking, program analysis, and model checking.

Josh Berdine, Cristiano Calcagno, Peter W. O’Hearn

Session 2

An Abstract Interpretation Perspective on Linear vs. Branching Time

Abstract

It is known that the branching time language ACTL and the linear time language ∀ LTL of universally quantified formulae of LTL have incomparable expressive powers, i.e., Sem(ACTL) and Sem(∀LTL) are incomparable sets. Within a standard abstract interpretation framework, ACTL can be viewed as an abstract interpretation LTL^∀ of LTL where the universal path quantifier ∀ abstracts each linear temporal operator of LTL to a corresponding branching state temporal operator of ACTL. In abstract interpretation terms, it turns out that the universal path quantifier abstraction of LTL is incomplete. In this paper we reason on a generic abstraction α over a domain A of a generic linear time language L. This approach induces both a language αL of α-abstracted formulae of L and an abstract language L^α whose operators are the best correct abstractions in A of the linear operators of L. When the abstraction α is complete for the operators in L it turns out that α L and L^α have the same expressive power, so that trace-based model checking of α L can be reduced with no lack of precision to A-based model checking of L^α. This abstract interpretation-based approach allows to compare temporal languages at different levels of abstraction and to view the standard linear vs. branching time comparison as a particular instance.

Francesco Ranzato, Francesco Tapparo

The Parallel Implementation of the Astrée Static Analyzer

Abstract

The Astrée static analyzer is a specialized tool that can prove the absence of runtime errors, including arithmetic overflows, in large critical programs. Keeping analysis times reasonable for industrial use is one of the design objectives. In this paper, we discuss the parallel implementation of the analysis.

David Monniaux

Using Datalog with Binary Decision Diagrams for Program Analysis

Abstract

Many problems in program analysis can be expressed naturally and concisely in a declarative language like Datalog. This makes it easy to specify new analyses or extend or compose existing analyses. However, previous implementations of declarative languages perform poorly compared with traditional implementations. This paper describes bddbddb, a BDD-Based Deductive DataBase, which implements the declarative language Datalog with stratified negation, totally-ordered finite domains and comparison operators. bddbddb uses binary decision diagrams (BDDs) to efficiently represent large relations. BDD operations take time proportional to the size of the data structure, not the number of tuples in a relation, which leads to fast execution times. bddbddb is an effective tool for implementing a large class of program analyses. We show that a context-insensitive points-to analysis implemented with bddbddb is about twice as fast as a carefully hand-tuned version. The use of BDDs also allows us to solve heretofore unsolved problems, like context-sensitive pointer analysis for large programs.

John Whaley, Dzintars Avots, Michael Carbin, Monica S. Lam

Loop Invariants on Demand

Abstract

This paper describes a sound technique that combines the precision of theorem proving with the loop-invariant inference of abstract interpretation. The loop-invariant computations are invoked on demand when the need for a stronger loop invariant arises, which allows a gradual increase in the level of precision used by the abstract interpreter. The technique generates loop invariants that are specific to a subset of a program’s executions, achieving a dynamic and automatic form of value-based trace partitioning. Finally, the technique can be incorporated into a lemmas-on-demand theorem prover, where the loop-invariant inference happens after the generation of verification conditions.

K. Rustan M. Leino, Francesco Logozzo

Type Systems for XML

Abstract

XML is a standard data format that is nowadays used everywhere. A notable feature of XML is its user-definable schemas. Schemas describe structural constraints on XML documents, thus defining “types” of XML. However, in current languages and systems for processing XML, those types are used only for dynamically validating data, not for statically verifying programs.

The goal of this work is to establish methods for the design and implementation of type systems for XML processing. However, this task is not a simple transfer of existing knowledges in programming languages since XML types are based on regular tree expressions and therefore have much richer structure than standard types treated in past researches. More concretely, difficulties arise in finding suitable definitions and algorithms for (1) typing concepts already standard in functional programming, e.g., subtyping and parametric polymorphism, (2) XML-specific structures, e.g., (in addition to regular tree expressions) attribute and shuffle expressions, and (3) language constructs for XML processing, e.g., pattern matching and its extensions. In this talk, I will overview our efforts dealing with these issues, emphasizing the principles consistently used in all of these—”definition by semantics” and “implementation based on finite tree automata.”

This work has been done jointly with Jérôme Vouillon, Benjamin Pierce, Makoto Murata, Tadahiro Suda, Alain Frisch, and Giuseppe Castagna.

Haruo Hosoya

Invited Talk

Session 3

Reflection Analysis for Java

Abstract

Reflection has always been a thorn in the side of Java static analysis tools. Without a full treatment of reflection, static analysis tools are both incomplete because some parts of the program may not be included in the application call graph, and unsound because the static analysis does not take into account reflective features of Java that allow writes to object fields and method invocations. However, accurately analyzing reflection has always been difficult, leading to most static analysis tools treating reflection in an unsound manner or just ignoring it entirely. This is unsatisfactory as many modern Java applications make significant use of reflection.

In this paper we propose a static analysis algorithm that uses points-to information to approximate the targets of reflective calls as part of call graph construction. Because reflective calls may rely on input to the application, in addition to performing reflection resolution, our algorithm also discovers all places in the program where user-provided specifications are necessary to fully resolve reflective targets. As an alternative to user-provided specifications, we also propose a reflection resolution approach based on type cast information that reduces the need for user input, but typically results in a less precise call graph.

We have implemented the reflection resolution algorithms described in this paper and applied them to a set of six large, widely-used benchmark applications consisting of more than 600,000 lines of code combined. Experiments show that our technique is effective for resolving most reflective calls without any user input. Certain reflective calls, however, cannot be resolved at compile time precisely. Relying on a user-provided specification to obtain a conservative call graph results in graphs that contain 1.43 to 6.58 times more methods that the original. In one case, a conservative call graph has 7,047 more methods than a call graph that does not interpret reflective calls. In contrast, ignoring reflection leads to missing substantial portions of the application call graph.

Benjamin Livshits, John Whaley, Monica S. Lam

Lightweight Family Polymorphism

Abstract

Family polymorphism has been proposed for object-oriented languages as a solution to supporting reusable yet type-safe mutually recursive classes. A key idea of family polymorphism is the notion of families, which are used to group mutually recursive classes. In the original proposal, due to the design decision that families are represented by objects, dependent types had to be introduced, resulting in a rather complex type system. In this paper, we propose a simpler solution of lightweight family polymorphism, based on the idea that families are represented by classes rather than objects. This change makes the type system significantly simpler without losing much expressibility of the language. Moreover, “family-polymorphic” methods now take a form of parametric methods; thus it is easy to apply the Java-style type inference. To rigorously show that our approach is safe, we formalize the set of language features on top of Featherweight Java and prove the type system is sound. An algorithm of type inference for family-polymorphic method invocations is also formalized and proved to be correct.

Atsushi Igarashi, Chieri Saito, Mirko Viroli

A Portable and Customizable Profiling Framework for Java Based on Bytecode Instruction Counting

Abstract

Prevailing profilers for Java, which rely on standard, native-code profiling interfaces, are not portable, give imprecise results due to serious measurement perturbation, and cause excessive overheads. In contrast, program transformations allow to generate reproducible profiles in a fully portable way with significantly less overhead. This paper presents a profiling framework that instruments Java programs at the bytecode level to build context-sensitive execution profiles at runtime. The profiling framework includes an exact profiler as well as a sampling profiler. User-defined profiling agents can be written in pure Java, too, in order to customize the runtime processing of profiling data.

Walter Binder

Race Conditions in Message Sequence Charts

Abstract

Message Sequence Charts (MSCs) are a graphical language for the description of scenarios in terms of message exchanges between communicating components in a distributed environment. The language has been standardised by the ITU and given a formal semantics by means of a process algebra. In this paper, we review a design anomaly, called race condition, in an MSC specification and argue that the current solution correcting race conditions is too weak when implementation is considered. In this paper, we provide an algorithm on partial orders as our solution. The result is a strengthened partial order, which is race-free and remains race-free in the implementation.

Chien-An Chen, Sara Kalvala, Jane Sinclair

Integrating Physical Systems in the Static Analysis of Embedded Control Software

Abstract

Abstract interpretation is a theory of effective abstraction and/or approximation of discrete mathematical structures as found in the semantics of programming languages, modelling program executions, hence program properties, at various levels of abstraction [3,7,8,10,12].

Patrick Cousot

Invited Talk

Session 4

Calculating Polynomial Runtime Properties

Abstract

Affine size-change analysis has been used for termination analysis of eager functional programming languages. The same style of analysis is also capable of compactly recording and calculating other properties of programs, including their runtime, maximum stack depth, and (relative) path time costs. In this paper we show how precise (not just big- \(\mathcal{O}\)) polynomial bounds on such costs may be calculated on programs, by a characterization as a problem in quantifier elimination. The technique is decidable, and complete for a class of size-change terminating programs with limited-degree polynomial costs. An extension to the technique allows the calculation of some classes of exponential-cost programs. We demonstrate the new technique by recording the calculation in numbers-of-function (or procedure) calls for a simple functional definition language, but it can also be applied to imperative languages. The technique is automated within the reduce computer algebra system.

Hugh Anderson, Siau-Cheng Khoo, Stefan Andrei, Beatrice Luca

Resource Bound Certification for a Tail-Recursive Virtual Machine

Abstract

We define a method to statically bound the size of values computed during the execution of a program as a function of the size of its parameters. More precisely, we consider bytecode programs that should be executed on a simple stack machine with support for algebraic data types, pattern-matching and tail-recursion. Our size verification method is expressed as a static analysis, performed at the level of the bytecode, that relies on machine-checkable certificates. We follow here the usual assumption that code and certificates may be forged and should be checked before execution.

Our approach extends a system of static analyses based on the notion of quasi-interpretations that has already been used to enforce resource bounds on first-order functional programs. This paper makes two additional contributions. First, we are able to check optimized programs, containing instructions for unconditional jumps and tail-recursive calls, and remove restrictions on the structure of the bytecode that was imposed in previous works. Second, we propose a direct algorithm that depends only on solving a set of arithmetical constraints.

Silvano Dal Zilio, Régis Gascon

A Path Sensitive Type System for Resource Usage Verification of C Like Languages

Abstract

In this paper, we present a path sensitive type system for resource usage verification. Path sensitivity is essential to model resource usage in C programs correctly and accurately. So far, most of methods to analyze this kind of property in the path sensitive way have been proposed as whole program analyses or unsound analyses. Our main contributions are as follows. First, we formalize a sound analysis for path sensitive resource usage properties in C like languages. To the best of our knowledge, it is the first sound and modular analysis for this problem. We provide the complete proof for the soundness of the type system and algorithm. Second, our analysis is modular, and we provide an inference algorithm to generate function summaries automatically. We believe that our approach suggests new insights into the design of modular analyses.

Hyun-Goo Kang, Youil Kim, Taisook Han, Hwansoo Han

Termination Analysis of Higher-Order Functional Programs

Abstract

Size-change termination (SCT) automatically identifies termination of first-order functional programs. The SCT principle: a program terminates if every infinite control flow sequence would cause an infinite descent in a well-founded data value (POPL 2001).

More recent work (RTA 2004) developed a termination analysis of the pure untyped λ-calculus using a similar approach, but an entirely different notion of size was needed to compare higher-order values. Again this is a powerful analysis, even proving termination of certain λ-expressions containing the fixpoint combinator Y. However the language analysed is tiny, not even containing constants.

These techniques are unified and extended significantly, to yield a termination analyser for higher-order, call-by-value programs as in ML’s purely functional core or similar functional languages. Our analyser has been proven correct, and implemented for a substantial subset of OCaml.

Damien Sereni, Neil D. Jones

Session 5

Heterogeneous Fixed Points with Application to Points-To Analysis

Abstract

Many situations can be modeled as solutions of systems of simultaneous equations. If the functions of these equations monotonically increase in all bound variables, then the existence of extremal fixed point solutions for the equations is guaranteed. Among all solutions, these fixed points uniformly take least or greatest values for all bound variables. Hence, we call them homogeneous fixed points. However, there are systems of equations whose functions monotonically increase in some variables and decrease in others. The existence of solutions of such equations cannot be guaranteed using classical fixed point theory. In this paper, we define general conditions to guarantee the existence and computability of fixed point solutions of such equations. In contrast to homogeneous fixed points, these fixed points take least values for some variables and greatest values for others. Hence, we call them heterogeneous fixed points. We illustrate heterogeneous fixed point theory through points-to analysis.

Aditya Kanade, Uday Khedker, Amitabha Sanyal

Register Allocation Via Coloring of Chordal Graphs

Abstract

We present a simple algorithm for register allocation which is competitive with the iterated register coalescing algorithm of George and Appel. We base our algorithm on the observation that 95% of the methods in the Java 1.5 library have chordal interference graphs when compiled with the JoeQ compiler. A greedy algorithm can optimally color a chordal graph in time linear in the number of edges, and we can easily add powerful heuristics for spilling and coalescing. Our experiments show that the new algorithm produces better results than iterated register coalescing for settings with few registers and comparable results for settings with many registers.

Fernando Magno Quintão Pereira, Jens Palsberg

Transformation to Dynamic Single Assignment Using a Simple Data Flow Analysis

Abstract

This paper presents a novel method to construct a dynamic single assignment (DSA) form of array-intensive, pointer-free C programs (or in any other procedural language). A program in DSA form does not perform any destructive update of scalars and array elements, i.e., each element is written at most once. As DSA makes the dependencies between variable references explicit, it facilitates complex analyses and optimizations of programs. Existing transformations into DSA perform a complex data flow analysis with exponential analysis time and work only for a limited set of input programs. Our method removes irregularities from the data flow by adding copy assignments to the program, and then it can use simple data flow analyses. The DSA transformation presented scales very well with growing program sizes and overcomes a number of important limitations of existing methods. We have implemented the method and it is being used in the context of memory optimization and verification of those optimizations.

Peter Vanbroekhoven, Gerda Janssens, Maurice Bruynooghe, Francky Catthoor

Abstract Dependences for Alarm Diagnosis

Abstract

We propose a framework for dependence analyses, adapted –among others– to the understanding of static analyzers outputs. Static analyzers like Astrée are sound but not complete; hence, they may yield false alarms, that is report not being able to prove part of the properties of interest. Helping the user in the alarm inspection task is a major challenge for current static analyzers. Semantic slicing, i.e. the computation of precise abstract invariants for a set of erroneous traces, provides a useful characterization of a possible error context. We propose to enhance semantic slicing with information about abstract dependences. Abstract dependences should be more informative than mere dependences: first, we propose to restrict to the dependences that can be observed in a slice; second, we define dependences among abstract properties, so as to isolate abnormal behaviors as source of errors. Last, stronger notions of slicing should allow to restrict slices to such dependences.

Xavier Rival

Session 6

A Typed, Compositional Logic for a Stack-Based Abstract Machine

Abstract

We define a compositional program logic in the style of Floyd and Hoare for a simple, typed, stack-based abstract machine with unstructured control flow, global variables and mutually recursive procedure calls. Notable features of the logic include a careful treatment of auxiliary variables and quantification and the use of substructural typing to permit local, modular reasoning about program fragments. Semantic soundness is established using an interpretation of types and assertions defined by orthogonality with respect to sets of contexts.

Nick Benton

A New Occurrence Counting Analysis for BioAmbients

Abstract

This paper concerns the application of formal methods to biological systems, modelled specifically in BioAmbients [30]. BioAmbients [30] is a variant of the Mobile Ambients (MA) [7] calculus, designed precisely for more faithfully capturing basic biological concepts. We propose a new static analysis for BioAmbients which computes approximate information about the run-time behaviour of a system. The analysis is derived following the abstract interpretation approach and introduces two main novelties with respect to the analyses in literature [25,24,26,27]: (i) it records information about the number of occurrences of objects; (ii) it maintains more detailed information about the possible contents of ambients, at any time. In this way, the analysis gives substantially more precise results and captures both the quantitative and causal aspect which are really important for reasoning on the temporal and spatial structure of biological systems. The interest of the analysis is demonstrated by considering a few simple examples which point out the limitations of the existing analyses for BioAmbients.

Roberta Gori, Francesca Levi

A Parametric Model for the Analysis of Mobile Ambients

Abstract

In this paper we propose a new parametric abstract finite model of Mobile Ambients able to express several properties on processes. The model can be used for the analysis of these properties by means of model checking techniques. The precision of the model can be increased by modifying certain numeric parameters increasingly avoiding thereby the occurrences of false counterexamples in the analysis.

Dino Distefano

On the Rôle of Abstract Non-interference in Language-Based Security

Abstract

In this paper, we illustrate the rôle of the notion of Abstract Non-Interference in language based security, by explaining how it models both the weakening of attackers’ observational capability, and the declassification of private information. Namely, we show that in abstract non-interference we model both attackers that can only observe properties of public data, and private properties that can or cannot flow. Moreover, we deepen the understanding of abstract non-interference by comparing it, by means of examples, with some the most interesting approaches to the weakening of non-interference, such as the PER model, robust declassification, delimited release and relaxed non-interference.

Isabella Mastroeni

A Next-Generation Platform for Analyzing Executables

Abstract

In recent years, there has been a growing need for tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virus-infected code. Static analysis provides techniques that can help with such problems; however, there are several obstacles that must be overcome:

– For many kinds of potentially malicious programs, symbol-table and debugging information is entirely absent. Even if it is present, it cannot be relied upon.

– To understand memory-access operations, it is necessary to determine the set of addresses accessed by each operation. This is difficult because

While some memory operations use explicit memory addresses in the instruction (easy), others use indirect addressing via address expressions (difficult).
Arithmetic on addresses is pervasive. For instance, even when the value of a local variable is loaded from its slot in an activation record, address arithmetic is performed.
There is no notion of type at the hardware level, so address values cannot be distinguished from integer values.
Memory accesses do not have to be aligned, so word-sized address values could potentially be cobbled together from misaligned reads and writes.

We have developed static-analysis algorithms to recover information about the contents of memory locations and how they are manipulated by an executable. By combining these analyses with facilities provided by the IDAPro and CodeSurfer toolkits, we have created CodeSurfer/x86, a prototype tool for browsing, inspecting, and analyzing x86 executables. From an x86 executable, CodeSurfer/x86 recovers intermediate representations that are similar to what would be created by a compiler for a program written in a high-level language. CodeSurfer/x86 also supports a scripting language, as well as several kinds of sophisticated pattern-matching capabilities. These facilities provide a platform for the development of additional tools for analyzing the security properties of executables.

T. Reps, G. Balakrishnan, J. Lim, T. Teitelbaum

Backmatter

Titel: Programming Languages and Systems
herausgegeben von: Kwangkeun Yi
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-32247-4
Print ISBN: 978-3-540-29735-2
DOI: https://doi.org/10.1007/11575467