Fairness: A Formal-Methods Perspective

Abstract

Sensitive decisions of large-scale societal impact are increasingly being delegated to opaque software—a trend that is unlikely to slow down in the near future. The issue of fairness and bias of decision-making algorithms has thus become a multifaceted, interdisciplinary concern, attracting the attention of computer scientists, law scholars, policy makers, journalists, and many others. In this expository paper, I will outline some of the research questions we have been studying about fairness through the lens of formal methods.

Aws Albarghouthi

The MISRA C Coding Standard and its Role in the Development and Analysis of Safety- and Security-Critical Embedded Software

Abstract

The MISRA project started in 1990 with the mission of providing world-leading best practice guidelines for the safe and secure application of both embedded control systems and standalone software. MISRA C is a coding standard defining a subset of the C language, initially targeted at the automotive sector, but now adopted across all industry sectors that develop C software in safety- and/or security-critical contexts. In this paper, we introduce MISRA C, its role in the development of critical software, especially in embedded systems, its relevance to industry safety standards, as well as the challenges of working with a general-purpose programming language standard that is written in natural language with a slow evolution over the last 40+ years. We also outline the role of static analysis in the automatic checking of compliance with respect to MISRA C, and the role of the MISRA C language subset in enabling a wider application of formal methods to industrial software written in C.

Roberto Bagnara, Abramo Bagnara, Patricia M. Hill

Numerical Invariants via Abstract Machines

Abstract

This paper presents an overview of a line of recent work on generating non-linear numerical invariants for loops and recursive procedures. The method is compositional in the sense that it operates by breaking the program into parts, analyzing each part independently, and then combining the results. The fundamental challenge is to devise an effective method for analyzing the behavior of a loop given the results of analyzing its body. The key idea is to separate the problem into two: first we approximate the loop dynamics by an abstract machine, and then symbolically compute the reachability relation of the abstract machine.

Zachary Kincaid

Deductive Verification in Decidable Fragments with Ivy

Abstract

This paper surveys the work to date on Ivy, a language and a tool for the formal specification and verification of distributed systems. Ivy supports deductive verification using automated provers, model checking, automated testing, manual theorem proving and generation of executable code. In order to achieve greater verification productivity, a key design goal for Ivy is to allow the engineer to apply automated provers in the realm in which their performance is relatively predictable, stable and transparent. In particular Ivy focuses on the use of decidable fragments of first-order logic. We consider the rationale or Ivy’s design, the various capabilities of the tool, as well as case studies and applications.

Kenneth L. McMillan, Oded Padon

Experience Developing and Deploying Concurrency Analysis at Facebook

Abstract

This paper tells the story of the development of RacerD, a static program analysis for detecting data races that is in production at Facebook. The technical details of RacerD are described in a separate paper; we concentrate here on how the project unfolded from a human point of view. The paper describes, in this specific case, the benefits of feedback between science and engineering, the tension encountered between principle and compromise, and how being flexible and adaptable in the presence of a changing engineering context can lead to surprising results which far exceed initial expectations. I hope to give the reader an impression of what it is like to develop advanced static analyses in industry, how it is both different from and similar to developing analyses for the purpose of advancing science.

Peter O’Hearn

New Applications of Software Synthesis: Verification of Configuration Files and Firewall Repair

Abstract

The main goal of software synthesis is to automatically derive code from a given specification. The specification can be either explicitly written, or specified through a couple of representative examples illustrating the user’s intent. However, sometimes there is no specification and we need to infer the specification from a given environment. This paper present two such efforts.

We first show, using verification for configuration files, how to learn specification when the given examples is actually a set of configuration files. Software failures resulting from configuration errors have become commonplace as modern software systems grow increasingly large and more complex. The lack of language constructs in configuration files, such as types and grammars, has directed the focus of a configuration file verification towards building post-failure error diagnosis tools. We describe a framework which analyzes data sets of correct configuration files and derives rules for building a language model from the given data set. The resulting language model can be used to verify new configuration files and detect errors in them.

We next describe a systematic effort that can automatically repair firewalls, using the programming by example approach. Firewalls are widely employed to manage and control enterprise networks. Because enterprise-scale firewalls contain hundreds or thousands of policies, ensuring the correctness of firewalls – whether the policies in the firewalls meet the specifications of their administrators – is an important but challenging problem. In our approach, after an administrator observes undesired behavior in a firewall, she may provide input/output examples that comply with the intended behavior. Based on the given examples, we automatically synthesize new firewall rules for the existing firewall. This new firewall correctly handles packets specified by the examples, while maintaining the rest of the behavior of the original firewall.

Ruzica Piskac

Interactive Verification of Distributed Protocols Using Decidable Logic

Abstract

Distributed systems are becoming more and more pervasive in our lives, making their correctness crucial. Unfortunately, distributed systems are notoriously hard to get right and verify. Due to the infinite state space (e.g., unbounded number of nodes and messages) and the complexity of the protocols used, verification of such systems is both undecidable and hard in practice.

Sharon Shoham

Abstract Interpretation of Stateful Networks

Abstract

Modern networks achieve robustness and scalability by maintaining states on their nodes. These nodes are referred to as middleboxes and are essential for network functionality. However, the presence of middleboxes drastically complicates the task of network verification. Previous work showed that the problem is undecidable in general and EXPSPACE-complete when abstracting away the order of packet arrival.

We describe a new algorithm for conservatively checking isolation properties of stateful networks. The asymptotic complexity of the algorithm is polynomial in the size of the network, albeit being exponential in the maximal number of queries of the local state that a middlebox can do, which is often small.

Our algorithm is sound, i.e., it can never miss a violation of safety but may fail to verify some properties. The algorithm performs on-the fly abstract interpretation by (1) abstracting away the order of packet processing and the number of times each packet arrives, (2) abstracting away correlations between states of different middleboxes and channel contents, and (3) representing middlebox states by their effect on each packet separately, rather than taking into account the entire state space. We show that the abstractions do not lose precision when middleboxes may reset in any state. This is encouraging since many real middleboxes reset, e.g., after some session timeout is reached or due to hardware failure.

Kalev Alpernas, Roman Manevich, Aurojit Panda, Mooly Sagiv, Scott Shenker, Sharon Shoham, Yaron Velner

Block-Size Independence for GPU Programs

Abstract

Optimizing GPU programs by tuning execution parameters is essential to realizing the full performance potential of GPU hardware. However, many of these optimizations do not ensure correctness and subtle errors can enter while optimizing a GPU program. Further, lack of formal models and the presence of non-trivial transformations prevent verification of optimizations.

In this work, we verify transformations involved in tuning the execution parameter, block-size. First, we present a formal programming and execution model for GPUs, and then formalize block-size independence of GPU programs, which ensures tuning block-size preserves program semantics. Next, we present an inter-procedural analysis to verify block-size independence for synchronization-free GPU programs. Finally, we evaluate the analysis on the Nvidia CUDA SDK samples, where 35 global kernels are verified to be block-size independent.

Rajeev Alur, Joseph Devietti, Nimit Singhania

Extending Constraint-Only Representation of Polyhedra with Boolean Constraints

Abstract

We propose a new relational abstract domain for analysing programs with numeric and Boolean variables. The main idea is to represent an abstract state as a set of linear constraints over numeric variables, with every constraint being enabled by a formula over Boolean variables. This allows us, unlike in some existing approaches, to avoid duplicating linear constraints shared by multiple Boolean formulas. To perform domain operations, we adapt algorithms from constraint-only representation of convex polyhedra, most importantly Fourier-Motzkin elimination and projection-based convex hull. We made a prototype implementation of the new domain in our abstract interpreter for Horn clauses. Our initial experiments are, in our opinion, promising and show directions for future improvement.

Alexey Bakhirkin, David Monniaux

An Efficient Abstract Domain for Not Necessarily Closed Polyhedra

Abstract

We present a construction of the abstract domain of NNC (not necessarily topologically closed) polyhedra based on a recently introduced variant of the double description representation and conversion procedure. We describe the implementation of the operators needed to interface the new abstract domain with commonly available static analysis tools, highlighting the efficiency gains enabled by the new representation. We also reconsider the widening operator for NNC polyhedra, proposing a more appropriate specification based on the semantics of the domain elements, rather than their low level representation details. Finally, we provide an experimental evaluation comparing the efficiency of the new abstract domain with respect to more classical implementations.

Anna Becchi, Enea Zaffanella

Modular Software Fault Isolation as Abstract Interpretation

Abstract

Software Fault Isolation (SFI) consists in transforming untrusted code so that it runs within a specific address space, (called the sandbox) and verifying at load-time that the binary code does indeed stay inside the sandbox. Security is guaranteed solely by the SFI verifier whose correctness therefore becomes crucial. Existing verifiers enforce a very rigid, almost syntactic policy where every memory access and every control-flow transfer must be preceded by a sandboxing instruction sequence, and where calls outside the sandbox must implement a sophisticated protocol based on a shadow stack. We propose to define SFI as a defensive semantics, with the purpose of deriving semantically sound verifiers that admit flexible and efficient implementations of SFI. We derive an executable analyser, that works on a per-function basis, which ensures that the defensive semantics does not go wrong, and hence that the code is well isolated. Experiments show that our analyser exhibits the desired flexibility: it validates correctly sandboxed code, it catches code breaking the SFI policy, and it can validate programs where redundant instrumentations are optimised away.

Frédéric Besson, Thomas Jensen, Julien Lepiller

Closing the Performance Gap Between Doubles and Rationals for Octagons

Abstract

Octagons have enduring appeal because their domain operations are simple, readily mapping to for-loops which apply max, min and sum to the entries of a Difference Bound Matrix (DBM). In the quest for efficiency, arithmetic is often realised with double-precision floating-point, albeit at the cost of the certainty provided by arbitrary-precision rationals. In this paper we show how Compact DBMs (CoDBMs), which have recently been proposed as a memory refinement for DBMs, enable arithmetic calculation to be short-circuited in various domain operations. We also show how comparisons can be avoided by changing the tables which underpin CoDBMs. From the perspective of implementation, the optimisations are attractive because they too are conceptually simple, following the ethos of Octagons. Yet they can halve the running time on rationals, putting CoDBMs on rationals on a par with DBMs on doubles.

Aziem Chawdhary, Andy King

Verifying Properties of Differentiable Programs

Abstract

There is growing demand for formal verification methods in the scientific and high performance computing communities. For scientific applications, it is not only necessary to verify the absence of violations such as out of bounds access or race conditions, but also to ensure that the results satisfy certain mathematical properties. In this work, we explore the limits of automated bounded verification in the verification of these programs by applying the symbolic execution tool CIVL to some numerical algorithms that are frequently used in scientific programs, namely a conjugate gradient solver, a finite difference stencil, and a mesh quality metric. These algorithms implement differentiable functions, allowing us to use the automatic differentiation tools Tapenade and ADIC in the creation of their specifications.

Jan Hückelheim, Ziqing Luo, Sri Hari Krishna Narayanan, Stephen Siegel, Paul D. Hovland

A Reduced Product of Absolute and Relative Error Bounds for Floating-Point Analysis

Abstract

Rigorous estimation of bounds on errors in finite precision computation has become a key point of many formal verification tools. The primary interest of the use of such tools is generally to obtain worst-case bounds on the absolute errors. However, the natural bound on the elementary error committed by each floating-point arithmetic operation is a bound on the relative error, which suggests that relative error bounds could also play a role in the process of computing tight error estimations. In this work, we introduce a very simple interval-based abstraction, combining absolute and relative error propagations. We demonstrate with a prototype implementation how this simple product allows us in many cases to improve absolute error bounds, and even to often favorably compare with state-of-the art tools, that rely on much more costly relational abstractions or optimization-based estimations.

Maxime Jacquemin, Sylvie Putot, Franck Védrine

Modular Static Analysis of String Manipulations in C Programs

Abstract

We present a modular analysis able to tackle out-of-bounds accesses in C strings. This analyzer is modular in the sense that it infers and tabulates (for reuse) input/output relations, automatically partitioned according to the shape of the input state. We show how the inter-procedural iterator discovers and generalizes contracts in order to improve their reusability for further analysis. This analyzer was implemented and was able to successfully analyze and infer relational contracts for functions such as strcpy, strcat.

Matthieu Journault, Antoine Miné, Abdelraouf Ouadjaout

Verifying Bounded Subset-Closed Hyperproperties

Abstract

Hyperproperties are quickly becoming very popular in the context of systems security, due to their expressive power. They differ from classic trace properties since they are represented by sets of sets of executions instead of sets of executions. This allows us, for instance, to capture information flow security specifications, which cannot be expressed as trace properties, namely as predicates over single executions. In this work, we reason about how it is possible to move standard abstract interpretation-based static analysis methods, designed for trace properties, towards the verification of hyperproperties. In particular, we focus on the verification of bounded subset-closed hyperproperties which are easier to verify than generic hyperproperties. It turns out that a lot of interesting specifications (e.g., Non-Interference) lie in this category.

Isabella Mastroeni, Michele Pasqua

Process-Local Static Analysis of Synchronous Processes

Abstract

We develop a modular approach to statically analyse imperative processes communicating by synchronous message passing. The approach is modular in that it only needs to analyze one process at a time, but will in general have to do so repeatedly. The approach combines lattice-valued regular expressions to capture network communication with a dedicated shuffle operator for composing individual process analysis results. We present both a soundness proof and a prototype implementation of the approach for a synchronous subset of the Go programming language. Overall our approach tackles the combinatorial explosion of concurrent programs by suitable static analysis approximations, thereby lifting traditional sequential analysis techniques to a concurrent setting.

Jan Midtgaard, Flemming Nielson, Hanne Riis Nielson

The Impact of Program Transformations on Static Program Analysis

Abstract

Semantics-preserving program transformations, such as those carried out by an optimizing compiler, can affect the results of static program analyses. In the best cases, a transformation increases precision or allows a simpler analysis to replace a complex one. In other cases, transformations have the opposite effect, reducing precision. This work constructs a theoretical framework to analyze this intriguing phenomenon. The framework provides a simple, uniform explanation for precision changes, linking them to bisimulation relations that justify the correctness of a transformation. It offers a mechanism for recovering lost precision through the systematic construction of a new, bisimulating analysis. Furthermore, it is shown that program analyses defined over a class of composite domains can be factored into a program transformation followed by simpler, equally precise analyses of the target program.

Kedar S. Namjoshi, Zvonimir Pavlinovic

Efficiently Learning Safety Proofs from Appearance as well as Behaviours

Abstract

Proving safety of programs relies principally on discovering invariants that are inductive and adequate. Obtaining such invariants, therefore, has been studied widely from diverse perspectives, including even mining them from the input program’s source in a guess-and-check manner [13]. However, guessing candidates based on syntactical constructions of the source code has its limitations. For one, a required invariant may not manifest on the syntactic surface of the program. Secondly, a poor guess may give rise to a series of expensive checks. Furthermore, unlike conjunctions, refining disjunctive invariant candidates is unobvious and may frequently cause the proof search to diverge. This paper attempts to overcome these limitations, by learning from both – appearance and behaviours of a program. We present an algorithm that (i) infers useful invariants by observing a program’s syntactic source as well as its semantics, and (ii) looks for conditional invariants, in the form of implications, that are guided by counterexamples to inductiveness. Our experiments demonstrate its benefits on several benchmarks taken from SV-COMP and the literature.

Sumanth Prabhu, Kumar Madhukar, R. Venkatesh

Invertible Linear Transforms of Numerical Abstract Domains

Abstract

We study systematic changes of numerical domains in abstract interpretation through invertible linear transforms of the Euclidean vector space, namely, through invertible real square matrices. We provide a full generalization, including abstract transfer functions, of the parallelotopes abstract domain, which turns out to be an instantiation of an invertible linear transform to the interval abstraction. Given an invertible square matrix M and a numerical abstraction A, we show that for a linear program P (i.e., using linear assignments and linear tests only), the analysis using the linearly transformed domain M(A) can be obtained by analysing on the original domain A a linearly transformed program \(P^M\). We also investigate completeness of abstract domains for invertible linear transforms. In particular, we show that, perhaps counterintuitively, octagons are not complete for \(45^{\circ }\) rotations and, additionally, cannot be derived as a complete refinement of intervals for some family of invertible linear transforms.

Francesco Ranzato, Marco Zanella

Incremental Verification Using Trace Abstraction

Abstract

Despite the increasing effectiveness of model checking tools, automatically re-verifying a program whenever a new revision of it is created is often not feasible using existing tools. Incremental verification aims at facilitating this re-verification, by reusing partial results. In this paper, we propose a novel approach for incremental verification that is based on trace abstraction. Trace abstraction is an automata-based verification technique in which the program is proved correct using a sequence of automata. We present two algorithms that reuse this sequence across different revisions, one eagerly and one lazily. We demonstrate their effectiveness in an extensive experimental evaluation on a previously established benchmark set for incremental verification based on different revisions of device drivers from the Linux kernel. Our algorithm is able to achieve significant speedups on this set, compared to both stand-alone verification and previous approaches.

Bat-Chen Rothenberg, Daniel Dietsch, Matthias Heizmann

Volume-Based Merge Heuristics for Disjunctive Numeric Domains

Abstract

Static analysis of numeric programs allows proving important properties of programs such as a lack of buffer overflows, division by zero, or integer overflow. By using convex numeric abstractions, such as polyhedra, octagons, or intervals, representations of program states are concise and the analysis operations are efficient. Unfortunately, many sets of program states can only be very imprecisely represented with a single convex numeric abstraction. This means that many important properties cannot be proven using only these abstractions. One solution to this problem is to use powerset abstractions where a set of convex numeric abstractions represents the union rather than the hull of those state sets. This leads to a new challenge: when to merge elements of the powerset and when to keep them separate. We present a new methodology for determining when to merge based on counting and volume arguments. Unlike previous techniques, this heuristic directly represents losses in precision through hull computations. In this paper we develop these techniques and show their utility on a number of programs from the SV-COMP and WCET benchmark suites.

Andrew Ruef, Kesha Hietala, Arlen Cox

Abstract Interpretation of CTL Properties

Abstract

CTL is a temporal logic commonly used to express program properties. Most of the existing approaches for proving CTL properties only support certain classes of programs, limit their scope to a subset of CTL, or do not directly support certain existential CTL formulas. This paper presents an abstract interpretation framework for proving CTL properties that does not suffer from these limitations. Our approach automatically infers sufficient preconditions, and thus provides useful information even when a program satisfies a property only for some inputs. We systematically derive a program semantics that precisely captures CTL properties by abstraction of the operational trace semantics of a program. We then leverage existing abstract domains based on piecewise-defined functions to derive decidable abstractions that are suitable for static program analysis. To handle existential CTL properties, we augment these abstract domains with under-approximating operators. We implemented our approach in a prototype static analyzer. Our experimental evaluation demonstrates that the analysis is effective, even for CTL formulas with non-trivial nesting of universal and existential path quantifiers, and performs well on a wide variety of benchmarks.

Caterina Urban, Samuel Ueltschi, Peter Müller

Inductive Termination Proofs with Transition Invariants and Their Relationship to the Size-Change Abstraction

Abstract

Transition invariants are a popular technique for automated termination analysis. A transition invariant is a covering of the transitive closure of the transition relation of a program by a finite number of well-founded relations. The covering is usually established by an inductive proof using transition predicate abstraction. Such inductive termination proofs have the structure of a finite automaton. These automata, which we call transition automata, offer a rich structure that has not been exploited in previous publications. We establish a new connection between transition automata and the size-change abstraction, which is another widespread technique for automated termination analysis. In particular, we are able to transfer recent results on automated complexity analysis with the size-change abstraction to transition invariants.

Florian Zuleger

Springer Professional

Static Analysis

25th International Symposium, SAS 2018, Freiburg, Germany, August 29–31, 2018, Proceedings

About this book

Table of Contents

Frontmatter

Fairness: A Formal-Methods Perspective

The MISRA C Coding Standard and its Role in the Development and Analysis of Safety- and Security-Critical Embedded Software

Numerical Invariants via Abstract Machines

Deductive Verification in Decidable Fragments with Ivy

Experience Developing and Deploying Concurrency Analysis at Facebook

New Applications of Software Synthesis: Verification of Configuration Files and Firewall Repair

Interactive Verification of Distributed Protocols Using Decidable Logic

Abstract Interpretation of Stateful Networks

Block-Size Independence for GPU Programs

Extending Constraint-Only Representation of Polyhedra with Boolean Constraints

An Efficient Abstract Domain for Not Necessarily Closed Polyhedra

Modular Software Fault Isolation as Abstract Interpretation

Closing the Performance Gap Between Doubles and Rationals for Octagons

Verifying Properties of Differentiable Programs

A Reduced Product of Absolute and Relative Error Bounds for Floating-Point Analysis

Modular Static Analysis of String Manipulations in C Programs

Verifying Bounded Subset-Closed Hyperproperties

Process-Local Static Analysis of Synchronous Processes

The Impact of Program Transformations on Static Program Analysis

Efficiently Learning Safety Proofs from Appearance as well as Behaviours

Invertible Linear Transforms of Numerical Abstract Domains

Incremental Verification Using Trace Abstraction

Volume-Based Merge Heuristics for Disjunctive Numeric Domains

Abstract Interpretation of CTL Properties

Inductive Termination Proofs with Transition Invariants and Their Relationship to the Size-Change Abstraction

Backmatter

Premium Partner