main-content

## Über dieses Buch

This book constitutes the proceedings of the 12th International Symposium on NASA Formal Methods, NFM 2020, held in Moffett Field, CA, USA, in May 2020.*

The 20 full and 5 short papers presented in this volume were carefully reviewed and selected from 62 submissions. The papers are organized in the following topical sections: learning and formal synthesis; formal methods for DNNs; high assurance systems; requirement specification and testing; validation and solvers; solvers and program analysis; verification and times systems; autonomy and other applications; and hybrid and cyber-physical systems.

*The conference was held virtually due to the COVID-19 pandemic.

## Inhaltsverzeichnis

Open Access

### Correction to: NASA Formal Methods

Ritchie Lee, Susmit Jha, Anastasia Mavridou, Dimitra Giannakopoulou

### From Passive to Active: Learning Timed Automata Efficiently

Abstract
Model-based testing is a promising technique for quality assurance. In practice, however, a model is not always present. Hence, model learning techniques attain increasing interest. Still, many learning approaches can only learn relatively simple types of models and advanced properties like time are ignored in many cases. In this paper we present an active model learning technique for timed automata. For this, we build upon an existing passive learning technique for real-timed systems. Our goal is to efficiently learn a timed system while simultaneously minimizing the set of training data. For evaluation we compared our active to the passive learning technique based on 43 timed systems with up to 20 locations and multiple clock variables. The results of $$18\,060$$ experiments show that we require only 100 timed traces to adequately learn a timed system. The new approach is up to 755 times faster.
Bernhard K. Aichernig, Andrea Pferscher, Martin Tappler

### Generating Correct-by-Construction Distributed Implementations from Formal Maude Designs

Abstract
Developing a reliable distributed system meeting desired performance requirements is a hard and labor-intensive task. Formal specification and analysis of a system design can yield correct designs as well as reliable performance predictions. In this paper we present a correct-by-construction automatic transformation mapping such a verified formal specification of a system design in Maude to a distributed implementation satisfying the same safety and liveness properties. Two case studies applying this transformation to state-of-the-art distributed transaction systems show that high-quality implementations with acceptable performance and meeting performance predictions can be automatically generated. In this way, formal models of distributed systems analyzed within the same formal framework for both logical and performance properties are automatically transformed into correct-by-construction implementations for which similar performance trends can be shown.
Si Liu, Atul Sandur, José Meseguer, Peter Csaba Ölveczky, Qi Wang

### Parameter Synthesis and Robustness Analysis of Rule-Based Models

Abstract
We introduce the Quantitative Biochemical Space Language, a rule-based language for a compact modelling of probabilistic behaviour of complex parameter-dependent biological systems. Application of rules is governed by an associated parametrised rate function, expressing partially known information about the behaviour of the modelled system. The parameter values influence the behaviour of the model. We propose a formal verification-based method for the synthesis of parameter values (parameter synthesis) which ensure the behaviour of the modelled system satisfies a given PCTL property. In addition, we demonstrate how this method can be used for robustness analysis.
Matej Troják, David Šafránek, Lukrécia Mertová, Luboš Brim

### PaRoT: A Practical Framework for Robust Deep Neural Network Training

Abstract
Deep Neural Networks (DNNs) are finding important applications in safety-critical systems such as Autonomous Vehicles (AVs), where perceiving the environment correctly and robustly is necessary for safe operation. Raising unique challenges for assurance due to their black-box nature, DNNs pose a fundamental problem for regulatory acceptance of these types of systems. Robust training—training to minimize excessive sensitivity to small changes in input—has emerged as one promising technique to address this challenge. However, existing robust training tools are inconvenient to use or apply to existing codebases and models: they typically only support a small subset of model elements and require users to extensively rewrite the training code. In this paper we introduce a novel framework, PaRoT, developed on the popular TensorFlow platform, that greatly reduces the barrier to entry. Our framework enables robust training to be performed on existing DNNs without rewrites to the model. We demonstrate that our framework’s performance is comparable to prior art, and exemplify its ease of use on off-the-shelf, trained models and its testing capabilities on a real-world industrial application: a traffic light detection network.
Edward W. Ayers, Francisco Eiras, Majd Hawasly, Iain Whiteside

### Simplifying Neural Networks Using Formal Verification

Abstract
Deep neural network (DNN) verification is an emerging field, with diverse verification engines quickly becoming available. Demonstrating the effectiveness of these engines on real-world DNNs is an important step towards their wider adoption. We present a tool that can leverage existing verification engines in performing a novel application: neural network simplification, through the reduction of the size of a DNN without harming its accuracy. We report on the work-flow of the simplification process, and demonstrate its potential significance and applicability on a family of real-world DNNs for aircraft collision avoidance, whose sizes we were able to reduce by as much as 10%.
Sumathi Gokulanathan, Alexander Feldsher, Adi Malca, Clark Barrett, Guy Katz

### Neural Simplex Architecture

Abstract
We present the Neural Simplex Architecture (NSA), a new approach to runtime assurance that provides safety guarantees for neural controllers (obtained e.g. using reinforcement learning) of autonomous and other complex systems without unduly sacrificing performance. NSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. In the traditional approach, the advanced controller (AC) is treated as a black box; when the decision module switches control to the baseline controller (BC), the BC remains in control forever. There is relatively little work on switching control back to the AC, and there are no techniques for correcting the AC’s behavior after it generates a potentially unsafe control input that causes a failover to the BC. Our NSA addresses both of these limitations. NSA not only provides safety assurances in the presence of a possibly unsafe neural controller, but can also improve the safety of such a controller in an online setting via retraining, without overly degrading its performance. To demonstrate NSA’s benefits, we have conducted several significant case studies in the continuous control domain. These include a target-seeking ground rover navigating an obstacle field, and a neural controller for an artificial pancreas system.
Dung T. Phan, Radu Grosu, Nils Jansen, Nicola Paoletti, Scott A. Smolka, Scott D. Stoller

### Strengthening Deterministic Policies for POMDPs

Abstract
The synthesis problem for partially observable Markov decision processes (POMDPs) is to compute a policy that satisfies a given specification. Such policies have to take the full execution history of a POMDP into account, rendering the problem undecidable in general. A common approach is to use a limited amount of memory and randomize over potential choices. Yet, this problem is still NP-hard and often computationally intractable in practice. A restricted problem is to use neither history nor randomization, yielding policies that are called stationary and deterministic. Previous approaches to compute such policies employ mixed-integer linear programming (MILP). We provide a novel MILP encoding that supports sophisticated specifications in the form of temporal logic constraints. It is able to handle an arbitrary number of such specifications. Yet, randomization and memory are often mandatory to achieve satisfactory policies. First, we extend our encoding to deliver a restricted class of randomized policies. Second, based on the results of the original MILP, we employ a preprocessing of the POMDP to encompass memory-based decisions. The advantages of our approach over state-of-the-art POMDP solvers lie (1) in the flexibility to strengthen simple deterministic policies without losing computational tractability and (2) in the ability to enforce the provable satisfaction of arbitrarily many specifications. The latter point allows to take trade-offs between performance and safety aspects of typical POMDP examples into account. We show the effectiveness of our method on a broad range of benchmarks.
Leonore Winterer, Ralf Wimmer, Nils Jansen, Bernd Becker

### Benchmarking Software Model Checkers on Automotive Code

Abstract
This paper reports on our experiences with verifying automotive C code by state-of-the-art open source software model checkers. The embedded C code is automatically generated from Simulink open-loop controller models. Its diverse features (decision logic, floating-point and pointer arithmetic, rate limiters and state-flow systems) and the extensive use of floating-point variables make verifying the code highly challenging. Our study reveals large discrepancies in coverage—which is at most only 20% of all requirements—and tool strength compared to results from the main annual software verification competition. A hand-crafted, simple extension of the verifier CBMC with k-induction delivers results on 63% of the requirements while the proprietary BTC EmbeddedValidator covers 80% and obtains bounded verification results for most of the remaining requirements.
Lukas Westhofen, Philipp Berger, Joost-Pieter Katoen

### Automated Requirements-Based Testing of Black-Box Reactive Systems

Abstract
We present a new approach to conformance testing of black-box reactive systems. We consider system specifications written as linear temporal logic formulas to generate tests as sequences of input/output pairs: inputs are extracted from the Büchi automata corresponding to the specifications, and outputs are obtained by feeding the inputs to the systems. Conformance is checked by comparing input/output sequences with automata traces to detect violations of the specifications. We consider several criteria for extracting tests and for stopping generation, and we compare them experimentally using both indicators of coverage and error-detection. The results show that our methodology can generate test suites with good system coverage and error-detection capability.
Massimo Narizzano, Luca Pulina, Armando Tacchella, Simone Vuotto

### Formal Verification of Parallel Prefix Sum

Abstract
With the advent of dedicated hardware for multicore programming, parallel algorithms have become omnipresent. For example, various algorithms have been proposed for the parallel computation of a prefix sum in the literature. As the prefix sum is a basic building block for many other multicore algorithms, such as sorting, its correctness is of utmost importance. This means, the algorithm should be functionally correct, and the implementation should be thread and memory safe.
In this paper, we use deductive program verification based on permission-based separation logic, as supported by VerCors, to show correctness of the two most frequently used parallel in-place prefix sum algorithms for an arbitrary array size. Interestingly, the correctness proof for the second algorithm reuses the auxiliary lemmas that we needed to create the first proof. To the best of our knowledge, this paper is the first tool-supported verification of functional correctness of the two parallel in-place prefix sum algorithms which does not make any assumption about the size of the input array.
Mohsen Safari, Wytse Oortwijn, Sebastiaan Joosten, Marieke Huisman

### Specification Quality Metrics Based on Mutation and Inductive Incremental Model Checking

Abstract
When using formal verification on Simulink or SCADE models, an important question about their certification is how well the specified properties cover the entire model. A method using unsatisfiable cores and inductive model checking called IVC (Inductive Validity Cores) has been recently proposed within modern SMT-based model checkers such as JKind. The IVC algorithm determines a minimal set of model elements necessary to establish a proof and gives back the traceability to the design elements (lines of code) necessary for the proof. These metrics are interesting but are rather coarse grain for certification purposes.
In this paper, we propose to use mutation combined with incremental inductive model checking to give more precision and quality to the traceability process and look inside the lines of code. Our algorithm, based on the result of IVC, mutates the source code to determine which parts inside a line of code have an impact on the properties (killed mutants) and which parts have no impact on the properties (survived mutants). Furthermore, using the incremental feature present in modern SMT-solvers, we observe that mutation can scale up to industrial models. We demonstrate the metrics first on a simple example, then on a complex industrial program and on the JKind benchmark.
Vassil Todorov, Safouan Taha, Frédéric Boulanger

### A Validation Methodology for OCaml-to-PVS Translation

Abstract
We present a methodology, called OPEV, to validate the translation between OCaml and PVS, which supports non-executable semantics. This validation occurs by generating large-scale tests for OCaml implementations, generating test lemmas for PVS, and generating proofs that automatically discharge these lemmas. OPEV incorporates an intermediate type system that captures a large subset of OCaml types, employing a variety of rules to generate test cases for each type. To prove the PVS lemmas, we developed automatic proof strategies and discharged the test lemmas using PVS Proof-Lite, a powerful proof scripting utility of the PVS verification system. We demonstrated our approach on two case studies that include two hundred and fifty-nine functions selected from the Sail and Lem libraries. For each function, we generated thousands of test lemmas, all of which are automatically discharged. The methodology contributes to a reliable translation between OCaml and PVS.
Xiaoxin An, Amer Tahat, Binoy Ravindran

### On the Usefulness of Clause Strengthening in Parallel SAT Solving

Abstract
In the context of parallel SATisfiability solving, this paper presents an implementation and evaluation of a clause strengthening algorithm. The developed component can be easily combined with (virtually) any CDCL-like SAT solver. Our implementation is integrated as a part of Painless, a generic and modular framework for building parallel SAT solvers.
Vincent Vallade, Ludovic Le Frioux, Souheib Baarir, Julien Sopena, Fabrice Kordon

Open Access

### Verifying a Solver for Linear Mixed Integer Arithmetic in Isabelle/HOL

Abstract
We implement a decision procedure for linear mixed integer arithmetic and formally verify its soundness in Isabelle/HOL. We further integrate this procedure into one application, namely into CeTA, a formally verified certifier to check untrusted termination proofs. This checking involves assertions of unsatisfiability of linear integer inequalities; previously, only a sufficient criterion for such checks was supported. To verify the soundness of the decision procedure, we first formalize the proof that every satisfiable set of linear integer inequalities also has a small solution, and give explicit upper bounds. To this end we mechanize several important theorems on linear programming, including statements on integrality and bounds. The procedure itself is then implemented as a branch-and-bound algorithm, and is available in several languages via Isabelle’s code generator. It internally relies upon an adapted version of an existing verified incremental simplex algorithm.
Ralph Bottesch, Max W. Haslbeck, Alban Reynaud, René Thiemann

### Constraint Caching Revisited

Abstract
Satisfiability Modulo Theories (SMT) solvers play a major role in the success of symbolic execution as program analysis technique. However, often they are still the main performance bottleneck. One approach to improve SMT performance is to use caching. The key question we consider here is whether caching strategies are still worthwhile given the performance improvements in SMT solvers. Two main caching strategies exist: either simple sat/unsat results are stored, or entire solutions (=models) are stored for later reuse. We implement both approaches in the Green framework and compare them with the popular Z3 constraint solver. We focus on linear integer arithmetic constraints; this is typical for symbolic execution, and both caching strategies and constraint solvers work well in this domain. We use both classic symbolic and concolic execution to see whether caching behaves differently in these settings. We consider only time consumption; memory use is typically negligible. Our results suggest that (1) the key to caching performance is factoring constraints into independent parts, and this by itself is often sufficient, (2) Z3’s incremental mode often outperforms caching; and (3) reusing models fares better for concolic than for classic symbolic execution.
Jan Taljaard, Jaco Geldenhuys, Willem Visser

### Per-Location Simulation

Abstract
Simulation/bisimulation is one of the most widely used frameworks for proving program equivalence/semantic preservation. In this paper, we propose a new per-location simulation (PLS) relation that is simple and suitable for proving that a compiled program semantically preserves its original program under a CFG-based language with a real-world, C/C++ like, weak memory model. To the best of our knowledge, PLS is the first simulation framework weaker than the CompCert [26]/CompCertTSO [47] one that is used for proving compiler correctness. With a combination of PLS, the compiler proof-framework Morpheus [34], and a language semantics with a weak memory model, we are able to prove that programs are semantically preserved through a transformation. All the definitions and proofs have been implemented in Isabelle/HOL.
Liyi Li, Elsa L. Gunter

### Sampling Distributed Schedulers for Resilient Space Communication

Abstract
We consider routing in delay-tolerant networks like satellite constellations with known but intermittent contacts, random message loss, and resource-constrained nodes. Using a Markov decision process model, we seek a forwarding strategy that maximises the probability of delivering a message given a bound on the network-wide number of message copies. Standard probabilistic model checking would compute strategies that use global information, which are not implementable since nodes can only act on local data. In this paper, we propose notions of distributed schedulers and good-for-distributed-scheduling models to formally describe an implementable and practically desirable class of strategies. The schedulers consist of one sub-scheduler per node whose input is limited to local information; good models additionally render the ordering of independent steps irrelevant. We adapt the lightweight scheduler sampling technique in statistical model checking to work for distributed schedulers and evaluate the approach, implemented in the Modest Toolset, on a realistic satellite constellation and contact plan.
Pedro R. D’Argenio, Juan A. Fraire, Arnd Hartmanns

### Model Checking Timed Hyperproperties in Discrete-Time Systems

Abstract
Many important timed requirements of computing systems cannot be described by the behavior of individual execution traces. Examples include countermeasures to deal with side-channel timing attacks and service-level agreements, which are examples of timed hyperproperties. In this paper, we propose the temporal logic HyperMTL, that extends MTL by allowing explicit and simultaneous quantification over multiple timed traces in the point-wise semantics. We demonstrate the application of HyperMTL in expressing important properties in information-flow security and cyber-physical systems. We also introduce a model checking algorithm for a nontrivial fragment of HyperMTL by reducing the problem to model checking untimed hyperproperties.
Borzoo Bonakdarpour, Pavithra Prabhakar, César Sánchez

### Verifying Band Convergence for Sampled Control Systems

Abstract
We present a method to verify transient and settling time properties, called band convergence properties, of digitally controlled continuous systems, wherein we consider a linear dynamical system model for a plant and a PID controller. We consider the discrete-time sampled behavior of the closed loop system, and verify band convergence for the discrete-time behavior. The basic idea is to look for a box-shaped invariant for the system which is adequate to ensure that the system stays within the given band. We first give a technique to handle a general discrete-time system, but with determinate matrix entries. We then give a technique to handle discrete-time systems with matrices that lie in a range which over-approximate the matrix exponentials (which arise when we consider the discrete-time version of a continuous system), using the notion of an abstract discrete-time system. We have implemented the verification approach, and evaluate its efficacy on some popular Simulink models.
P. Ezudheen, Zahra Rahimi Afzal, Pavithra Prabhakar, Deepak D’Souza, Meenakshi D’Souza

### Heterogeneous Verification of an Autonomous Curiosity Rover

Abstract
The Curiosity rover is one of the most complex systems successfully deployed in a planetary exploration mission to date. It was sent by NASA to explore the surface of Mars and to identify potential signs of life. Even though it has limited autonomy on-board, most of its decisions are made by the ground control team. This hinders the speed at which the Curiosity reacts to its environment, due to the communication delays between Earth and Mars. Depending on the orbital position of both planets, it can take 4–24 min for a message to be transmitted between Earth and Mars. If the Curiosity were controlled autonomously, it would be able to perform its activities much faster and more flexibly. However, one of the major barriers to increased use of autonomy in such scenarios is the lack of assurances that the autonomous behaviour will work as expected. In this paper, we use a Robot Operating System (ROS) model of the Curiosity that is simulated in Gazebo and add an autonomous agent that is responsible for high-level decision-making. Then, we use a mixture of formal and non-formal techniques to verify the distinct system components (ROS nodes). This use of heterogeneous verification techniques is essential to provide guarantees about the nodes at different abstraction levels, and allows us to bring together relevant verification evidence to provide overall assurance.
Rafael C. Cardoso, Marie Farrell, Matt Luckcuck, Angelo Ferrando, Michael Fisher

### Run-Time Assurance for Learning-Enabled Systems

Abstract
There has been much publicity surrounding the use of machine learning technologies in self-driving cars and the challenges this presents for guaranteeing safety. These technologies are also being investigated for use in manned and unmanned aircraft. However, systems that include “learning-enabled components” (LECs) and their software implementations are not amenable to verification and certification using current methods. We have produced a demonstration of a run-time assurance architecture based on a neural network aircraft taxiing application that shows how several advanced technologies could be used to ensure safe operation. The demonstration system includes a safety architecture based on the ASTM F3269-17 standard for bounded behavior of complex systems, diverse run-time monitors of system safety, and formal synthesis of critical high-assurance components. The enhanced system demonstrates the ability of the run-time assurance architecture to maintain system safety in the presence of defects in the underlying LEC.
Darren Cofer, Isaac Amundson, Ramachandra Sattigeri, Arjun Passi, Christopher Boggs, Eric Smith, Limei Gilham, Taejoon Byun, Sanjai Rayadurgam

### hpnmg: A C++ Tool for Model Checking Hybrid Petri Nets with General Transitions

Abstract
hpnmg is a tool for model checking Hybrid Petri nets with an arbitrary but finite number of general transition firings against specifications formulated in STL. The tool efficiently implements and combines algorithms for state space creation, transformation to a geometric representation, model checking a potentially nested STL formula and integrating over the resulting satisfaction set to yield the probability that the specification holds at a specific time.
Jannik Hüls, Henner Niehaus, Anne Remke

### A Transformation of Hybrid Petri Nets with Stochastic Firings into a Subclass of Stochastic Hybrid Automata

Abstract
We present a transformation of Hybrid Petri nets extended with stochastic firings (HPnGs) into a subclass of Stochastic Hybrid Automata (SHA), thereby making HPnGs amenable to techniques from that domain. While (non-stochastic) Hybrid Petri nets have previously been transformed into Hybrid Automata, we consider also stochastic aspects and transform HPnGs into Singular Automata, which are Hybrid Automata restricted to piecewise constant derivatives for continuous variables, extended by random clocks. We implemented our transformation and show its usefulness by comparing results for time-bounded reachability for HPnGs extended with non-determinism on the one hand, and for the transformed SHAs using the ProHVer tool on the other hand.
Carina Pilch, Maurice Krause, Anne Remke, Erika Ábrahám

### Constraining Counterexamples in Hybrid System Falsification: Penalty-Based Approaches

Abstract
Falsification of hybrid systems is attracting ever-growing attention in quality assurance of Cyber-Physical Systems (CPS) as a practical alternative to exhaustive formal verification. In falsification, one searches for a falsifying input that drives a given black-box model to output an undesired signal. In this paper, we identify input constraints—such as the constraint “the throttle and brake pedals should not be pressed simultaneously” for an automotive powertrain model—as a key factor for the practical value of falsification methods. We propose three approaches for systematically addressing input constraints in optimization-based falsification, two among which come from the lexicographic method studied in the context of constrained multi-objective optimization. Our experiments show the approaches’ effectiveness.
Zhenya Zhang, Paolo Arcaini, Ichiro Hasuo

### Falsification of Cyber-Physical Systems with Constrained Signal Spaces

Abstract
Falsification has garnered much interest recently as a way to validate complex CPS designs with respect to a specification expressed via temporal logics. Using their quantitative semantics, the falsification problem can be formulated as a robustness minimization problem.
Benoît Barbot, Nicolas Basset, Thao Dang, Alexandre Donzé, James Kapinski, Tomoya Yamaguchi

### Backmatter

Weitere Informationen