Skip to main content
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Formal Verification of WTO-based Dataflow Solvers

verfasst von : Roméo La Spina, Delphine Demange, Sandrine Blazy

Erschienen in: Programming Languages and Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dieses Kapitel untersucht die formale Verifizierung von Datenflusslösern, die schwache topologische Ordnung (WTO) verwenden, eine von Bourdoncle vorgeschlagene Technik. Es beginnt mit einem Überblick über die Datenflussanalyse, verfolgt ihre Ursprünge und begründet ihre Rolle im modernen Compilerbau. Das Kapitel geht dann auf die Formalisierung der WTO und ihre Anwendung in Datenflusslösern ein und hebt zwei unterschiedliche Strategien hervor: die iterativen und rekursiven Ansätze. Die iterative Strategie beinhaltet die Stabilisierung von Komponenten durch wiederholte Iterationen, während die rekursive Strategie Komponenten durch rekursive Verarbeitung von Teilkomponenten stabilisiert. Beide Methoden werden formell innerhalb des Coq-Proof-Assistenten verifiziert, wodurch ihre Richtigkeit und Integration in den CompCert-verifizierten C-Compiler gewährleistet ist. Das Kapitel präsentiert auch eine empirische Bewertung dieser Löser, indem ihre Leistung und Präzision mit dem weit verbreiteten Kildall-Algorithmus verglichen wird. Die Ergebnisse zeigen, dass WTO-basierte Löser vergleichbare Leistung und Präzision bieten, was sie zu einer praktikablen Alternative für Compiler-Optimierungen macht. Darüber hinaus werden in diesem Kapitel die damit verbundenen Arbeiten und zukünftige Verbesserungen diskutiert, wobei ein umfassender Überblick über den aktuellen Zustand und mögliche Fortschritte bei den Datenflussanalysetechniken gegeben wird.

1 Introduction

Dataflow analysis for program analysis dates back to the end of the 70’s with Allen et al.’s work, at the time where the first program optimizations for PL/I compilers where developed. Dataflow analysis is now a well-established compiler basic brick, on which compilers rely to analyze, in a sound and static way, facts or information about the values computed by programs at run-time.
In the dataflow analysis framework, program properties are expressed as a solution of a system of mutually recursive (in)equations, specifying how the program abstract state is modified by each program construct. Given such an equation system, a solution is then effectively computed using a dataflow solver, which computes a fixpoint of the equations. Depending on the mathematical structure underlying the information of interest, the fixpoint is not always unique.
As of today in the compiler construction literature and state-of-the-art, the algorithm proposed by Kildall [13] and variants thereof constitute the de facto standard to implement dataflow solvers. It is part of the general family of iterative dataflow solvers, where, starting from an initial empty information, dataflow equations are applied iteratively, until a sound, over-approximating fixpoint solution is reached for all equations. The algorithm uses a worklist to keep track of the dataflow equations that remain to be solved, and a fixpoint is considered to be reached when that worklist is empty. This algorithm is simple to implement and easy to understand. Its theoretical foundations have been extensively studied, up to the point that it has been formally verified in the Coq proof assistant, and implemented in the CompCert verified C compiler [19]. In practice, however, one must be careful on the particular choice of the data-structure used to implement the worklist: even though the correctness of the algorithm does not depend on it, the convergence speed crucially depends on the order in which the equations are picked from the worklist.
Intrinsically, Kildall’s algorithm treats dependencies between dataflow equations dynamically: the worklist evolves along the iterations on the equations, and applying an equation can trigger the (future) re-application of some other equations. Choosing the right implementation of the worklist, and pre-computing a partial information on the dependencies between the equations thus allows to circumvent this dynamic treatment of dependencies, and is key to ensure satisfying convergence speeds. Typically, CompCert relies on a reverse postorder numbering of the nodes of the control-flow graph to reflect potential dependencies between the dataflow equations of program instructions. The worklist is implemented with a heap, from which one can pick the next node, and thus the corresponding equations to apply, in an appropriate way, based on its numbering.
Many other iterative dataflow solving algorithms and techniques exist. In particular, a number of them are designed to process the dataflow equations in an order that is statically determined: the solver iteratively applies the equations following this static order. In this paper, we study one such particular technique, proposed by Bourdoncle [5]. This technique originates from the abstract interpretation community, where abstract domains and transfer functions are usually considered to be computationally expensive, and where sophisticated widening techniques are employed to guarantee the convergence of the algorithm. The proposed algorithmic methods implement efficient chaotic iteration strategies, both in terms of precision and performance.
At the heart of the solving methods proposed by Bourdoncle is the notion of a weak topological ordering (WTO) over the nodes of the graph, pre-computed based on the structure of the program. For a given static ordering, Bourdoncle proposes two flavors of solver algorithms, each with a different strategy to iteratively apply the equations: the iterative strategy and the recursive strategy. Since their inception, Bourdoncle’s solving methods have been implemented in the vast majority of static analyzers based on abstract interpretation.
From a formal proof perspective, however, those solvers have received little attention. The solver algorithms are not exposed and just described with prose, leaving the developer’s with few hints on how to, concretely, implement such a solver. The only correctness proof [5] we are aware of stays also at an informal level, and hardly helps gaining some intuition about the key principles underlying the correctness of the algorithm. The only formally verified artifact that we know of and that uses this technique is a (prototypical) static analyzer [4] precursor of the C Verasco analyzer [12]. There, the fixpoint computation is untrusted and validated a posteriori to be a post-fixpoint. But post-mortem fixpoint validation does not help understanding the sophisticated mechanics of the solvers.
In this work, we study the foundations of the fixpoint solving technique proposed by Bourdoncle [5], and conduct a formal correctness proof of both the iterative and recursive strategies. We apply our formalization work to the context of verified compilation, and implement both solvers within the CompCert compiler. We demonstrate the practicality of our verified algorithms with an empirical validation of their efficiency and precision, when used on a various collection of realistic dataflow analyses used for C program optimization and compilation. More precisely, we make the following contributions:
  • We formalize the notion of weak topological ordering over the nodes of a graph, as well as two dataflow solver algorithms, following the informal descriptions given in the literature. Each solver implements a different solving strategy over an input WTO. The first algorithm implements the iterative strategy, the second implements the recursive strategy.
  • We formally prove the correctness of both dataflow algorithms: they compute a post-fixpoint of a set of dataflow equations. Our proofs are generic in the WTO used by each of the strategy, and support both forward and backward dataflow analyses.
  • Our formalization is done within the Coq proof assistant, and integrates to the reference, formally verified C compiler CompCert. Our algorithms are implemented in Coq, and then extractable to OCaml.
  • We evaluate empirically the two formally verified solvers on the whole range of dataflow analyses provided by CompCert, using CompCert’s benchmark suite. We evaluate their performance in terms of efficiency and precision.
We present our contribution with a pen-and-paper formalism, and avoid to refer to Coq definitions as most as possible. Our Coq formal development and experimental validation setup are available online [15]. The full description of our experimental results is published as a companion experience report [16].
In this paper, we elide discussions regarding the termination of the algorithms, and simply assume they terminate. Our Coq formalization relies on fuel-bounded, mutually recursive functions, and are thus terminating.
The paper is structured as follows. Section 2 provides some technical background on dataflow analysis. In Section 3, we formalize the notion of a WTO, illustrating our definitions with examples, and we describe algorithmic ways to compute such an ordering. In Section 4, we explain the general architecture of a WTO-based dataflow solver, common to the two strategies, and the high-level structure of our correctness proof. Section 5 and Section 6 describe the algorithms for each of the solvers respectively, and the main technical lemmas we prove for the final correctness of the algorithms. In Section 7, we briefly comment on our Coq formal development and its integration within CompCert, and summarize our experimental results. Section 8 discusses related work. Section 9 concludes and discusses future improvements for this work.

2 Background on Dataflow Analysis

2.1 Dataflow Solvers

To perform dataflow analysis, programs are usually represented by their control-flow graph (CFG). A CFG \(G = (N,n_e,E)\) is a graph where N is a set of nodes, E is a set of edges, and \(n_e \in N\) is a distinguished entry node. The nodes correspond to program points and the edges describe the control flow between program points. We say that a node \(s \in N\) is a successor of another node \(n \in N\) if \((n,s) \in E\). The set of the successors of n in G is defined as the set \(\textsf {succ}_G(n) = \{ s\in N \mid (n,s)\in E \}\). Similarly, the set of the predecessors of s in G is defined as \(\textsf {pred}_G(s) = \{ n\in N \mid (n,s)\in E \}\).
The goal of dataflow analysis is to compute the most precise information possible at each program point i.e., each node of the CFG. Information is usually represented by abstract values in a semi-lattice \((L,\sqsubseteq ,\sqcup ,\bot )\). Moreover, we are given a transfer function \(T:N\times L\rightarrow L\), indexed by a node in the graph, that allows us to compute information at the output of the node given the information at its input. Namely, if \(x\in L\) is the information at the input of n, then T(nx) is the information at the output.
A solution of the dataflow analysis problem can be represented as a mapping A from each node in the graph to a value in L. To be correct, the abstract information given by the solution should properly reflect the program data flow: for every successor s of a node n, we need to have \(T(n,A[n]) \sqsubseteq A[s]\): the information at the output of n should at least be included in the value computed for s. Hence, the dataflow analysis problem consists in solving an (in)equation system on the nodes of the graph, and one must find a post-fixpoint for that system [21].
The general way to solve such a system is to solve the equations iteratively. Basically, one can retrieve information from predecessors p of a node s and update A[s] in consequence to solve the equations of the form \(T(p, A[p]) \sqsubseteq A[s]\). However, doing so may break the validity of the other equations concerning A[s], hence requiring iteration passes over the nodes of the graph to reach a fixpoint.
If one iterates naively without minding the structure of the CFG, the convergence is likely to be very slow: a large number of iterations will have to be done before reaching the fixpoint. To make the iterations more efficient, several strategies exist. Kildall [13] gave a well-known historical algorithm with worklists, which is very commonly used in practice to perform dataflow analysis. However, other iteration strategies have been developed. Bourdoncle [5, 6] elaborated a method which relies on a pre-computation of the iteration order. This iteration order is represented by a weak topological ordering (WTO) of the nodes. Given a WTO, Bourdoncle describes two iteration strategies : the iterative strategy and the recursive one. In this paper, we give a formalization for the notion of WTO (see Section 3) and for Bourdoncle’s iteration strategies (see Section 4).

2.2 CompCert Interface

In the formally verified compiler CompCert, optimizations rely on a dataflow analysis, and the solvers are formally specified with the Coq proof assistant. They are mostly specified as described in Section 2.1. However, the abstract structures used in CompCert have slightly weaker properties than a usual semi-lattice. In this paper, we call such structures pseudo-semi-lattices. The difference with a semi-lattice is that the join operator \(\sqcup \) does not necessarily return the least upper bound but only an upper bound.
Definition 1
(Pseudo-semi-lattice). A pseudo-semi-lattice is a quadruple \((L,\sqsubseteq ,\sqcup ,\bot )\) where \((L,\sqsubseteq )\) is a partially ordered set, \(\sqcup \) is an upper bound operator and \(\bot \) is a least element.
A dataflow solver in CompCert is a function \( fixpoint \) that takes as input a CFG \(G = (N,n_e,E)\), a pseudo-semi-lattice abstract domain \(\mathbb {L} = (L,\sqsubseteq ,\sqcup ,\bot )\), a transfer function \(T: N\times L \rightarrow L\), and an entry abstract value \(v_e\in L\). It computes a solution A of the dataflow equation system as a mapping \(A: N \mapsto L\) associating an abstract value to each of the nodes of G. CompCert’s interface for dataflow solvers also includes a correctness criterion stating that the returned solution is a fixpoint for the dataflow equation system.
Definition
(Correctness criterion). Let \(G = (N,n_e,E)\) be a CFG. Assume \( fixpoint (G,\mathbb {L},T,v_e) = A\). The function \( fixpoint \) is said to be correct when, for all \(n\in N\) reachable from \(n_e\) in G and \(s \in \textsf {succ}_G(n)\), we have \(T(n,A[n]) \sqsubseteq A[s]\).
All of CompCert’s analyses rely on this dataflow solver interface, and implement it with Kildall’s historical algorithm [13]. In this work, we develop two new implementations for this interface in place of Kildall’s algorithm, namely the iterative and recursive strategies of Bourdoncle. The fact that all the algorithms implement the same interface saves us from modifying the rest of CompCert’s Coq development when replacing Kildall’s solver with Bourdoncle’s one in the compilation chain. It also makes it easy to compare both approaches experimentally in terms of efficiency and precision of the computed solution.

3 Weak Topological Ordering

The two dataflow solvers, proposed by Bourdoncle [5] and that we formalise in this paper, rely on a pre-computed data-structure. This data-structure guides the iteration strategy of the solvers, i.e., the order in which the nodes of the CFG are processed in order to reach a fixpoint for the dataflow equations. This data-structure is called a weak topological ordering. The intuition that one usually has is that it represents, more or less faithfully, the loop structure of the CFG.
This section formally defines weak topological orderings. We illustrate our definitions with concrete examples, showing that a WTO does not necessarily reflect the loop nesting structure of the CFG. We comment on three methods to compute such an ordering from a graph. We finally describe our implementation, which we formally validate a posteriori using a formally verified validator.

3.1 Weak Topological Order

The notion of a weak topological ordering (WTO) is informally explained by Bourdoncle [5]. Here, we present our formal definition of a WTO. Let \(G = (N,n_e,E)\) be a graph. A WTO is a data-structure containing nodes in N, arranged in a hierarchical way, called a hierarchical order. A WTO is then defined as a hierarchical ordering satisfying some well-formedness properties.
Definition 2
(Hierarchical Ordering). A hierarchical ordering \(\mathcal {H}\) is a list of node nestings containing no duplicate node. Each node nesting in the list is defined inductively as follows:
We emphasize that a hierarchical ordering is a list of such node nestings, rather than a node nesting itself: as will become clearer along the paper, it is sometimes better to avoid (and sometimes simply impossible) gathering all node nestings in a top-level node nesting. The fact that components are defined by both a head and a node nesting instead of just a node nesting disallows having two nested components opened at the same node, in accordance with Bourdoncle’s definition of a hierarchical ordering [5].
Not all hierarchical orderings are valid WTO. We turn now to explaining and formalizing the required properties. The main property is a coherence criterion on the structure of the components of the hierarchical ordering. This criterion is captured through the notions of parent head and feedback edge.
In a hierarchical ordering, a parent head of a node is simply the head of any component enclosing that node.
Definition 3
(Parent head). Let e be a node nesting. We say that v is a parent head of u in e if v is the head of a component of e containing u. Formally:
  • Node h is a parent head of h in \(h@l\)
  • If u belongs to l, then h is a parent head of u in \(h@l\)
  • If v is a parent head of u in some node nesting of l, then v is a parent head of u in \(h@l\)
In a hierarchical ordering \(\mathcal {H}\), v is a parent head of u if v is a parent head of u in some node nesting of \(\mathcal {H}\).
The notion of feedback edge is more involved formally speaking, although it keeps relatively intuitive. It is defined relatively to a hierarchical ordering, and as such, it shouldn’t be confused with the notion of a back-edge in a CFG1. Indeed, a hierarchical ordering has a priori no relation to dominance. Meanwhile, feedback edges are defined with respect to the order in which the nodes appear in the hierarchical ordering. To define them, we rely on a partial order on the nodes of the graph, induced by a hierarchical ordering \(\mathcal {H}\), written \(\preceq _\mathcal {H}\).
Definition 4
(Feedback edge). Let \(G = (N,n_e,E)\) be a graph, and let \(\mathcal {H}\) be a hierarchical ordering on G. An edge \((u,v)\in E\) is a feedback edge iff \(v\preceq _{\mathcal {H}} u\).
We turn now to defining formally the partial order \(\preceq _{\mathcal {H}}\). It suffices to define \(x \preceq _\mathcal {H}y\) if and only if node x appears before y in the flattened version of \(\mathcal {H}\). The flattening of a hierarchical ordering is the (unique) list containing exactly the same nodes, and that preserves the occurrence order of the nodes.
Definition 5
(Flattening of a hierarchical ordering). Let \(\mathcal {H}\) be a hierarchical ordering. Its flattening, written \(L(\mathcal {H})\), is defined by the following mutually recursive definitions, where concat and map are the usual operations on lists.
Definition 6
(Induced order of a hierarchical ordering). Let \(\mathcal {H}\) be a hierarchical ordering, and \(L(\mathcal {H})\) be its flattening. The induced order of \(\mathcal {H}\), written \(\preceq _\mathcal {H}\), is defined as the list order \(\preceq _{L(\mathcal {H})}\) of \(L(\mathcal {H})\), where for any list l of nodes:
The proof that \(\preceq _\mathcal {H}\) is reflexive, antisymmetric and transitive is straightforward. \(\preceq _\mathcal {H}\) is a total order on the nodes in \(L(\mathcal {H})\).
Example 1
Let \(G = (N, 1, E)\) be a graph defined by \(N=\{1,4,5,7,10,15\}\) and \(E = \{(1,4), (4,5), (5,7), (7,1), (7,10), (10,15)\}\), and represented below
A first hierarchical ordering is \(\mathcal {H}_1 = [1@[\langle 4\rangle ,\langle 5\rangle ,\langle 7\rangle ],\langle 10\rangle ,\langle 15\rangle ]\). The induced order of \(\mathcal {H}_1\) is [1, 4, 5, 7, 10, 15]. Thus, for \(\mathcal {H}_1\), (7, 1) is the unique feedback edge. Another hierarchical ordering is \(\mathcal {H}_2 = [\langle 5\rangle , 7@[\langle 1\rangle ,15@[\langle 10\rangle ]], \langle 4\rangle ]\). The induced order of \(\mathcal {H}_2\) is [5, 7, 1, 15, 10, 4] which has two feedback edges, (4, 5) and (10, 15).
In Example 1, while \(\mathcal {H}_2\) is a hierarchical ordering, it does not constitute a valid basis to run an iterative dataflow solving algorithm. An important structural aspect is wrong: the feedback edge (4, 5) has its source and target nodes split apart in two disjoint nestings. This is what forbids \(\mathcal {H}_2\) from being a WTO.
Definition 7
(Weak Topological Ordering (WTO)). Let \(G = (N,n_e, E)\) be a graph. A weak topological ordering is a hierarchical ordering \(\mathcal {W}\) on G satisfying the two following properties:
  • (Node completeness) All nodes of G reachable from \(n_e\) belong to \(L(\mathcal {W})\)
  • (Feedback-to-parent-head) For all feedback edge \((u,v)\in E\) of \(\mathcal {W}\), v is a parent head of u.
In Example 1, both \(\mathcal {H}_1\) and \(\mathcal {H}_2\) are node-complete, but only \(\mathcal {H}_1\) is satisfying the feedback-to-parent-head property, and \(\mathcal {H}_2\) does not.
Example 2
We illustrate our definitions with a slightly more involved example.
The hierarchical ordering \(\mathcal {W}= [\langle 3\rangle ,\langle 2\rangle ,1@[\langle 4\rangle ,5@[\langle 7\rangle ],\langle 6\rangle ], \langle 8\rangle ]\) is indeed a WTO, with induced order [3, 2, 1, 4, 5, 7, 6, 8]. All nodes are reachable from the entry node 3 of the graph, and they are all included in \(L(\mathcal {W})\). Edges (7, 5) and (6, 1) are the feedback edges (dashed) for \(\mathcal {W}\). Since 5 is a parent head of 7 and 1 is a parent head of 6, then \(\mathcal {W}\) is indeed a WTO. This example WTO is relatively intuitive, as it matches the loop structure of the graph: components correspond to the loops, and feedback edges point to a loop header.
But it is important to notice that this structural matching is not required for a hierarchical ordering to be a WTO. For instance, the following (degenerated) hierarchical ordering \([1@[2@[3@[4@[5@[6@[7@[8@[~]]]]]]]]]\) is indeed a WTO.
Our specification of a WTO (Definition 7) is sufficient to prove the correctness of both dataflow solver algorithms that we present in Section 4, which adapts well to either forward dataflow solving, or backward solving on a reverse CFG. In our formalization, to accommodate backward analyses, we additionally make sure that the WTO contains all exit nodes of the CFG.
Yet, from a performance perspective, the particular choice of a WTO is crucial. For instance, the degenerated WTO \([1@[2@[3@[\ldots ,7@[8@[]]]]]]\) from Example 2, would imply a high number of iterations over large collections of nodes to reaching a fixpoint. Intuitively, WTO-based dataflow solvers perform better when the WTO is faithful to the CFG structure: typically, feedback edges link each end of loop body to the header of the corresponding loop. Indeed, reflecting the structure of the CFG is likely to reflect the underlying dependencies between dataflow equations. It is, however, difficult to make this observation more than a simple intuition, generally speaking.

3.2 Building a WTO

Previous examples and discussions show that a single graph admits in general several WTO. There exist indeed several algorithms to construct a WTO.
Enumeration Method. This is probably the simplest algorithm. It consists in listing all the nodes of the graph in an arbitrary order, which then serves as the induced order of the hierarchical ordering under construction. Following this order, the list of node nestings is then defined recursively by listing all the nodes as vertices with a single node, opening a component as soon as necessary – i.e., when an encountered node is pointed to by a feedback edge – and finally, closing all components at the very end. For a node list l without duplicates, the algorithm is a simple recursive function \(\mathcal {W}(G, l)\)
Example 3
In Example 2, when listing the nodes by number, the WTO computed by the enumeration method is \([\langle 1\rangle @[2@[\langle 3\rangle ,\langle 4\rangle , 5@[\langle 6\rangle ,\langle 7\rangle ,\langle 8\rangle ]]]]\).
In this work, we implemented the above algorithm and proved its correctness, with respect to Definition 7: the algorithm does compute a hierarchical ordering that is a WTO. Our correctness proof consists in formalizing the following reasoning. First, as all nodes of the graph are listed, regardless of their reachability from the entry node, the resulting ordering is node-complete. Moreover, when a feedback edge (uv) is encountered, i.e., when \(v \le u\), then v is indeed a parent head of u, since the just opened component with head v contains all the nodes that are still to be processed recursively in the list, including u.
To us, the Enumeration method is essentially a proof of concept for our formal specification of a WTO. But the computed WTO does not a priori reflect the structure of the CFG, and results in a considerably slow convergence of WTO-based solvers. Bourdoncle describes a specific enumeration method where the nodes are discovered in a depth-first search (DFS) of the graph instead of an arbitrary order [5]. Instead, we implemented another method that gives a clearly better WTO than the Enumeration method, that we present next.
Strongly Connected Components Method. This is probably the method that is the most frequently employed to build a WTO. The algorithm [5] decomposes the graph in strongly connected components and subcomponents, using Tarjan’s algorithm [25]. It extends Tarjan’s algorithm to construct the WTO on the fly during the SCC discovery, by recursively applying the algorithm on discovered SCC, once the root of an SCC and all the edges pointing to that root are virtually disconnected. The SCC (sub-)components correspond to the WTO (sub-)components. The algorithm starts the exploration of the graph from its entry node, and discovers the (sub-)components in a topological order, as in Tarjan’s algorithm. The feedback edges in the resulting WTO are those edges pointing to the root of an SCC component.
Example 4
The SCC of the graph from Example 2 are, in their topological order, \(\{3\}\), \(\{2\}\), \(\{1,4,5,7,6\}\) which has one subcomponent \(\{5,7\}\), and finally \(\{8\}\). The discovered back edges are (7, 5) and (6, 1). Consequently, the WTO computed using the SCC method is \([\langle 3\rangle ,\langle 2\rangle , 1@[\langle 4\rangle ,5@[\langle 7\rangle ],\langle 6\rangle ],\langle 8\rangle ]\).
The SCC method has a time complexity in \(O(\delta \left| E\right| )\) and provides a WTO which generally reflects well the program structure. In our development, we reused an implementation of the SCC method provided by the OCamlGraph library [9]. An informal proof of the correctness of this method is presented in Bourdoncle’s manuscript [6]. Although formalizing the proof could certainly be feasible, relying on previous work mechanizing the correctness proofs of Tarjan’s algorithm [8], such a proof is quite orthogonal and out of the scope of the present work, which focuses primarily on the correctness of dataflow solvers algorithms.
Rather than formally verifying the SCC method for building a WTO, we thus implement a formally verified validator in Coq, which checks, a posteriori, that the untrusted computed WTO effectively meets its specification. This does not lower the formal guarantee of the final implementation: the validator is formally proved in Coq to only accept hierarchical orderings that are WTO, with respect to Definition 7. Furthermore, in practice, our validator never rejects a WTO that has been computed with the SCC method.
Our validator for a WTO candidate \(\mathcal {W}\) takes as inputs the candidate \(\mathcal {W}\) and the CFG for which \(\mathcal {W}\) is supposed to be a WTO. In a first step, from \(\mathcal {W}\), we build a map \(\mathcal {O}(\mathcal {W})\), associating to each node its index in the flattening \(L(\mathcal {W})\). During the computation of the map, we check for potential duplicates. When the map is computed, we hence know that \(\mathcal {W}\) is a hierarchical ordering for G. In a second step, we validate the node completeness property of Definition 7. We check that the entry node is in \(L(\mathcal {W})\) and then check that \(\mathcal {W}\) is closed by the successor relation of the graph G. It follows that all reachable nodes from the entry node are in \(\mathcal {W}\). Finally, we validate the feedback-to-parent-head property of Definition 7. Here, we rely on the map \(\mathcal {O}(\mathcal {W})\) to determine whether an edge (uv) is a feedback edge or not, for \(\mathcal {W}\), by comparing the indices \(\mathcal {O}(\mathcal {W})[u]\) and \(\mathcal {O}(\mathcal {W})[v]\) in the map. We then check, straightforwardly, for each such feedback edge (ns), that s is a parent head of n by looking in enclosing components.
The overall complexity of the validator is \(O(\left| E\right| \log (\left| N\right| ) + \left| N\right| (\log \left| N\right| + feedback (G)))\) where \( feedback (G)\) is the number of feedback edges in G. For realistic CFG, we have \(\left| E\right| = O(\left| N\right| )\): instructions almost never have more than 2 successors. Therefore we can broadly approximate the complexity of the validator to \(O(\left| N\right| (\log \left| N\right| + feedback (G)))\). This is confirmed experimentally: we could observe that the ratio between validation time for a CFG G with n nodes and \(n(\log n + feedback (G))\) asymptotically tends to a constant.

4 General Framework of WTO-based Dataflow Solvers

Equipped with a WTO, we describe here the two dataflow solvers algorithms that we have formalized in Coq. For the sake of the presentation, we focus on describing the algorithm in pseudo-code, and highlighting the important parts of their correctness proof.
The first algorithm implements the so-called iterative strategy, while the second implements the recursive strategy [5]. While being significantly different one from the other, both algorithms fit a common, global framework, which reflects in the general architecture of our correctness proofs. We thus start by describing this global framework, as well as its key ingredients and intuitions.
WTO-based dataflow solvers follow CompCert’s interface for \( fixpoint \) described in Section 2.2, with the exception that they take a WTO \(\mathcal {W}\) as an extra input. The resulting formal correctness theorem for WTO-based solvers becomes:
Theorem 1
(Fixpoint partial correctness). Let \(G = (N, n_e, E)\) be a CFG. Assume \( fixpoint (G,\mathbb {L}, T, v_e, \mathcal {W}) = A\). For all \(n\in N\) reachable from \(n_e\) in G and all \(s \in \textsf {succ}_G(n)\), we have \(T(n,A[n]) \sqsubseteq A[s]\).
Notice that we only consider a partial correctness result, and assume the algorithm is indeed terminating. Formally proving its termination is outside of the scope of our contribution. In our Coq formalization, we resort to a fuel argument to ensure termination: the algorithm takes an extra parameter (a natural number) arbitrarily bounding the number of mutual recursive calls, and returns an optional result A that is defined only when enough fuel is provided.

4.1 Common Algorithmic Part

The pseudo-code for \( fixpoint \) is given in Algorithm 1. The resulting mapping \(A: N \mapsto L\), initialized with \(v_e\) for the entry node \(n_e\), is updated along the algorithm. It represents the solution of the equation system, described by T, computed so far. Then, the algorithm traverses \(\mathcal {W}\), visiting each node nesting in order, and updating A accordingly. When visiting a vertex \(\langle n\rangle \), we simply process the corresponding node n with function AnalyzeNode: in the current context, the abstract information for that node is updated by joining the current value with the abstract value coming from each of its predecessors. Visiting a component \(h@l\) consists in stabilizing that component, using the \(\textsc {Stabilize}\) procedure. Essentially, when \(\textsc {Stabilize}\) returns, the dataflow equations should hold for all the CFG nodes inside \(h@l\). Both algorithms differ in the way they stabilize a component: the iterative strategy stabilizes a top-level node nesting without taking into account the structure of its subcomponents, while the recursive strategy recursively stabilize subcomponents. For the moment, we keep the definition of \(\textsc {Stabilize}\) abstract.
Handling Backward Analyses The algorithm described here uses a CFG with a single entry node. But this implies some adaptations to support backward analyses, on a reversed CFG: we cannot assume that the CFG has a single exit point. To keep the presentation lighter, we abstract from those technical discrepancies between a forward and a backward dataflow solver algorithm. Essentially, the algorithm for forward analyses, as well as the main proof arguments, can largely be reused as is to implement a backward dataflow solver.

4.2 Common Proof Architecture

Both dataflow solvers algorithms follow a similar high-level proof architecture. We describe it now. Considering that each node nesting is visited one after the other in the main loop of the algorithm, dataflow equations should hold for the nodes belonging to the node nestings that were processed previously. We introduce a ghost variable S in our algorithm, to keep track of the set of the previously processed node nestings. We use this ghost variable to formulate the main correctness invariant of the algorithm. This invariant basically means that dataflow equations hold on all CFG nodes in (a node nesting of) S: \(\forall e\in S, \forall n\in e, \forall p\in \textsf {pred}(n), T(p,A[p]) \sqsubseteq A[n]\). Here \(n \in e\) means, by abuse of notation, that n is in \( flatten (e)\). We say that e contains a node n, or n belongs to e.
For a set of node nestings S, we write \( Nodes (S)\) the set of nodes belonging to some nesting in S. Formally, \( Nodes (S) = \{ n\in ~\hbox {N} \mid \exists e\in S, n \in e \}\). We additionally define the predicate \( Valid (A,n) \triangleq \forall p\in \textsf {pred}(n), T(p,A[p])\sqsubseteq A[n]\), expressing that, for the current value of A, the dataflow equations hold at node n. Using these notations, the main correctness invariant can thus be rephrased as \(\forall s\in Nodes (S), Valid (A,s)\). For both strategy algorithm, we prove that this invariant is preserved through the main loop of Algorithm 1.
Lemma 1
(Correctness invariant of S). After any iteration of the top-level loop of Algorithm 1, we have \(\forall s\in Nodes (S), Valid (A,s)\).
Naturally, the proof of the above lemma is specific to each of the strategy algorithms, which defines its own version of \(\textsc {Stabilize}\) procedure (see Sections 5 and 6). But this lemma is enough to prove the above correctness theorem. Indeed, at the end of that top-level loop, all the node nestings in \(\mathcal {W}\) are in S. By definition of a WTO, the invariant thus holds on all the nodes of G that are reachable from the entry node \(n_e\). Furthermore, thanks to the fact that the WTO is closed by the successor relation, the invariant phrased on predecessors, is equivalent to the above correctness theorem phrased on successors.
An important point to note in Algorithm 1 is that the nestings in \(\mathcal {W}\) cannot be processed in any order: they must be processed in the order in which they appear in the WTO. Indeed, in a WTO, a feedback edge can only link two nodes in the same top-level nesting thanks to the feedback-to-parent-head property. Thus, the predecessors of a node n either appear before n or are in the same node nesting as n. As a result, if the nestings are processed in the correct order, stabilized nodes can only have stabilized nodes as predecessors. This guarantees that the equations on stabilized nodes are preserved when stabilizing new components. However, if the nestings are processed in the wrong order, some stabilized node n might have not yet stabilized nodes as predecessors (they will appear before n in the WTO). The equations at already stabilized nodes might then be broken by further processings, which leads to an incorrect result. We formalize this observation by the following lemma, which we prove as invariant of the top-level loop of the algorithm. The proof essentially relies on the definition of a WTO.
Lemma 2
(Predecessor-closure of S). At any point of the execution of Algorithm 1, the set \( Nodes (S)\) is closed by the predecessor relation: for all \(n \in Nodes (S)\), and all \(p \in \textsf {pred}(n)\), we have \(p \in Nodes (S)\).
At the core of both dataflow algorithms is the function AnalyzeNode. We prove a basic lemma stating that it returns an abstract value greater than the current values at all its predecessors. The proof is straightforward.
Lemma 3
(Correctness of AnalyzeNode). Let \(n, p \in N\) be two nodes of the CFG. If \(p\in \textsf {pred}(n)\), then \(T(p,A[p]) \sqsubseteq \) AnalyzeNode(n).

5 WTO-Iterative Dataflow Solver

5.1 The Algorithm

What distinguishes the iterative strategy from the recursive one is the implementation of \(\textsc {Stabilize}(h,l)\). The iterative strategy proceeds as follows. An iteration consists in retrieving information from the predecessors of each node n in \(h:\,\!:L(l)\). Iterations are performed until a fixpoint is reached on these nodes. To detect the stabilization of the component after an iteration (i.e., to detect that the fixpoint has been reached), Bourdoncle has shown that it suffices to check that the equations hold at h and the heads of the subcomponents of l [5]. However we must ensure that we analyze each node of the component at least once before declaring the component as stabilized. Indeed, non-head nodes inside the component might have predecessors in nestings previously processed. We must analyze these nodes to retrieve the final information computed at their predecessors, otherwise the correctness of the analysis is not guaranteed.
\(\textsc {Stabilize}(h,l)\) can then be implemented as follows. At the first iteration, for each node n in \(h:\,\!:L(l)\), retrieve information from all its predecessors. For next iterations, proceed each node n in \(h:\,\!:L(l)\) as follows. If \(n=h\) or n is the head of a subcomponent in l, retrieve information from the predecessors of n. If the equations hold at n, then do nothing. Otherwise, the rest of the component has to be entirely analyzed again, like at the first iteration. If n is not a head and all the previous checks at heads succeeded, then do nothing. The process stops once all the checks at heads have succeeded at the last iteration.
The pseudo-code of Bourdoncle’s iterative strategy is given in Algorithm 2. The auxiliary function AnalyzeNode returns information retrieved from the predecessors of a node. The first iteration logic is implemented by the AnalyzeNesting function. This function is called at the beginning of \(\textsc {Stabilize}\) and when a check fails in the subsequent iterations: the remaining nodes of the component have to be re-analyzed. AnalyzeNesting updates the abstract values of the nodes in a nesting with the information retrieved.
The mutually recursive functions CheckComponent, CheckNesting and CheckNestings implement subsequent iterations. They check for stabilization at heads and analyze remaining nestings with AnalyzeNesting once a check fails. The results of the checks are propagated thanks to the boolean m: while it is \(\textsf {false}\), abstract values at heads have not been modified (the checks succeed). CheckNesting checks an individual nesting. The check succeeds immediately for a vertex (it is not a head). The check for a component is done with CheckComponent. CheckComponent checks the head of a component and recursively checks its sub-nestings with CheckNestings. If the check fails, sub-nestings are re-analyzed with AnalyzeNesting. CheckNestings checks a list of nestings. If a check fails at some point (\(m = \textsf {true}\)), remaining nestings are re-analyzed with AnalyzeNesting.

5.2 Proof of Correctness

The correctness proof of Algorithm 2 follows the general architecture described earlier. We thus describe now the salient point of the proof of Lemma 1. We split the proof in two main steps. First, we prove that stabilizing a given node nesting does not break the validity of the equations for nodes in the nestings that were previously stabilized (Lemma 4). Then, we prove that stabilizing a given nesting e indeed ensures the validity of the equations for the nodes of e (Lemma 6).
Notations. In the following, we make use of the following notation. We write \(S_0,A_0 \xrightarrow {F(x)} S_1,A_1\) for the property “if \(S = S_0\) and \(A = A_0\) before a call to F(x), then \(S = S_1\) and \(A = A_1\) after the call.”. For example, if \(S=S_0\), \(A=A_0\) before a call to Stabilize(hl) and \(S=S_1\), \(A=A_1\) when the function returns, then we have \(S_0,A_0 \xrightarrow {\textsc {Stabilize}(h,l)} S_1,A_1\).
We now present the main technical lemmas, and give an overview of their proofs by giving the key auxiliary lemmas involved therein.
Stabilize preserves validity. As explained above, we first prove that for nodes already processed by the procedure, and hence belonging to S, the validity of their dataflow equation is preserved.
Lemma 4
(Stabilize preserves validity). 
Suppose \(S_0,A_0~\xrightarrow {\textsc {Stabilize}(h,l)}~S_1,A_1\) and that for all \(n \in h@l\), \(n \not \in Nodes (S_0)\). Then for all \(s\in Nodes (S_{0})\), if \( Valid (A_0,s)\) then \( Valid (A_1,s)\).
To prove Lemma 4, we rely on an auxiliary lemma (Lemma 5) which states that AnalyzeNesting preserves the validity of equations for nodes that do not have predecessors in the node nesting argument. We prove a similar result for CheckComponent, that we elide here to keep the presentation light.
Lemma 5
Assume that \(S_0,A_0 \xrightarrow {\textsc {AnalyzeNesting}(e)} S_1,A_1\). Let \(s \notin e\) having no predecessors in e. If \( Valid (A_0,s)\), then \( Valid (A_1,s)\).
Stabilize ensures validity. For the nodes analyzed through the procedure, we show that their dataflow equations become valid at the end of the process.
Lemma 6
(Stabilize ensures validity.). Suppose \(S_0,A_0 \xrightarrow {\textsc {Stabilize}(h,l)} S_1,A_1\). Then for all s in \(h@l\), we have \( Valid (A_1,s)\).
To prove Lemma 6, we split the reasoning in three steps. First, we show in Lemma 7 below that, after analyzing the nodes of a nesting with AnalyzeNesting, the equations hold on all the analyzed nodes that are not heads of subcomponents in the WTO. Indeed, a non-head node n have no predecessors after it in the WTO (by the feedback-to-parent-head property of a WTO). Thus the values at the predecessors of n have not changed after we analyzed n. However we might have new information to retrieve at heads: they might have predecessors analyzed after them.
At this point in the proof, we need to distinguish nodes that are component heads from the ones that are not. To do this, we rely on the notion of parent head (Definition 3). In a node nesting e, s is the head of a sub-component of e if and only if s is a parent head of itself in e. In particular, if s is a parent head of some node n, then s is a parent head of itself.
Lemma 7
Suppose \(S_0,A_0 \xrightarrow {\textsc {AnalyzeNesting}(e)} S_1,A_1\). Then for all \(s \in e\), if s is not a parent head of itself in e, then \( Valid (A_1,s)\).
Next, we prove in Lemma 8 below, that provided the equations hold on all non-head nodes of the component, they still hold after checking the component with CheckComponent. This is true for reasons similar to the first part.
Lemma 8
Suppose \(S_0,A_0 \xrightarrow {\textsc {CheckComponent}(h,l)} S_1,A_1\). Let s in \(h@l\). If s is not a parent head of itself in \(h@l\) and \( Valid (A_0,s)\), then \( Valid (A_1,s)\).
Finally, we prove with Lemma 9 that, once CheckComponent returns \(\textsf {false}\), the equations hold on all the nodes of the component, including heads. This is true provided that they hold on non-head nodes before the final check. Indeed, if the checks at heads at line 26 succeed, then the equations hold on the heads.
Lemma 9
Suppose \(S_0,A_0 \xrightarrow {\textsc {CheckComponent}(h,l)} S_1,A_1\). Assume further that for all s in \(h@l\), if s is not a parent head of itself in \(h@l\) then \( Valid (A_0,s)\). If CheckComponent returns \(\textsf {false}\), then, for all \(s \in h@l\), we have \( Valid (A_1,s)\).

6 WTO-Recursive Dataflow Solver

6.1 The Algorithm

The recursive strategy implements \(\textsc {Stabilize}(h,l)\) differently from the iterative strategy. Here, stabilizing a component recursively stabilizes the encountered subcomponents. Bourdoncle indicates that with this strategy, it would be sufficient to check that the equations hold at the head of the component [5] to detect the stabilization of a whole component. This simplifies the algorithm, the heads of subcomponents do not need to be checked for stability, and the need to use functions CheckNesting, CheckNestings and CheckComponent vanishes.
The pseudo-code of the recursive strategy is given in Algorithm 3. We explain now the functioning of \(\textsc {Stabilize}(h,l)\). Here again, we distinguish the first iteration on the component from the next iterations. For the first iteration, visit each node nesting in \(\langle h\rangle :\,\!:l\). If it is a vertex \(\langle n\rangle \), we simply analyze node n. Otherwise, visiting a component \(h'@l'\) consists in stabilize it, recursively, by calling \(\textsc {Stabilize}(h',l')\). For the subsequent iterations on \(h@l\), retrieve information from the predecessors of h and join it with the current value available for h. If the result is not greater than A[h], then stop. Otherwise, analyze each nesting of l like at the first iteration.
In Algorithm 3, we provide an extra boolean parameter to \(\textsc {Stabilize}\), \(\textsf {stop}\), that guarantees that the first iteration is always done and that the nodes of the component are analyzed at least once. The initial call to \(\textsc {Stabilize}\) is made with \(\textsf {stop}=\textsf {false}\) while the recursive calls are made with \(\textsf {stop}=\textsf {true}\). The function only returns when \(\textsf {stop}\) is \(\textsf {true}\)2.

6.2 Proof of Correctness

The correctness proof for Algorithm 3 follows the general proof architecture explained in Section 4. We split the proof of the correctness invariant preservation (Lemma 1) in a way that is similar to the proof of the iterative dataflow algorithm. We first prove that Stabilize preserves the validity of equation for the nodes in the already stabilized node nestings (Lemma 10). We then prove that Stabilize effectively ensures the validity of equations for the nodes of the component under stabilization (Lemma 11).
Lemma 10
(Stabilize preserves validity). 
Suppose \(S_0,A_0~\xrightarrow {\textsc {Stabilize}(h,l,\textsf {stop})}~S_1,A_1\) and for all \(n \in h@l\), \(n \not \in Nodes (S_0)\). Then for all \(s\in Nodes (S_{0})\), if \( Valid (A_0,s)\) then \( Valid (A_1,s)\).
Proving Lemma 10 for Algorithm 3 is much simpler than for Algorithm 2, since we don’t have to consider auxiliary functions that implement the stabilization check. Hence, we don’t expand here on auxiliary lemmas.
Next, we prove with Lemma 11 that \(\textsc {Stabilize}\) does indeed establish the validity of the equations for the nodes it analyzes recursively.
Lemma 11
(Stabilize ensures validity). Suppose \(S_0,A_0 \xrightarrow {\textsc {Stabilize}(h,l,\textsf {false})} S_1,A_1\). Then for all \(s\in h@l\), \( Valid (A_1,s)\).
To prove Lemma 11, we proceed in a different way than previously. In particular, this proof has to be done by mutual recursion between AnalyzeNesting and Stabilize. In our implementation, we add a natural fuel, p, as a parameter to both functions to ensure termination. This fuel integer allows us here to reason by strong induction on p to do the proofs. Consequently, we make that fuel parameter explicit in the following lemmas, and write \(\textsc {AnalyzeNesting}_p(e)\) and \(\textsc {Stabilize}_p(h,l,\textsf {stop})\) the calls with a fuel equal to p.
We organize the proof of Lemma 11 using three auxiliary properties, each of them indexed with a current level of fuel p:
  • \(\mathcal {C}_{Ana}(p)\) states that after analyzing a node nesting e with \(\textsc {AnalyzeNesting}_p(e)\), the equations hold on all the nodes in e, including heads.
  • \(\mathcal {C}_{Stab}(\) truep) states that after calling \(\textsc {Stabilize}_p(h,l,\textsf {true})\) to stabilize a component, the equations hold on all nodes in \(h@l\), provided they were holding on all the nodes in l.
  • \(\mathcal {C}_{Stab}(\) falsep) states that after calling \(\textsc {Stabilize}_p(h,l,\textsf {false})\) to stabilize a component, the equations hold on all the nodes of \(h@l\), without any extra prerequisite.
We do the proof along the following induction scheme. First, assuming that for all \(k < p\) we have \(\mathcal {C}_{Ana}(k)\), we prove that property \(\mathcal {C}_{Stab}(\textsf {true}, p)\) holds. But, for any p, \(\mathcal {C}_{Stab}(\textsf {true}, p)\) implies \(\mathcal {C}_{Stab}(\textsf {false}, p+1)\). Indeed, when executing a call \(\textsc {Stabilize}_{p+1}(h,l,\textsf {false})\), we call \(\textsc {Stabilize}_p(h,l,\textsf {true})\) after analyzing at least once all the nodes in l, thus solving their equations. As a result, assuming that for all \(k < p\) (a fortiori for all \(k < p-1\)) we have \(\mathcal {C}_{Ana}(k)\), we can deduce that \(\mathcal {C}_{Stab}(\textsf {false}, p)\). We can conclude the proof of \(\mathcal {C}_{Ana}(p)\) by strong induction on p.
Another difference with the proof of Algorithm 2 is that, thanks to the recursive stabilizations of subcomponents, we are able to prove that, after one iteration on a component \(h@l\), the algorithm keeps the equations valid on all the nodes of l rather than just on the non-head node of l. Indeed, the equations at the head h are the only ones that might not hold. Equations at the heads of the subcomponents, and thus all non-head nodes in subcomponents, have been solved recursively.
Lemma 12
\(\forall p, \left( \forall k < p, \mathcal {C}_{Ana}(k)\right) \Rightarrow \mathcal {C}_{Stab}(p,\textsf {true})\).
Let \(p\in \mathbb {N}\), and suppose that for all \(k < p\), if \(S_0,A_0 \xrightarrow {\textsc {AnalyzeNesting}_k(e)} S_1,A_1\), then for all \(s\in e\), \( Valid (A_1,s)\). Suppose also that \(S_0,A_0 \xrightarrow {\textsc {Stabilize}_p(h,l,\textsf {true})} S_1,A_1\), and that for all \(s\in l\), we have \( Valid (A_0,s)\). Then for all \(s\in h@l\), we have \( Valid (A_1,s)\).
Lemma 13
\(\forall p, \left( \forall k < p, \mathcal {C}_{Ana}(k)\right) \Rightarrow \mathcal {C}_{Stab}(p, \textsf {false})\).
Let \(p\in \mathbb {N}\) and suppose that, for all \(k < p\), if \(S_0,A_0 \xrightarrow {\textsc {AnalyzeNesting}_k(e)} S_1,A_1\), then for all \(s\in e\), \( Valid (A_1,s)\). Suppose also that \(S_0,A_0 \xrightarrow {\textsc {Stabilize}_p(h,l,\textsf {false})} S_1,A_1\). Then for all \(s\in h@l\), we have \( Valid (A_1,s)\).
Lemma 14
(AnalyzeNesting ensures validity). \(\forall p, \mathcal {C}_{Ana}(p)\).
Let \(p\in \mathbb {N}\). If \(S_0,A_0 \xrightarrow {\textsc {AnalyzeNesting}_p(e)} S_1,A_1\), then for all \(s\in e\), \( Valid (A_1,s)\).

7 Implementation and Experimental Evaluation

We now evaluate experimentally our verified dataflow solvers. Our first objective is to confirm experimentally that our verified and implemented algorithms have acceptable performances, both in terms of precision and efficiency. Our baseline is the generic dataflow solvers from CompCert’s distribution, implementing Kildall’s algorithm [13]. We do not expect to outperform CompCert’s dataflow solvers. Indeed, CompCert’s implementation of Kildall’s algorithm relies on well-chosen data structures, in particular to implement worklists. The latter are based on heap structures from CompCert library whose performances have already been carefully tuned, and trade-offs regarding compilation time and optimization power have already been worked out.
Our second objective is to better understand the practical differences between the solvers, by delineating their respective strengths and weaknesses observed in practice, in the realistic setting of a compiler toolchain like CompCert.
Below we provide a summary of our experimental results and conclusions. Our companion experiment report [16] further details experimental data and results, drawn from three representative dataflow analyses: (1) ValueAnalysis, a forward analysis starting from the CFG entry point, (2) Liveness analysis, a backward analysis starting from a overapproximated set of exit points, and (3) Allocation, a backward analysis where all nodes are considered as exit points.

7.1 Code Size of our Development

Our complete development includes Coq and OCaml code. Coq files contain specifications, programs and proofs. Coq programs correspond to both dataflow solvers, including their forward and backward versions, as well as a formally verified validator for our WTO computation. The WTO computation itself is hand-written in OCaml. It implements Tarjan’s SCC algorithm, and is mainly reusing the OCamlGraph [9] library, with minor adaptations to fit our context: to handle backward analyses, we compute a valid set of entry nodes of the reverse CFG, on which we begin the discovery of the SCC.
Table 1 gives the size of our development, splitting figures according to whether they correspond to Coq definitions, Coq proofs, or OCaml definitions. This consists in around 3.1k lines of code for definitions, 4.4k for the proofs. The whole part related to WTO (specification, computation, validation, and proofs) is about 1.8k lines of Coq and OCaml code. The fixpoint solvers’ specification, implementation and correctness proofs is the largest part of our development: 3.4k lines of Coq for the iterative strategy and 2.5k lines for the recursive one. We also indicate in Table 1 the size of the underlying CompCert code base. This represents 111k lines of Coq definitions and 66k lines of proofs, along with 53k lines of OCaml, for a total of about 230k lines.
Table 1.
Size of our development (LoC), with regards to CompCert’s code base
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91121-7_4/MediaObjects/652625_1_En_4_Tab1_HTML.png
Dummy

7.2 Benchmark Programs

We evaluate the solvers using example C programs from CompCert’s test suite, which includes Spass, a prover for first-order logic, a ray tracer, a compression library and a set of cryptographic primitives. Their sizes range from a few lines of code to several thousands, including programs with many nested loops.
CompCert’s test suite are realistic hand-written C programs. As such, they rarely include very large functions, but are instead made of a large collection of relatively small functions. For Spass, the largest function has 1245 RTL instructions. Only 5 functions have more than 500 instructions, and the average size of a function is 16 instructions. Hence, in order to evaluate and compare the solvers as the size of the analyzed functions grows up, we also include 10 random C programs generated with Csmith [27]. The largest generated function has 3916 RTL instructions, and the average function size is 118 instructions.

7.3 Dataflow Solvers

We conduct experiments on the three following solvers, each of them having two variants for forward and backward analysis. We use the following abbreviations to refer to each of them: K is the Kildall solver as implemented in CompCert, BIter is our WTO-iterative solver, and BRec is our WTO-recursive solver. The WTO we use in our solvers is always computed using the SCC method (the enumeration method described in Section 3.2 is indeed clearly inefficient in practice). As explained Section 1, the numbering of the CFG nodes makes a difference for the solvers. Indeed, Kildall’s implementation in CompCert is highly sensitive to the numbering of nodes, and converges faster when nodes are numbered according to a reverse postorder (RPO) traversal of the CFG. Our experiments always use an RPO numbering of the CFG, which also leads to better SCC-based WTO.

7.4 Global Compilation Time and Optimization Opportunities

Changing the dataflow algorithm of the compiler can influence globally compilation times. We measure the whole compilation time, for each C source file, and for three distinct compilation chains, with respective solvers K, BIter and BRec. Results are given in Fig. 1. Each point compares compilation time measures for a single file. The (xy) coordinates correspond to the time measures for each algorithms. Total compilation time with BRec is higher than with K by about 15%. BIter is also slightly more efficient than BRec by about 2%. These compilation times for BRec and BIter both include the WTO validation times. Removing the WTO validation time from compilation time with BRec reduces the overhead against K down to approximately 5%.
Fig. 1.
Total compilation time, by C file (in seconds, log/log scale). Left: K vs. BRec. Right: BIter vs. BRec.
Using a different solver algorithm can in theory influence the quality of compiled programs: more precise analyses results could lead to further optimized programs. In our case, however, for the three solvers, resulting assembly programs are syntactically identical, and their running time would hence be the same. Note that obtaining identical assembly programs does not necessarily imply that dataflow solvers have indeed equal precision. Instead, it only means that program optimizations are currently insensitive to potential variations in the precision of the underlying analyses.
These first, high-level experiments show that BIter and BRec have acceptable performances in practice, both in time and precision. Now, in order to gain a better understanding, we compare solvers with more fine-grained experiments.

7.5 Efficiency of the Solvers

We now study the solvers themselves, independently of program optimizations.
Analysis Time. We assess and compare the analysis time for the solvers, for each analysis, and function by function. We include in the analysis time the fixpoint computation time, and the preprocessing time required to compute the fixpoint: either the WTO computation and validation time for BIter and BRec solvers, or the RPO numbering for K solvers. For long analysis times, BRec performs better, while K is better for short analysis times. But, as compilation time is linearly correlated with the size of the CFG (in number of nodes), this means that, as the size of the function increases, the gain of iterating more efficiently is worth the cost of computing and validating the WTO itself. Solvers BRec and BIter scale better with function size. In conclusion, there is an improvement on the fixpoint computation thanks to more efficient iterations, but this improvement is limited by an overhead due to the computation of the WTO.
Analysis Costs. We consider dataflow solvers computation costs that are relative to transfer functions and join operators, rather than just wall-clock time. We count the number of applications of T and \(\sqcup \) when computing fixpoints during the various analyses. Abstracting over the time spent in the particular implementation of T and \(\sqcup \) allows us gaining some insights about computation costs inherent to the fixpoint algorithm, independently of a particular analysis.
Our full results appear in our experiment report [16]. In a nutshell, BRec applies more often transfer functions T than K does. However, BRec applies drastically less \(\sqcup \) operators than K does. Indeed, when analyzing a node, BRec retrieves dataflow facts from its predecessors, after applying the transfer function to each of them: T needs to be applied once per predecessor each time we analyze a node. Instead, K starts by applying the transfer function of the node, and then propagates the output dataflow fact to all of its successors. In contrast, BRec only applies \(\sqcup \) for component heads and for nodes having several predecessors, while K needs to apply it when analysing every node, to join the output dataflow fact with the fact previously computed for that node.
Finally, we compare BIter costs against BRec costs in terms of join operators. Both have similar costs for forward dataflow analyses, but differ for backward analyses. BIter is almost always better than BRec on backward analyses. This is due to the fact that SCC are discovered from a set of exit nodes instead of an entry node, as explained in Section 4. As a result, WTO component heads often correspond to the ends of loop bodies, rather than being at loop headers. In such a situation, BRec misses information from yet unprocessed nodes more often than BIter does. Indeed BRec might do a large number of iterations on a subcomponent while missing information coming from nodes located after it. In contrast, BIter performs iterations on the whole outermost component, thus converging faster. This phenomenon is amplified for analyses like Allocation, which consider all nodes of the CFG as exit points.

7.6 Precision of the Solvers

We evaluate the precision of the solvers by comparing the abstract values they return, for each analysis, and for each function. Comparison is done using the partial order from the analysis pseudo-semi-lattice domain. Differences can occur because CompCert uses pseudo-semi-lattices instead of semi-lattices: we have no guarantee that \(\sqcup \) returns the least upper bound, and over-approximations might happen depending on the processing order of the nodes. We observe a few differences in terms of precision. For domains where \(\sqcup \) effectively computes the least upper bound, such as for Liveness or Allocation, we observe no difference across the solvers. However, the three solvers do not always compute the same result for Value Analysis where the pseudo-semi-lattice is not a semi-lattice. On Value Analysis, for 6.1‰ of the 5766 analyzed functions, one of the three algorithms outputs at least one different abstract value from the two others at some node of the CFG.
This work is at the intersection of two research areas. The first one focuses on dataflow analysis algorithms, where the algorithms are usually designed for specific cases: for instance, the abstract structures almost never have infinite ascending chains. It is the case in CompCert: all its optimizations rely on abstract domains with length-bounded ascending chains. The second area concerns generic fixpoint solvers for static analysis with abstract interpretation, where Bourdoncle’s method is often the reference approach. Indeed it allows to do widening to speed up the convergence of the analysis and to guarantee its termination. It is a need since those solvers potentially have to deal with lattices with very long or even infinite ascending chains. Such solvers are usually described in a theoretical context, and are not especially designed for dataflow analysis.

8.1 Dataflow Analysis

The main and historical approach to dataflow analysis is based on worklists [13, 21]. A formalization and proof of correctness in Coq is available for Kildall’s method in the existing dataflow analysis framework of CompCert [20].
To simplify dataflow analysis on reducible graphs, one can decompose them in intervals of nodes [2, 3], that describe the main blocks of the underlying program. It then suffices to compute the abstract information at the head of an interval Bourdoncle [5] describes a method to compute a WTO from an interval decomposition of a reducible CFG. However, optimization passes often make the CFG irreducible. It is thus difficult to use such an approach with optimized programs. As a result, in our work, we did not choose to implement this method.
The experimental evaluation of Cooper et al. [10] for worklist algorithms addresses similar questions to our evaluation of Bourdoncle’s method. Namely, they both assess the impact of the order in which a dataflow solver process the nodes on the efficiency of the solver. Their main conclusion is that the efficiency of these solvers strongly depends on the data structure chosen for the worklist. We can draw a parallel with the fact that the efficiency of Bourdoncle’s method strongly depends on how we compute the WTO. In both cases, the efficiency of dataflow analysis algorithms mostly hinges on the choice of the iteration order.
Bourdoncle’s method, and more generally abstract interpretation techniques, bring an interesting approach to dataflow analysis as alternatives to historical algorithms. This new approach have already been partially explored by Sherman [24] to improve the Soot Java optimization framework [26]. Indeed Soot was initially based on Kildall’s algorithm and did not support widening. As a result, it was difficult to adapt it to analyses where the lattices have long ascending chains. Thus, Soot was not commonly used for program verification, or for complex program optimizations such as array bound check elimination [22]. In her work, Sherman redesign the dataflow analysis interface of Soot. The new interface allows the implementation of abstract interpretation-based methods such as Bourdoncle’s one. Note that her work only focuses on forward dataflow analyses.

8.2 Generic Solvers for Static Analysis

Bourdoncle’s iterations have many applications in the field of static analysis. In a recent work, Lemerre [18] used the recursive strategy of Bourdoncle to compute a SSA abstraction of a program. His algorithm is formalized on a theoretical toy language and implemented as a plugin for Frama-C which translates C to a SSA intermediate representation of LLVM. Blazy et al. [4] developed a formally verified value analysis on an intermediate representation of CompCert. This analysis is based on abstract interpretation. They choose to implement their fixpoint solver with Bourdoncle’s method. Indeed, they perform an analysis with the lattice of bounded intervals. Since this lattice contains infinite ascending chains, the analysis required widening. Thus they could not reuse the Kildall framework from CompCert. However their fixpoint solver is validated a posteriori. Our paper presents a partial proof of correctness in Coq of the fixpoint computation (termination is not proved but guaranteed by the introduction of fuel).
Local solvers are often used to do static program analysis. They consist in computing information at some node only by exploring its dependencies. Thus, the problem they address is different from our problem: the interface we consider here requires an analysis result on all the nodes of the graph. Some program optimizations techniques also use local solvers to perform, e.g. liveness queries [17], but this is not the case for CompCert’s general dataflow framework.
An historical example of local solver is the top-down solver (TD) [7]. The idea is to simulate chaotic iterations from a node while building a graph of the dependencies of this node. The simulations are done recursively: when a dependency p of a node n is encountered, a sub-simulation is performed on p. The abstract values computed are updated each time a sub-simulation is done. The sub-simulation on p has to be re-done if the abstract value at n changed in the meantime. The algorithm RLD (for Recursive Local Dependencies) [11, 23] reuses the same idea but handles slightly more precisely the dependencies. Indeed, when the value at a node n is changed, TD removes recursively all the nodes that depends on n from the list of solved nodes and has to solve them again. RLD only removes the successors of n from the list of solved nodes. RLD has been formalized in Coq by Hofmann [11].
These algorithms indeed follow a “top-down” approach: they start from the node to analyze (the top) and go down as they analyze recursively the dependencies. “Bottom-up” algorithms rather consist in computing a fixpoint on a pre-determined set of nodes containing the node to analyze. Worklist algorithms are examples of “bottom-up” algorithms.
Kim et al. recently presented a variant of Bourdoncle’s method to parallelize the fixpoint computation [14]. This variant is based on the notion of weak partial order, a generalization of the WTO of Bourdoncle. While a WTO is a total order on the reachable nodes of the graph, a WPO is a partial order. Thus, non-comparable nodes can be processed in parallel since they are not interdependent.

9 Conclusion and Perspectives

This work provides a formalization of Bourdoncle’s WTO-based method to solve dataflow problems. This work is supported by an implementation which is verified within the Coq proof assistant, and fits into a realistic context: compiler optimizations in CompCert. We achieved performances that are comparable to those of Kildall’s historical algorithm for dataflow analysis, while keeping the same level of precision. Finally, our formal proof allows for a better understanding of the correctness arguments of the solvers.
Our dataflow solver could be improved in several directions. First, we could implement and verify the SCC algorithm directly in Coq to compute the WTO, thus avoiding the non-negligible overhead of the validator. We could adapt the formally verified implementation of Tarjan’s SCC algorithm by Chen et al. [8] to fit the context of WTO computation. Second, we could refine our implementation by distinguishing upper-bound and widening operators. Depending on whether a node is a widening point, identified by component heads in the WTO, applying a widening operator rather than a simple upper bound operator could make the analyses converge faster in practice, with no consequence on our formal correctness proof if we keep a fuel-bounded recursion technique.
Another avenue for research, from a formal proof perspective, would be to specify widening operators in a more precise way to study the termination and other related convergence notions of the solver algorithms.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Fußnoten
1
A back-edge in a graph is defined as an edge whose target dominates its source [1].
 
2
Introducing this parameter simplifies the implementation and the correctness proof.
 
Literatur
1.
Zurück zum Zitat Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA (Jul 2006) Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA (Jul 2006)
4.
Zurück zum Zitat Blazy, S., Laporte, V., Maroneze, A., Pichardie, D.: Formal verification of a C value analysis based on abstract interpretation. In: Fahndrich, M., Logozzo, F. (eds.) SAS - 20th Static Analysis Symposium. 7935, vol. Lecture Notes in Computer Science, pp. 324–344. Springer, Seattle, United States (Jun 2013). https://doi.org/10.1007/978-3-642-38856-9_18 Blazy, S., Laporte, V., Maroneze, A., Pichardie, D.: Formal verification of a C value analysis based on abstract interpretation. In: Fahndrich, M., Logozzo, F. (eds.) SAS - 20th Static Analysis Symposium. 7935, vol. Lecture Notes in Computer Science, pp. 324–344. Springer, Seattle, United States (Jun 2013). https://​doi.​org/​10.​1007/​978-3-642-38856-9_​18
5.
Zurück zum Zitat Bourdoncle, F.: Efficient chaotic iteration strategies with widenings. In: Bjørner, D., Broy, M., Pottosin, I.V. (eds.) Formal Methods in Programming and Their Applications. pp. 128–141. Springer Berlin Heidelberg, Berlin, Heidelberg (1993). https://doi.org/10.1007/BFb0039704 Bourdoncle, F.: Efficient chaotic iteration strategies with widenings. In: Bjørner, D., Broy, M., Pottosin, I.V. (eds.) Formal Methods in Programming and Their Applications. pp. 128–141. Springer Berlin Heidelberg, Berlin, Heidelberg (1993). https://​doi.​org/​10.​1007/​BFb0039704
6.
Zurück zum Zitat Bourdoncle, F.J.: Sémantiques Des Langages Impératifs d’ordre Superieur et Interprétation Abstraite. Ph.D. thesis, Palaiseau, Ecole polytechnique (1992) Bourdoncle, F.J.: Sémantiques Des Langages Impératifs d’ordre Superieur et Interprétation Abstraite. Ph.D. thesis, Palaiseau, Ecole polytechnique (1992)
7.
Zurück zum Zitat Charlier, B.L., Van Hentenryck, P.: A universal top-down fixpoint algorithm. Tech. rep., Brown University, USA (1992) Charlier, B.L., Van Hentenryck, P.: A universal top-down fixpoint algorithm. Tech. rep., Brown University, USA (1992)
8.
Zurück zum Zitat Chen, R., Cohen, C., Lévy, J., Merz, S., Théry, L.: Formal proofs of tarjan’s strongly connected components algorithm in why3, coq and isabelle. In: Harrison, J., O’Leary, J., Tolmach, A. (eds.) 10th International Conference on Interactive Theorem Proving, ITP 2019, September 9-12, 2019, Portland, OR, USA. LIPIcs, vol. 141, pp. 13:1–13:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPICS.ITP.2019.13 Chen, R., Cohen, C., Lévy, J., Merz, S., Théry, L.: Formal proofs of tarjan’s strongly connected components algorithm in why3, coq and isabelle. In: Harrison, J., O’Leary, J., Tolmach, A. (eds.) 10th International Conference on Interactive Theorem Proving, ITP 2019, September 9-12, 2019, Portland, OR, USA. LIPIcs, vol. 141, pp. 13:1–13:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://​doi.​org/​10.​4230/​LIPICS.​ITP.​2019.​13
9.
Zurück zum Zitat Conchon, S., Filliâtre, J.C., Signoles, J.: Designing a generic graph library using ML functors (04 2007) Conchon, S., Filliâtre, J.C., Signoles, J.: Designing a generic graph library using ML functors (04 2007)
12.
Zurück zum Zitat Jourdan, J.H., Laporte, V., Blazy, S., Leroy, X., Pichardie, D.: A formally-verified c static analyzer. In: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. p. 247-259. POPL ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2676726.2676966 Jourdan, J.H., Laporte, V., Blazy, S., Leroy, X., Pichardie, D.: A formally-verified c static analyzer. In: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. p. 247-259. POPL ’15, Association for Computing Machinery, New York, NY, USA (2015). https://​doi.​org/​10.​1145/​2676726.​2676966
13.
Zurück zum Zitat Kildall, G.A.: A unified approach to global program optimization. In: Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. p. 194-206. POPL ’73, Association for Computing Machinery, New York, NY, USA (1973). https://doi.org/10.1145/512927.512945 Kildall, G.A.: A unified approach to global program optimization. In: Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. p. 194-206. POPL ’73, Association for Computing Machinery, New York, NY, USA (1973). https://​doi.​org/​10.​1145/​512927.​512945
22.
Zurück zum Zitat Qian, F., Hendren, L., Verbrugge, C.: A Comprehensive Approach to Array Bounds Check Elimination for Java. In: Goos, G., Hartmanis, J., Van Leeuwen, J., Horspool, R.N. (eds.) Compiler Construction, vol. 2304, pp. 325–341. Springer Berlin Heidelberg, Berlin, Heidelberg (2002). https://doi.org/10.1007/3-540-45937-5_23 Qian, F., Hendren, L., Verbrugge, C.: A Comprehensive Approach to Array Bounds Check Elimination for Java. In: Goos, G., Hartmanis, J., Van Leeuwen, J., Horspool, R.N. (eds.) Compiler Construction, vol. 2304, pp. 325–341. Springer Berlin Heidelberg, Berlin, Heidelberg (2002). https://​doi.​org/​10.​1007/​3-540-45937-5_​23
26.
Zurück zum Zitat Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot: A Java bytecode optimization framework. In: CASCON First Decade High Impact Papers on - CASCON ’10. pp. 214–224. ACM Press, Toronto, Ontario, Canada (2010). https://doi.org/10.1145/1925805.1925818 Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot: A Java bytecode optimization framework. In: CASCON First Decade High Impact Papers on - CASCON ’10. pp. 214–224. ACM Press, Toronto, Ontario, Canada (2010). https://​doi.​org/​10.​1145/​1925805.​1925818
Metadaten
Titel
Formal Verification of WTO-based Dataflow Solvers
verfasst von
Roméo La Spina
Delphine Demange
Sandrine Blazy
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-91121-7_4