6.1 Phase-\(\text {PDR}^\forall \) for Inferring Universally Quantified Characterizations
We now describe our procedure for solving the CHCs system of Sect.
5.1. It either (i) returns universally quantified phase characterizations that induce a safe inductive phase invariant, (ii) returns an abstract counterexample trace demonstrating that this is not possible, or (iii) diverges.
EPR. Our procedure handles transition systems expressed using the extended
Effectively
PRopositional fragment (EPR) of first order logic [
51,
52], and infers universally quantified phase characterizations. Satisfiability of (extended) EPR formulas is decidable, enjoys the finite-model property, and supported by solvers such as Z3 [
46] and iProver [
41].
Phase-\({{\varvec{PDR}}}^\forall \) . Our procedure is based on
\(\text {PDR}^\forall \) [
40], a variant of PDR [
10,
21] that infers universally quantified inductive invariants. PDR computes a sequence of
frames \(\mathcal {F}_0, \ldots , \mathcal {F}_n\) such that
\(\mathcal {F}_i\) overapproximates the set of states reachable in
i steps. In our case, each frame
\(\mathcal {F}_i\) is a mapping from a phase
\(q\) to characterizations. The details of the algorithm are standard for PDR; we describe the gist of the procedure in the extended version [
24]. We only stress the following: Counterexamples to safety take into account the safety property as well as disabled transitions. Search for predecessors is performed by going backwards on automaton edges, blocking counterexamples from preceding phases to prove an obligation in the current phase. Generalization is performed w.r.t. all incoming edges. As in
\(\text {PDR}^\forall \), proof obligations are constructed via diagrams [
12]; in our setting these include the interpretation for the view quantifiers (see [
24] for details).
Edge Covering Check in EPR. In our setting, Eqs. (
1), (
2) and (
4) fall in EPR, but not Eq. (
3). Thus, we restrict edge labeling so that each edge is labeled with a
\({\textit{TR}}\) of an
action
, together with an alternation-free precondition. It then suffices to check implications between the preconditions and the entire
\({\textit{TR}}\) (see the extended version [
24]). Such edge labeling is sufficiently expressive for all our examples. Alternatively, sound but incomplete bounded quantifier instantiation [
23] could be used, potentially allowing more complex decompositions of
\({\textit{TR}}\).
Absence of Inductive Phase Characterizations. What happens when the user gets the automaton wrong? One case is when there does not exist an inductive phase invariant with universal phase characterizations over the given structure. When this occurs, our tool can return an
abstract counterexample trace—a sequence of program transitions and transitions of the automaton (inspired by [
40,
49])—which constitutes a proof of that fact (see the extended version [
24]). The counterexample trace can assist the user in debugging the automaton or the program and modifying them. For instance, missing edges occurred frequently when we wrote the automata of Sect.
6, and we used the generated counterexample traces to correct them.
Another type of failure is when an inductive phase invariant exists but the automaton does not direct the search well towards it. In this case the user may decide to terminate the analysis and articulate a different intuition via a different phase structure. In standard inference procedures, the only way to affect the search is by modifying the transition system; instead, phase structures equip the user with an ability to guide the search.
6.2 Evaluation
We evaluate our approach for user-guided invariant inference by comparing Phase-
\(\text {PDR}^\forall \) to standard
\(\text {PDR}^\forall \). We implemented
\(\text {PDR}^\forall \) and Phase-
\(\text {PDR}^\forall \) in
mypyvy [
2], a new system for invariant inference inspired by Ivy [
45], over Z3 [
46]. We study:
1.
Can Phase-\(\text {PDR}^\forall \) converge to a proof when \(\text {PDR}^\forall \) does not (in reasonable time)?
2.
Is Phase-\(\text {PDR}^\forall \) faster than \(\text {PDR}^\forall \)?
3.
Which aspects of Phase-\(\text {PDR}^\forall \) contribute to its performance benefits?
Protocols. We applied
\(\text {PDR}^\forall \) and Phase-
\(\text {PDR}^\forall \) to the most challenging examples admitting universally-quantified invariants, which previous works verified using deductive techniques. The protocols we analyzed are listed below and in Table
1. The full models appear in [
1]. The KV-R protocol analyzed is taken from one of the two realistic systems studied by the IronFleet paper [
33] using deductive verification.
Phase Structures. The phase structures we used appear in [
1]. In all our examples, it was straightforward to translate the existing high-level intuition of important and relevant distinctions between phases in the protocol into the phase structures we report. For example, it took us less than an hour to finalize an automaton for KV-R. We emphasize that phase structures do not include phase characterizations; the user need not supply them, nor has to understand the inference procedure. Our exposition of the phase structures below refers to an intuitive meaning of each phase, but this is not part of the phase structure provided to the tool.
Table 1.
Running times in seconds of \(\text {PDR}^\forall \) and Phase-\(\text {PDR}^\forall \), presented as the mean and standard deviation (in parentheses) over 16 different Z3 random seeds. “\({}^*\)” indicates that some runs did not converge after 1 h and were not included in the summary statistics. “> 1 h” means that no runs of the algorithm converged in 1 h. #p refers to the number of phases and #v to the number of view quantifiers in the phase structure. #r refers to the number of relations and |a| to the maximal arity. The remaining columns describe the inductive invariant/phase invariant obtained in inference. |f| is the maximal frame reached. #c, #q are the mean number of clauses and quantifiers (excluding view quantifiers) per phase, ranging across the different phases.
Lock service (single lock) | 2.21 (00.03) | 0.67 (0.01) | 4 | 1 | 5 | 1 | 11 | 9 | 15 | 6 | 3–4 | 3–4 |
Lock service (multiple locks) | 2.73 (00.02) | 1.06 (0.01) | 4 | 1 | 5 | 2 | 11 | 9 | 24 | 6 | 4 | 3–4 |
Consensus | 60.54 (2.95) | 1355 (570)\(^*\) | 3 | 1 | 7 | 2 | 9 | 6 | 15 | 12 | 5–6 | 10–14 |
KV (basic) | 1.79 (0.02) | 1.59 (0.02) | 2 | 1 | 3 | 3 | 5 | 7 | 27 | 5 | 4 | 9–10 |
Ring leader | 152.44 (39.41) | 2.53 (0.04) | 2 | 2 | 4 | 3 | 6–7 | 6 | 11 | 5 | 1–2 | 0–1 |
KV-R | 2070 (370)\(^*\) | 372.5 (35.9) | 2 | 1 | 7 | 5 | 12–15 | 24 | 156 | 11–13 | 5–11 | 15–67 |
Cache coherence | >1 h | 90.1 (0.82) | 10 | 1 | 11 | 2 | n/a | n/a | n/a | 13 | 10–15 | 12–27 |
(1) Achieving Convergence Through Phases. In this section we consider the effect of phases on inference for examples on which standard \(\text {PDR}^\forall \) does not converge in 1 hr.
Examples. Sharded key-value store with retransmissions (KV-R): see Sect.
3 and Example
1. This protocol has not been modeled in decidable logic before.
Cache Coherence. This example implements the classic MESI protocol for maintaining cache coherence in a shared-memory multiprocessor [
36], modeled in decidable logic for the first time. Cores perform reads and writes to memory, and caches snoop on each other’s requests using a shared bus and maintain the invariant that there is at most one writer of a particular cache line. For simplicity, we consider only a single cache line, and yet the example is still challenging for
\(\text {PDR}^\forall \). Standard explanations of this protocol in the literature already use automata to describe this invariant, and we directly exploit this structure in our phase automaton.
Phase Structure: There are 10 phases in total, grouped into three parts corresponding to the modified, exclusive, and shared states in the classical description. Within each group, there are additional phases for when a request is being processed by the bus. For example, in the shared group, there are phases for handling reads by cores without a copy of the cache line, writes by such cores, and also writes by cores that
do have a copy. Overall, the phase structure is directly derived from textbook descriptions, taking into account that use of the shared bus is not atomic.
Results and Discussion. Measurements for these examples appear in Table
1. Standard
\(\text {PDR}^\forall \) fails to converge in less than an hour on 13 out of 16 seeds for KV-R and all 16 seeds for the cache. In contrast, Phase-
\(\text {PDR}^\forall \) converges to a proof in a few minutes in all cases. These results demonstrate that phase structures can effectively guide the search and obtain an invariant quickly where standard inductive invariant inference does not.
(2) Enhancing Performance Through Phases. In this section we consider the use of phase structures to improve the speed of convergence to a proof.
Examples. Distributed lock service, adapted from [
61], allows clients to acquire and release locks by sending requests to a central server, which guarantees that only one client holds each lock at a time.
Phase structure: for each lock, the phases follow the 4 steps by which a client completes a cycle of acquire and release. We also consider a simpler variant with only a single lock, reducing the arity of all relations and removing the need for an automaton view. Its
phase structure is the same, only for a single lock.
Simple quorum-based consensus, based on the example in [
60]. In this protocol, nodes propose themselves and then receive votes from other nodes. When a quorum of votes for a node is obtained, it becomes the leader and decides on a value. Safety requires that decided values are unique. The
phase structure distinguishes between the phases before any node is elected leader, once a node is elected, and when values are decided. Note that the automaton structure is unquantified.
Leader election in a ring [
13,
51], in which nodes are organized in a directional ring topology with unique IDs, and the safety property is that an elected leader is a node with the highest ID.
Phase structure: for a view of two nodes
\(n_1,n_2\), in the first phase, messages with the ID of
\(n_1\) are yet to advance in the ring past
\(n_2\), while in the second phase, a message advertising
\(n_1\) has advanced past
\(n_2\). The inferred characterizations include another quantifier on nodes, constraining interference (see Sect.
7).
Sharded key-value store (KV) is a simplified version of KV-R above, without message drops and the retransmission mechanism. The phase structure is exactly as in KV-R, omitting transitions related to sequence numbers and acknowledgment. This protocol has not been modeled in decidable logic before.
Results and Discussion. We compare the performance of standard
\(\text {PDR}^\forall \) and Phase-
\(\text {PDR}^\forall \) on the above examples, with results shown in Table
1. For each example, we ran the two algorithms on 16 different Z3 random seeds. Measurements were performed on a 3.4GHz AMD Ryzen Threadripper 1950X with 16 physical cores, running Linux 4.15.0, using Z3 version 4.7.1. By disabling hyperthreading and frequency scaling and pinning tasks to dedicated cores, variability across runs of a single seed was negligible.
In all but one example, Phase-
\(\text {PDR}^\forall \) improves performance, sometimes drastically; for example, performance for leader election in a ring is improved by a factor of 60. Phase-
\(\text {PDR}^\forall \) also improves the
robustness of inference [
27] on this example, as the standard deviation falls from 39 in
\(\text {PDR}^\forall \) to 0.04 in Phase-
\(\text {PDR}^\forall \).
The only example in which a phase structure actually diminishes inference effectiveness is simple consensus. We attribute this to an automaton structure that does not capture the essence of the correctness argument very well, overlooking votes and quorums. This demonstrates that a phase structure might guide the search towards counterproductive directions if the user guidance is “misleading”. This suggests that better resiliency of interactive inference framework could be achieved by combining phase-based inference with standard inductive invariant-based reasoning. We are not aware of a single “good” automaton for this example. The correctness argument of this example is better captured by the conjunction of two automata (one for votes and one for accumulating a quorum) with different views, but the problem of inferring phase invariants for mutually-dependent automata is a subject for future work.
(3) Anatomy of the Benefit of Phases. We now demonstrate that each of the beneficial aspects of phases discussed in Sect.
5.2 is important for the benefits reported above.
Phase Decomposition. Is there a benefit from a phase structure even without disabled transitions? An example to a positive answer to this question is leader election in a ring, which demonstrates a huge performance benefit even without disabled transitions.
Disabled Transitions. Is there a substantial gain from exploiting disabled transitions? We compare Phase-
\(\text {PDR}^\forall \) on the structure with disabled transitions and a structure obtained by (artificially) adding self loops labeled with the originally impossible transitions, on the example of lock service with multiple locks (Sect.
6.2), seeing that it demonstrates a performance benefit using Phase-
\(\text {PDR}^\forall \) and showcases several disabled transitions in each phase. The result is that without disabled transitions, the mean running time of Phase-
\(\text {PDR}^\forall \) on this example jumps from 2.73 s to 6.24 s. This demonstrates the utility of the additional safety properties encompassed in disabled transitions.
Phase-Awareness. Is it important to treat phases explicitly in the inference algorithm, as we do in Phase-
\(\text {PDR}^\forall \) (Sect.
6.1)? We compare our result on convergence of KV-R with an alternative in which standard
\(\text {PDR}^\forall \) is applied to an encoding of the phase decomposition and disabled transition by
ghost state: each phase is modeled by a relation over possible view assignments, and the model is augmented with update code mimicking phase changes; the additional safety properties derived from disabled transitions are provided; and the view and the appropriate modification of the safety property are introduced. This translation expresses all information present in the phase structure, but does not explicitly guide the inference algorithm to use this information. The result is that with this ghost-based modeling the phase-oblivious
\(\text {PDR}^\forall \) does not converge in 1 h on KV-R in any of the 16 runs, whereas it converges when Phase-
\(\text {PDR}^\forall \) explicitly directs the search using the phase structure.