Skip to main content
Erschienen in:
Buchtitelbild

Open Access 2019 | OriginalPaper | Buchkapitel

cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis

verfasst von : Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, Cesare Tinelli

Erschienen in: Computer Aided Verification

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present cvc4sy, a syntax-guided synthesis (SyGuS) solver based on three bounded term enumeration strategies. The first encodes term enumeration as an extension of the quantifier-free theory of algebraic datatypes. The second is based on a highly optimized brute-force algorithm. The third combines elements of the others. Our implementation of the strategies within the satisfiability modulo theories (SMT) solver cvc4 and a heuristic to choose between them leads to significant improvements over state-of-the-art SyGuS solvers.

1 Introduction

Syntax-guided synthesis (SyGuS) [3] is a recent paradigm for program synthesis, successfully used for applications in formal verification and programming languages. Most SyGuS solvers perform counterexample-guided inductive synthesis (CEGIS) [16]: a refinement loop in which a learner proposes solutions, and a verifier, generally a satisfiability modulo theories (SMT) solver [8, 9], checks them and provides counterexamples for failures. Generally, the learner enumerates some set of terms, while pruning spurious ones [17]. The simplicity and efficacy of enumerative SyGuS have made it the de facto approach for SyGuS, although alternatives exist for restricted fragments [4, 14].
In previous work [14], we have shown how the SMT solver cvc4  [5] can itself act as an efficient synthesizer. This tool paper focuses on recent advances in the enumerative subsolver of cvc4, culminating in the current SyGuS solver cvc4sy. Figure 1 shows its main components. The term enumerator is parameterized by an enumeration strategy chosen before solving: cvc4sy_S, whose constraint-based (smart) enumeration allows for numerous optimizations (Sect. 2); cvc4sy_F, based on a new approach for (fast) enumerative synthesis (Sect. 3), which has significant advantages with respect to the enumerative solver cvc4sy_S and other state-of-the-art approaches; and cvc4sy_H, based on a hybrid approach combining smart and fast enumeration (Sect. 4). All strategies are fully integrated in cvc4, meaning they support inputs in many background theories, including arithmetic, bit-vectors, strings, and floating point. We evaluate these approaches on a large set of benchmarks (Sect. 5).
The Problem. A syntax-guided synthesis problem for a function f in a background theory T consists of a set of semantic restrictions, or specification, for f given by a (second-order) T-formula of the form https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq1_HTML.gif , and a set of syntactic restrictions on the solutions for f, typically expressed as a context-free grammar. An enumerative approach to this problem combines a term enumerator and a solution verifier for solving synthesis conjectures. The role of the term enumerator is to output a stream of terms \(t_1, t_2, \ldots \) over some tuple \(\bar{x}\) of variables representing the inputs of f, where each \(t_i[\bar{x}]\) is a candidate solution. The role of the solution verifier is to check for each \(t_i\) whether it is a solution for f by determining if the negated conjecture https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq6_HTML.gif is unsatisfiable.
Bounded term generation considers terms based on an ordering such as term size (the number of non-nullary symbols in a term). For each \(k=0,1,2,\ldots \), the term enumerator outputs a finite set \(S_k\) of terms, each of size at most \(k\). Bounded term generation in cvc4sy is complete in the sense that, for any k, if f has a solution of size at most \(k\), then at least one of the terms in \(S_k\) is a solution for f. The effectiveness of an approach for (complete) bounded term generation can be evaluated based on two criteria: (i) the number of terms it generates and (ii) the rate at which it generates them.
We follow two approaches for enumerative SyGuS in cvc4sy, each optimized for one of the criteria above: a smart approach and a fast one. The first aims to generate reasonably quickly the smallest set of terms while maintaining completeness, while the second aims to generate terms as quickly as possible.
Technical Preliminaries. As we showed in previous work [14], syntactic restrictions can be conveniently represented as a set of (algebraic) datatypes, for which some SMT solvers have dedicated decision procedures [7, 13]. For instance, given a function https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq12_HTML.gif and the context-free grammar R below specifying what integer (\(I\)) and Boolean (\(B\)) terms can appear in candidate solutions for f:
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/MediaObjects/487309_1_En_5_Equ1_HTML.png
(1)
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/MediaObjects/487309_1_En_5_Equ2_HTML.png
(2)
our SyGuS solver generates the following mutually recursive datatypes:
$$\begin{aligned} \mathcal {I}= & {} \mathsf {0} \ \mid \ \mathsf {1} \ \mid \ \mathsf {x} \ \mid \ \mathsf {y} \ \mid \ \mathsf {plus}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {minus}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {ite}( \mathcal {B}, \mathcal {I}, \mathcal {I}) \end{aligned}$$
(3)
$$\begin{aligned} \mathcal {B}= & {} \mathsf {geq}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {eq}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {not}( \mathcal {B}) \ \mid \ \mathsf {and}( \mathcal {B}, \mathcal {B}) \end{aligned}$$
(4)
Each datatype constructor corresponds to a production rule of R, e.g. \(\mathsf {plus}\) corresponds to the rule \(I::= I+ I\). A datatype term such as \(\mathsf {plus}( \mathsf {x}, \mathsf {y} )\) represents the arithmetic term \(x+y\). We will use these datatypes as a running example.
For a datatype term t, we write \(\mathsf {is}_{\mathsf {C}}( t )\) to denote the discriminator predicate that is satisfied exactly when t is interpreted as a datatype whose top constructor is \(\mathsf {C}\). We write \(\mathsf {sel}^{\tau }_{n}( t )\) to denote a shared selector [15] applied to t, interpreted as the \(n^{ th }\) child of t with type \(\tau \) if one exists, and interpreted as an arbitrary element of \(\tau \) otherwise. A term consisting of zero or more consecutive nested applications of shared selectors applied to a term t is a shared selector chain (for t).

2 Smart Enumerative SyGuS

Our smart enumerative SyGuS approach cvc4sy_S, is based on finding solutions for an evolving set of constraints in an extension of the quantifier-free fragment of algebraic datatypes. These constraints are constructed to rule out many redundant solutions while not overconstraining the problem, potentially missing actual solutions.
In detail, candidate solutions for the function \(f: \tau _1 \rightarrow \tau _2\) to be synthesized are constructed by maintaining a set of constraints F, initially empty, for a first-order variable d ranging over the datatype representing \(\tau _2\). For example, consider again the function f with the syntactic restrictions expressed by the datatypes in Eqs. 3 and 4. If the term generator finds a model for F, it provides to the solution verifier the integer term which corresponds to the value of d in the model; for example, it provides \(x+1\) when d is interpreted as \(\mathsf {plus}( \mathsf {x}, \mathsf {1})\). In turn, if the solution verifier finds that \(x+1\) is not a solution, it provides the blocking constraint https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq30_HTML.gif , i.e., the datatype constraint that rules out the current value for d, which is then added to F. This is a syntactic constraint on future candidate solutions from the term generator. Its atoms are discriminators applied to shared selector chains.
cvc4sy_S uses a number of optimization techniques in addition to the basic loop above, which we describe in the remainder of this section. These techniques produce blocking constraints via the lemmas-on-demand paradigm [6] that eagerly rule out spurious candidates, prior to the solution verification step. Additionally, whenever possible, it strengthens blocking constraints via novel generalization techniques, with the effect of ruling out larger classes of candidates.
Blocking via Theory Rewriting with Structural Generalization. As we describe in previous work [14], the enumerative solver of cvc4 uses its rewriter as an oracle for discovering when candidate solutions are redundant. The motivation is that for any two equivalent terms t and s, only one of them needs to be checked with the solution verifier, since either both t and s are solutions to the synthesis conjecture or neither is. Given a term t, we write https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq31_HTML.gif to denote its rewritten form. Note that it is possible for equivalent terms not to have the same rewritten form. This is a consequence of the trade-offs in the implementation of cvc4 ’s rewriter, which must balance efficiency and completeness.
As an example, suppose that the term enumerator previously generated \(x+y\) and that d’s current value is the datatype term representing \(y+x\), where, however, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq34_HTML.gif . We first generate a blocking constraint template \(R[z]\) of the form https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq36_HTML.gif , where z is a fresh variable. This template is subsequently instantiated with https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq37_HTML.gif for any shared selector chain u of type \(\mathcal {I}\) that currently (or later) appears in F, starting with d itself. This has the effect of ruling out all candidate solutions that have \(y+x\) as a subterm, which is justified by the fact that each such term is equivalent to one in which all occurrences of \(y+x\) are replaced by \(x+y\).
We employ a refinement of this technique, which we call theory rewriting with structural generalization, which searches for and then blocks only the minimal skeleton of the term under test that is sufficient for determining its rewritten form. For example, consider the if-then-else term https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq42_HTML.gif , This term is equivalent to x, regardless of the value of predicate https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq43_HTML.gif . This can be confirmed by the rewriter by computing that https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq44_HTML.gif where w is a fresh Boolean variable. Then, instead of generating a constraint that blocks only (the datatype value corresponding to) t, we generate a stronger constraint that does not depend on the subterm https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq45_HTML.gif . In other words, this blocking constraint rules out all candidate solutions that contain the subterm https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq46_HTML.gif , for any term w. We compute these generalizations using a recursive algorithm that iteratively replaces each subterm of the current candidate with a fresh variable, and checks whether its rewritten form remains the same.
Blocking via CEGIS with Structural Generalization. Synthesis solvers based on CEGIS maintain a list of refinement points that witness the infeasibility of previous candidate solutions. That is, given a synthesis conjecture https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq47_HTML.gif , the solver maintains a growing list \(\bar{p}_1, \ldots , \bar{p}_n\) of values for \(\bar{x}\) that witness the infeasibility of previous candidates \(u_1, \ldots , u_n\) for f. Then, when a new candidate u is generated, we first check whether \(\varphi [ u,\, \bar{p}_i ]\) is false for some https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq52_HTML.gif . When a candidate u fails to satisfy \(\varphi [ u, \bar{p}_i ]\), cvc4sy_S further applies a form of generalization analogous to the structural generalization described above. We call this CEGIS with structural generalization, where the goal is to find the minimal skeleton of u that also fails to satisfy some refinement point.
For example, suppose f is the function to synthesize, \(\varphi \) includes the constraint https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq55_HTML.gif , and \(p_1 = (3,3)\) is a refinement point. Then, the candidate term https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq57_HTML.gif will be discarded, because https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq58_HTML.gif . Notice, however, that any candidate https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq59_HTML.gif is falsified by \(p_1\), regardless of what w is, since https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq61_HTML.gif is equivalent to https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq62_HTML.gif . This indicates that we can block all \(\mathsf {ite}\) candidate terms with condition https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq64_HTML.gif and true branch x. We can express this constraint in cvc4sy_S by dropping the disjuncts that relate to the false branch of the \(\mathsf {ite}\) term. This form of blocking is particularly useful when synthesizing multiple functions \(( f_1,\) \(\ldots , f_n )\), since it is often the case that a candidate for a single \(f_i\) is already sufficient to falsify the specification, regardless of what the candidates for the other functions are.
Evaluation Unfolding. This technique uses evaluation functions to encode the relationship between the datatype terms assigned to d and their analogs in the theory T. For example, the evaluation function for the datatype \(\mathcal {I}\) defined in (3) is a function https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq70_HTML.gif defined axiomatically so that \(\mathsf {E}_{\mathcal {I}}(d,m,n)\) denotes the result of evaluating d by interpreting any occurrences of \(\mathsf {x}\) and \(\mathsf {y}\) in d respectively as m and n and interpreting the other constructors as the corresponding arithmetic/Boolean operators, e.g. \(\mathsf {E}_{\mathcal {I}}( \mathsf {minus}( \mathsf {x}, \mathsf {y} ), 5, 3 )\) is interpreted as 2. When a refinement point \(\bar{c}\) is generated, we add a constraint requiring that the evaluation of d at \(\bar{c}\) must satisfy the specification. For example, for conjecture https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq77_HTML.gif , and refinement point https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq78_HTML.gif , we add the constraint https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq79_HTML.gif . Then, when a literal \(\mathsf {is}_{\mathsf {C}}( t )\) is asserted for a term t of type \(\mathcal {I}\), we can add a constraint corresponding to the one-step unfolding of the evaluation of t. Specifically, when \(\mathsf {is}_{\mathsf {ite}}( d )\) is asserted, we generate the constraint indicating that the evaluation of d on point (2, 1) indeed behaves like an \(\mathsf {ite}\) term when d has top symbol \(\mathsf {ite}\). Our implementation adds these constraints for all terms t whose top symbols correspond to \(\mathsf {ite}\) or Boolean connectives. For terms t whose top symbol is any of the other operators, we add constraints corresponding to their total evaluation of t when the value of t is fully determined, for example, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq86_HTML.gif . Notice this constraint with \(t=d\) along with the refinement constraint https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq88_HTML.gif suffices to show that d cannot be \(\mathsf {plus}( \mathsf {x}, \mathsf {y} )\).

3 Fast Enumerative SyGuS

The techniques in the previous section prune the search space so that often, only a small subset of the entire possible set of terms is considered for a given term size bound. The main bottleneck, however, is managing the large number of blocking constraints generated. Moreover, the benefits of this approach are limited when the grammar or specification does not admit opportunities for generalization.
For this reason, we have also developed cvc4sy_F, which, in the spirit of other SyGuS solvers (notably ESolver  [17]), relies on a principled brute-force approach for term generation. In contrast to other solvers, however, which are built as layers on top of the core SMT reasoner, cvc4sy_F is fully integrated as a subsolver of cvc4, so communication with other components has almost no overhead. This technique, fast enumerative synthesis, does not use constraint solving to generate new terms. As a result, the majority of optimizations from Sect. 2 are incompatible with it.
Algorithm. To generate terms up to a given size k, we maintain a set \(S^k_\tau \) of terms of type \(\tau \) and size k for each datatype \(\tau \) corresponding to a non-terminal symbol of our input grammar R. First, we compute for each such \(\tau \) the set \(\mathcal {C}_\tau \) of its constructor classes, an equivalence relation over the constructors of \(\tau \) that groups them by their type. For example, the constructor classes for \(\mathcal {I}\) are \(\{ \mathsf {x}, \mathsf {y}, \mathsf {0}, \mathsf {1} \}\), \(\{ \mathsf {plus}, \mathsf {minus} \}\) and \(\{ \mathsf {ite} \}\). Then, we use the following procedure for generating all terms of size \(k\) for type \(\tau \): The recursive procedure FastEnum(\(\tau \), \(k\)) populates the set \(S^{k}_{\tau }\) of all terms of type \(\tau \) with size \(k\). These sets are cached globally. We incorporate an optimization that only adds terms \(\mathsf {C}( t_1, \ldots , t_n )\) to \(S^{k}_{\tau }\) whose corresponding terms in the theory T are unique up to rewriting. This mimics the effect of blocking via theory rewriting as described in Sect. 2. For example, \(\mathsf {plus}( \mathsf {y}, \mathsf {x} )\) is not added to \(S^{1}_{\mathcal {I}}\) if that set already contains \(\mathsf {plus}( \mathsf {x}, \mathsf {y} )\), noting that https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq126_HTML.gif . By construction of \(S^{k}_{\tau }\) for https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq128_HTML.gif , this has the cascading effect of excluding all terms having \(y+x\) as a subterm.
We observe that theory rewriting with structural generalization cannot be easily incorporated into this scheme since it requires the use of a constraint solver, something that the above algorithm seeks to avoid.

4 Hybrid Approach: Variable-Agnostic Enumerative SyGuS

We follow a third approach, in solver cvc4sy_H, that combines elements of the previous approaches. The idea is to use the (smart) approach from Sect. 2 to generate terms, but then generate multiple candidate solutions from each term using a fast subprocedure we call a concretizer. We implement an instance of this scheme, which we call variable-agnostic term generation, that produces only terms that are unique modulo alpha-equivalence. In our running example, when a term t such as \(x+1\) is produced, the concretizer produces all terms generated by the grammar R that are alpha-equivalent to t, namely, \(\{ x+1, y+1 \}\) in this case. The advantage of this approach is that cvc4sy_H can block any term whose variables are not canonically ordered; that is, assuming for instance that https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq132_HTML.gif , it may block terms like \(1-y\) and \(y+y\), noting they are alpha-equivalent to \(1-x\) and \(x+x\), respectively. To implement this blocking scheme, we introduce unary Boolean predicates \(pre_{x}\) and \(post_{x}\) for each variable x in our grammar, where \(pre_{x}\) (resp., \(post_{x}\)) holds for t if and only if variable x occurs in a depth-first left-to-right traversal of our candidate term before (resp., after) traversing to the position indicated by the selector chain t. We encode the semantics of these predicates based on the arguments of constructors in our signature, e.g. https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq141_HTML.gif . We then assert that \(pre_{x}\) and \(pre_{y}\) are false for our top-level variable d, and require https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq144_HTML.gif for all z, stating that x must come before y in the traversal of any generated term.
This technique is useful for grammars with many variables, such as grammars in invariant synthesis problems, where the number of terms of small size is prohibitively large. Blocking based on theory rewriting (with generalization) from Sect. 2 is compatible with this technique and is used in cvc4sy_H. However, the other optimizations are disabled, since they prune solutions in a way that is not agnostic to variables.

5 Evaluation

We evaluated the above techniques in cvc4sy on four benchmark sets: invariant synthesis benchmarks from the verification of Lustre [11] models; a set from work on synthesizing invertibility conditions for bit-vector operators [12] (IC-BV); a set of bit-vector invariant synthesis problems [2] (CegisT); and the SyGuS-COMP 2018 [1] benchmarks from five tracks: assorted problems (General), conditional linear arithmetic problems (CLIA), invariant synthesis problems (INV), and programming-by-examples problems [10] with a set over bit-vectors (PBE-BV) and another over strings (PBE-Str). We also considered separately the CrCi subset from General, which corresponds to cryptographic circuit synthesis. We ran our experiments on a cluster equipped with Intel E5-2637 v4 CPUs running Ubuntu 16.04, providing one core, 1800 s, and 8 GB RAM for each job. Results are summarized in Table 1 and Fig. 2. We denote the strategies from Sects. 2, 3, and 4 by s, f and h, respectively (smart, fast, and hybrid); disabling the optimizations from Sect. 2 is marked by “-” and the suffixes r (rewriting), rg (rewriting with structural generalization), cg (CEGIS with structural generalization), and eu (evaluation unfolding). We also evaluated two meta-strategies of cvc4sy: a and a+si. The auto strategy a picks a strategy based on the properties of the problem: f for PBE problems and for problems without the Boolean type or the \(\mathsf {ite}\) operator in their grammar and s otherwise. Strategy a+si uses the single-invocation solver [14] on problems that are amenable to quantifier elimination and a otherwise. We use the state-of-the-art SyGuS solver EUSolver  [4] (EUS) as a baseline, but only for SyGuS-COMP benchmarks due to limitations in its parser.
Table 1.
Summary of number of problems solved per benchmark set. Best results are in bold.
Set
#
a+si
a
s
s-cg
s-eu
s-rg
s-r
f
f-r
h
h-rg
h-r
EUS
General
413
293
237
228
229
232
230
220
237
226
221
225
213
290
Gen-CrCi
214
159
159
159
159
143
159
159
155
132
130
137
125
152
CLIA
88
86
20
20
19
19
19
18
20
16
16
16
16
85
INV
127
109
109
109
109
109
109
109
110
109
109
109
109
68
PBE-BV
753
751
751
721
721
721
721
628
751
717
721
721
628
745
PBE-Str
109
105
105
104
104
104
87
75
105
103
102
87
75
74
Subtotal
1704
1503
1381
1341
1341
1328
1325
1209
1378
1303
1299
1295
1166
1414
IC-BV
160
135
135
135
132
130
130
133
138
132
128
126
127
 
CegisT
79
56
43
43
43
43
42
41
42
42
42
42
41
 
Lustre
485
255
255
255
255
218
211
221
231
213
248
244
234
 
Total
2428
1949
1814
1774
1771
1719
1708
1604
1789
1690
1717
1707
1568
 
Overall, strategy s excels on more challenging benchmark sets such as Lustre and Gen-Crci, while strategy f excels on the majority of the others. The gains for f are especially significant on PBE problems, where it outperforms both s and EUS by several orders of magnitude. Such gains are significant given that cvc4 won this track at SyGuS-COMP 2018 by employing s alone, and a variant of EUS won it in 2017. This result can be explained as a consequence of two factors. First, the string and bit-vector grammars contain many operators with the same type, making the constructor class optimization of the f algorithm very effective. Second, although not described in this paper, all solvers in our evaluation use divide-and-conquer algorithms for PBE problems [4], which are not compatible with the optimizations cg and eu. The most important optimization for all cvc4sy strategies and with all benchmark sets is r. The optimization eu is especially effective when grammars contain \(\mathsf {ite}\) and Boolean connectives, such as those in the Lustre set and in some subsets of General, on which we can see the biggest gains of s with respect to s-eu; cg is more helpful for IC-BV, with a few harder benchmarks only solved due to this technique.
The first scatter plot in Fig. 2 shows the advantage of h over s on Lustre, a benchmark set containing invariant synthesis problems with dozens of variables. We remark this configuration excels at quickly finding small solutions for problems with many variables, although solves fewer problems overall. The second scatter plot shows that while s takes significantly longer on easy problems, it outperforms f in the long run. The last two plots show that f significantly outperforms the state of the art on PBE benchmarks.
For all benchmark sets, the auto strategy a chooses the best enumerative strategy of cvc4sy with only a few exceptions, and hence it is the default configuration of cvc4sy. Due to specialized synthesis techniques [4, 14], both a+si and EUS outperform the purely enumerative strategies of cvc4. This is reflected in the cactus plot on the commonly supported benchmark sets, where a and f solve more benchmarks than EUS for lower times but then EUS solves more benchmarks in the end. For a+si, the cactus plot shows that it outperforms EUS significantly. Nevertheless, we remark that a+si is able to solve only 393 (16%) of the overall benchmarks using only single invocation techniques. Hence, we conclude that both smart and fast enumerative strategies are critical subcomponents in our approach to syntax-guided synthesis.

Acknowledgments

This work was partially supported by the National Science Foundation under award 1656926 and by the Defense Advanced Research Projects Agency under award FA8650-18-2-7854.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Literatur
3.
Zurück zum Zitat Alur, R., et al.: Syntax-guided synthesis. In: Irlbeck, M., Peled, D.A., Pretschner, A., (eds.) Dependable Software Systems Engineering. NATO Science for Peace and Security Series, D: Information and Communication Security, vol. 40, pp. 1–25. IOS Press (2015) Alur, R., et al.: Syntax-guided synthesis. In: Irlbeck, M., Peled, D.A., Pretschner, A., (eds.) Dependable Software Systems Engineering. NATO Science for Peace and Security Series, D: Information and Communication Security, vol. 40, pp. 1–25. IOS Press (2015)
7.
Zurück zum Zitat Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for satisfiability in the theory of recursive data types. Electr. Notes Theor. Comput. Sci. 174(8), 23–37 (2007)CrossRef Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for satisfiability in the theory of recursive data types. Electr. Notes Theor. Comput. Sci. 174(8), 23–37 (2007)CrossRef
9.
Zurück zum Zitat Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185, pp. 825–885. IOS Press (2009) Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185, pp. 825–885. IOS Press (2009)
11.
Zurück zum Zitat Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)CrossRef Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)CrossRef
16.
Zurück zum Zitat Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs, pp. 404–415. ACM (2006) Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs, pp. 404–415. ACM (2006)
17.
Zurück zum Zitat Udupa, A., Raghavan, A., Deshmukh, J.V., Mador-Haim, S., Martin, M.M.K., Alur, R.: TRANSIT: specifying protocols with concolic snippets. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2013, Seattle, 16–19 June 2013, pp. 287–296 (2013) Udupa, A., Raghavan, A., Deshmukh, J.V., Mador-Haim, S., Martin, M.M.K., Alur, R.: TRANSIT: specifying protocols with concolic snippets. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2013, Seattle, 16–19 June 2013, pp. 287–296 (2013)
Metadaten
Titel
cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis
verfasst von
Andrew Reynolds
Haniel Barbosa
Andres Nötzli
Clark Barrett
Cesare Tinelli
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-25543-5_5

Premium Partner