nach oben

Erschienen in:

Open Access 2019 | OriginalPaper | Buchkapitel

cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis

verfasst von : Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, Cesare Tinelli

Erschienen in: Computer Aided Verification

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

We present cvc4sy, a syntax-guided synthesis (SyGuS) solver based on three bounded term enumeration strategies. The first encodes term enumeration as an extension of the quantifier-free theory of algebraic datatypes. The second is based on a highly optimized brute-force algorithm. The third combines elements of the others. Our implementation of the strategies within the satisfiability modulo theories (SMT) solver cvc4 and a heuristic to choose between them leads to significant improvements over state-of-the-art SyGuS solvers.

1 Introduction

Syntax-guided synthesis (SyGuS) [3] is a recent paradigm for program synthesis, successfully used for applications in formal verification and programming languages. Most SyGuS solvers perform counterexample-guided inductive synthesis (CEGIS) [16]: a refinement loop in which a learner proposes solutions, and a verifier, generally a satisfiability modulo theories (SMT) solver [8, 9], checks them and provides counterexamples for failures. Generally, the learner enumerates some set of terms, while pruning spurious ones [17]. The simplicity and efficacy of enumerative SyGuS have made it the de facto approach for SyGuS, although alternatives exist for restricted fragments [4, 14].

In previous work [14], we have shown how the SMT solver cvc4 [5] can itself act as an efficient synthesizer. This tool paper focuses on recent advances in the enumerative subsolver of cvc4, culminating in the current SyGuS solver cvc4sy. Figure 1 shows its main components. The term enumerator is parameterized by an enumeration strategy chosen before solving: cvc4sy_S, whose constraint-based (smart) enumeration allows for numerous optimizations (Sect. 2); cvc4sy_F, based on a new approach for (fast) enumerative synthesis (Sect. 3), which has significant advantages with respect to the enumerative solver cvc4sy_S and other state-of-the-art approaches; and cvc4sy_H, based on a hybrid approach combining smart and fast enumeration (Sect. 4). All strategies are fully integrated in cvc4, meaning they support inputs in many background theories, including arithmetic, bit-vectors, strings, and floating point. We evaluate these approaches on a large set of benchmarks (Sect. 5).

The Problem. A syntax-guided synthesis problem for a function f in a background theory T consists of a set of semantic restrictions, or specification, for f given by a (second-order) T-formula of the form

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq1_HTML.gif

, and a set of syntactic restrictions on the solutions for f, typically expressed as a context-free grammar. An enumerative approach to this problem combines a term enumerator and a solution verifier for solving synthesis conjectures. The role of the term enumerator is to output a stream of terms $t_1, t_2, \ldots $ over some tuple $\bar{x}$ of variables representing the inputs of f, where each $t_i[\bar{x}]$ is a candidate solution. The role of the solution verifier is to check for each $t_i$ whether it is a solution for f by determining if the negated conjecture

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq6_HTML.gif

is unsatisfiable.

Bounded term generation considers terms based on an ordering such as term size (the number of non-nullary symbols in a term). For each $k=0,1,2,\ldots $, the term enumerator outputs a finite set $S_k$ of terms, each of size at most $k$. Bounded term generation in cvc4sy is complete in the sense that, for any k, if f has a solution of size at most $k$, then at least one of the terms in $S_k$ is a solution for f. The effectiveness of an approach for (complete) bounded term generation can be evaluated based on two criteria: (i) the number of terms it generates and (ii) the rate at which it generates them.

We follow two approaches for enumerative SyGuS in cvc4sy, each optimized for one of the criteria above: a smart approach and a fast one. The first aims to generate reasonably quickly the smallest set of terms while maintaining completeness, while the second aims to generate terms as quickly as possible.

Technical Preliminaries. As we showed in previous work [14], syntactic restrictions can be conveniently represented as a set of (algebraic) datatypes, for which some SMT solvers have dedicated decision procedures [7, 13]. For instance, given a function

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq12_HTML.gif

and the context-free grammar R below specifying what integer ($I$) and Boolean ($B$) terms can appear in candidate solutions for f:

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/MediaObjects/487309_1_En_5_Equ1_HTML.png

(1)

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/MediaObjects/487309_1_En_5_Equ2_HTML.png

(2)

our SyGuS solver generates the following mutually recursive datatypes:

$$\begin{aligned} \mathcal {I}= & {} \mathsf {0} \ \mid \ \mathsf {1} \ \mid \ \mathsf {x} \ \mid \ \mathsf {y} \ \mid \ \mathsf {plus}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {minus}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {ite}( \mathcal {B}, \mathcal {I}, \mathcal {I}) \end{aligned}$$

(3)

$$\begin{aligned} \mathcal {B}= & {} \mathsf {geq}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {eq}( \mathcal {I}, \mathcal {I}) \ \mid \ \mathsf {not}( \mathcal {B}) \ \mid \ \mathsf {and}( \mathcal {B}, \mathcal {B}) \end{aligned}$$

(4)

Each datatype constructor corresponds to a production rule of R, e.g. $\mathsf {plus}$ corresponds to the rule $I::= I+ I$. A datatype term such as $\mathsf {plus}( \mathsf {x}, \mathsf {y} )$ represents the arithmetic term $x+y$. We will use these datatypes as a running example.

For a datatype term t, we write $\mathsf {is}_{\mathsf {C}}( t )$ to denote the discriminator predicate that is satisfied exactly when t is interpreted as a datatype whose top constructor is $\mathsf {C}$. We write $\mathsf {sel}^{\tau }_{n}( t )$ to denote a shared selector [15] applied to t, interpreted as the $n^{ th }$ child of t with type $\tau $ if one exists, and interpreted as an arbitrary element of $\tau $ otherwise. A term consisting of zero or more consecutive nested applications of shared selectors applied to a term t is a shared selector chain (for t).

2 Smart Enumerative SyGuS

Our smart enumerative SyGuS approach cvc4sy_S, is based on finding solutions for an evolving set of constraints in an extension of the quantifier-free fragment of algebraic datatypes. These constraints are constructed to rule out many redundant solutions while not overconstraining the problem, potentially missing actual solutions.

In detail, candidate solutions for the function $f: \tau _1 \rightarrow \tau _2$ to be synthesized are constructed by maintaining a set of constraints F, initially empty, for a first-order variable d ranging over the datatype representing $\tau _2$. For example, consider again the function f with the syntactic restrictions expressed by the datatypes in Eqs. 3 and 4. If the term generator finds a model for F, it provides to the solution verifier the integer term which corresponds to the value of d in the model; for example, it provides $x+1$ when d is interpreted as $\mathsf {plus}( \mathsf {x}, \mathsf {1})$. In turn, if the solution verifier finds that $x+1$ is not a solution, it provides the blocking constraint

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq30_HTML.gif

, i.e., the datatype constraint that rules out the current value for d, which is then added to F. This is a syntactic constraint on future candidate solutions from the term generator. Its atoms are discriminators applied to shared selector chains.

cvc4sy_S uses a number of optimization techniques in addition to the basic loop above, which we describe in the remainder of this section. These techniques produce blocking constraints via the lemmas-on-demand paradigm [6] that eagerly rule out spurious candidates, prior to the solution verification step. Additionally, whenever possible, it strengthens blocking constraints via novel generalization techniques, with the effect of ruling out larger classes of candidates.

Blocking via Theory Rewriting with Structural Generalization. As we describe in previous work [14], the enumerative solver of cvc4 uses its rewriter as an oracle for discovering when candidate solutions are redundant. The motivation is that for any two equivalent terms t and s, only one of them needs to be checked with the solution verifier, since either both t and s are solutions to the synthesis conjecture or neither is. Given a term t, we write

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq31_HTML.gif

to denote its rewritten form. Note that it is possible for equivalent terms not to have the same rewritten form. This is a consequence of the trade-offs in the implementation of cvc4 ’s rewriter, which must balance efficiency and completeness.

As an example, suppose that the term enumerator previously generated $x+y$ and that d’s current value is the datatype term representing $y+x$, where, however,

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq34_HTML.gif

. We first generate a blocking constraint template $R[z]$ of the form

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq36_HTML.gif

, where z is a fresh variable. This template is subsequently instantiated with

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq37_HTML.gif

for any shared selector chain u of type $\mathcal {I}$ that currently (or later) appears in F, starting with d itself. This has the effect of ruling out all candidate solutions that have $y+x$ as a subterm, which is justified by the fact that each such term is equivalent to one in which all occurrences of $y+x$ are replaced by $x+y$.

We employ a refinement of this technique, which we call theory rewriting with structural generalization, which searches for and then blocks only the minimal skeleton of the term under test that is sufficient for determining its rewritten form. For example, consider the if-then-else term

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq42_HTML.gif

, This term is equivalent to x, regardless of the value of predicate

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq43_HTML.gif

. This can be confirmed by the rewriter by computing that

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq44_HTML.gif

where w is a fresh Boolean variable. Then, instead of generating a constraint that blocks only (the datatype value corresponding to) t, we generate a stronger constraint that does not depend on the subterm

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq45_HTML.gif

. In other words, this blocking constraint rules out all candidate solutions that contain the subterm

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq46_HTML.gif

, for any term w. We compute these generalizations using a recursive algorithm that iteratively replaces each subterm of the current candidate with a fresh variable, and checks whether its rewritten form remains the same.

Blocking via CEGIS with Structural Generalization. Synthesis solvers based on CEGIS maintain a list of refinement points that witness the infeasibility of previous candidate solutions. That is, given a synthesis conjecture

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq47_HTML.gif

, the solver maintains a growing list $\bar{p}_1, \ldots , \bar{p}_n$ of values for $\bar{x}$ that witness the infeasibility of previous candidates $u_1, \ldots , u_n$ for f. Then, when a new candidate u is generated, we first check whether $\varphi [ u,\, \bar{p}_i ]$ is false for some

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq52_HTML.gif

. When a candidate u fails to satisfy $\varphi [ u, \bar{p}_i ]$, cvc4sy_S further applies a form of generalization analogous to the structural generalization described above. We call this CEGIS with structural generalization, where the goal is to find the minimal skeleton of u that also fails to satisfy some refinement point.

For example, suppose f is the function to synthesize, $\varphi $ includes the constraint

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq55_HTML.gif

, and $p_1 = (3,3)$ is a refinement point. Then, the candidate term

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq57_HTML.gif

will be discarded, because

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq58_HTML.gif

. Notice, however, that any candidate

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq59_HTML.gif

is falsified by $p_1$, regardless of what w is, since

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq61_HTML.gif

is equivalent to

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq62_HTML.gif

. This indicates that we can block all $\mathsf {ite}$ candidate terms with condition

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq64_HTML.gif

and true branch x. We can express this constraint in cvc4sy_S by dropping the disjuncts that relate to the false branch of the $\mathsf {ite}$ term. This form of blocking is particularly useful when synthesizing multiple functions $( f_1,$ $\ldots , f_n )$, since it is often the case that a candidate for a single $f_i$ is already sufficient to falsify the specification, regardless of what the candidates for the other functions are.

Evaluation Unfolding. This technique uses evaluation functions to encode the relationship between the datatype terms assigned to d and their analogs in the theory T. For example, the evaluation function for the datatype $\mathcal {I}$ defined in (3) is a function

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq70_HTML.gif

defined axiomatically so that $\mathsf {E}_{\mathcal {I}}(d,m,n)$ denotes the result of evaluating d by interpreting any occurrences of $\mathsf {x}$ and $\mathsf {y}$ in d respectively as m and n and interpreting the other constructors as the corresponding arithmetic/Boolean operators, e.g. $\mathsf {E}_{\mathcal {I}}( \mathsf {minus}( \mathsf {x}, \mathsf {y} ), 5, 3 )$ is interpreted as 2. When a refinement point $\bar{c}$ is generated, we add a constraint requiring that the evaluation of d at $\bar{c}$ must satisfy the specification. For example, for conjecture

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq77_HTML.gif

, and refinement point

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq78_HTML.gif

, we add the constraint

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq79_HTML.gif

. Then, when a literal $\mathsf {is}_{\mathsf {C}}( t )$ is asserted for a term t of type $\mathcal {I}$, we can add a constraint corresponding to the one-step unfolding of the evaluation of t. Specifically, when $\mathsf {is}_{\mathsf {ite}}( d )$ is asserted, we generate the constraint indicating that the evaluation of d on point (2, 1) indeed behaves like an $\mathsf {ite}$ term when d has top symbol $\mathsf {ite}$. Our implementation adds these constraints for all terms t whose top symbols correspond to $\mathsf {ite}$ or Boolean connectives. For terms t whose top symbol is any of the other operators, we add constraints corresponding to their total evaluation of t when the value of t is fully determined, for example,

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq86_HTML.gif

. Notice this constraint with $t=d$ along with the refinement constraint

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq88_HTML.gif

suffices to show that d cannot be $\mathsf {plus}( \mathsf {x}, \mathsf {y} )$.

3 Fast Enumerative SyGuS

The techniques in the previous section prune the search space so that often, only a small subset of the entire possible set of terms is considered for a given term size bound. The main bottleneck, however, is managing the large number of blocking constraints generated. Moreover, the benefits of this approach are limited when the grammar or specification does not admit opportunities for generalization.

For this reason, we have also developed cvc4sy_F, which, in the spirit of other SyGuS solvers (notably ESolver [17]), relies on a principled brute-force approach for term generation. In contrast to other solvers, however, which are built as layers on top of the core SMT reasoner, cvc4sy_F is fully integrated as a subsolver of cvc4, so communication with other components has almost no overhead. This technique, fast enumerative synthesis, does not use constraint solving to generate new terms. As a result, the majority of optimizations from Sect. 2 are incompatible with it.

Algorithm. To generate terms up to a given size k, we maintain a set $S^k_\tau $ of terms of type $\tau $ and size k for each datatype $\tau $ corresponding to a non-terminal symbol of our input grammar R. First, we compute for each such $\tau $ the set $\mathcal {C}_\tau $ of its constructor classes, an equivalence relation over the constructors of $\tau $ that groups them by their type. For example, the constructor classes for $\mathcal {I}$ are $\{ \mathsf {x}, \mathsf {y}, \mathsf {0}, \mathsf {1} \}$, $\{ \mathsf {plus}, \mathsf {minus} \}$ and $\{ \mathsf {ite} \}$. Then, we use the following procedure for generating all terms of size $k$ for type $\tau $: The recursive procedure FastEnum($\tau $, $k$) populates the set $S^{k}_{\tau }$ of all terms of type $\tau $ with size $k$. These sets are cached globally. We incorporate an optimization that only adds terms $\mathsf {C}( t_1, \ldots , t_n )$ to $S^{k}_{\tau }$ whose corresponding terms in the theory T are unique up to rewriting. This mimics the effect of blocking via theory rewriting as described in Sect. 2. For example, $\mathsf {plus}( \mathsf {y}, \mathsf {x} )$ is not added to $S^{1}_{\mathcal {I}}$ if that set already contains $\mathsf {plus}( \mathsf {x}, \mathsf {y} )$, noting that

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq126_HTML.gif

. By construction of $S^{k}_{\tau }$ for

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq128_HTML.gif

, this has the cascading effect of excluding all terms having $y+x$ as a subterm.

We observe that theory rewriting with structural generalization cannot be easily incorporated into this scheme since it requires the use of a constraint solver, something that the above algorithm seeks to avoid.

4 Hybrid Approach: Variable-Agnostic Enumerative SyGuS

We follow a third approach, in solver cvc4sy_H, that combines elements of the previous approaches. The idea is to use the (smart) approach from Sect. 2 to generate terms, but then generate multiple candidate solutions from each term using a fast subprocedure we call a concretizer. We implement an instance of this scheme, which we call variable-agnostic term generation, that produces only terms that are unique modulo alpha-equivalence. In our running example, when a term t such as $x+1$ is produced, the concretizer produces all terms generated by the grammar R that are alpha-equivalent to t, namely, $\{ x+1, y+1 \}$ in this case. The advantage of this approach is that cvc4sy_H can block any term whose variables are not canonically ordered; that is, assuming for instance that

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq132_HTML.gif

, it may block terms like $1-y$ and $y+y$, noting they are alpha-equivalent to $1-x$ and $x+x$, respectively. To implement this blocking scheme, we introduce unary Boolean predicates $pre_{x}$ and $post_{x}$ for each variable x in our grammar, where $pre_{x}$ (resp., $post_{x}$) holds for t if and only if variable x occurs in a depth-first left-to-right traversal of our candidate term before (resp., after) traversing to the position indicated by the selector chain t. We encode the semantics of these predicates based on the arguments of constructors in our signature, e.g.

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq141_HTML.gif

. We then assert that $pre_{x}$ and $pre_{y}$ are false for our top-level variable d, and require

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-25543-5_5/487309_1_En_5_IEq144_HTML.gif

for all z, stating that x must come before y in the traversal of any generated term.

This technique is useful for grammars with many variables, such as grammars in invariant synthesis problems, where the number of terms of small size is prohibitively large. Blocking based on theory rewriting (with generalization) from Sect. 2 is compatible with this technique and is used in cvc4sy_H. However, the other optimizations are disabled, since they prune solutions in a way that is not agnostic to variables.

5 Evaluation

We evaluated the above techniques in cvc4sy on four benchmark sets: invariant synthesis benchmarks from the verification of Lustre [11] models; a set from work on synthesizing invertibility conditions for bit-vector operators [12] (IC-BV); a set of bit-vector invariant synthesis problems [2] (CegisT); and the SyGuS-COMP 2018 [1] benchmarks from five tracks: assorted problems (General), conditional linear arithmetic problems (CLIA), invariant synthesis problems (INV), and programming-by-examples problems [10] with a set over bit-vectors (PBE-BV) and another over strings (PBE-Str). We also considered separately the CrCi subset from General, which corresponds to cryptographic circuit synthesis. We ran our experiments on a cluster equipped with Intel E5-2637 v4 CPUs running Ubuntu 16.04, providing one core, 1800 s, and 8 GB RAM for each job. Results are summarized in Table 1 and Fig. 2. We denote the strategies from Sects. 2, 3, and 4 by s, f and h, respectively (smart, fast, and hybrid); disabling the optimizations from Sect. 2 is marked by “-” and the suffixes r (rewriting), rg (rewriting with structural generalization), cg (CEGIS with structural generalization), and eu (evaluation unfolding). We also evaluated two meta-strategies of cvc4sy: a and a+si. The auto strategy a picks a strategy based on the properties of the problem: f for PBE problems and for problems without the Boolean type or the $\mathsf {ite}$ operator in their grammar and s otherwise. Strategy a+si uses the single-invocation solver [14] on problems that are amenable to quantifier elimination and a otherwise. We use the state-of-the-art SyGuS solver EUSolver [4] (EUS) as a baseline, but only for SyGuS-COMP benchmarks due to limitations in its parser.

Table 1.

Summary of number of problems solved per benchmark set. Best results are in bold.

Set	#	a+si	a	s	s-cg	s-eu	s-rg	s-r	f	f-r	h	h-rg	h-r	EUS
General	413	293	237	228	229	232	230	220	237	226	221	225	213	290
Gen-CrCi	214	159	159	159	159	143	159	159	155	132	130	137	125	152
CLIA	88	86	20	20	19	19	19	18	20	16	16	16	16	85
INV	127	109	109	109	109	109	109	109	110	109	109	109	109	68
PBE-BV	753	751	751	721	721	721	721	628	751	717	721	721	628	745
PBE-Str	109	105	105	104	104	104	87	75	105	103	102	87	75	74
Subtotal	1704	1503	1381	1341	1341	1328	1325	1209	1378	1303	1299	1295	1166	1414
IC-BV	160	135	135	135	132	130	130	133	138	132	128	126	127
CegisT	79	56	43	43	43	43	42	41	42	42	42	42	41
Lustre	485	255	255	255	255	218	211	221	231	213	248	244	234
Total	2428	1949	1814	1774	1771	1719	1708	1604	1789	1690	1717	1707	1568

Overall, strategy s excels on more challenging benchmark sets such as Lustre and Gen-Crci, while strategy f excels on the majority of the others. The gains for f are especially significant on PBE problems, where it outperforms both s and EUS by several orders of magnitude. Such gains are significant given that cvc4 won this track at SyGuS-COMP 2018 by employing s alone, and a variant of EUS won it in 2017. This result can be explained as a consequence of two factors. First, the string and bit-vector grammars contain many operators with the same type, making the constructor class optimization of the f algorithm very effective. Second, although not described in this paper, all solvers in our evaluation use divide-and-conquer algorithms for PBE problems [4], which are not compatible with the optimizations cg and eu. The most important optimization for all cvc4sy strategies and with all benchmark sets is r. The optimization eu is especially effective when grammars contain $\mathsf {ite}$ and Boolean connectives, such as those in the Lustre set and in some subsets of General, on which we can see the biggest gains of s with respect to s-eu; cg is more helpful for IC-BV, with a few harder benchmarks only solved due to this technique.

The first scatter plot in Fig. 2 shows the advantage of h over s on Lustre, a benchmark set containing invariant synthesis problems with dozens of variables. We remark this configuration excels at quickly finding small solutions for problems with many variables, although solves fewer problems overall. The second scatter plot shows that while s takes significantly longer on easy problems, it outperforms f in the long run. The last two plots show that f significantly outperforms the state of the art on PBE benchmarks.

For all benchmark sets, the auto strategy a chooses the best enumerative strategy of cvc4sy with only a few exceptions, and hence it is the default configuration of cvc4sy. Due to specialized synthesis techniques [4, 14], both a+si and EUS outperform the purely enumerative strategies of cvc4. This is reflected in the cactus plot on the commonly supported benchmark sets, where a and f solve more benchmarks than EUS for lower times but then EUS solves more benchmarks in the end. For a+si, the cactus plot shows that it outperforms EUS significantly. Nevertheless, we remark that a+si is able to solve only 393 (16%) of the overall benchmarks using only single invocation techniques. Hence, we conclude that both smart and fast enumerative strategies are critical subcomponents in our approach to syntax-guided synthesis.

Acknowledgments

This work was partially supported by the National Science Foundation under award 1656926 and by the Defense Advanced Research Projects Agency under award FA8650-18-2-7854.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Vorheriges Kapitel Q3B: An Efficient BDD-based SMT Solver for Quantified Bit-Vectors

Nächstes Kapitel Incremental Determinization for Quantifier Elimination and Functional Synthesis

SyGuS-COMP 2018 (2018). http://sygus.seas.upenn.edu/SyGuS-COMP2018.html

Abate, A., David, C., Kesseli, P., Kroening, D., Polgreen, E.: Counterexample guided inductive synthesis modulo theories. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018. LNCS, vol. 10981, pp. 270–288. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96145-3_15CrossRef

Alur, R., et al.: Syntax-guided synthesis. In: Irlbeck, M., Peled, D.A., Pretschner, A., (eds.) Dependable Software Systems Engineering. NATO Science for Peace and Security Series, D: Information and Communication Security, vol. 40, pp. 1–25. IOS Press (2015)

Alur, R., Radhakrishna, A., Udupa, A.: Scaling enumerative program synthesis via divide and conquer. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10205, pp. 319–336. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54577-5_18CrossRef

Barrett, C., et al.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_14CrossRef

Barrett, C., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: Splitting on demand in SAT modulo theories. In: Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 512–526. Springer, Heidelberg (2006). https://doi.org/10.1007/11916277_35CrossRef

Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for satisfiability in the theory of recursive data types. Electr. Notes Theor. Comput. Sci. 174(8), 23–37 (2007)CrossRef

Barrett, C., Tinelli, C.: Satisfiability Modulo Theories. Handbook of Model Checking, pp. 305–343. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-10575-8_11CrossRef

Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185, pp. 825–885. IOS Press (2009)

10.

Gulwani, S.: Programming by examples: applications, algorithms, and ambiguity resolution. In: Olivetti, N., Tiwari, A. (eds.) IJCAR 2016. LNCS (LNAI), vol. 9706, pp. 9–14. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40229-1_2CrossRef

11.

Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)CrossRef

12.

Niemetz, A., Preiner, M., Reynolds, A., Barrett, C., Tinelli, C.: Solving quantified bit-vectors using invertibility conditions. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018, Part II. LNCS, vol. 10982, pp. 236–255. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96142-2_16CrossRef

13.

Reynolds, A., Blanchette, J.C.: A decision procedure for (co)datatypes in SMT solvers. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 197–213. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21401-6_13CrossRef

14.

Reynolds, A., Deters, M., Kuncak, V., Tinelli, C., Barrett, C.: Counterexample-guided quantifier instantiation for synthesis in SMT. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015, Part II. LNCS, vol. 9207, pp. 198–216. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21668-3_12CrossRef

15.

Reynolds, A., Viswanathan, A., Barbosa, H., Tinelli, C., Barrett, C.: Datatypes with shared selectors. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 591–608. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_39CrossRef

16.

Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs, pp. 404–415. ACM (2006)

17.

Udupa, A., Raghavan, A., Deshmukh, J.V., Mador-Haim, S., Martin, M.M.K., Alur, R.: TRANSIT: specifying protocols with concolic snippets. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2013, Seattle, 16–19 June 2013, pp. 287–296 (2013)

Titel: cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis
verfasst von: Andrew Reynolds
Haniel Barbosa
Andres Nötzli
Clark Barrett
Cesare Tinelli
Verlag: Springer International Publishing
Buch: Computer Aided Verification
Print ISBN: 978-3-030-25542-8

Electronic ISBN: 978-3-030-25543-5

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-25543-5_5

Springer Professional