Skip to main content
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

SemML: Enhancing Automata-Theoretic LTL Synthesis with Machine Learning

verfasst von : Jan Křetínský, Tobias Meggendorfer, Maximilian Prokop, Ashkan Zarkhah

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Das Kapitel vertieft sich in die Synthese endlicher Systeme anhand logischer Spezifikationen, ein zentrales Thema in der theoretischen Informatik. Er untersucht die Verwendung der LTL-Synthese (Linear Temporal Logic) als Alternative zur manuellen Implementierung und beleuchtet die Herausforderungen, die von der 2-EXPTIME-Vollständigkeit der LTL-Synthese ausgehen. Der Text diskutiert die dramatischen Verbesserungen im Automatenbau, insbesondere die Entwicklung deterministischer Automaten, die um Größenordnungen kleiner sind, was den klassischen automatentheoretischen Ansatz wiederbelebt hat. Ein zentraler Schwerpunkt ist die semantische Beschriftung von Automatenzuständen, die zusätzliche Informationen liefert, die die Erforschung des Automaten / Spiels on-the-fly leiten und potenziell eine gewinnbringende Strategie finden können, bevor der gesamte Staatsraum konstruiert wird. Das Kapitel stellt SemML vor, ein Werkzeug, das die maschinelle Lernheuristik in die Synthesepipeline einbindet und damit die Leistung der LTL-Synthese deutlich verbessert. SemMLs Ansatz des maschinellen Lernens leitet die spontane Erforschung des Paritätsspiels, indem er semantische Beschriftungen verwendet, um fundierte Entscheidungen darüber zu treffen, welche Teile des Spiels zuerst erkundet werden sollen. Es hat sich gezeigt, dass das Tool bestehende hochmoderne Tools wie Strix sowohl in der Geschwindigkeit als auch in der Anzahl der gelösten Fälle übertrifft. Das Kapitel behandelt auch die technischen Verbesserungen, die in SemML vorgenommen wurden, einschließlich State Merging, Dual Perspectives und Caching, die zu seiner überlegenen Leistung beitragen. Die experimentelle Evaluierung von SemML demonstriert seine Wirksamkeit sowohl bei synthetischen als auch bei SYNTCOMP-Benchmarks und unterstreicht sein Potenzial für reale Anwendungen in der reaktiven Synthese.
Hinweise
This research was funded in part by the DFG project 427755713 GOPro and the MUNI Award in Science and Humanities (MUNI/I/1757/2021) of the Grant Agency of Masaryk University.

1 Introduction

Synthesis of finite systems from their logical specifications has been one of the central topics of theoretical computer science since the times of Church [7] and Büchi [5], being closely linked to developments of the automata theory [39]. Indeed, the logical formula would be translated to an automaton, in fact a game over this automaton played by the environment and system players, where the strategy of the latter corresponds to an implementation of the specified system.
Since Pnueli’s suggestion to use Linear Temporal Logic (LTL) [30] for describing relevant properties of reactive systems, LTL synthesis [31] has become an appealing alternative to manual implementation followed by LTL model checking. Indeed, the tedious and error-prone implementation and debugging could be circumvented by automated construction of systems or their controllers, which are then correct “by construction”. Nevertheless, the 2-EXPTIME-completeness of LTL synthesis, stemming from the doubly exponentially sized parity automata for the LTL formulae, has been challenging the practical applicability of the whole concept. Fortunately, this has also led to numerous advances, such as identification of subclasses of properties for which the problem becomes easier, e.g. [2, 3, 29], or methods avoiding the notoriously expensive step of determinizing the automata, e.g. [24, 40] or employing antichain-based methods, e.g. [4, 6].
The breakthrough of directly constructing deterministic automata orders of magnitudes smaller [13, 18] has brought the classical automata-theoretic approach back on the stage, and indeed as the most efficient approach available. This is witnessed by Strix [26], a synthesis tool based on the translations of Rabinizer/Owl[22, 23] tools, winning the LTL tracks of the main synthesis competition SYNTCOMP[16].
Semantic Labelling and Previous Work. The dramatic improvements in the size came with an interesting side-effect. In contrast to determinization of Safra [35] and others [28, 36], the new constructions are following the logical structure of the formula. Consequently, the states of the generated automaton/game are labelled by this additional information. It consists of the formula describing the property yet to be satisfied, i.e. monitoring the progress of satisfaction of the original formula, and formulae capturing progress of all its subformulae. For example, an input formula \(\lnot a \vee \textbf{G}{\textbf{F}}(a\wedge {\textbf{X}}b)\) labels the initial state of the automaton together with the sub-goal \(a\wedge {\textbf{X}}b\) to be satisfied infinitely often, see Fig. 1. After reading a, the successor state is labelled by the remaining goal \(\textbf{G}{\textbf{F}}(a\wedge {\textbf{X}}b)\) as well as the progressing sub-goal b left to be satisfied. Under b the automaton then moves to the state with the first component remaining forever the same goal \(\textbf{G}{\textbf{F}}(a\wedge {\textbf{X}}b)\), but the second component, now being satisfied fully, signals that one repetition of the sub-goal has been successfully finished.
Fig. 1.
An example of a part of an automaton with semantic labelling
While this labelling was left unused for years, it clearly offers additional information. Indeed, for instance, seeing \(\lnot a \vee \textbf{G}{\textbf{F}}(a\wedge {\textbf{X}}b)\) as a goal, it seems easier to choose \(\lnot a\) in order to satisfy it than to take care of the infinitely repeating sub-goal. Similarly, if progress is made in satisfying \(a\wedge {\textbf{X}}b\) by choosing an a, it seems wasteful not to follow with a b, although the overall goal of \(\textbf{G}{\textbf{F}}(a\wedge {\textbf{X}}b)\) remains unaffected either way. Such a guidance can be used to explore the automaton/game on-the-fly and possibly finding a winning strategy before the whole state space is constructed. In contrast, from the traditional perspective of solving games on graphs, one can either solve the whole game, or possibly explore it on-the-fly “blindly” since there is no observable difference between taking a transition, say, to the left or right. Note that the guidance need not be reliable, the correctness is still guaranteed by solving (a part of) the game, hinting at possible use of machine learning.
In [19], two attempts have been made to explore the automaton/game in a profitable order using this additional information. Firstly, the first component is subject to a naïve heuristic called trueness, estimating the ease to satisfy a formula (by considering every formula as a Boolean combination, ignoring the temporal structure, and counting the percentage of satisfying assignments), and then transitions with higher trueness are explored first. Secondly, reinforcement learning has been used with rewards being related to satisfying sub-goals in the second component. While the former is also implemented in Strix, both are ad-hoc heuristics with limitations. In [20], a machine-learning approach has been suggested, which learns from solving games for other formulae estimating which transition is more often “winning” than others. This allows for superior precision, also dealing natively with more convoluted choices where the hand-written heuristic struggles. However, only this oracle was implemented, not a (competitive) synthesis procedure.
Our Contribution. In the present tool paper, we show how we incorporate this approach into the whole synthesis pipeline, closing the gaps explicitly left open. Besides, we report on our tool efficiently implementing the approach and, even in a preliminary version, winning this year’s edition of the realizability track1 of the SYNTCOMP competition. In more detail, our contribution is as follows:
  • We implement a machine-learning heuristic guiding the on-the-fly exploration. In contrast to [20] using SVM, we evaluate various models and choose the most adequate option. Besides, we adapt it to the state-of-the-art semantic labelling.
  • We incorporate it into our synthesis pipeline, which improves over the approach of Strix in several (traditionally algorithmic) aspects.
  • We report on the performance of our tool SemML (short for Semantic-labelling-based Machine Learning) and analyze why it performs better than Strix. It is worth noting that SemML is faster on the SYNTCOMP benchmark set while being trained only on synthetic data. On this synthetic data, it is even an order of magnitude faster.
Note that other lines of research that use LTL synthesis solvers as blackbox, e.g. LTL modulo theories synthesis [34] or portfolio solvers such as NeuroSynt [8], directly profit from these improvements.
Further Related Work.
Besides Strix, the closest to our work is, on the one hand, Spot (with ltlsynt) [33], following the same automata-theoretic approach, but constructing the whole automaton; and on the other hand, purely machine-learning approaches such as the deep-learning-based [9, 37], implemented in NeuroSynt [8], which guesses circuits using ML, falling back to Strix to achieve completeness. Before the automata-theoretic approach, further winning approaches included bounded synthesis, e.g. [11, 14], or even earlier safraless implementations [17]. As mentioned, all of these are significantly out-performed by Strix in SYNTCOMP.

2 Tool Description

In this section, we provide an overview of our tool SemML. We formally state the problem it is solving, describe how to use the tool, and its high-level approach.

2.1 Problem Description: LTL Synthesis and Realizability

The problem of LTL reactive synthesis is defined as follows. We are given an LTL [30] formula \(\phi \) over a set of atomic propositions \(\textsf{AP}\) together with a partition of \(\textsf{AP}\) into environment and system propositions \(\textsf{AP}= \textsf{AP}_{\textsf{ENV}} \mathbin {\cup }\textsf{AP}_{\textsf{SYS}}\). The environment and system generate an infinite word as follows. In each step i, the environment chooses \(e_i\subseteq \textsf{AP}_{\textsf{ENV}}\) and then2 the system chooses \(s_i\subseteq \textsf{AP}_{\textsf{SYS}}\), generating a sequence \(e_1,s_1,e_2,s_2,\ldots \). The combined word over \(\textsf{AP}\) is then \(e_1\cup s_1,e_2\cup s_2,\ldots \) and the system wins if this word satisfies \(\phi \). The central question is whether the system has a winning strategy, i.e. a way to choose \(s_i\) based on the current prefix so that the combined word always satisfies the given formula.3 In that case, the instance is called realizable and unrealizable otherwise. For example, the formula \(\phi = {\textbf{G}}(r \Leftrightarrow {\textbf{X}}g)\) with \(\textsf{AP}_{\textsf{SYS}} = \{r\}\) and \(\textsf{AP}_{\textsf{ENV}} = \{g\}\) prescribes that whenever the environment sends a request, the system should in the next step grant the request, and only then. This formula is realizable, and a winning strategy is to remember whether the environment sent a request in the previous step.
Deciding whether a formula is realizable or not is called LTL realizability. In synthesis in the narrower sense, we want to output such a strategy for the winning player, i.e. a procedure that at every step outputs the next choice, typically in the form of a finite state machine, e.g. a Mealy machine or an AIGER circuit [1]. While our tool can output the strategy in a straightforward way, we refrain from discussing it further, as it neither the focus of the tool nor of our advancements.

2.2 Functionality

Inputs/Outputs SemML accepts the standard format TLSF [15] (used in SYNTCOMP), converted to LTL using syfco, as well as explicit input, i.e. an LTL formula together with a partition of the atomic propositions. It supports both realizability and synthesis (with the strategy encoded as AIGER circuit).
Usage To streamline interaction, SemML is invoked through a Python wrapper. For TLSF input, use \(\texttt {main.py - -tlsf <path\,to\,tlsf\,file>}\), and for explicit input \(\texttt {main.py - -ins=<ins> - -outs=<outs> -f=<formula>}\), where ins and outs are the atomic propositions owned by environment and system, respectively. The tool then solves the synthesis problem and outputs the witness strategy. If desired, append--realizability to only solve the realizability problem. The tool then simply outputs REALIZABLE or UNREALIZABLE.
Fig. 2.
High level architecture of SemML.

2.3 High-Level Architecture

In line with the automata-theoretic approach, SemML employs on-the-fly construction of the corresponding parity game. In SemML, this process comprises three main components, namely frontier exploration, partial game solving, and backtracking, also outlined in Fig. 2. In a nutshell, starting from an empty game, SemML explores a “minimal” frontier. Here, we employ a sophisticated machine-learning (ML) guidance that, based on the semantic labelling, decides which parts of the game to explore first. Then, our parity game solver tries to find a solution, interpreting unexplored states as losing for either player. If the solver finds a solution, we are done. If not, we need to explore more of the game. We refer to a backtracking heuristic to identify candidate states and, starting from there, go back to frontier exploration. In this setup, the main purpose of the new ML component is to tackle hard cases, where the automaton is too large to be constructed in its entirety, by trying to identify a small part where one of the players already wins.

3 Advancements in Detail

In this section we describe the major advancements of SemML. Recall that our technical goal is to employ ML to guide on-the-fly synthesis towards “easily winning” regions and thus improve scalability. To describe our approach and contributions, we first discuss the state of the art. Here, we consider the tool Strix, which is currently the only implementation of this approach competitive on the standard SYNTCOMP benchmarks, and [20], which provides the first ML-based approach to exploiting the semantic labelling. In particular, we discuss their individual shortcomings and incompatibilities. Then, we outline how we solve these issues, and describe our solution approach in detail.

3.1 State of the Art and its Shortcomings

  https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figc_HTML.gif alternates between exploring the parity game and trying to solve the explored part. To decide which states to explore, Strix uses a global double-ended priority queue (one end for each player) to track every state that can be explored further. Strix simply works on both ends of that queue while checking for solvability in fixed intervals until a winner is identified. This comes with two major problems. Firstly, the states are ordered by (a variant of) the rather naïve trueness [19] of formulae, which roughly corresponds to the percentage of propositionally satisfying assignments, completely disregarding the temporal structure. This is particularly problematic for formulas which comprise lots of temporal behaviour (such as \({\textbf{X}}{\textbf{X}}\phi \)). Secondly, the exploration is not “demand-driven”, meaning it can fail because one successor of an important state has not been explored (and thus is considered losing), but during the subsequent exploration phase, that successor is not touched either because other states have more extreme scores (or many states have the same score).
The approach of [20] uses the semantic labelling to predict winning choices locally through a simple learning approach with hand-crafted features. These are then used as the starting point for a parity game solver, ensuring that potential imprecisions of the ML model are detected and fixed. However, (i) they do not provide an implementation competitive w.r.t. the actual runtime, and (ii) adapting their approach to Strix (or any other tool) is difficult for a variety of reasons. Firstly, Strix is fundamentally designed to work with a “global” ranking, i.e. picking states to explore from one priority queue, while [20] gives “local” recommendations, specific to a concrete state. Note that combining the two approaches by first picking a state similarly to Strix and then following [20] in that state does not solve Strix’s lack of demand-driven guidance. Secondly, [20] employs complex features and evaluating them is too time consuming. The timing constraints on evaluating the guidance heuristic are quite strict, since we need to be able to give hundreds of recommendations per second to remain competitive. Finally, [20] uses an outdated automaton construction (via LDBA [12, 38]), while Strix uses a newer, practically more efficient variant. This is particularly relevant, as the automaton construction usually is the biggest bottleneck of LTL synthesis.
Summary. In order to reap the benefits of ML for on-the-fly synthesis, we thus first have to design a novel exploration approach (and adapt all subsequent machinery) capable of processing local advice in the spirit of [20]. While this is a pre-requisite for using ML guidance, it is also interesting in its own right, as it allows for a more targeted, deep exploration instead of exploring multiple equally promising directions simultaneously. Then, at the same time, our guidance must be much more efficient than [20], so that it does not add too much overhead, which would negate any performance gained by giving good recommendations. Finally, it also needs to be designed for a modern automaton construction.
We proceed to explain how we tackled these problems. We introduce our locally guided approach to on-the-fly LTL synthesis, describe how we use machine learning to guide the exploration, and finally outline general engineering improvements.

3.2 Locally Guided Exploration

Recall that the overall approach is to alternate between exploring parts of the parity game and checking whether there exists a winning strategy in the current part of the game already, as depicted in Fig. 2. For the exploration, our aim is to follow “good” choices for one player that work well against all options of its opponent. As such, we run the exploration for both “perspectives” separately, and regularly switch between them (further motivation and details in Sec. 3.4). In the following, we take the perspective of the system (aiming to prove realizability); the dual part for the environment is analogous.
As hinted in Sec. 2.3, our exploration approach comprises two parts, namely frontier exploration and backtracking. The components of frontier exploration and its interplay with both backtracking and the game solver is depicted in Fig. 3. During frontier exploration, the core idea is to only explore a necessary minimum so that a strategy of the system can be at all properly defined. In particular, we want to reach a point where every known system state has at least one of its successors explored and every known environment state has all of its successors explored. If that is the case, we call the (partial) parity game closed. Clearly, to obtain a closed game, we repeatedly need to explore states. To this end, we maintain a queue of automaton states (the current frontier) which we still need to explore. After taking a state from the queue, we compute the immediate automaton successors, using an adapted implementation of Owl, and split this automaton transition (under a subset of \(\textsf{AP}\)) into the two moves of the players under a subset of \(\textsf{AP}_{\textsf{ENV}}\) and \(\textsf{AP}_{\textsf{SYS}}\), respectively. While we hardly have a choice for environment states (we need to explore all successors in the game), in system states we can select which successor to explore. Thus, in these states we ask our exploration heuristic for advice, and add all newly reached states to the queue. For this local guidance, we employ the new ML-based approach, which we later explain in depth. For an example, see Fig. 3 (bottom). From left to right, we obtain a state \(q_0\) from the queue and construct its successors \(q_1\) and \(q_2\) in the automaton. Splitting it into the game introduces the two system states \(s_1\) and \(s_2\). In both states, the exploration heuristic recommends going towards \(q_2\), and we only add that state to the frontier queue.
Overall, we repeat this process until the current game is closed, and then attempt to solve it. If we cannot determine a winner, then in at least one of the system states a “wrong” successor was chosen (the system cannot win the current partial game). Thus, subsequently, we ask the backtracking oracle which states might have been “wrong”. Concretely, we heuristically choose a subset of all non-fully explored states with the highest trueness. For each of these, we explore their next best successors according to our heuristics. Now, the game might not be closed, and thus we switch back to frontier exploration. We repeat this process until a winner is found (which always happens, as eventually the entire game is explored).
Fig. 3.
Illustration of the exploration process together with an example for each step.

3.3 Exploration Guidance Through Machine Learning

In this section, we describe our ML approach used to guide the frontier exploration. Recall that for a given state the exploration heuristic is supposed to give a ranking preferring “good” edges which lead to a winning strategy: Initially, we follow the highest ranked choice, then, when backtracking in this state, the second one, and so on. Thus, we would like this heuristic to prefer edges that can be part of a winning strategy. Additionally, as we also want to obtain small games, among the possibly winning edges we would like to prefer edges leading to smaller strategies (hence exploring a smaller part of the game). Note that one can also employ handcrafted heuristics instead, e.g. the score computed by Strix (which we also implement and evaluate).
Similar to [20], we employ a supervised learning approach. As usual for ML, we start by describing the dataset used to train our models. We then discuss the overall architecture of the model(s), how we obtain the ground truth, and how we extract features. Finally, we discuss the training method. Of course, our approach is not the only possible way to tackle this problem. Yet, while designing it, we discovered several subtle pitfalls and tried alternative approaches which proved to be suboptimal. We provide further details within this section.
Data Our explicit aim is to exploit structure in real-world formulae. Here, existing datasets such as the SYNTCOMP set seem a natural choice. However, we want to evaluate our approach on the entire SYNTCOMP set (in order to faithfully replicate the SYNTCOMP evaluation). Thus, “showing” any part of it during learning could introduce an unfair advantage. This leaves us with hardly any realistic data sets.
This problem has already been observed by [37]. As a solution, they note that in practice, specifications often follow specific patterns and combinations thereof [10]. Thus, randomly combining such patterns should yield numerous formulae that resemble some structure one might expect in practice. To this end, [37] identified over 150 assumptions and over 1500 guarantees which intuitively limit the behaviour of the environment and system respectively. From this set of “building blocks”, they generate formulae by sampling assumptions and guarantees and assemble them in the form of “conjunction of assumptions implies conjunction of guarantees”, which adds some comprehensible structure. Formulae of this kind can be interpreted as “if the environment adheres to one behaviour profile, the system should adhere to another”.
We extend this idea a bit further by sampling several options for system and environment, which diversifies the formulae while maintaining comprehensibility. In particular, we sample multiple sets of assumptions and guarantees and assemble a formula in the form of “DNF of assumptions implies DNF of guarantees”. Intuitively, these formulae mean “If the environment follows one of these behaviour profiles, the system should adhere to one of their behaviour profiles”. In particular, this introduces options for the system to which behaviour profile it should adhere to, which in turn might depend on the profile the environment chooses.
We filter our generated data into two groups depending on the size of the corresponding automaton. The training and validation group consists of 1000 formulae where the automaton has at most 500 states. We introduce this limitation to keep the ground truth and feature computation feasible (described later). While 1000 formulae may seem like a small data set, note that we learn from the local decisions in each state of the 1000 associated parity games, which give several million data points in total. For evaluation, we also identified 200 formulae of which the automaton size is not known except that it is larger than 20,000 states.
While this yields a decent data foundation for our venture, the synthetic data definitely is quite different from SYNTCOMP. Thus, for practical purposes, one should consider including SYNTCOMP and other data during learning. Since this is orthogonal to the evaluation in this paper (including any part of SYNTCOMP in our training data would introduce unwanted biases), we deliberately do not include this.
Model Architecture Similar to [20], we rank outgoing edges through all pairwise comparisons. Formally, we employ a pair classifier \(p: E\times E \rightarrow \mathbb {R}\) where the sign denotes whether the first or second edge is preferred and the magnitude denotes the confidence in that prediction. In a state, every pair of edges is compared and each edge is ranked according to the sum of confidences in its favour. However, since this scales quadratically in the number of outgoing edges, we approximate the above for states with more than 16 edges. For them, a number of pivot edges are chosen that every other edge is compared to, in order to obtain a “first guess” at a ranking. From that guess, the best 8 are selected to enter the second round which now is a full round of comparisons. The final ranking comprises the ranking among the top 8 followed by the other edges according to the first ranking.
Moreover, similar to [20], we pre-classify states into groups with conceptual differences and train a separate model for each group. Intuitively, we distinguish (i) whether a state is owned by the system or the environment, and (ii) whether the long-term goals are trivially structured, e.g. a single liveness condition, which simplifies some decisions; leading to 4 models in total. We discuss the concrete implementation of the pair classifier in the “Training” section below.
Fig. 4.
A simple game to illustrate two challenges for the ground truth. For simplicity, all states are controlled by one player. Clearly, all states are winning.
Ground Truth For the supervised learning of our models, we need meaningful labels that denote the quality of an edge so that we can determine the better one of any given pair. But which edges are “good”? At first, this may seem obvious – simply take all edges which are part of a winning strategy. This however is problematic for multiple reasons, as already outlined in [20].
First, parity games do not allow for maximally permissive strategies. This means that there simply is no one “local truth”; whether an edge is good or bad may depend on decisions in other states, as we exemplify in Fig. 4. While the edges leading to the goal from \(v_1\) and \(v_2\) are always winning choices, edge \(\overline{v_1v_2}\) is only winning if the goal-edge is chosen in \(v_2\) (and vice-versa for \(\overline{v_2v_1}\)). Consequently, different solution approaches may yield different sets of winning edges, and just considering one of them would bias the model to behave alike to that concrete solution method and not to “understand” semantically labelled parity games in general. Relating to the previous example, even though \(\overline{v_1v_2}\) and \(\overline{v_2v_1}\) are symmetric, a solver may only mark one of them as winning, but never both. As such, using the output of one solver would actively try to make our model believe that one of the two is better and imitate that solver’s bias.
Secondly, even if multiple edges are indeed winning, this does not inform us about the “complexity” required to win after playing one of them. For example, consider two edges where one leads to a trivially winning sink within two steps and the other leads to a large and complicated, but ultimately also winning region. Qualitatively, both edges are equivalent, but we prefer the former as it yields smaller solutions and requires fewer correct decisions in the future. We also provide an illustration in Fig. 4. There, both choices in u are winning, however we prefer moving to \(v_1\) over moving to w, as we can win “faster”.
Thus, in order to determine the quality of any given edge of the game, we analyze the game tree after playing said edge. Constructing the entire game tree and applying min-max is practically infeasible already for rather small instances. Therefore, we apply an improved version of the decayed Monte Carlo tree search suggested in [20]. In particular, we only deeply expand the tree for critical paths (the ones where either player fancies their chances) and thus can identify longer shortest winning paths. Conveniently, we thereby no longer require the “optimal stalling” strategy that [20] uses for the opponent, as the opponent prefers longer losing paths over shorter ones by default due to the decay. In the end, we effectively get a score between \(-1\) and 1 for each edge in the game which indicates the “quality” of this edge. Intuitively, an edge directly leading to tt gets a 1. An edge leading to a region where the system can win but may require a lot of steps to do so yields a small, positive value, while edges after which the environment can quickly force a losing cycle yield a score close to \(-1\).
Feature Extraction The feature extraction transforms a transition in the game into a vector of numbers so that it can be processed by an ML model. The features are based on all the information that is available at the time of deciding which edge to explore further. In particular, this includes the semantic labelling of the transition’s source and target, the colour/priority, and also labelling associated to its sibling transitions.
We deliberately aimed at manually designing a (large) set of features derived from the semantics and then prune it via feature selection. While automatic feature extraction is a powerful tool, in our use case the feature extraction needs to be extremely efficient, since we need to call it hundreds of times per second to remain even remotely viable. (Recall that we need to extract the features for every edge in the game and already the games obtained from reasonably simple formulae can easily reach thousands of states and significantly more edges.)
State Features We first introduce twelve “formula features”, which transform a single LTL formula into a number. Intuitively, they can be thought of as proxies for higher level concepts. These concepts include formula-complexity (syntactic properties such as height and size of the syntax tree), formula-sat-difficulty (how “difficult” is it to satisfy the formula, capturing variants of trueness [19]), or formula-controllability (how much influence does a player have on the truth value of the formula with only their variables). Formula features are then aggregated for a state (which comprises several formulae) in one of two ways. Either, we select a single formula of the labelling and yield the value of the base feature on the selected formula as the state’s overall value (e.g. selecting the formula that maximizes the value of another base feature). Or, we apply the base feature to all formulae of the semantic labelling and aggregate the results in several ways, exploiting the non-trivial structure of the state labelling. Intuitively, this captures the respective concept (e.g. controllability) over the entire state.
Edge Features Edge features are obtained from state features by either taking the state feature of the edge’s successor or the change of the feature along the edge, i.e. the difference of the feature in the successor and predecessor. Further, we can compare that value against the value of all other edges of the same state by normalizing the feature to the [0, 1]-interval, so that a normalized value of 1 denotes that it is the highest among its sibling edges. This may help learning relative comparisons, but loses all information on the absolute value of the feature. As confirmed in our final models, a mixture of normalized and non-normalized features seems to be desirable. Aside from features derived from states, we also consider features based on the edge priority as suggested by [20]. However, we include the parity information in an “ML-friendly” way, for example by mapping it to a linear scale.
In total, we obtain well above 150k different features for edges. These include, for example, the change in the syntax tree height of the most controllable formula or the aggregated trueness, normalized across all successors. For more details on the features we refer to the appendix of [21].
Training Applying ground truth and feature methods to the generated formulae yields a dataset for supervised learning with way over a million samples for every state class. We bootstrap these down to roughly 100k samples per state class in order to make the training take reasonable amounts of time. The following procedure was done for every state class individually.
Feature Elimination As using our entire set of thousands of features is absolutely impractical in several regards, especially with our performance constraints in mind, we perform multiple stages of feature elimination. First, we randomly select between 50 an 100 features per major category and add some hand-picked features. This leaves us at an algorithmically manageable, yet practically infeasible amount of about 700 features. For further reductions, we perform a variant of recursive feature elimination, adjusted to our pairwise setting (see [21] for details). As different features might be more or less important for different model types and state classes, we ran a separate feature elimination for each of these. For details on what kind of features remained after the elimination, we refer to [21].
Models As for model types, we evaluated (kernel-)SVMs, neural networks, random forests, and gradient boosted trees. The input for each model is the concatenation of the two feature vectors of the respective edges that we want to compare pairwise. For tree models, we additionally concatenated the pointwise differences of the features in order to allow them to compare the same feature of both edges in one decision node. For every model type, we performed several smaller runs to obtain suitable hyper-parameters for the large scale feature elimination.
Ultimately, gradient boosted trees proved to be the best choice for implementing our pair classifier in all four state classes. Together with random forests, they clearly outperformed the non-tree methods like SVM or NN. However, in contrast to random forests, they required less features (3-10, depending on state class) to do so. Further details on this experiment can be found in [21].

3.4 Engineering

To conclude, we provide details on our implementation and engineering improvements. First and foremost, as a major practical improvement, SemML is implemented in pure Java (built on top of Owl). In contrast, Strix is developed as a hybrid between Rust and Java, with native compilation of Java through GraalVM, and a complex interplay between the two code bases. This adds, among others, complexity due to working with two languages, subtle performance overheads when crossing language boundaries, and setup difficulties due to requiring a rather particular set of tools. As such, SemML is significantly easier to use, maintain, and extend. For easy incorporation of third party tools such as syfco, we also include a Python wrapper. For learning, we use the Python library sklearn [27], and store our models in the established PMML format.
Aside from structural improvements and pure-Java implementation, we also added several engineering changes compared to the approaches of Strix and [20], of which we list a few notable ones.
  • State Merging In parity games, states are fully determined by their set of edges, i.e. if two system states transition to the same successor for every system assignment while emitting the same priority, they are equivalent and can be merged. This equivalence check is very efficient due to our internal representation of states. Interestingly, by applying this reduction we observed a decrease of game sizes by up to two orders of magnitude.
  • Dual Perspectives As mentioned in Sec. 3.2, our approach alternates between the perspectives of both players. While this is not required for correctness, it helps in practice, as, for example, we can quickly find a small winning region for the environment even if the system player explores in a different direction.
  • Exploration Scheduling Usually, we switch perspectives whenever the game is closed and the solver is consulted. However, when following a “bad” edge leads the algorithm off the track, we might spend a lot of time trying to close the game. Instead of insisting on continuing this process to the end, we also switch to the other perspective after too many states have been explored without closing the game.
  • Result Sharing We re-use information discovered from one “perspective” for the other part, where appropriate, for example already constructed parts of the automaton. Moreover, if one side finds a set of states to be winning for them, the exploration of the other side directly treats them as losing.
  • Caching We trade memory for time by caching all computed features.
  • BDD We implemented complement edges and several further engineering improvements in the underlying pure-Java BDD library JBDD [25].

4 Experimental Evaluation

In this section, we evaluate the two central research questions of interest, namely:
  • RQ1 Can SemML solve LTL realizability more efficiently than state-of-the-art tools, in particular Strix?
  • RQ2 How much of the improvements are caused by algorithmic and engineering changes and how much by employing ML-guided exploration?
We first introduce considered tools, metrics of interest, and our benchmark sets. Then, we present our results and discuss each research question separately.
Tools We consider our tool SemML and the state-of-the-art tool Strix4. To further distinguish algorithmic and engineering improvements from those due to the ML-based exploration heuristic, we also consider https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figd_HTML.gif (“SemML without ML”), which uses the exploration score of Strix as guidance instead.
Note that both SemML and Strix internally construct a strategy even when “only” solving LTL realizability. The problem of exporting the (already constructed) strategy into a particular format, e.g. AIGER circuits, is completely orthogonal. Thus, we explicitly focus on the time to find the solution (i.e. let the tools run in their “realizability” configuration).
Remark 1
When only focusing on the number of solved instances, the portfolio solver NeuroSynt [8] is a more powerful tool than Strix. This is to be expected, as it runs multiple approaches (including Strix) in parallel. However, to clearly compare our specific approach to the state-of-the-art, we deem a direct comparison of SemML to Strix (the state-of-the-art of “single-approach”-solvers) more relevant. A more detailed discussion and comparison of SemML with NeuroSynt can be found in the appendix of [21].
Metrics Primarily, we are interested in which tool can solve more benchmarks within a given time constraint (the main metric used for SYNTCOMP). Additionally, for the instances solved by multiple tools, we are interested in which tool solves them faster. For that matter, we compute the ratios of (wallclock) time for all samples that both tools were able to solve and aggregate them by computing the geometric mean.5 We exclude simple instances to account for constant time overheads caused by, e.g., JVM startup and loading of ML models. We treat an instance as simple if both compared tools solve them faster than a given threshold (usually 5s; later we also consider larger values to focus on the most complicated instances).
Benchmarks We consider two classes of benchmarks. First, our evaluation set as described in Sec. 3.3, called Synthetic. Here, we expect the ML-based guidance to shine, as the inputs are of a similar structure as the training data (just much larger). Additionally, we consider the entire set of SYNTCOMP 2024.
We highlight some peculiarities of the SYNTCOMP data set. First, as we observe in our experiments, the vast majority of instances are trivial, i.e. solved within a few seconds. As such, a small constant time overhead has a large (relative) impact. Second, SYNTCOMP mainly comprises parametrized families of formulae, where incrementing the parameter often more than doubles the size of the state space. As such, for many families a lot of their instances are simple, very few are interesting-but-solvable, and then many more are completely out of reach. This results in a rather small set of benchmarks where a notable difference can be expected, and solving one more sample of a family already marks significant improvement. Finally, there are subtle biases and asymmetries that may limit the possible improvement of exploration guidance. On the one hand, several of the realizable families are arbiters (or variants thereof), where, by design, (nearly) the entire state space needs to be explored. In particular, any attempt of guidance is useless and any effort spent on it costs overall performance. On the other hand, many unrealizable families are constructed by taking a realizable family and introducing a contradiction at a parametrized depth of the execution. This class may be more suitable for employing targeted guidance, as one only needs to find that single contradiction in the state space to prove unrealizability. However, these unrealizable families tend to be dominant in numbers while not providing much diversity. Thus, if a heuristic by chance adapts well or badly to a single family or a particular kind of contradiction, this effect alone can dominate the overall evaluation. As such, the results on SYNTCOMP, should be interpreted carefully, especially when comparing guidance heuristics.
Experimental Setup Our experiments were conducted on an AMD Epyc 7443 24-Core CPU and 188GB of RAM. Each invocation was limited to 30 minutes and 60GB memory, mimicking SYNTCOMP conditions.
Table 1.
Comparison of SemML to Strix on SYNTCOMP and synthetic data. We show how many instances are solved by both, by only one, or none of the tools. Note that 815 instances of SYNTCOMP are solved by both tools in under 10 seconds.
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Tab1_HTML.png
Dummy
Table 2.
Average runtime ratios between Strix and SemML for all instances where at least one tool required more than n seconds (where n is in the top row) and both tools found a solution. We also give the number of instances that satisfy these two criteria. A ratio \(>1\) indicates that SemML is faster on average.
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Tab2_HTML.png
Dummy

4.1 RQ1: Comparing SemML to Strix

Our main results are summarized in Tables 1 and 2 and displayed in Fig. 5. On SYNTCOMP, we solve \(49-27=22 \) instances more than Strix which, considering the difficulty scale of SYNTCOMP, marks a major achievement. Further, we observe significant speed-up compared to Strix, especially as the instances get larger. For the ratio over entire SYNTCOMP, we mention that there are about 600 instances that Strix solves (nearly) instantly whereas SemML requires about a second to start the JVM and load ML parameters.
Investigating the unique solves of both tools in more detail, we observe that on many realizable families, SemML is able to solve one more instance than Strix within the timeout, sometimes even within a minute. In particular, there are three families (amba_gr, amba_decomposed_lock, collector_v3), where two or even more extra instances were solved, marking a major improvement. The instances only solved by Strix turned out to be mostly from variants of the ltl2dba families. Here, we conjecture that our guidance takes a “bad turn”, never reached closure, and thus never reconsidered its steps, while the less guided, broader exploration of Strix has a higher chance of exploring the right parts.
For unrealizable formulae, most of our unique solves were of the discussed form of injecting a fault into some arbiter and our targeted exploration was able to localize these faults deep into the state space. Strix’s unique solves are again mostly unrealizable variants of ltl2dba and detector_unreal formulae.
On the Synthetic dataset, SemML outperforms Strix by an order of magnitude, solving all instances Strix is able to solve, and even 53 additional ones. On the 30 instances that both tools solved, SemML is 8.56 times faster, and this factor increases to 13.44 when considering the most challenging instances.

4.2 RQ2: Effects of Machine Learning and Algorithmic Changes

We proceed to investigate the impact of our algorithmic and engineering changes and ML heuristic individually. For that matter, we first compare Strix to https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figg_HTML.gif and then proceed with comparing https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figh_HTML.gif to SemML.
Fig. 5.
Scatter plots of the runtimes of SemML, Strix, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figi_HTML.gif on SYNTCOMP. A point (xy) denotes that tool X and tool Y needed x and y seconds, respectively. If a point is above/below the diagonal, tool X is faster/slower. Plots are on logarithmic scale, dashed diagonals indicate that one tool is twice as fast. Timeouts are pushed to the orthogonal dashed line. The axes start at 1 second.
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figj_HTML.gif vs. https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figk_HTML.gif Especially on SYNTCOMP, a significant part of the improvements is caused by our algorithmic changes. For example, the game for collectorv1_14 has over 19M states in Strix while SemML keeps it at a manageable 60k states due merging of game states. This allowed SemML to solve collectorv1_15 as well, whereas Strix could not. Similarly, we also observed the concrete impact of the BDD improvements as well as the demand-guided exploration. The former is more prominent for realizable, while the latter shows significant impact mostly (but not exclusively) for unrealizable instances. Concretely, 24 of 30 https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figl_HTML.gif ’s unique solves of unrealizable instances are fault-injected-arbiters. Here, our deep, targeted exploration was able to localize the fault, even when only following the exploration score of Strix. Strix’s broader exploration could not reach these regions despite following the same score, likely because it tried to follow multiple “leads” at once, due to its global view. On the Synthetic set, the effects are even more pronounced. Here, already https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figm_HTML.gif solves all instances that Strix solves and 37 more, with a speed-up of 9.1.
  https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Fign_HTML.gif vs. https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figo_HTML.gif Despite our algorithmic changes already yielding significant improvements, ML still adds performance on top. For SYNTCOMP, our guidance is able to identify solutions much quicker on several families (e.g. 01-13.tlsf or collector_v3), resulting in a speed-up factor of 1.25 (with lower threshold of 30 seconds to focus on complicated instances) and 2 more unique solves on realizable samples. On unrealizable formulae, SemML solves 7 further instances (mainly from full_arbiter_unreal), but fails on 8 instances (mainly from round_robin_arbiter_unreal). We conjecture that this is due to bad generalization, and due to the fact that exploration towards a deep fault comes with more opportunities for a ML model to “mess up”, whereas https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figp_HTML.gif ’s score is stable. Overall however, SemML is competitive with https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figq_HTML.gif on SYNTCOMP, which is positive, considering the discussed structure of SYNTCOMP and, especially, that our model was not even trained on similar inputs.
Turning our attention to the Synthetic set (in line with the training data), we see that SemML has significant 20 unique solves compared to 4 for https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_12/MediaObjects/648515_1_En_12_Figr_HTML.gif and a speed-up of 1.57 on complicated instances (threshold of 30 seconds). Based on this, we conjecture that SYNTCOMP likely is “out-of-distribution” for our model and, consequently, training with both synthetic as well as SYNTCOMP samples would add even more improvements for real-world formulae.
Remark 2
To conclude, we stress that our ML models are deliberately kept small, since we need to be able to evaluate them extremely quickly. As such, we only considered quite simple features and small models. Concretely, the final models for all state classes are comprised of 15 trees of depth 2. We also evaluated larger models with more complex and meaningful features. These performed significantly better in terms of pure accuracy on the pair classification, but added so much overhead that ultimately fewer instances were solved.

5 Conclusion and Future Work

We presented our tool SemML, which combines algorithmic and engineering improvements together with tailored machine-learning heuristics to arrive at a highly performant tool for reactive synthesis. Our experimental evaluation confirms the impact of both our improvements. In particular, SemML significantly outperforms the state-of-the-art tool Strix, which dominated the reactive synthesis competition since its first appearance.
For future work, we identify several avenues. In terms of algorithmic improvements, both the backtracking heuristic and our parity game solver can be significantly improved by a tighter integration with the exploration heuristic. Further, we intend to implement – as well as develop new – state of the art methods for extracting and representing solutions efficiently to practically make use of the solutions we are now able to identify. For users of the tool, we want to provide a variant of SemML that is also trained on SYNTCOMP formulae, unfair for academic evaluation but useful for actual industrial synthesis. We also want to investigate hierarchical oracles, i.e. include more accurate (but slower) oracles, which are only consulted when the basic oracle has low confidence.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Fußnoten
1
For realizability, the task is to determine whether a system satisfying the specification exists; its implementation, however, is only required in the synthesis track, where our tool did not participate. Competitively small representations of computed strategies require numerous (known) techniques, which are orthogonal to the advancements our tool is bringing into the area of machine-learning-aided solving of LTL games.
 
2
The convention that the environment chooses first and the system, observing the environment’s choice, goes second is called Mealy semantics.
 
3
Formally, a function \(f:(2^{\textsf{AP}_{\textsf{ENV}}})^*\rightarrow 2^{\textsf{AP}_{\textsf{SYS}}}\) so that \(e_1f(e_1)e_2f(e_1e_2)\ldots \models \phi \).
 
4
With its best-performing configuration--exploration=minmax --lookahead=0.
 
5
As is customary, the geometric mean is preferable for runtime ratios: For example, for 0.5x and 2x speed-ups it yields 1x instead of 1.25x with an arithmetic mean.
 
Literatur
4.
Zurück zum Zitat Bohy, A., Bruyère, V., Filiot, E., Jin, N., Raskin, J.: Acacia+, a tool for LTL synthesis. In: Madhusudan, P., Seshia, S.A. (eds.) Computer Aided Verification - 24th International Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings. Lecture Notes in Computer Science, vol. 7358, pp. 652–657. Springer (2012). https://doi.org/10.1007/978-3-642-31424-7_45 Bohy, A., Bruyère, V., Filiot, E., Jin, N., Raskin, J.: Acacia+, a tool for LTL synthesis. In: Madhusudan, P., Seshia, S.A. (eds.) Computer Aided Verification - 24th International Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings. Lecture Notes in Computer Science, vol. 7358, pp. 652–657. Springer (2012). https://​doi.​org/​10.​1007/​978-3-642-31424-7_​45
5.
Zurück zum Zitat Büchi, J.: On a decision method in restricted second-order arithmetic. In: Nagel, E., Suppes, P., Tarski, A. (eds.) Proceedings of the First International Congress on Logic, Methodology, and Philosophy of Science 1960 (1962) Büchi, J.: On a decision method in restricted second-order arithmetic. In: Nagel, E., Suppes, P., Tarski, A. (eds.) Proceedings of the First International Congress on Logic, Methodology, and Philosophy of Science 1960 (1962)
6.
Zurück zum Zitat Cadilhac, M., Pérez, G.A.: Acacia-bonsai: A modern implementation of downset-based LTL realizability. In: Sankaranarayanan, S., Sharygina, N. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Paris, France, April 22-27, 2023, Proceedings, Part II. Lecture Notes in Computer Science, vol. 13994, pp. 192–207. Springer (2023). https://doi.org/10.1007/978-3-031-30820-8_14 Cadilhac, M., Pérez, G.A.: Acacia-bonsai: A modern implementation of downset-based LTL realizability. In: Sankaranarayanan, S., Sharygina, N. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Paris, France, April 22-27, 2023, Proceedings, Part II. Lecture Notes in Computer Science, vol. 13994, pp. 192–207. Springer (2023). https://​doi.​org/​10.​1007/​978-3-031-30820-8_​14
7.
Zurück zum Zitat Church, A.: Application of recursive arithmetic to the problem of circuit synthesis. Journal of Symbolic Logic (1963) Church, A.: Application of recursive arithmetic to the problem of circuit synthesis. Journal of Symbolic Logic (1963)
8.
Zurück zum Zitat Cosler, M., Hahn, C., Omar, A., Schmitt, F.: Neurosynt: A neuro-symbolic portfolio solver for reactive synthesis. In: Finkbeiner, B., Kovács, L. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 30th International Conference, TACAS 2024, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2024, Luxembourg City, Luxembourg, April 6-11, 2024, Proceedings, Part III. Lecture Notes in Computer Science, vol. 14572, pp. 45–67. Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_3 Cosler, M., Hahn, C., Omar, A., Schmitt, F.: Neurosynt: A neuro-symbolic portfolio solver for reactive synthesis. In: Finkbeiner, B., Kovács, L. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 30th International Conference, TACAS 2024, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2024, Luxembourg City, Luxembourg, April 6-11, 2024, Proceedings, Part III. Lecture Notes in Computer Science, vol. 14572, pp. 45–67. Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​3
9.
Zurück zum Zitat Cosler, M., Schmitt, F., Hahn, C., Finkbeiner, B.: Iterative circuit repair against formal specifications. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023), https://openreview.net/forum?id=SEcSahl0Ql Cosler, M., Schmitt, F., Hahn, C., Finkbeiner, B.: Iterative circuit repair against formal specifications. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023), https://​openreview.​net/​forum?​id=​SEcSahl0Ql
10.
Zurück zum Zitat Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Property specification patterns for finite-state verification. In: Proceedings of the Second Workshop on Formal Methods in Software Practice. p. 7–15. FMSP ’98, Association for Computing Machinery, New York, NY, USA (1998). https://doi.org/10.1145/298595.298598 Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Property specification patterns for finite-state verification. In: Proceedings of the Second Workshop on Formal Methods in Software Practice. p. 7–15. FMSP ’98, Association for Computing Machinery, New York, NY, USA (1998). https://​doi.​org/​10.​1145/​298595.​298598
11.
Zurück zum Zitat Ehlers, R.: Unbeast: Symbolic bounded synthesis. In: Abdulla, P.A., Leino, K.R.M. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 17th International Conference, TACAS 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings. Lecture Notes in Computer Science, vol. 6605, pp. 272–275. Springer (2011). https://doi.org/10.1007/978-3-642-19835-9_25 Ehlers, R.: Unbeast: Symbolic bounded synthesis. In: Abdulla, P.A., Leino, K.R.M. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 17th International Conference, TACAS 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings. Lecture Notes in Computer Science, vol. 6605, pp. 272–275. Springer (2011). https://​doi.​org/​10.​1007/​978-3-642-19835-9_​25
14.
Zurück zum Zitat Faymonville, P., Finkbeiner, B., Tentrup, L.: Bosy: An experimentation framework for bounded synthesis. In: Majumdar, R., Kuncak, V. (eds.) Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part II. Lecture Notes in Computer Science, vol. 10427, pp. 325–332. Springer (2017). https://doi.org/10.1007/978-3-319-63390-9_17 Faymonville, P., Finkbeiner, B., Tentrup, L.: Bosy: An experimentation framework for bounded synthesis. In: Majumdar, R., Kuncak, V. (eds.) Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part II. Lecture Notes in Computer Science, vol. 10427, pp. 325–332. Springer (2017). https://​doi.​org/​10.​1007/​978-3-319-63390-9_​17
15.
Zurück zum Zitat Jacobs, S., Klein, F., Schirmer, S.: A high-level LTL synthesis format: TLSF v1.1. In: Piskac, R., Dimitrova, R. (eds.) Proceedings Fifth Workshop on Synthesis, SYNT@CAV 2016, Toronto, Canada, July 17-18, 2016. EPTCS, vol. 229, pp. 112–132 (2016). https://doi.org/10.4204/EPTCS.229.10 Jacobs, S., Klein, F., Schirmer, S.: A high-level LTL synthesis format: TLSF v1.1. In: Piskac, R., Dimitrova, R. (eds.) Proceedings Fifth Workshop on Synthesis, SYNT@CAV 2016, Toronto, Canada, July 17-18, 2016. EPTCS, vol. 229, pp. 112–132 (2016). https://​doi.​org/​10.​4204/​EPTCS.​229.​10
16.
Zurück zum Zitat Jacobs, S., Pérez, G.A., Abraham, R., Bruyère, V., Cadilhac, M., Colange, M., Delfosse, C., van Dijk, T., Duret-Lutz, A., Faymonville, P., Finkbeiner, B., Khalimov, A., Klein, F., Luttenberger, M., Meyer, K.J., Michaud, T., Pommellet, A., Renkin, F., Schlehuber-Caissier, P., Sakr, M., Sickert, S., Staquet, G., Tamines, C., Tentrup, L., Walker, A.: The reactive synthesis competition (SYNTCOMP): 2018-2021. CoRR abs/2206.00251 (2022). https://doi.org/10.48550/arXiv.2206.00251 Jacobs, S., Pérez, G.A., Abraham, R., Bruyère, V., Cadilhac, M., Colange, M., Delfosse, C., van Dijk, T., Duret-Lutz, A., Faymonville, P., Finkbeiner, B., Khalimov, A., Klein, F., Luttenberger, M., Meyer, K.J., Michaud, T., Pommellet, A., Renkin, F., Schlehuber-Caissier, P., Sakr, M., Sickert, S., Staquet, G., Tamines, C., Tentrup, L., Walker, A.: The reactive synthesis competition (SYNTCOMP): 2018-2021. CoRR abs/2206.00251 (2022). https://​doi.​org/​10.​48550/​arXiv.​2206.​00251
17.
Zurück zum Zitat Jobstmann, B., Bloem, R.: Optimizations for LTL synthesis. In: Formal Methods in Computer-Aided Design, 6th International Conference, FMCAD 2006, San Jose, California, USA, November 12-16, 2006, Proceedings. pp. 117–124. IEEE Computer Society (2006). https://doi.org/10.1109/FMCAD.2006.22 Jobstmann, B., Bloem, R.: Optimizations for LTL synthesis. In: Formal Methods in Computer-Aided Design, 6th International Conference, FMCAD 2006, San Jose, California, USA, November 12-16, 2006, Proceedings. pp. 117–124. IEEE Computer Society (2006). https://​doi.​org/​10.​1109/​FMCAD.​2006.​22
18.
Zurück zum Zitat Kretínský, J., Esparza, J.: Deterministic automata for the (f, g)-fragment of LTL. In: Madhusudan, P., Seshia, S.A. (eds.) Computer Aided Verification - 24th International Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings. Lecture Notes in Computer Science, vol. 7358, pp. 7–22. Springer (2012). https://doi.org/10.1007/978-3-642-31424-7_7 Kretínský, J., Esparza, J.: Deterministic automata for the (f, g)-fragment of LTL. In: Madhusudan, P., Seshia, S.A. (eds.) Computer Aided Verification - 24th International Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings. Lecture Notes in Computer Science, vol. 7358, pp. 7–22. Springer (2012). https://​doi.​org/​10.​1007/​978-3-642-31424-7_​7
19.
Zurück zum Zitat Kretínský, J., Manta, A., Meggendorfer, T.: Semantic labelling and learning for parity game solving in LTL synthesis. In: Chen, Y., Cheng, C., Esparza, J. (eds.) Automated Technology for Verification and Analysis - 17th International Symposium, ATVA 2019, Taipei, Taiwan, October 28-31, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11781, pp. 404–422. Springer (2019). https://doi.org/10.1007/978-3-030-31784-3_24 Kretínský, J., Manta, A., Meggendorfer, T.: Semantic labelling and learning for parity game solving in LTL synthesis. In: Chen, Y., Cheng, C., Esparza, J. (eds.) Automated Technology for Verification and Analysis - 17th International Symposium, ATVA 2019, Taipei, Taiwan, October 28-31, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11781, pp. 404–422. Springer (2019). https://​doi.​org/​10.​1007/​978-3-030-31784-3_​24
20.
Zurück zum Zitat Kretínský, J., Meggendorfer, T., Prokop, M., Rieder, S.: Guessing winning policies in LTL synthesis by semantic learning. In: Enea, C., Lal, A. (eds.) Computer Aided Verification - 35th International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part I. Lecture Notes in Computer Science, vol. 13964, pp. 390–414. Springer (2023). https://doi.org/10.1007/978-3-031-37706-8_20 Kretínský, J., Meggendorfer, T., Prokop, M., Rieder, S.: Guessing winning policies in LTL synthesis by semantic learning. In: Enea, C., Lal, A. (eds.) Computer Aided Verification - 35th International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part I. Lecture Notes in Computer Science, vol. 13964, pp. 390–414. Springer (2023). https://​doi.​org/​10.​1007/​978-3-031-37706-8_​20
22.
Zurück zum Zitat Kretínský, J., Meggendorfer, T., Sickert, S.: Owl: A library for \(\omega \)-words, automata, and LTL. In: Lahiri, S.K., Wang, C. (eds.) Automated Technology for Verification and Analysis - 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October 7-10, 2018, Proceedings. Lecture Notes in Computer Science, vol. 11138, pp. 543–550. Springer (2018). https://doi.org/10.1007/978-3-030-01090-4_34 Kretínský, J., Meggendorfer, T., Sickert, S.: Owl: A library for \(\omega \)-words, automata, and LTL. In: Lahiri, S.K., Wang, C. (eds.) Automated Technology for Verification and Analysis - 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October 7-10, 2018, Proceedings. Lecture Notes in Computer Science, vol. 11138, pp. 543–550. Springer (2018). https://​doi.​org/​10.​1007/​978-3-030-01090-4_​34
23.
Zurück zum Zitat Kretínský, J., Meggendorfer, T., Sickert, S., Ziegler, C.: Rabinizer 4: From LTL to your favourite deterministic automaton. In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10981, pp. 567–577. Springer (2018). https://doi.org/10.1007/978-3-319-96145-3_30 Kretínský, J., Meggendorfer, T., Sickert, S., Ziegler, C.: Rabinizer 4: From LTL to your favourite deterministic automaton. In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10981, pp. 567–577. Springer (2018). https://​doi.​org/​10.​1007/​978-3-319-96145-3_​30
24.
Zurück zum Zitat Kupferman, O., Piterman, N., Vardi, M.Y.: Safraless compositional synthesis. In: Ball, T., Jones, R.B. (eds.) Computer Aided Verification, 18th International Conference, CAV 2006, Seattle, WA, USA, August 17-20, 2006, Proceedings. Lecture Notes in Computer Science, vol. 4144, pp. 31–44. Springer (2006). https://doi.org/10.1007/11817963_6 Kupferman, O., Piterman, N., Vardi, M.Y.: Safraless compositional synthesis. In: Ball, T., Jones, R.B. (eds.) Computer Aided Verification, 18th International Conference, CAV 2006, Seattle, WA, USA, August 17-20, 2006, Proceedings. Lecture Notes in Computer Science, vol. 4144, pp. 31–44. Springer (2006). https://​doi.​org/​10.​1007/​11817963_​6
26.
Zurück zum Zitat Meyer, P.J., Sickert, S., Luttenberger, M.: Strix: Explicit reactive synthesis strikes back! In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10981, pp. 578–586. Springer (2018). https://doi.org/10.1007/978-3-319-96145-3_31 Meyer, P.J., Sickert, S., Luttenberger, M.: Strix: Explicit reactive synthesis strikes back! In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10981, pp. 578–586. Springer (2018). https://​doi.​org/​10.​1007/​978-3-319-96145-3_​31
27.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
28.
Zurück zum Zitat Piterman, N.: From nondeterministic buchi and streett automata to deterministic parity automata. In: 21th IEEE Symposium on Logic in Computer Science (LICS 2006), 12-15 August 2006, Seattle, WA, USA, Proceedings. pp. 255–264. IEEE Computer Society (2006). https://doi.org/10.1109/LICS.2006.28 Piterman, N.: From nondeterministic buchi and streett automata to deterministic parity automata. In: 21th IEEE Symposium on Logic in Computer Science (LICS 2006), 12-15 August 2006, Seattle, WA, USA, Proceedings. pp. 255–264. IEEE Computer Society (2006). https://​doi.​org/​10.​1109/​LICS.​2006.​28
29.
Zurück zum Zitat Piterman, N., Pnueli, A., Sa’ar, Y.: Synthesis of reactive(1) designs. In: Emerson, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract Interpretation, 7th International Conference, VMCAI 2006, Charleston, SC, USA, January 8-10, 2006, Proceedings. Lecture Notes in Computer Science, vol. 3855, pp. 364–380. Springer (2006). https://doi.org/10.1007/11609773_24 Piterman, N., Pnueli, A., Sa’ar, Y.: Synthesis of reactive(1) designs. In: Emerson, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract Interpretation, 7th International Conference, VMCAI 2006, Charleston, SC, USA, January 8-10, 2006, Proceedings. Lecture Notes in Computer Science, vol. 3855, pp. 364–380. Springer (2006). https://​doi.​org/​10.​1007/​11609773_​24
31.
Zurück zum Zitat Pnueli, A., Rosner, R.: On the synthesis of an asynchronous reactive module. In: Ausiello, G., Dezani-Ciancaglini, M., Rocca, S.R.D. (eds.) Automata, Languages and Programming, 16th International Colloquium, ICALP89, Stresa, Italy, July 11-15, 1989, Proceedings. Lecture Notes in Computer Science, vol. 372, pp. 652–671. Springer (1989). https://doi.org/10.1007/BFb0035790 Pnueli, A., Rosner, R.: On the synthesis of an asynchronous reactive module. In: Ausiello, G., Dezani-Ciancaglini, M., Rocca, S.R.D. (eds.) Automata, Languages and Programming, 16th International Colloquium, ICALP89, Stresa, Italy, July 11-15, 1989, Proceedings. Lecture Notes in Computer Science, vol. 372, pp. 652–671. Springer (1989). https://​doi.​org/​10.​1007/​BFb0035790
34.
Zurück zum Zitat Rodríguez, A., Sánchez, C.: Boolean abstractions for realizability modulo theories. In: Enea, C., Lal, A. (eds.) Computer Aided Verification. pp. 305–328. Springer Nature Switzerland, Cham (2023) Rodríguez, A., Sánchez, C.: Boolean abstractions for realizability modulo theories. In: Enea, C., Lal, A. (eds.) Computer Aided Verification. pp. 305–328. Springer Nature Switzerland, Cham (2023)
36.
Zurück zum Zitat Schewe, S.: Tighter bounds for the determinisation of büchi automata. In: de Alfaro, L. (ed.) Foundations of Software Science and Computational Structures, 12th International Conference, FOSSACS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22-29, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5504, pp. 167–181. Springer (2009).https://doi.org/10.1007/978-3-642-00596-1_13 Schewe, S.: Tighter bounds for the determinisation of büchi automata. In: de Alfaro, L. (ed.) Foundations of Software Science and Computational Structures, 12th International Conference, FOSSACS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22-29, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5504, pp. 167–181. Springer (2009).https://​doi.​org/​10.​1007/​978-3-642-00596-1_​13
38.
Zurück zum Zitat Sickert, S., Esparza, J., Jaax, S., Křetínský, J.: Limit-deterministic büchi automata for linear temporal logic. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II. Lecture Notes in Computer Science, vol. 9780, pp. 312–332. Springer (2016). https://doi.org/10.1007/978-3-319-41540-6_17 Sickert, S., Esparza, J., Jaax, S., Křetínský, J.: Limit-deterministic büchi automata for linear temporal logic. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II. Lecture Notes in Computer Science, vol. 9780, pp. 312–332. Springer (2016). https://​doi.​org/​10.​1007/​978-3-319-41540-6_​17
39.
Zurück zum Zitat Thomas, W.: On the synthesis of strategies in infinite games. In: Mayr, E.W., Puech, C. (eds.) STACS 95, 12th Annual Symposium on Theoretical Aspects of Computer Science, Munich, Germany, March 2-4, 1995, Proceedings. Lecture Notes in Computer Science, vol. 900, pp. 1–13. Springer (1995). https://doi.org/10.1007/3-540-59042-0_57 Thomas, W.: On the synthesis of strategies in infinite games. In: Mayr, E.W., Puech, C. (eds.) STACS 95, 12th Annual Symposium on Theoretical Aspects of Computer Science, Munich, Germany, March 2-4, 1995, Proceedings. Lecture Notes in Computer Science, vol. 900, pp. 1–13. Springer (1995). https://​doi.​org/​10.​1007/​3-540-59042-0_​57
Metadaten
Titel
SemML: Enhancing Automata-Theoretic LTL Synthesis with Machine Learning
verfasst von
Jan Křetínský
Tobias Meggendorfer
Maximilian Prokop
Ashkan Zarkhah
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-90643-5_12