Skip to main content
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Compositional Learning for Synchronous Parallel Automata

verfasst von : Mahboubeh Samadi, Aryan Bastany, Hossein Hojjat

Erschienen in: Fundamental Approaches to Software Engineering

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dieses Kapitel beschäftigt sich mit der fortgeschrittenen Technik des aktiven Automatenlernens, die entscheidend dazu beiträgt, prägnante und abstrakte Modelle aus Black-Box-Systemen zu extrahieren. Der Schwerpunkt liegt auf der synchronen parallelen Zusammensetzung von Mealy-Maschinen, einer Art endlicher Zustandsmaschine, die Ausgänge auf Grundlage aktueller Zustände und Eingaben produziert. Das Kapitel stellt einen bahnbrechenden synchronen kompositorischen Lernalgorithmus vor, der bestehende Methoden erweitert, indem er aus synchronen Komponenten bestehende Lernsysteme ohne vorherige Kenntnis ihrer Struktur oder synchronisierender Handlungen erlernt. Dieser Algorithmus erkennt und lernt automatisch die einzelnen Komponenten und ihre Interaktionen, wodurch Output-Determinismus und Skalierbarkeit gewährleistet sind. Das Kapitel bietet eine gründliche theoretische Grundlage, einschließlich des Nachweises von Kündigung und Richtigkeit, und bewertet die Leistung des Algorithmus sowohl anhand synthetischer als auch realistischer Benchmarks. Die Ergebnisse zeigen die Überlegenheit des Algorithmus im Umgang mit großen Modellen und machen ihn zu einem wertvollen Beitrag im Bereich des automatisierten Lernens. Die detaillierte Evaluierung und die experimentellen Ergebnisse heben die praktischen Anwendungen und Vorteile des vorgeschlagenen Algorithmus hervor und machen ihn zu einem Pflichtlektüre für jeden, der sich für fortgeschrittene Automaten-Lerntechniken interessiert.

1 Introduction

Active automata learning [1] is a promising approach for extracting a model in the shape of an automaton from a black-box system. This technique can be quite handy when engineers need to understand how a legacy or sophisticated system operates. The extracted model is concise, abstract, and quite helpful in practice. Active automata learning has recently gained much attention: it has been used for learning sophisticated industrial systems such as communication and security protocols [2], biometric passports [3], smart cards [4], and network protocols [5].
Fig. 1:
(a) Two Mealy machines with alphabets \(\{a,c\}\) and \(\{b,c\}\) (b) Synchronous parallel composition of two Mealy machines synchronizing on c
In automata learning, a learner interacts with a teacher who knows the System Under Learning (SUL), and poses queries to learn the system model. The original algorithm L\(^*\) [1] is proposed to learn a model of a system as a deterministic finite automaton, and other extensions of L\(^*\) are introduced in [610] to learn richer models. Using L\(^*\) to learn industrial large models may lead to a huge number of queries which grows quadratically with the number of model states [11]. Instead, compositional learning where a system is learned through learning its components is a promising approach to resolve this problem [12]. In [1114], compositional learning algorithms are introduced where [12] and [14] are the closest works to our work. In [12], a synchronous parallel composition algorithm is introduced where the alphabets of components are known to the learner beforehand and the components interact synchronously. In [14], a compositional algorithm is introduced as an extension of L\(^*\) where the individual components are learned without any knowledge about the number of components and their individual alphabets. However, in [14], components are not synchronized and there is no communication among them.
In this paper, we introduce a synchronous compositional learning algorithm as an extension of the algorithm proposed in [14] in which systems comprising synchronous components are learned compositionally. In our approach, individual components are learned without any knowledge about the number of components, their individual alphabets, and the synchronizing alphabets. We learn such information automatically and on-the-fly, and learn the composite model by applying L\(^*\) to the individual components. We learn a system model in the form of a Mealy machine [15]. A Mealy machine is a finite-state machine in which the transitions produce an output value based on the current state and the current input symbol. Following the model learning literature [1, 14], we assume that the system model under learning is deterministic [16] where there is exactly one output and target state for a source state and a given input. In other words, the next state is uniquely determined by the given state, input, and output. We show by an example that the synchronous parallel composition of two deterministic Mealy machines may lead to a pseudo-nondeterministic Mealy machine [17] where there is more than one output for the given state and the given input.
Example. Consider two deterministic Mealy machines \(M_1\) and \(M_2\) (Fig.  1(a)) with input alphabets \(\{a,c\}\), and \(\{b,c\}\). These machines synchronize on the shared action c where c has different outputs 0 and 1. Fig. 1(b) shows the synchronous parallel composition of \(M_1\) and \(M_2\) where composing \(s_0\xrightarrow {c/0}s_0\) and \(s'_1\xrightarrow {c/1}s'_2\) results in \((s_0,s'_1)\xrightarrow {c/?}(s_0,s'_2)\) and ? denotes the non-deterministic output which can be either 0 or 1.
In this paper, we assume that each synchronizing action must have a unique output wherever it is used across the whole system. This assumption avoids output non-determinism: the synchronous parallel composition of two deterministic Mealy machines remains deterministic. Output determinism is also assumed in [1820] and compatible with many benchmarks in railway [21, 22] and automotive [23, 24] domains. We also investigated the outputs of actions in some benchmarks (details in Section 5) and found that on average around \(\sim \)46 percent of the actions have unique outputs. However, the assumption that synchronizing actions have unique outputs everywhere may be overkill in some cases (the output value should be the same only at the point of communication, not throughout the whole system). Without this assumption, the proposed algorithm cannot detect the synchronizing actions and the individual components, but it learns the parallel composition of automata correctly. We leave investigating the general case as future work.
In the proposed approach, we learn the SUL in separate components. If the alphabets of the components are not disjoint, then their intersection denotes the synchronizing actions. Initially, we start with a partition comprising only singleton sets and assume that all actions can be synchronizing actions. If we find an action with two different outputs, then we remove it from the set of synchronizing actions. The total behavior of the system is obtained by constructing the synchronous parallel composition of the components. We process the counter-example CE returned from the teacher and iteratively update the set of synchronizing actions. For each of the two actions involved in CE, we merge their corresponding input sets if they do not belong to the set of synchronizing actions. Otherwise, we add the synchronizing action to the sets including the non-synchronizing actions of CE.
Fig. 2:
(a) Initial model: two parallel Mealy machines, (b) Synchronous parallel composition of (a), (c) Partition input alphabet to \(\{\{a\},\{b\},\{c\}\}\) and learn each component individually, (d) Use a counter-example ca to update components, (e) Use a counter-example cb to update components
Running Example Consider the SUL of Fig. 2(a) with two synchronous components and the alphabets \(\{a,c\}\) and \(\{b,c\}\). Fig. 2(b) shows the synchronous parallel composition of two components. Initially, we assume that the set of synchronizing actions is \({ sync}=\{a,b,c\}\). We start by partitioning the alphabet into singleton sets of elements, i.e., \(\{a\}\), \(\{b\}\), \(\{c\}\), where each set corresponds to one component. The parallel composition of the three learned Mealy machines of Fig.  2(c) does not comply with the original system of Fig. 2(b). Hence, the teacher returns a counter-example. For example, \({ CE}=ca\) is a counter-example as this string generates the output 01 in Fig. 2(b) but the output sequence in Fig. 2(c) is 00. As the action a has two outputs 0 and 1, we remove it from the set of synchronizing actions, i.e., \({ sync}=\{b,c\}\). Two actions c and a are both involved in the minimum counter-example. A counter-example is minimal if any of its prefixes is not a counter-example. In this way, a minimal counter example does not have extra actions and only consists of a sequence of actions that leads to generating different outputs. Hence, we conclude that c and a must belong to one component and hence we add the synchronizing action c to the set including a, i.e., \(\{a,c\}\), \(\{b\}\) (details about the methods of partition and merge can be found in Section 3). Then, we restart the learning process which leads to the Mealy machines in Fig. 2(d). As the learning process continues, the teacher returns a counter-example \({ CE}=cb\) which generates the output 01 in Fig. 2(b) but the output sequence is 00 in Fig. 2(d). As the action b has two outputs 0 and 1, we remove it from the set of synchronizing actions which results in \({ sync}=\{c\}\). We conclude that b and c must belong to one component as the counter-example includes both actions b and c. However, we do not merge the input sets \(\{a,c\}\) and \(\{b\}\) as a and b may belong to two distinct components synchronized on c. Instead, we add c to the set \(\{b\}\) which results in \(\{a,c\}\) and \(\{b,c\}\) (Fig. 2(e)). The synchronous parallel composition of components in Fig. 2(e) is equal to Fig. 2(b). Hence, the teacher does not return a counter-example and the original system’s components are learned.
In the following, we summarize our main contributions:
  • We introduce a compositional algorithm for parallel synchronous automata. The learner does not know the number of automata, their individual alphabets, and the synchronizing alphabets (Section 3).
  • We provide a theoretical proof of the proposed algorithm and prove that the algorithm terminates and the components are learned correctly (Section 4).
  • We evaluate the performance of the proposed algorithm on the two sets of benchmarking systems called synthetic and realistic. For the synthetic systems, we generate random components that can synchronize based on the patterns that reflect the real-world network topologies. For the realistic system, we use a benchmark based on an industrial automotive system (Section 5).

2 Preliminaries and Background

In this section, we first give some preliminaries about the notion of a Mealy machine as the underlying system model, the parallel composition and decomposition of Mealy machines, and the concept of (in)dependent actions. Then, we recall the basic concepts of active automata learning and the L\(^*\) algorithm.

2.1 Mealy Machines

Mealy machines are the generalizations of finite automata in which the transitions produce outputs:
Definition 1
(Mealy Machine). A Mealy machine is a six-tuple \((S, s_0, I,O, \delta , \lambda )\):
  • S is a finite set of internal states, \(s_0\in S\) is the initial state,
  • I is a non-empty set of actions, representing the input alphabet,
  • O is a non-empty set of outputs,
  • \(\delta : S \times I \rightarrow S\) is a state transition function,
  • \(\lambda : S \times I \rightarrow O\) is an output function.
A Mealy machine starts in the initial state \(s_0\) and accepts a sequence of actions to produce an equally-sized sequence of outputs. For each \(s,s' \in S\), \(i\in I\), and \(o\in O\), we use \(\delta (s, i) = s'\) to denote that the next state of s is \(s'\), \(\lambda (s, i) = o\) to denote that the output upon receiving an input i is o.

2.2 (De)Composing Mealy Machines

In this section, we first define the synchronizing actions and the synchronous parallel composition of Mealy machines, and then explain how a Mealy machine can be decomposed into smaller Mealy machines. In a parallel composition of Mealy machines, we assume that whenever two machines share a common action symbol, this symbol is a synchronizing action and the Mealy machines must synchronize on that symbol. This is essentially similar to CSP-style multi-way communications [25], where multiple machines need to synchronize at the common action name. To prevent non-deterministic behavior, as explained in Section 1, we strictly require that each synchronizing action has a unique output.
Definition 2
(Synchronizing Action). For n Mealy machines \(M_i=(S_i,s_{0_i},I_i,O_i,\delta _i,\lambda _i)\), s.t \(i\in \{1,\ldots , n\}\), an action u is a synchronizing action iff:
  • There exists \(M_j\) and \(M_k\) such that \(u\in I_j\cap I_k\) and for every \(s\in S_j\) and \(s'\in S_k\), \(\lambda _j(s,u)=\lambda _k(s',u)\).
  • In any \(M_j\), there are no states \(s,s'\in S_j\) such that \(\lambda _j(s,u)\ne \lambda _j(s',u)\).
We assume any action shared among some Mealy machines is a synchronizing action. An action that is not shared between any Mealy machines is a local action.
Definition 3
(Synchronous Parallel Composition). Given n Mealy machines \(M_i=(S_i,s_{0_i},I_i,O_i,\delta _i,\lambda _i)\), s.t \(i\in \{1,\ldots , n\}\), their synchronous parallel composition, denoted by \(\parallel _{i=1}^n M_i\), defined as \((S,s_{0},I,O ,\delta ,\lambda )\) where \(S=S_1\times \ldots \times S_n\), \(s_0 = (s_{0_1},\ldots ,s_{0_n})\), \(I=\bigcup _{i\in \{1,\ldots , n\}} I_i\), \(O=\bigcup _{i\in \{1,\ldots , n\}} O_i\), \(\lambda ((s_1,\ldots ,s_j,\ldots ,s_n),u)=\lambda _j(s_j,u)\) and \(\delta \) is defined as the following:
  • For each \(j\in \{1,\ldots ,n\}\), the local action u can be performed independently if \(u\in I_j\) and there does not exist \(k\in \{1,\ldots ,n\}\), such that \(j\ne k\) and \(u\in I_k\). In this case, \(\delta \) is defined as
    $$\begin{aligned} \delta ((s_1,\ldots ,s_j,\ldots ,s_n),u)= (s_1,\ldots ,\delta _j(s_j,u),\ldots ,s_n) \end{aligned}$$
  • The synchronizing action u can be performed by the Mealy machines that include u. For each \(j\in \{1,\ldots ,n\}\), we define \(\delta '_j(s_j,u)\) to denote the next state of \(s_j\) after performing the synchronizing action u. If \(u\in I_j\) then \( \delta '_j(s_j,u)=\delta _j(s_j, u)\), and if \(u\not \in I_j\) then \(\delta '_j(s_j,u)=s_j\). In this case, \(\delta \) is defined as
    $$\begin{aligned} \delta ((s_1,\ldots ,s_j,\ldots ,s_n),u)= (\delta '_1(s_1,u),\ldots ,\delta '_j(s_j,u),\ldots ,\delta '_n(s_n,u)) \end{aligned}$$
As the output of a synchronizing action is the same in all Mealy machines, the last condition implies that the output of an action in the composed Mealy machine is the same as its output in the individual Mealy machine.
Next, we first define how a Mealy machine can be decomposed into some Mealy machines, then define (in)dependent actions, and simple Mealy machines.
Definition 4
(Projection of a Mealy Machine [14]). The projection of a Mealy machine \(M = (S, s_0, I,O, \delta , \lambda )\) on a set of inputs \(I' \subseteq I\), denoted by \(P(M, I')\), is a Mealy machine \((S, s_0, I',O', \delta ', \lambda ')\), where
  • \(\delta '(s, a)=\delta (s, a)\) for \(a\in I', {s\in S}\),
  • \(\lambda '(s, a) = \lambda (s, a)\) for \(a\in I', {s\in S},\) and
  • \(O' = \{o\in O\mid \exists \, a\in I', \exists \, s \in S\cdot \lambda (s, a) = o\}\).
Example. Consider two Mealy machines in Fig. 2(a) where alphabets are \(\Sigma _1=\{a,c\}\) and \(\Sigma _2=\{b,c\}\). The shared action c is a synchronizing action, and a and b are the local actions. The synchronizing action c has a unique output 0 in both Mealy machines. The synchronous parallel composition of these two Mealy machines is denoted in Fig. 2(b). The projection of a Mealy machine in Fig. 2(b) on the input alphabets \(\{a\}\), \(\{b\}\), and \(\{c\}\) are represented in Fig. 2(c).
Definition 5
(Independent Cover [14]). Consider a Mealy machine M with a set of inputs I. The subsets \(I_1,\ldots , I_n \subseteq I\) form an independent cover of I when for any sequence of inputs as \({u'}\), \(\lambda _{P(M,I_1)\parallel \ldots \parallel P(M,I_n)}({u'}) =\lambda _{M}({u'})\).
Intuitively, an independent cover represents individual components where some components are synchronized on shared actions.
Definition 6
(Independent Actions). Consider a set of inputs I, and \(\{I_1,\ldots ,I_i,\ldots ,I_j,\ldots ,I_n\}\) an independent cover of I (\(i\ne j\)). Two actions u and v where at least one of them is non-synchronizing are independent if \(u\in I_i\) and \(v\in I_j\).
Definition 7
(Dependent Actions). Consider a set of inputs I, and \(\{I_1,\ldots ,I_i,\ldots ,I_n\}\) an independent cover of I. Two actions u and v where at least one of them is a non-synchronizing action, are dependent if \(u\in I_i\) and \(v\in I_i\).
Definition 8
(Simple Mealy Machine). A Mealy machine M is simple if it consists of dependent actions, and no Mealy machines exist like \(M_1\) and \(M_2\) such that \(M=M_1\parallel M_2\).
Example. The cover \(\{\{a\},\{b\}{,\{c\}}\}\) in Fig. 2(c) is not an independent cover  as \(\lambda _{M}(aca)=001\) but \(\lambda _{P(M,\{a\})||P(M,\{b\}){||P(M,\{c\})}}({aca})\) = 000. Two actions a and b in Fig. 2(a) are independent as they belong to two subsets of \(\{\{a,c\},\{b,c\}\}\), and a and c are dependent as they belong to one set of \(\{\{a,c\},\{a,b\}\}\). Each of the two Mealy machines in Fig. 2(a) are the simple Mealy machines as they are not the synchronous parallel composition of other machines and their actions are dependent on each other.

2.3 Model Learning

Active model learning [1] is a technique for learning the behavior of a black box system by formulating a hypothesis \(\mathscr {H}\) about the behavior of an SUL as a finite state machine. The learner can pose two types of queries to a teacher about the SUL that only the teacher knows: (1) Membership Queries, where the learner asks whether a target SUL includes a sequence of actions, (2) Equivalence Queries (EQ), where the learner proposes a hypothesis \(\mathscr {H}\) about the “language” of the SUL and asks whether \(\mathscr {H}\) and the SUL are equivalent. The learner organizes the received information from the teacher in an observation table:
Definition 9
(Observation Table [1]). An observation table is a triple (SET), where \(S \subseteq I^*\) is a prefix-closed set of input strings, \(E\subseteq I^+\) is a suffix-closed set of input strings, and T can be seen as a table. The rows of T are labeled by elements from \(S\cup (S,I)\) where (SI) represents the set of all possible extensions of prefixes by adding a single symbol from I to each element in S, columns are labeled by elements from E, and T(se) is the SUL’s output suffix for the input sequence s.e where s.e represents the string formed by concatenating a prefix string s with a suffix string e.
The \(L^*\) algorithm initially starts with S only containing the empty word \(\epsilon \), and E equals the set of inputs alphabet I. To construct a hypothesis \(\mathscr {H}\), the observation table must have two properties namely closeness and consistency. An observation table is closed iff for all \(w\in S.I\) there is a \(w' \in S\) that for all \(e \in E\), \(T(w, e) = T(w', e)\) holds. An observation table is consistent iff for all \(s_1\), \(s_2 \in S\), such that for all \(e\in E\), \(T(s_1, e) = T(s_2, e)\), it holds that \(T(s_1.\alpha , e) = T(s_2.\alpha , e)\) for all \({\alpha \in I},{e \in E}\).
The learner queries are posed until these two properties hold, and a hypothesis \(\mathscr {H}\) is formulated. Then, the teacher returns either a counterexample (CE) denoting the non-conformance, or ‘yes’, if \(\mathscr {H}\) is equivalent to the SUL. If a counter-example is found, it is incorporated into the observation table by adding its prefixes/suffixes to the observation table, and the updated table is again checked for closedness and consistency. The L\(^*\) algorithm is guaranteed to eventually produce a hypothesis \(\mathscr {H}\) which is the same as the SUL.
As we aim to learn the system in terms of synchronous components separately, we need to define the projection operator that removes all the transitions that are not in the projected alphabet. The proposed algorithm learns a black-box system with respect to its projection on the actions of each component.
Definition 10
(L\(^*\)with Projected Actions [14]). Given a Mealy machine \(M = (S, s_0, I,O, \delta , \lambda )\) and \(I' \subseteq I\), \(L^*(M, I')\) returns \(P(M, I')\) by running algorithm L\(^*\) with the projected action \(I'\) on M.

3 Synchronous Compositional Learning Algorithm

In this section, we introduce an extension of the CL\(^*\) algorithm [14] named Synchronous Compositional Learning Algorithm (SCL\(^*\)). In [14], components have no communication and they are not synchronized with each other. In this paper, we compositionally learn an unknown system \(M = \parallel _{i=1}^n M_i\) consisting of n parallel synchronous components where each component’s input and the synchronizing actions are not known and we learn such information on-the-fly.
We initially assume that each component has an input alphabet and the set of synchronizing actions \( sync\) is equal to the union of the input alphabets of all components. The main idea behind learning the synchronizing actions in SCL\(^*\) is to observe the outputs of actions. Based on Definition 2 and the uniformity of the outputs of synchronizing actions, which implies that outputs of synchronizing actions are unique, if the algorithm discovers different outputs for an action u, it concludes that u cannot be a synchronizing action and so it removes u from \( sync\). The set of the input alphabets of components \(I^F = \{I_1,\dots , I_n\}\) is a cover of the total system’s input alphabet. The goal is to find a cover \(I^F\) where each \(I_i\in I^F\) corresponds to a component i. When \(I_i\) and \(I_j\) have a non-empty intersection, it means that they synchronize on the shared actions. To reach such a cover, we start with an initial cover with singleton sets including only one action. Assume that the teacher returns a counter-example \(u_1\ldots u_m\) in response to an equivalence query. For each pair of actions \(u_\ell \in I_i\) and \(u_{r}\in I_j\) such that \(u_\ell \) does not belong to \( sync\), if \(I_1,\ldots ,I_n\) does not form an independent cover according to Definition 5, i.e., \(\lambda _{P(M,I_1)\parallel \ldots \parallel P(M,I_n)}(u_{\ell }u_{r}) \ne \lambda _{M}(u_{\ell }u_{r})\), we conclude that \(u_{\ell }\) and \(u_{r}\) are dependent and they must belong to one input set. We merge \(I_i\) and \(I_j\) if \(u_{r}\) also does not belong to \( sync\). If \(u_{r}\) belongs to \( sync\), we add \(u_{r}\) to \(I_i\) which includes \(u_{\ell }\). The reason is that if \(u_{r}\) is actually a synchronizing action, then the components i and j must synchronize on \(u_{r}\). Hence, \(u_{r}\) must belong to both input sets \(I_i\) and \(I_j\). As the algorithm continues, if we find that \(u_{r}\) is not actually a synchronizing action, then we merge \(I_i\) and \(I_j\). We will prove (Proposition 2) that the counter-example \(u_1\ldots u_m\) consists of at least one non-synchronizing action as the Mealy machine is deterministic and the outputs of synchronizing actions are also unique. Therefore, a sequence of synchronizing actions in the SUL and the learned model will always generate the same sequence of outputs.
After updating \(I^F\), we learn the system with the projected alphabet for each \(I_k\in I^F\), and compute the product of the obtained components with synchronous parallel composition. This scenario is iterated until the result is equivalent to the original system, and the teacher returns no counter-example.
Algorithm 1
Synchronous Compositional Learning Algorithm (\( SCL^*\))
In the following, we introduce the SCL\(^*\) algorithm in detail to learn a system consisting of synchronous components where each component can be modeled by a simple Mealy machine. To this end, we first introduce some definitions used in the algorithm.
Definition 11
(LearnSyncInParts). The LearnSyncInParts function gets \(M = (S, s_0, I,O, \delta , \lambda )\) and the cover \(I^F =\{I_1,\dots , I_n\}\) of I and returns the synchronous parallel composition of the learned components.
$$\begin{aligned} { LearnSyncInParts}(M,I^F)=L^*(M,I_1)\parallel \dots \parallel L^*(M,I_n) \end{aligned}$$
Definition 12
(DependentSets). The function \( DependentSets\) gets a counter-example CE, an input cover \(I^F\), and a synchronizing set \( sync\), and returns the input sets from \(I^F\) that contain at least one non-synchronizing action of CE:
$${ DependentSets}({ CE},I^F,{ sync})= \{I_j\mid I_j\in I^F,\exists i\cdot { CE}[i]\not \in { sync}\wedge { CE}[i]\in I_j \}$$
, where \({ CE}[i]\) refers to the \(i^{th}\) symbol of CE.
Example. If \(I^F=\{\{a,b\},\{c,d\},\{e,f\}\}\), \({ sync}=\{b,d\}\), and \({ CE}=bce\), then \({ DependentSets}({ CE},I^F,{ sync})=\{\{c,d\},\{e,f\}\}\).
Definition 13
(Merge). The function \({ Merge}\) gets two covers \(I^{F}_1\) and \(I^{F}_2\), and updates \(I^{F}_1\) according to the sets in \(I^{F}_2\). To this end, it removes the sets of \(I^{F}_2\) from \(I^{F}_1\) and adds their union to \(I^{F}_1\): \({ Merge}(I^{F}_1,I^{F}_2)= (I^{F}_1\setminus \ I^{F}_2)\cup \{\bigcup _{I_j\in I^{F}_2} I_j\}\)
Example. If \(I^{F}_1=\{\{a\},\{b\},\{c,d\}\}\), \(I^{F}_2=\{\{b\},\{c,d\}\}\) then \({ Merge}(I^{F}_1,I^{F}_2)=\{\{a\},\{b,c,d\}\}\)
Algorithm 1 shows the pseudo-code of learning a system compositionally including components that synchronized on some actions. The algorithm is called with the singleton covering \(I^F\) of the alphabet I and the original system M, i.e., if the input alphabet is \(I = \{u_1, u_2, \dots , u_n\}\), then the initial cover of the alphabet will be \(I^F = \{\{u_1\}, \{u_2\},\dots , \{u_n\}\}\). Initially, on line 1, we set the synchronizing actions, i.e., sync, as the union of all input alphabets in \(I^F\). Then, the LearnSyncInParts method, on line 3, learns each of the components given the corresponding alphabet set using the algorithm \(L^*\) and returns the synchronous composition of the learned components. Regarding the learned components, we update the hypothetical synchronizing actions, sync. Based on our assumption, if there is a synchronizing action like u with different outputs in \(\mathscr {H}\), we find that u cannot be a synchronizing action, and hence, remove u from sync, on lines 4-6. Otherwise, we store the corresponding output of each synchronizing action in the mapping OutSync, on line 7. This mapping maintains the set of tuples including the actions and their corresponding unique outputs. On line 8, we ask the teacher whether the hypothesis \(\mathscr {H}\) is equivalent to M. If the teacher returns “yes” for this equivalence query, the algorithm terminates and returns \(\mathscr {H}\). Otherwise, the counter-example is processed by the procedure ProcessCE in Algorithm 2, and then an(other) iteration of the loop is performed until the correct hypothesis \(\mathscr {H}\) is learned, on lines 10-12. The \( sync\) set may also contain non-synchronizing actions with unique outputs. On line 13, such actions are removed from \( sync\) if they are not shared among some sets of \(I^F\).
Algorithm 2 is a procedure that updates the global variables \(I^F\) and sync based on the counter-example CE. The function Output(CE,M) returns the corresponding output of the counter-example, on line 2. If the counter-example contains an action like u that its output is different from the corresponding output in the mapping OutSync, then we find that u is not a synchronizing action. Hence, we remove u from sync and \( OutSync\), on lines 4-6. Then, we must update the input sets of \(I^F\) as u may be in different input sets as a synchronizing action. To this end, we find such sets and merge them on lines 7-8 ( by Proposition 1, we explain the necessity of merging such input sets and prove its correctness). After updating the synchronizing actions, we must update \(I^F\) based on the counter-example CE. The DependentSets method, on line 9, extracts the sets of \(I^F\) that are dependent on each other based on the non-synchronizing actions of CE. These sets are merged with the synchronizing actions participating in CE, on lines 10-11.
Algorithm 2
Process Counter-Example
Proposition 1
Let \(I^F\) be an input cover, and \( sync\) be the synchronizing actions. If two input sets \(I_i,I_j\in I^F\) have a shared action u, and u is detected as a non-synchronizing action, then \(I_i\) and \(I_j\) must be merged.
Proof
According to Algorithm 1, initially all actions belong to \( sync\), and the actions with different outputs are removed from \( sync\). Assume that in Algorithm 2, lines 4-6, \(I^F\) includes two input sets \(I_i=\{a_1,\ldots ,a_n,u\}\) and \(I_j=\{b_1,\ldots , b_m,u\}\) while u is detected as a non-synchronizing action. By Definition 7, each set consists of dependent actions, and hence, the actions \(a_1,\ldots ,a_n\) are dependent on the non-synchronizing action u, and \(b_1,\ldots ,b_m\) are also dependent on u. As u is a non-synchronizing action, by the transitive relationship, we can conclude that \(a_1,\ldots ,a_n\) are also dependent on \(b_1,\ldots ,b_m\). As a result, we remove \(I_i\) and \(I_j\) from \(I^F\) and instead add their union \(\{a_1,\ldots ,a_n,b_1,\ldots ,b_m,u\}\) to \(I^F\).    \(\square \)

3.1 Minimum Counter-Examples

We aim to find a counter-example with minimal length where any of its prefixes are not a counter-example. If the teacher returns a non-minimal counter-example, some input sets may be merged incorrectly [14]. We extend the algorithm proposed in [14] to minimize the returned counter-example according to the synchronizing actions. We iteratively take a counter-example CE and return the smallest prefix of CE which is also a counter-example. To this end, we select the input sets \(I^M\) including synchronizing and non-synchronizing actions participating in CE. Then, we iteratively get a subset of \(I^M\), merge its members, and produce a set A. The projection of CE on A is returned as a minimum counter-example if the SUL and the hypothesis model produce different outputs.
In the next section, we prove that the SCL\(^*\) algorithm terminates and the synchronous components are learned correctly.

4 Properties of the SCL\(^*\) Algorithm

To prove the correctness and termination of the SCL\(^*\) algorithm, we first prove that if the algorithm terminates, then the shared actions among input sets in \(I^F\) belong to \( sync\). Then, we prove that a counter-example has at least one non-synchronizing action.
Lemma 1
Let \(I^F\) be an input cover and \( sync\) be the synchronizing actions. If the algorithm SCL\(^*\) terminates in a round where \(I^F\) includes \(I_i, I_j\) as two input sets with a common action u, i.e., \(u\in I_i\cap I_j\), then u belongs to the set of synchronizing actions, i.e., \(u\in sync\).
Proof
We prove this by a contradiction. Assume that u exists in two input sets \(I_i\) and \(I_j\) while it does not belong to \( sync\). By Proposition 1, \(I_i\) and \(I_j\) must be merged as u is found as a non-synchronizing action. However, this contradicts that the algorithm terminates in a round that \(I^F\) includes two distinct input sets \(I_i\) and \(I_j\).    \(\square \)
Proposition 2
Given an input cover \(I^F\) and the synchronizing actions \( sync\), if the teacher returns a counter-example CE, then CE has at least one action like u such that \(u\not \in { sync}\).
Proof
We prove this by a contradiction. Assume that \(I^F=\{\{u\},\{v\}\}\) and all actions of CE like u and v belong to \( sync\). By Definition 5, the teacher returns a counter-example if \(\lambda _{P(M,u)\parallel P(M,v)}(uv) = \lambda _M(uv)\) does not hold. In this way, two cases can be assumed: (1) \(\lambda _{M}(uv)\ne \lambda _{P(M,{\{u\}})\parallel P(M,{\{v\}})}({uv})\) holds, and (2) \(\lambda _{P(M,{\{u\}})\parallel P(M,{\{v\}})}({uv})\) is not defined. The first case contradicts the assumption that uv are the synchronizing actions as the outputs of the synchronizing actions must be unique in the system. In the second case, there is a sequence of action uv such that \(\lambda _{M}(uv)\) is defined in M, but \(\lambda _{P(M,{\{u\}})\parallel P(M,{\{v\}})}({uv})\) is not defined. As the Mealy machine is deterministic, this case happens if there is an action with different outputs in CE which is a contradiction.    \(\square \)
The next proposition from [14] indicates that the actions participating in CE belong to different input sets.
Proposition 3
Let \(I^F\) be an input cover and CE be a counter-example. If CE has at least two actions u and v, then u and v do not belong to the same input set of \(I^F\).
As the actions of CE belong to different input sets, we prove that different input sets of \(I^F\) must be merged if and only if they include non-synchronizing actions participating in CE.
Proposition 4
Let \(I^F\) be an input cover, \( sync\) be the set of synchronizing actions, and CE be a counter-example. Two input sets \(I_i,I_j\in I^F\), where \(|I_i|>1\), \(|I_j|>1\), must be merged iff CE has at least two actions \(u,v\not \in sync\) such that \(u\in I_i\), \(v\in I_j\).
Proof
Proof by contradiction and using Proposition 2.
   \(\square \)
Corollary 1
If the counter-example includes a synchronizing action u and a non-synchronizing action v, the synchronizing action \(u\in sync\) must be added to an input set of \(I^F\) including v.
Now, we prove that the size of an input cover \(I^F\) is decreased, and the algorithm SCL\(^*\) will eventually terminate.
Lemma 2
At each round of the algorithm \({ SCL}^*\), if the counter-example CE has at least two non-synchronizing actions, then the size of \(I^F\) is decreased by at least one.
Proof
Proof by using Propositions 1, 3, and 4.    \(\square \)
Theorem 1
The Synchronous Compositional Learning Algorithm terminates.
Proof
There are three cases according to the synchronizing actions: (1) If there is no synchronizing action among components, the proposed algorithm reduces to the algorithm CL\(^*\) [14]. As CL\(^*\) eventually terminates, we conclude that our algorithm also terminates. (2) If all actions are synchronizing, then by Proposition 2, the teacher does not return a counter-example. As a result, the SUL is equivalent to the learned model and hence the algorithm terminates. (3) If there are some synchronizing actions among components, then some sets in an input cover \(I^F=\{I_1,\ldots ,I _n\}\) intersect on synchronizing actions. By Lemma 2, if the returned counter-example has at least two non-synchronizing actions, then the size of \(I^F\) is decreased. Furthermore, by Proposition 1, if a shared action u is found as a non-synchronizing action, then those sets that include u are removed from \(I^F\) and their union is added to \(I^F\) and hence the size of \(I^F\) is decreased. If after k rounds, the size of input cover \(I^F\) does not decrease anymore, then it is concluded that the counter-example does not include at least two non-synchronizing actions. In this way, only synchronizing actions are added to other input sets as the algorithm continues. The number of alphabet and synchronizing actions are finite; hence, the algorithm terminates after finite rounds.    \(\square \)
In the following, we prove that if the algorithm terminates, then \( sync\) only includes synchronizing actions.
Lemma 3
Let \(I^F\) be an input cover and \( sync\) be the synchronizing actions. If the algorithm \({ SCL}^*\) terminates, then \( sync\) includes only the synchronizing actions.
Proof
Proof by using Lemma 1 and Proposition 4.    \(\square \)
Now, we prove that the algorithm eventually learns the compositional model including the synchronous components correctly.
Theorem 2
Given an input cover \(I^F\), the SCL\(^*\) Algorithm learns the compositional model including the synchronous components correctly.
Proof
Proof by contradiction using Theorem 1, Lemma 2 and Lemma 3.    \(\square \)

5 Evaluation and Experimental Results

In this section, we first check how much our assumption is practical in some real benchmarks. To this aim, we performed statistical work on a number of benchmarks1 and checked what percentage of actions have the same output in the model. Figure 3 shows that on average 46 percent of actions have unique outputs in these benchmarks and hence our assumption is practical.
Fig. 3:
Percentage of actions with unique outputs in some benchmarks
We also evaluate the performance of our algorithm on some real and synthetic benchmarks. To this aim, we implemented the SCL\(^*\) algorithm2 on top of the LearnLib framework [26] which is a library for automata learning. We compare the SCL\(^*\) algorithm with L\(^*\) by evaluating the number of resets and input symbols. The number of resets denotes how many times a finite state machine must be reset and back to the known state during the execution of queries[14]. This parameter is an important factor in learning practical systems as they are time-consuming [27]. The number of input symbols is the total number of input alphabets among components. This parameter declares the total cost of a learning model as the queries used in the model learning have many different lengths. The total number of input symbols is a more accurate metric for comparison of learning algorithms than the number of queries [28].
We need a set of benchmarks to evaluate the performance of the SCL\(^*\) algorithm regarding the two metrics. We use realistic and synthetic benchmarks. For the realistic benchmarks, we choose the Body Comfort System (BCS) [23], which is an automotive software product line (SPL) of a Volkswagen Golf model. For the synthetic benchmarks, we generate some components randomly with different numbers of states and various alphabets. We compose three to nine components for both benchmarks using the synchronous parallel composition (Definition 3). The compositional model has a minimum of 100 and a maximum of 30000 states. We do not evaluate the metrics for fewer than 100 states since the advantage of compositional learning is not significant for small systems. In addition, most models of the realistic benchmark do not have more than 30000 states. We repeat each experiment three times and evaluate the experiments on a single-machine quad-core (Intel i5-1135G7 2.42GHz) with 16 GB RAM.
Fig. 4:
Total number of resets in SCL\(^*\) and L\(^*\) (realistic benchmarks)

5.1 Realistic Benchmarks

We evaluate the performance of the SCL\(^*\) algorithm on a realistic system called Body Comfort System (BCS). This system is an automotive software product line of a Volkswagen Golf model. This model has 27 components which have three types of connectors called input, output, and internal. The internal connectors denote interactions among components and input/output connectors constitute the communication with the environment [29]. As Mealy machines are the underlying model of SCL\(^*\), we use the finite state machines of these components which are constructed in [30] and consider the internal connectors as the synchronizing actions.
Fig. 5:
Total number of input symbols in SCL\(^*\) and L\(^*\) (realistic benchmarks)
Fig. 4 and Fig. 5 show how the number of states and components affects the number of resets and input symbols. The plots denote the executed tests where red plots correspond to the execution of L\(^*\) and green plots correspond to the execution of SCL\(^*\). The right-hand side of both figures shows the effect of compositional learning when the number of parallel components increases while the number of states remains fixed. The performance of the two algorithms is almost the same in cases where the number of states is small. By increasing the number of states, the compositional learning algorithm SCL\(^*\) is more scalable.

5.2 Synthetic Benchmarks

We develop a test case generator, which is inspired by [12], to generate a synthetic benchmark. This tool generates some components in the form of Mealy machines where each Mealy machine has a random number of states between two and five. The components communicate synchronously regarding the four topologies (Fig. 6) that reflect the real-world network topologies [31]. We randomly assign alphabets to each component in which each component has at least one non-synchronizing action in addition to synchronizing actions.
Fig. 6:
Synchronization topologies in a synthetic benchmark
In Fig. 6, each topology denotes how components can synchronize where the vertices represent the components and edges represent the shared synchronizing actions. In the point to point topology, each synchronizing action can be shared only between two components, and all components are not necessarily synchronized. All components must synchronize by mesh, while all components synchronize only with one component in star. In the ring topology, components synchronize in a circular form and synchronize with two components of the circle.
Fig. 7 and Fig. 8 indicate how the number of states and components affects the number of resets and input symbols. Similar to the realistic benchmarks, the colored plots denote the executed test. We use different colors to show the results for various topologies: green for point-to-point, blue for ring, orange for star, and red for mesh. In addition, dashed lines correspond to the execution of L\(^*\), and solid lines correspond to the execution of SCL\(^*\).
Fig. 7:
Total number of resets in SCL\(^*\) and L\(^*\) (synthetic benchmarks)
The result shows that SCL\(^*\) is more scalable than L\(^*\) when the number of states is increased. For the mesh topology, we cannot evaluate the algorithms on a system that includes more than seven components (27000 states) due to the memory overflow. Similar to the realistic benchmarks, the right-hand side of both figures shows the effect of compositional learning when the number of parallel components increases while the number of states remains fixed.
Fig. 8:
Total number of input symbols in SCL\(^*\) and L\(^*\) (synthetic benchmarks)
This work builds on the rich literature of active automata learning that aim to find the model of a black-box system by posing queries and iteratively building a hypothesis. These approaches are the extension of the L\(^*\) algorithm to find the richer models of a system, and learn the system either monolithically or compositionally. For instance, in [10, 32, 33] the underlying system is learned monolithically. In [10], an extension of L\(^*\) to pomset automata is introduced which learns the concurrent computations. A monolithic approach is introduced in  [32] to learn the asynchronously communicating finite state machines. In [33], an extension of L\(^*\) for learning finite automata is introduced where they assume the existence of an incomplete teacher who does not know the answers to all queries, and they answer such queries by using an SMT solver.
The compositional approaches, which are closest to our works, learn the parallel system by learning each component individually. In [14], an extension of L\(^*\) is introduced where individual components are learned without any knowledge of the component number and their individual alphabets. In [14] the components cannot synchronize while our algorithm also considers the synchronization among components. The algorithm of [13] learns the parallel interleaving of Moore machines with decomposable outputs. Each component can be learned by the L\(^*\) algorithm, and the output of components are explicitly specified. In our work, there are dependencies between components in which some components must be synchronized, and we identify the components and assign outputs to them on-the-fly. In [11], an algorithm to learn the synchronous parallel labeled transitions systems is introduced. However, they assume a priori knowledge of dependencies among actions. In our work, we do not know the synchronizing actions and learn them on-the-fly. In [12], a synchronous compositional learning algorithm is introduced by knowing the alphabet of components. However, we have no knowledge about the alphabet of components and synchronizing actions.

7 Conclusion

In this paper, we introduce a synchronous compositional learning algorithm as an extension of the CL\(^*\) algorithm presented in [14]. In this work, components can synchronize on the shared actions. We have assumed each synchronizing action has a unique output in a system to maintain the output determinism of the compositional model. The proposed algorithm learns the model of the individual components. We have proved that our algorithm will eventually terminate and learn the model of the individual components correctly. We use two sets of benchmarks to evaluate the performance of our algorithm. One set corresponds to the synthetic benchmark where random components are generated and synchronized based on a special pattern of network topologies. Another set corresponds to the realistic benchmark of an automotive product line model. We have shown that by increasing the number of states and components in a system, the proposed algorithm is more scalable than L\(^*\). In future, we aim to relax the restriction on the synchronizing actions’s outputs in which each synchronizing action has different outputs in the system.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Literatur
19.
22.
Zurück zum Zitat U. S. C. Braunstein, J. Peleska, et al, A sysml test model and test suite for the ETCS ceiling speed monitor, in: Work Package 4 OETCS/WP4/CSM 01/00, University of Bremen, (2014). U. S. C. Braunstein, J. Peleska, et al, A sysml test model and test suite for the ETCS ceiling speed monitor, in: Work Package 4 OETCS/WP4/CSM 01/00, University of Bremen, (2014).
23.
Zurück zum Zitat S. Lity, R. Lachmann, M. Lochau, I. Schaefer, Delta-oriented software product line test models-the body comfort system case study., in: Technical Report 2012-07, (2012). S. Lity, R. Lachmann, M. Lochau, I. Schaefer, Delta-oriented software product line test models-the body comfort system case study., in: Technical Report 2012-07, (2012).
29.
Zurück zum Zitat S. Lity, R. Lachmann, M. Lochau, I. Schaefer, Delta-oriented test case prioritization for integration testing of software product lines., in: Proceedings of the 19th International Conference on Software Product Line, ACM, (2015). https://doi.org/10.1145/2791060.2791073. S. Lity, R. Lachmann, M. Lochau, I. Schaefer, Delta-oriented test case prioritization for integration testing of software product lines., in: Proceedings of the 19th International Conference on Software Product Line, ACM, (2015). https://​doi.​org/​10.​1145/​2791060.​2791073.
Metadaten
Titel
Compositional Learning for Synchronous Parallel Automata
verfasst von
Mahboubeh Samadi
Aryan Bastany
Hossein Hojjat
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-90900-9_6