Zum Inhalt
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Learning Real-Time One-Counter Automata Using Polynomially Many Queries

verfasst von : Prince Mathew, Vincent Penelle, A. V. Sreejith

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Das Kapitel vertieft sich in das komplizierte Problem der Identifizierung von Modellen aus vorgegebenen Datensätzen, eine Herausforderung, die in Bereichen wie formaler Verifikation und maschinellem Lernen weit verbreitet ist. Es wird das Konzept deterministischer Echtzeit-Ein-Zähler-Automaten (DROCA) eingeführt, bei denen es sich um endliche Zustandsmaschinen handelt, die mit einem nicht negativen Integerzähler angereichert werden und die die Erkennung bestimmter nicht regelmäßiger kontextfreier Sprachen ermöglichen. Die Autoren stellen einen neuen Ansatz für aktives Lernen von DROCA vor, indem sie einen SAT-Löser einsetzen, um das NP-Hard-Problem zu lösen, einen minimalen deterministischen endlichen Automaten (DFA) zu finden. Diese Methode reduziert die Anzahl der erforderlichen Abfragen erheblich und macht sie effizienter als bestehende Techniken. Das Kapitel behandelt auch die Äquivalenz von gegensynchronisiertem DROCA und stellt Algorithmen zur Überprüfung von Äquivalenz und Gegensynchronisation zur Verfügung. Experimentelle Ergebnisse zeigen die Überlegenheit der vorgeschlagenen Methode gegenüber bestehenden Algorithmen und unterstreichen ihre praktische Anwendbarkeit und Effizienz. Das Kapitel schließt mit Vorschlägen für zukünftige Forschung, einschließlich der Verbesserung minimaler Trennmethoden zur Identifizierung von DFA und der Erweiterung des Algorithmus, um allgemeinere Modelle zu erlernen.
Hinweise
This document contains numerous links to enhance its usability. Terms and concepts defined within the document are directly linked to their definitions as hyperlinks.

1 Introduction

The problem of identifying a model from a given dataset is an area of interest in various fields of computer science, like formal verification and machine learning. However, inferring the right model from labelled samples is challenging. For instance, finding a minimal separating dfa  – a dfa that accepts a given set of positive samples and rejects a given set of negative samples – is known to be \(\textsf{NP}\)-complete [9]. Angluin [2] introduced an active learning framework called \(\textsc {L}^*\), involving a learner and a teacher. The learner can ask membership and equivalence queries to the teacher. Angluin showed that dfa can be learnt in polynomial time with respect to the size of the minimal dfa.
In this paper, we are interested in active learning of a deterministic real-time one-counter automaton (droca). These are finite-state machines equipped with a non-negative integer counter that can be incremented or decremented on reading an input symbol. The counter adds expressive power, enabling a droca to recognise certain non-regular context-free languages (e.g., \(\{a^nb^n\mid n>0\}\)). However, this added power also introduces significant challenges in learning.
Our contribution. We introduce the notion of counter-synchronous drocas  – two drocas are counter-synchronous if, for any word, the counter value reached on reading that word is the same on both machines. Given two drocas with \(\textsf{K}\) states, we give an \(\mathcal {O}(\textsf{K}^6)\) time1 algorithm that solves the following two problems (see Theorem 3): (1) check if they are counter-synchronous, and (2) check whether they are equivalent, if they are counter-synchronous. For visibly one-counter automata (droca where the input alphabet determines the counter-actions), we have devised an even faster \(\mathcal {O}(\alpha (\textsf{K}^3)\textsf{K}^3)\) algorithm for checking equivalence (see Theorem 4).
The main result of this paper is a novel approach for active learning of drocas. However, the active learning framework differs from that introduced by Angluin in a few crucial aspects (see Section 4). Similar to the work by Bruyère et al. [4], we use an additional query called counter value query. This allows the learner to ask for the counter value reached on reading a word in the droca. Furthermore, the learner has access to a minimal synchronous-equivalence query on the droca. The teacher returns true for an equivalence query if the learnt droca is counter-synchronous and equivalent to the teacher’s droca. Otherwise, it returns a minimal word that violates this property.
In this framework, we give an algorithm that learns a minimal counter-synchronous droca. A key innovation in our approach is the use of a SAT solver for solving the \(\textsf{NP}\)-hard problem of finding a minimal separating dfa from a set of positive and negative samples. The solver, in conjunction with a modified version of \(\textsc {L}^*\), learns a characteristic dfa (see Definition 6). Subsequently, we use this characteristic dfa to construct a minimal counter-synchronous droca. Our algorithm requires only a polynomial number of queries to the SAT solver and the teacher. Consequently, our algorithm is in \(\mathsf {P^{NP}}\).
Justification for using a SAT solver. We argue that unless \(\mathsf {P=NP}\), learning a minimal visibly one-counter automaton (voca) cannot be done in polynomial time. This follows from the fact that minimisation of voca can be reduced to learning a minimal voca. Furthermore, It was pointed out by Michaliszyn and Otop [15] that minimising voca is \(\textsf{NP}\)-complete. This follows from the result by Gauwin et al.[8] that minimising vpda is \(\textsf{NP}\)-complete.
Comparison with existing methods. From a complexity theoretical perspective, drocas can be learned with polynomial space and exponential time with a straightforward brute-force approach. This method entails enumerating all conceivable drocas, starting with a one-state droca, and submitting equivalence queries for each. This approach, without a doubt, entails an exponential number of equivalence queries. All existing algorithms for learning drocas, including the algorithm by Fahmy and Roos [7], the algorithm by Bruyère et al. [4], and the algorithm for learning vocas by Neider and Löding [17] require exponential time and an exponential number of queries with respect to the number of states of a minimal droca recognising the language. All these algorithms share the idea of learning the initial portion of an infinite behavioural graph and then seek to identify a repetitive structure in it. However, in the worst-case scenario, this repetitive structure becomes apparent only after learning an exponentially large portion of the graph. In this case, the learnt droca will be exponentially large. Consequently, learning this exponential-sized behaviour necessitates exponentially many queries. Moreover, the equivalence queries also run on these exponentially large drocas, making it even more infeasible. Bruyère et al. [4] were the first to pursue a practical learning application of learning drocas. However, due to the difficulty in checking the equivalence of drocas, they had to use a weaker form of equivalence query that checks for counter-examples up to some random counter value. Their equivalence check might say two non-equivalent drocas are equivalent if the minimal counter-example is large.
Our approach (MinOCA) differs fundamentally from these existing methods by eliminating the need to observe the automaton’s behaviour up to exponentially large counter values. We propose an algorithm for learning drocas using only a polynomial number of queries. This sets it apart from existing techniques that require exponentially many queries for learning. One significant bottleneck in learning drocas is the equivalence test by the teacher. Given two drocas with number of states less than some \(\textsf{K}\in \mathbb {N}\), the equivalence check takes \(\mathcal {O}(\textsf{K}^{26})\) time2. This is impractical for real-world applications. To mitigate this, we use the synchronous-equivalence check that runs in \(\mathcal {O}(\textsf{K}^6)\) time. We obtain significantly smaller counter-examples while using this equivalence check. Our equivalence queries are also on models whose size is less than or equal to a minimal counter-synchronised droca. Furthermore, unlike existing techniques that learn exponentially large drocas, our algorithm always learns an equivalent counter-synchronous droca with the minimal number of states.
Experiments. We evaluate an implementation of our algorithm and compare the results obtained with the existing technique by Bruyère et al. [4]. Experiments were conducted on randomly generated drocas with number of states ranging from 2 to 15 and the input alphabet size varying from 2 to 5. The results indicate that the proposed method outperforms the existing one [4].
The remainder of this paper is organised as follows: Section 2 gives the definitions and preliminaries. Section 3 gives equivalence results of counter-synchronised drocas. Section 4 details our learning algorithm for drocas, Section 5 covers the implementation details and presents our experimental results. Finally, Section 6 summarises our work and suggests future research directions.

2 Definitions and Preliminaries

For any finite set S, |S| denotes its cardinality. Non-negative numbers are denoted by \(\mathbb {N}\), and [ij] denote the interval \(\{i, i+1, \ldots , j\} \subseteq \mathbb {N}\). For any \(d\in \mathbb {N}\), the sign of d (denoted by \(sign(d)\)) is defined as \(sign(d)=0\) if \(d=0\) and is 1 otherwise. Let \(w=a_0a_1a_2\ldots a_n \in \varSigma ^*\). For \(j,k\in [0,n]\), with \(j<k\), we use w[j] to denote the letter \(a_j\) and \(w_{[j \cdots k]}\) to denote the factor \(a_j a_{j+1}\cdots a_{k}\).
Definition 1
A deterministic real-time one-counter automaton (droca) \(\mathcal {A} = (Q,\varSigma , q_0,\delta _0,\delta _1,F)\), where Q is a finite nonempty set of states, \(\varSigma \) is the input alphabet, \(q_0\in Q\) is the initial state, \(\delta _0: Q \times \varSigma \rightarrow Q \times \{0,+1\}\) and \(\delta _1: Q \times \varSigma \rightarrow Q \times \{0,+1,-1\}\) are the transition functions, and \(F\subseteq Q\) is the set of final states.
We use \(|\mathcal {A} |\) to denote the size of \(\mathcal {A} \), which we consider to be |Q|. A configuration \(\texttt {c}\) of a droca is a pair \((q, n)\in Q \times \mathbb {N}\), where \(q\in Q\) denotes the current state and \(n\in \mathbb {N}\) is the counter value. The configuration \(c_0=(q_0,0)\) is called the initial configuration of \(\mathcal {A} \). Let \(p,q \in Q, n\in \mathbb {N}, e \in \{-1,0,+1\}\) and \(a \in \varSigma \). A transition between two configurations (pn) and \((q, n+e)\) on the symbol a is defined, if \(\delta _{sign(n)}({p},a) = ({q},e)\). We use \((p,n)\xrightarrow {a}(q,n+e)\) to denote this. A run on a word \(w=a_1\dots a_n\) from a configuration \((p_1,m_1)\) is the sequence of transitions \((p_1,m_1) \xrightarrow {a_1} (p_2,m_2) \xrightarrow {a_2} (p_3,m_3) \xrightarrow {a_3} \dots \xrightarrow {a_{n-1}} (p_n,m_n)\) where \(p_i \in Q\) and \(m_i\)s are counter values. In this case, we denote the run by \((p_1,m_1) \xrightarrow {w} (p_n,m_n)\). Note that the counter values always stay non-negative, implying a decrement is not permitted from a configuration with zero counter value.
Let \(q \in Q,\ n\in \mathbb {N}\), and \(w \in \varSigma ^*\) with \((q_0,0)\xrightarrow {w}(q,n)\). We use \({\texttt {ce}}_\mathcal {A} (w)= n\) to denote the counter value reached on reading w from the initial configuration. We call \({\texttt {ce}}_\mathcal {A} (w)\) the counter-effect of w. We denote by \(height_{\mathcal {A}}(w)\) the maximal counter-effect of the prefixes of w in \(\mathcal {A}\). We drop the subscript \(\mathcal {A} \) when the \(\textsc {droca} \) under consideration is evident. \(\mathcal {A} \) is deterministic (resp. complete) if, for any word w, there is at most (resp. at least) one run on w starting from any configuration. We say that a word w is accepted by \(\mathcal {A} \) if and only if \((q_0,0) \xrightarrow {w} (q_f,m)\) for some \(q_f \in F\) and \(m \in \mathbb {N}\). The language of \(\mathcal {A} \), denoted by \(\mathcal {L}(\mathcal {A})\), is the set of all words accepted by \(\mathcal {A} \). For convenience, we use \(\mathcal {A} (w)=1\) if \(w\in \mathcal {L}(\mathcal {A})\) and \(\mathcal {A} (w)=0\) otherwise. Two drocas \(\mathcal {A} \) and \(\mathcal {B} \) are equivalent if \(\mathcal {L}(\mathcal {A})=\mathcal {L}(\mathcal {B})\).
There are no \(\epsilon \)-transitions in a droca. We consider only complete drocas in this paper. Every incomplete droca can be made complete without changing its language by directing all the undefined transitions to a new non-final sink state.
Definition 2
We say that two drocas \(\mathcal {A}\) and \(\mathcal {B}\) are counter-synchronised if for all \(w\in \varSigma ^*\), \({\texttt {ce}}_{\mathcal {A}}(w)=\) \({\texttt {ce}}_{\mathcal {B}}(w)\).

3 Equivalence of Counter-Synchronised DROCAs

The equivalence of drocas was shown to be in \(\textsf{NL}\) by Böhm and Göller [3]. They show that there exists a \(\mathcal {O}(\textsf{K}^{26})\) length word that distinguishes two non-equivalent drocas of size \(\textsf{K}\). However, this is not suitable for practical applications.
We give an efficient \(\mathcal {O}(\textsf{K}^6)\) algorithm to check the equivalence of counter-synchronised drocas. If two drocas are not counter-synchronised, then our algorithm outputs the smallest word for which they reach different counter values. If they are counter-synchronised and not equivalent, then our algorithm outputs the smallest word that is accepted by one and rejected by the other. In both these cases, the length of the counter-example is \(\mathcal {O}(\textsf{K}^5)\).
In Theorem 4, we can treat \(\alpha \) as a constant since \(\alpha (j) \le 4\) for all \(j<2^{1000}\) (see [1]). Hopcroft and Karp [11] showed that equivalence checking of two dfas of size n can be done in \(\mathcal {O}(\alpha (n)n)\) time.
Theorem 3
Let \(\mathcal {A}\) and \(\mathcal {B}\) be drocas where \(|\mathcal {A} |,|\mathcal {B} |\le \textsf{K}\). If \(\mathcal {A}\) is not counter-synchronised and equivalent to \(\mathcal {B}\), then there is a minimal word w where \(|w|\le 2\textsf{K}^5\), \(height_{\mathcal {A}}(w),\ height_{\mathcal {B}}(w)\le \textsf{K}^4\), and either \({\texttt {ce}}_{\mathcal {A}}(w)\ne \) \({\texttt {ce}}_{\mathcal {B}}(w)\) or \(\mathcal {A} (w)\ne \mathcal {B} (w)\). There is an \(\mathcal {O}(\textsf{K}^6)\) time algorithm to output this word if it exists.
sketch
Let \(\mathcal {A}\) and \(\mathcal {B}\) be counter-synchronous but not equivalent. The equivalence of \(\mathcal {A}\) and \(\mathcal {B}\) reduces to reachability in their product droca. From [5], the height of a minimal word that reaches a configuration is \(\textsf{K}^4\) for a droca of size \(\textsf{K}^2\).
Let \(\mathcal {A}\) and \(\mathcal {B}\) be not counter-synchronous and w be the minimal word where \({\texttt {ce}}_{\mathcal {A}}(w)\ne \) \({\texttt {ce}}_{\mathcal {B}}(w)\). Let \(w=w_1a\) for some \(w_1\) and \(a\in \varSigma \). Since w is minimal, \({\texttt {ce}}_{\mathcal {A}}(w^\prime )=\) \({\texttt {ce}}_{\mathcal {B}}(w^\prime )\) for all prefixes of \(w'\) of \(w_1\). The synchronous run of \(w_1\) can be seen as the run of a droca with \(\textsf{K}^2\) states. From [5], \(height_{\mathcal {A}}(w_1) \le \textsf{K}^4\).
In both cases, it suffices to search for words of height less than \(\textsf{K}^4\). We construct dfas of size \(\textsf{K}^5\) corresponding to the configuration graph of \(\mathcal {A}\) and \(\mathcal {B}\) up to counter value \(\textsf{K}^4\). We conclude by doing a dfa equivalence check.    \(\square \)
A visibly one-counter automaton (voca) is a (droca) where the input alphabet is \(\varSigma = (\varSigma _{+}, \varSigma _{-},\varSigma _{0})\). The counter value is incremented (resp. decremented, not changed) on reading a symbol from \(\varSigma _{+}\) (resp. \(\varSigma _{-}\), \(\varSigma _0\)). Note that vocas over the same alphabet are counter-synchronous by definition as the word itself defines the counter value reached. For vocas, we give a faster \(\mathcal {O}(\alpha (\textsf{K}^3)\textsf{K}^3)\) time algorithm for equivalence checking using ideas from [14].
Fig. 1:
The figure shows the synchronous run of a word on two vocas such that it reaches a final state in one voca and a non-final state in the other. The dashed line denotes the part of the synchronous run that can be removed.
Theorem 4
Let \(\mathcal {A}\) and \(\mathcal {B}\) be vocas where \(|\mathcal {A} |,|\mathcal {B} |\le \textsf{K}\). If \(\mathcal {A} \) and \(\mathcal {B} \) are not equivalent, then there is a word w where \(\mathcal {A} (w)\ne \mathcal {B} (w)\), \(|w|\le 4\textsf{K}(\textsf{K}+\textsf{K}^2)\), and \(height_{\mathcal {A}}(w)\le 2(\textsf{K}+\textsf{K}^2)\). An \(\mathcal {O}(\alpha (\textsf{K}^3)\textsf{K}^3)\) algorithm can find this w if it exists.
Proof
Let \(\mathcal {A} =(P,\varSigma , p_{\iota },\delta ^1_0,\delta ^1_1,F_1)\) and \(\mathcal {B} =(Q,\varSigma , q_{\iota },\delta ^2_0,\delta ^2_1,F_2)\) be two non-equivalent vocas. Let \(P=\{p_1,\ldots , p_{|\mathcal {A} |}\}\) and \(Q=\{q_1,\ldots , q_{|\mathcal {B} |}\}\). For any word w, \({\texttt {ce}}_{\mathcal {A}}(w)={\texttt {ce}}_{\mathcal {B}}(w)\). Let \({\texttt {ce}}(w)\) denote this value. Similarly, let height(w) denote \(height_{\mathcal {A}}(w) = height_{\mathcal {B}}(w)\). For states \(p_i \in P\) and \(q_j \in Q\), we define the row vector \(\boldsymbol{\textbf{x}}_{(p_i,q_j)} \in \{0,1\}^{|\mathcal {A} |+|\mathcal {B} |}\) as: \(\boldsymbol{\textbf{x}}_{(p_i,q_j)}[k] = 1\) if and only if \(k=i\) or \(k= |\mathcal {A} |+j\). We also define the row vector \(\boldsymbol{\mathbf {\eta }}\in \{-1,0,1\}^{|\mathcal {A} |+|\mathcal {B} |}\) where \(\boldsymbol{\mathbf {\eta }}[k] = 1\) if \(k \le |\mathcal {A} |\) and \(p_k \in F_1\), \(\boldsymbol{\mathbf {\eta }}[k] = -1\), if \(k > |\mathcal {A} |\) and \(q_{k-|\mathcal {A} |} \in F_2\), and \(\boldsymbol{\mathbf {\eta }}[k]=0\) otherwise. Therefore, \(\boldsymbol{\textbf{x}}_{(p, q)}{\boldsymbol{\mathbf {\eta }}}^\top \ne 0\), if exactly one among p and q is a final state.
We consider the synchronous run of \(\mathcal {A} \) and \(\mathcal {B} \). A configuration pair of this synchronous run is \((\boldsymbol{\textbf{x}}_{(p,q)}, n)\) where \(p \in P, q \in Q\), and \(n \in \mathbb {N}\). The initial configuration pair is \(c_{\iota } = (\boldsymbol{\textbf{x}}_{(p_{\iota },q_{\iota })}, 0)\). Given two configuration pairs \(c_1=(\boldsymbol{\textbf{x}}_{(p,q)},n)\) and \(c_2=(\boldsymbol{\textbf{x}}_{(p',q')},m)\), \(c_1 \xrightarrow {u} c_2\) denotes \((p,n) \xrightarrow {u} (p',m)\) and \((q,n) \xrightarrow {u} (q',m)\). The transition matrix on u from \(c_1\) is \(\mathbb {M}\in \{0,1\}^{(|\mathcal {A} |+|\mathcal {B} |)^2}\) where \(\mathbb {M}[i,j]=1\) if and only if \((p_i,n) \xrightarrow {u} (p_j,m)\) or \((q_{i-|\mathcal {A} |},n)\xrightarrow {u}(q_{j-|\mathcal {A} |},m)\). Hence, \(\boldsymbol{\textbf{x}}_{(p,q)} \mathbb {M}= \boldsymbol{\textbf{x}}_{(p',q')}\). Claims 1 to 3 follow.
Claim 1
For any \(p,p' \in P\), \(q,q' \in Q\) and word w, the transition matrix of w from \((\boldsymbol{\textbf{x}}_{(p,q)},n)\) is the same as the transition matrix of w from \((\boldsymbol{\textbf{x}}_{(p',q')},n)\).
Claim 2
Let w be such that \({\texttt {ce}}(w') > 0\) for all prefixes \(w'\) of w. Then, the transition matrix of w from \((\boldsymbol{\textbf{x}}_{(p,q)},n)\) is the same as the transition matrix of w from \((\boldsymbol{\textbf{x}}_{(p',q')},m)\) for any \(p,p' \in P\) and \(q,q' \in Q\) and \(m,n >0\).
Claim 3
Any set of \(|\mathcal {A} |^2+|\mathcal {B} |^2+1\) transition matrices is linearly dependent.
Let w be a minimal word such that \(\mathcal {A} (w)\ne \mathcal {B} (w)\). Let \(c_{\ell }=(\boldsymbol{\textbf{x}}_{\ell },n_{\ell })\) be such that \(c_{\iota } \xrightarrow {w} c_{\ell }\). Hence, \(\boldsymbol{\textbf{x}}_{\ell } {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\). Note that there is no configuration pair that repeats during the run of w. Otherwise, we can remove the run between the repetitions to get a shorter word. This will contradict the minimality of w.
Claim 4
\(height(w)\le n_\ell +|\mathcal {A} |^2+|\mathcal {B} |^2\).
Proof
Assume for contradiction that \(height(w)=m\) and \(m> n_\ell + |\mathcal {A} |^2+|\mathcal {B} |^2\). There is a factorisation of \(w = w_1w_2\) such that \(c_{\iota }\xrightarrow {w_1} (\vec{z},m) \xrightarrow {w_2}c_{\ell }\). For an \(i \in [n_{\ell },m]\), let \((\boldsymbol{\textbf{y}}_i,i)\) (resp. \((\boldsymbol{\textbf{y}}_i',i)\)) be the configuration pair where the counter value i is encountered for the last (resp. first) time during the run of \(w_1\) (resp. \(w_2\)). Let \(\mathbb {A}_i\) and \(\mathbb {B}_i\) denote the transition matrices where \(\boldsymbol{\textbf{y}}_i \mathbb {A}_i= \boldsymbol{\textbf{y}}_i'\) and \(\boldsymbol{\textbf{y}}_i' \mathbb {B}_i= \boldsymbol{\textbf{x}}_{\ell }\). Let \(w = x_iy_iz_i\) where \(c_{\iota }\xrightarrow {x_i} (\boldsymbol{\textbf{y}}_i,i) \xrightarrow {y_i} (\boldsymbol{\textbf{y}}_i',i) \xrightarrow {z_i}c_{\ell }\). See Figure 1. Consider the matrices \(\mathbb {A}_{m-1},\mathbb {A}_{m-2},\ldots , \mathbb {A}_{n_\ell }\) in order. From Claim 3, there exists \(t \in [n_\ell , m-1]\) where \(\mathbb {A}_t\) is linearly dependent on \(\mathbb {A}_{m-1},\cdots ,\mathbb {A}_{t+1}\). Since \(\boldsymbol{\textbf{y}}_t \mathbb {A}_t \mathbb {B}_t {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\), \(\boldsymbol{\textbf{y}}_t (r_{t+1}\mathbb {A}_{t+1}+\cdots +r_{m-1}\mathbb {A}_{m-1} )\mathbb {B}_t {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\) for integers \(r_{t+1},\ldots ,r_{m-1}\). Hence there is a \(j > t\) where \(\boldsymbol{\textbf{y}}_t \mathbb {A}_j \mathbb {B}_t {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\). We show \(x_ty_jz_t\) is accepted exactly by one of \(\mathcal {A} \) and \(\mathcal {B} \) contradicting minimality of w. It suffices to show \(c_{\iota } \xrightarrow {x_t} (\boldsymbol{\textbf{y}}_t,t) \xrightarrow {y_j} (\boldsymbol{\textbf{y}}_t \mathbb {A}_j,t) \xrightarrow {z_t} (\boldsymbol{\textbf{y}}_t \mathbb {A}_j \mathbb {B}_t,n_{\ell })\). From Claim 2, the transition matrix of \(y_j\) from \((\boldsymbol{\textbf{y}}_t,t)\) is \(\mathbb {A}_j\) and from Claim 1, the transition matrix of \(z_t\) from \((\boldsymbol{\textbf{y}}_t\mathbb {A}_j, t)\) is \(\mathbb {B}_t\).    \(\square \) \(_{Claim: 4}\)
Claim 5
\(n_\ell \le |\mathcal {A} |+|\mathcal {B} |\).
Proof
Assume for contradiction that \(n_\ell >|\mathcal {A} |+|\mathcal {B} |\). For \(i\in [1,n_\ell ]\), let \((\boldsymbol{\textbf{z}}_i,i)\) denote the configuration pair where counter value i is encountered the last time during the run of w. Let \(w=x_iz_i\) where \(c_{\iota }\xrightarrow {x_i}(\boldsymbol{\textbf{z}}_i,i)\xrightarrow {z_i}c_{\ell }\) and \(\mathbb {C}_i\) be such that \(\boldsymbol{\textbf{z}}_{i} \mathbb {C}_i= \boldsymbol{\textbf{x}}_{\ell }\). From the fundamental theorem of vector spaces, there exists \(t\le (|\mathcal {A} |+|\mathcal {B} |)+1\) such that \(\boldsymbol{\textbf{z}}_t\) is a linear combination of \(\boldsymbol{\textbf{z}}_1, \ldots , \boldsymbol{\textbf{z}}_{t-1}\). Since \(\boldsymbol{\textbf{z}}_t \mathbb {C}_t {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\), there exists \(j<t\) such that \(\boldsymbol{\textbf{z}}_{j} \mathbb {C}_t {\boldsymbol{\mathbf {\eta }}}^\top \ne 0\). From Claim 2, the transition matrix of \(z_t\) from \((\boldsymbol{\textbf{z}}_j,j)\) is \(\mathbb {C}_t\). The word \(x_jz_t\) contradicts the minimality of w.    \(\square \) \(_{Claim: 5}\)
Let \(\textsf{K}\in \mathbb {N}\) such that \(|\mathcal {A} |,|\mathcal {B} |<\textsf{K}\). From Claim 4 and Claim 5, we get that \(height(w)\le 2(\textsf{K}+\textsf{K}^2)\). Similar to that in Theorem 3, we reduce the problem of finding the distinguishing word to equivalence check of dfas of size \(\mathcal {O}(\textsf{K}^3)\).    \(\square \)

4 Learning DROCAs

In this section, we give an \(\textsc {L}^*\)-like algorithm for active learning of drocas using polynomially many queries with the help of a SAT solver. In this framework, we have a learner and a teacher. The learner aims to construct a droca that recognises the same language as the teacher’s droca (call it \(\mathcal {A} \)). The teacher answers the following types of queries by the learner.
  • membership queries MQ\(_\mathcal {A} \): the learner provides a word \(w\in \varSigma ^*\). The teacher returns 1 if \(w\in \mathcal {L}(\mathcal {A})\), and 0 if \(w\not \in \mathcal {L}(\mathcal {A})\).
  • counter value queries CV\(_\mathcal {A} \): the learner asks the counter value reached on reading word w. The teacher returns the counter value. i.e., \({\texttt {ce}}_{\mathcal {A}}(w)\).
  • minimal synchronous-equivalence queries MSQ\(_\mathcal {A} \): the learner asks whether a droca \(\mathcal {C} \) is equivalent and counter-synchronous to \(\mathcal {A} \). The teacher returns yes if \(\mathcal {C} \) and \(\mathcal {A} \) are counter-synchronous and equivalent. Otherwise, the teacher provides a minimal counter-example \(z \in \varSigma ^*\) such that \(\mathcal {C} (z) \ne \mathcal {A} (z)\) or \({\texttt {ce}}_{\mathcal {A}}(z) \ne {\texttt {ce}}_{\mathcal {C}}(z)\).
The requirement for the teacher to return a counter-example z such that the counter values reached on reading z in the teachers and learners drocas are different can be removed. But in that case, we have to use the equivalence check for drocas (not necessarily counter-synchronous). The number of queries needed will be polynomial in the length of the minimal counter-example that distinguishes two drocas. Including this additional condition allows the use of the counter-synchronous equivalence check (see Theorem 3), which is much faster. The assumption of the teacher returning a minimal counter-example can be removed if we use partial-equivalence queries. A partial-equivalence query takes two drocas and a limit as inputs and determines whether the two drocas are equivalent for all words whose length does not exceed the limit. If they are not, then it returns a distinguishing word whose length is less than the limit. Neider and Löding [17] and Bruyère et al. [4] used partial-equivalence queries to learn the behaviour of drocas up to a certain counter value. One can simulate the teacher returning a minimal counter-example by a polynomial number of partial-equivalence queries. Similarly, a partial-equivalence query can be replaced with an equivalence query that returns a minimal counter-example.
Table 1:
An observation table of the droca given in Figure 2a recognising the language \(\{a^nb^na \mid n>0\}\). Here, \({\mathcal {P}}=\{\epsilon , a, ab, aba\}\) and \({\mathcal {S}}=\{\epsilon ,a\}\).
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90643-5_14/MediaObjects/648515_1_En_14_Tab1_HTML.png
Dummy

4.1 Observation Table

Our algorithm maintains an observation table C over the input alphabet \(\varSigma =\{\sigma _1,\sigma _2,\ldots , \sigma _k\}\) for some \(k\in \mathbb {N}\). \(C=({\mathcal {P}},{\mathcal {S}},Memb, \texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma }, {Actions})\), where \({\mathcal {P}}\subseteq \varSigma ^*\) is a nonempty prefix-closed set of strings, \({\mathcal {S}}\subseteq \varSigma ^*\) is a nonempty suffix-closed set of strings, \(Memb: ({\mathcal {P}}\cup {\mathcal {P}}\varSigma ) {\mathcal {S}}\rightarrow \{0, 1\}\) is a function that indicates whether words belong to the language, \(\texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma }: {\mathcal {P}}\cup {\mathcal {P}}\varSigma \rightarrow \mathbb {N}\) is the function \({\texttt {ce}}\) with domain restricted to the set \({\mathcal {P}}\cup {\mathcal {P}}\varSigma \), and \({Actions}: ({\mathcal {P}}\cup {\mathcal {P}}\varSigma ) {\mathcal {S}}\rightarrow \{0,1\}\times \{0,1,-1\}^k\) is a function representing the sign of the counter value reached and the counter-actions on every letter after reading a word. Given \(w\in ({\mathcal {P}}\cup {\mathcal {P}}\varSigma ) {\mathcal {S}}\), \(Memb(w)\) is equal to 0 (resp. 1) if \(\mathcal {A} (w)\) is equal to 0 (resp. 1), and \({Actions}(w)= (sign({\texttt {ce}}(w)), {\texttt {ce}}(w \sigma _1)-{\texttt {ce}}(w), \ldots , {\texttt {ce}}(w \sigma _k)-{\texttt {ce}}(w))\). Given \(w_1,w_2\in ({\mathcal {P}}\cup {\mathcal {P}}\varSigma ) {\mathcal {S}}\), we say that \({Actions}(w_1)\) is not similar to \({Actions}(w_2)\) if and only if \(sign({\texttt {ce}}(w_1))= sign({\texttt {ce}}(w_2))\) and \({Actions}(w_1)\ne {Actions}(w_2)\). We use \({Actions}(w_1)\not \sim {Actions}(w_2)\) to denote this. The observation table initially has \({\mathcal {P}}={\mathcal {S}}=\{\epsilon \}\) and is augmented as the algorithm runs.
An observation table can be viewed as a two-dimensional array with rows labelled with elements of \({\mathcal {P}}\cup {\mathcal {P}}\varSigma \), columns labelled by elements of S and an additional column labelled \({\texttt {ce}}\). The column \({\texttt {ce}}\) contains the counter value reached on reading the word labelling a row (see Table 1). For any \(p \in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), the entry in row p and column s is equal to \((Memb(p s), {Actions}(p s))\) and \(cell(p)\) denotes the finite function \(f_p\) from \({\mathcal {S}}\) to \(\{0,1\} \times \{0,1\}\times \{0,1,-1\}^k\) defined by \(f_p(s)= (Memb(p s), {Actions}(p s))\). We use row(p) to denote \((\texttt {ce}(p), cell(p))\). For \(p,p^\prime \in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \), we say row(p) is equal to \(row(p^\prime )\) (denoted by \(row(p)= row(p^\prime )\)), if \(cell(p)=cell(p^\prime )\) and \(\texttt {ce}(p)=\texttt {ce}(p^\prime )\).
Now, we introduce the notion of d-closed and d-consistent observation tables for any \(d\in \mathbb {N}\). This is similar to the notion of closed and consistency used by Angluin [2], but it also takes into account the counter values.
Definition 5
Let \(({\mathcal {P}},{\mathcal {S}},Memb,\texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma },Actions)\) be an observation table and \(d\in \mathbb {N}\).
1.
The observation table is not d-closed if there exist \(p\in {\mathcal {P}}\) and \(a\in \varSigma \) such that \(\texttt {ce}(p a)\le d\) and for all \(p^\prime \in {\mathcal {P}}\), \(row(p a) \ne row(p^\prime )\).
 
2.
The observation table is not d-consistent if there exist \(p,q\in {\mathcal {P}}\) and \(a\in \varSigma \) such that \(\texttt {ce}(p)=\texttt {ce}(q) \le d\), \(row(p)= row(q)\), and \(row(p a)\ne row(q a)\).
 
Consider the observation table given in Table 1. This table is 1-closed but not 2-closed. This is because of the presence of the words aaba and bb in \({\mathcal {P}}\varSigma \). The given table is trivially 1-consistent as there are no equal rows in \({\mathcal {P}}\).

4.2 Constructing a DROCA from an Observation Table

We introduce the notion of a characteristic dfa. Given an alphabet \(\varSigma \), we define the modified alphabet \(\widetilde{\varSigma } = \bigcup _{a\in \varSigma }\{ a^0, a^1\}\). For a droca \(\mathcal {A}\), we define a function \({\texttt {Enc}}_{\mathcal {A}}: \varSigma ^* \rightarrow \widetilde{\varSigma }^*\) as follows: For \(w\in \varSigma ^+\), \({\texttt {Enc}}_{\mathcal {A}}(w)= \tilde{w}\), such that for all \(i\in [0,|w|-1]\), \(\tilde{w}[i] = w[i]^{sign({\texttt {ce}}_{\mathcal {A}}({w_{[0\cdots i-1]}}))}\). Also \({\texttt {Enc}}_{\mathcal {A}}(\epsilon )= \epsilon \).
Definition 6
Let \(\mathcal {A} = (Q, \varSigma , q_0, \delta _0, \delta _1, F)\) be a droca. The characteristic dfa \(\mathcal {D} _{\mathcal {A}}\) of \(\mathcal {A} \) over the modified alphabet \(\widetilde{\varSigma }\) is \(\mathcal {D} _{\mathcal {A}}= (Q, \widetilde{\varSigma }, q_0, \delta , F)\) where, for all \(q\in Q\) and \(a\in \varSigma \), \(\delta (q, a^0) =p\) (resp. \(\delta (q, a^1)=p\)) if and only if \(\delta _0(q, a)= (p,c)\) (resp. \(\delta _1(q, a)=(p,c)\)) for some \(p\in Q\) and \(c\in \{0,1,-1\}\).
Figure 2b shows the characteristic dfa over the modified alphabet \(\widetilde{\varSigma }\) corresponding to the droca given in Figure 2a. Note that in a characteristic dfa, the counter information is not present. If we have access to the counter-actions, then we can construct a droca from this characteristic dfa.
Fig. 2:
A droca recognising the language \(\{a^nb^na \mid n>0\}\) is given in Figure 2a. The characteristic dfa corresponding to this droca is shown in Figure 2b.
Constructing a Characteristic DFA Using a SAT Solver: Let \(C=({\mathcal {P}},{\mathcal {S}},Memb,\texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma },{Actions})\) be an observation table. We give C as input to a procedure \(\texttt {ConstructAutomaton}\) and obtain a dfa \(\mathcal {D} _C\) over the modified alphabet \(\widetilde{\varSigma }\) as output. The dfa \(\mathcal {D} _C\) satisfies the following two properties:
1.
for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), \({\texttt {Enc}}_{\mathcal {A}}(ps)\in \mathcal {L}(\mathcal {D} _C)\) if and only if \(Memb(ps)=1\).
 
2.
for any \(p_1,p_2\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s_1,s_2\in {\mathcal {S}}\), if the run on \({\texttt {Enc}}_{\mathcal {A}}(p_1s_1)\) and \({\texttt {Enc}}_{\mathcal {A}}(p_2s_2)\) reaches the same state in \(\mathcal {D} _C\) then \({Actions}(p_1s_1)\) is similar to \({Actions}(p_2s_2)\).
 
We outline the operations performed by the function \(\texttt {ConstructAutomaton}\). First, we create two sets of words, Pos and Neg. For any \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), we add \({\texttt {Enc}}_{\mathcal {A}}(ps)\) to Pos (resp. Neg) if and only if \(Memb(ps)=1\) (resp. 0). The dfa \(\mathcal {D} _C\) will accept all words in Pos and reject all words in Neg. This ensures condition 1. Observe that not all words over \(\widetilde{\varSigma }\) correspond to encodings of words over \(\varSigma \). To ensure condition 2, we add words over a larger alphabet \(\widetilde{\varSigma } \cup Operations\), where \(Operations=\{Actions(w)\mid w=ps \text { for some } p\in {\mathcal {P}}\text { and }s\in {\mathcal {S}}\}\). For any \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), we add \({\texttt {Enc}}_{\mathcal {A}}(p s)\cdot {Actions}(p s)\) to Pos and for all \(op\in Operations\) where \({Actions}(p s)\not \sim op\), we add \({\texttt {Enc}}_{\mathcal {A}}(p s)\cdot op\) to Neg. We find the minimal separating dfa for the sets Pos and Neg. The transitions labelled by the letters from Operations are removed from this minimal separating dfa to obtain \(\mathcal {D} _C\).
Given an observation table, the sets Pos and Neg can be constructed in polynomial time. Every state of \(\mathcal {A} \) can add two elements to the set Operations – one corresponding to the counter-actions on reading letters from counter value zero and the other for reading letters from positive counter value. Hence, the cardinality of this set is at most \(2|\mathcal {A} |\). In our implementation, we use the algorithm by Dell’Erba et al. [6] that uses a SAT solver to find a minimal separating dfa. However, any algorithm that finds a minimal separating dfa suffices [10, 12, 16]. In the next subsection, we observe that \(\mathcal {D} _C\) is a characteristic dfa.
Constructing a DROCA from a Characteristic DFA: Let the input alphabet be \(\varSigma = \{\sigma _1,\ldots ,\sigma _k\}\) and \(C=({\mathcal {P}}, {\mathcal {S}}, Memb,\texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma },{Actions})\) be an observation table over \(\varSigma \). The following lemma states that we can construct a droca \(\mathcal {B} _C\) from C such that it agrees with the observation table C. The idea is to use the observation table to assign counter-actions to transitions of \(\mathcal {D} _C\).
Lemma 7
Given an observation table C, we can construct a droca \(\mathcal {B} _C\) with \(|\mathcal {B} _C|\le |\mathcal {A} |\) such that for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), \(\mathcal {B} _C(ps)=Memb(ps)\) and \({\texttt {ce}}_{\mathcal {B} _C}(ps)={\texttt {ce}}_{\mathcal {A}}(ps)\).
Proof
Let the function \(\texttt {ConstructAutomaton}\) for input C return a dfa \(\mathcal {D} _C= (Q, \widetilde{\varSigma }, q_0, \delta , F)\). For all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\cup {\mathcal {S}}\varSigma \), we can find \({\texttt {ce}}(ps)\) from C. We define the droca \(\mathcal {B} _C= (Q, \varSigma , q_0, \delta _0, \delta _1, F)\) where \(\delta _0\) and \(\delta _1\) are specified as follows. For all \(q\in Q\), \(a\in \varSigma \), \(\delta _0(q, a)= (\delta (q, a^0),c)\) for some \(c\in \{0,1\}\), if there exists \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\) such that \(\mathcal {D} _C\) on reading \({{\texttt {Enc}}_{\mathcal {A}}(ps)}\) reaches the state q with \({\texttt {ce}}(ps)=0\) and \({\texttt {ce}}(psa)=c\). Similarly, for all \(q\in Q\), \(a\in \varSigma \), \(\delta _1(q, a)= (\delta (q, a^1),c)\) for some \(c\in \{0,1,-1\}\), if there exists \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\) such that \(\mathcal {D} _C\) on reading \({{\texttt {Enc}}_{\mathcal {A}}(ps)}\) reaches the state q with \({\texttt {ce}}(ps)>0\) and \({\texttt {ce}}(psa)=c\). If there are any transitions that do not have a counter-action, it means that there is no word in the observation table that took that transition. One can assign any arbitrary counter-action for these transitions.
We know that for any word \(w, w^\prime \in \varSigma ^*\), if \(\mathcal {D} _C\) on reading w and \(w^\prime \) reaches the same state, then \({Actions}(w)\) is similar to \({Actions}(w^\prime )\). In our construction, we are assigning counter-actions to transitions from a state in \(\mathcal {B} _C\) based on the counter-actions of words that reach that state in \(\mathcal {D} _C\). Since all words that reach a state have similar counter-actions, this assignment will be consistent. By construction, for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), \({\texttt {ce}}_{\mathcal {B} _C}(ps)= {\texttt {ce}}_{\mathcal {A}}(ps)\). This also ensures that \({\texttt {Enc}}_{\mathcal {A}}(ps)= {\texttt {Enc}}_{\mathcal {B} _C}(ps)\). Note that, \(\mathcal {D} _C\) on reading \({\texttt {Enc}}_{\mathcal {A}}(ps)\) reaches a final state if and only if \(ps\in \mathcal {L}(\mathcal {A})\). Since \({\texttt {Enc}}_{\mathcal {A}}(ps)= {\texttt {Enc}}_{\mathcal {B} _C}(ps)\), we get that \(\mathcal {B} _C\) reaches a final state on reading ps if and only if \(ps\in \mathcal {L}(\mathcal {A})\). Therefore, for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), \(\mathcal {B} _C(ps)=1\) if and only if \(Memb(ps)=1\).    \(\square \)

4.3 MinOCA: The Learning Algorithm

Algorithm 1 learns a droca that is equivalent to an input droca \(\mathcal {A} \). Initially, it sets up an observation table using empty strings and incrementally refines this table to distinguish states of the unknown droca. The process iteratively increases an integer value d and uses membership and counter value queries to construct a d-closed and d-consistent observation table \(C=({\mathcal {P}}, {\mathcal {S}}, Memb,\texttt {ce}\upharpoonright _{{\mathcal {P}}\cup {\mathcal {P}}\varSigma },{Actions})\) (see lines 4 – 14 of Algorithm 1). This part resembles the \(\textsc {L}^*\) algorithm. Making the table d-closed adds new rows to \({\mathcal {P}}\), and making it d-consistent adds new columns to \({\mathcal {S}}\). We construct a droca from a d-closed and d-consistent observation table C using Lemma 7 and ask a minimal synchronous-equivalence query. If the teacher provides a counter-example, then all its prefixes are added to \({\mathcal {P}}\), and the value of d is updated to the height of the counter-example, if it is more than d. The table is then extended until it becomes d-closed and d-consistent. This process continues until the correct droca is learnt.
Algorithm 1
MinOCA: droca learning algorithm.
Analysis: The correctness of Algorithm 1 follows from the fact that whenever the teacher replies yes to a minimal synchronous-equivalence query, the learnt droca recognises the target language. Now, we show that the algorithm terminates after polynomially many queries.
Similar to [17], we define a refined Myhill-Nerode congruence \(\simeq \subseteq \varSigma ^*\times \varSigma ^*\) as follows: for \(u,v\in \varSigma ^*\), \(u\simeq v\) if and only if for all \(z\in \varSigma ^*\), \(\mathcal {A} (uz)=\mathcal {A} (vz)\) and \({\texttt {ce}}_{\mathcal {A}}(uz)={\texttt {ce}}_{\mathcal {A}}(vz)\). For all \(w\in \varSigma ^*\), let [w] denote the equivalence class of w under \(\simeq \). The behaviour graph of a droca is an infinite state automaton induced by this refined congruence \(\simeq \). Each equivalence class under this refined congruence corresponds to a state of the behaviour graph, and the transitions between them are defined based on the equivalence class of the resultant words. i.e., for \(u\in \varSigma ^*\) and \(a\in \varSigma \) there is a transition from the state corresponding to [u] to the state corresponding to [ua] on reading the symbol a. The state corresponding to an equivalence class [u] will be marked as a final state in the behaviour graph if and only if \(\mathcal {A} (u)=1\).
Let \(\simeq |_{C} \) denote the restriction of \(\simeq \) to the entries in the observation table C. i.e., for all \(p,p'\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s,s'\in {\mathcal {S}}\), \(ps\simeq |_{C}\ p's'\) if and only if for all \(z\in \varSigma ^*\) where both psz and \(p's'z\) are in \(({\mathcal {P}}\cup {\mathcal {P}}\varSigma ){\mathcal {S}}\), \(\mathcal {A} (psz)=\mathcal {A} (p's'z)\) and \({\texttt {ce}}_{\mathcal {A}}(psz)={\texttt {ce}}_{\mathcal {A}}(p's'z)\). Given an observation table C, we can construct a partial behavioural graph \(BG_{C}\) induced by \(\simeq |_{C} \) in a similar fashion. Note that for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), the counter value corresponding to the equivalence class reached on reading ps and the membership of ps in \(BG_C\) is the same as the counter value reached and membership of ps in C.
Proposition 1
Given \(p,p'\in {\mathcal {P}}\), if \(p \simeq p^\prime \), then \(row(p)=row(p')\).
In the next proposition, C is an observation table and \(\mathcal {B} _C\) the \(\textsc {droca} \) learnt by the learner from C (in Line 15 of Algorithm 1).
Proposition 2
Let z be a counter-example, with \(height_{\mathcal {A}}(z)= d\), returned by MSQ\(_{\mathcal {A}}(\mathcal {B} _C)\). Let \(C'\) be the observation table obtained by adding the prefixes of z to \({\mathcal {P}}\) in C and making it d-closed and d-consistent. Then the number of distinct rows with counter value less than or equal to d in \(C'\) is more than that of C.
sketch
Let \(z = \) MSQ\(_{\mathcal {A}}(\mathcal {B} _C)\) be a counter-example with \(height_{\mathcal {A}}(z)= d\). Let \(C'\) be the observation table obtained by adding the prefixes of z to \({\mathcal {P}}\) in C and making it d-closed and d-consistent. Assume for contradiction that the number of distinct rows with counter value less than or equal to d in \(C'\) is the same as that of C. From Proposition 1, we get that \(BG_C\) and \(BG_{C'}\) are the same in this case. From Lemma 7 for all \(p\in {\mathcal {P}}\cup {\mathcal {P}}\varSigma \) and \(s\in {\mathcal {S}}\), the counter values reached and membership of ps are the same in \(\mathcal {B} _C\) and C. We also know that the counter value corresponding to the equivalence class reached on reading ps and membership of ps in \(BG_C\) is the same as the counter value reached and membership of ps in C. Since \(BG_{C}= BG_{C'}\), the membership and counter values reached by all prefixes of z in \(\mathcal {A} \) should also match \(\mathcal {B} _C\) contradicting the assumption that z is a counter-example.    \(\square \)
Proposition 3
For any \(d \in \mathbb {N}\), at most \(d \times |\mathcal {A} |\) many counter-examples of height less than or equal to d is returned by the minimal synchronous-equivalence query.
sketch
Fix a \(d' \in \mathbb {N}\). There are at most \(|\mathcal {A} |\) many configurations of \(\mathcal {A} \) with counter value \(d'\). We know that for \(p,p'\in {\mathcal {P}}\), if \(row(p) \ne row(p')\), then \(p \not \simeq p'\). Hence, there are at most \(|\mathcal {A} |\) distinct rows in the observation table with counter value \(d'\). Consequently, there are at most \(d \times |\mathcal {A} |\) distinct rows with counter values \(d' \le d\). The claim now follows from Proposition 2.    \(\square \)
Proposition 4
At most \(|\mathcal {A} |^5+1\) many minimal synchronous-equivalence queries are executed during the run of Algorithm 1.
Proof
Assume for contradiction that more than \(|\mathcal {A} |^5+1\) minimal synchronous-equivalence queries are executed. Hence, more than \(|\mathcal {A} |^5\) counter-examples are returned by these queries. From Proposition 3, it follows that at least one counter-example is of height greater than \(|\mathcal {A} |^4\). However, from Theorem 3, we know that the height of the minimal counter-example that distinguishes two drocas of size \(|\mathcal {A} |\) is at most \(|\mathcal {A} |^4\). This is a contradiction, since by Lemma 7, \(|\mathcal {B} _C| \le |\mathcal {A} |\) for any observation table C during the run of the algorithm.    \(\square \)
Theorem 8
Given a droca \(\mathcal {A} \), a minimal counter-synchronous droca recognising the same language can be learnt with at most \(|\mathcal {A} |^5+1\) queries to the SAT solver, \(|\mathcal {A} |^5+1\) minimal synchronous-equivalence queries and polynomially many membership and counter value queries.
We can adapt the learning algorithm for drocas for learning vocas. A minimal voca can be learnt using at most \(2|\mathcal {A} |^3+1\) equivalence queries from Theorem 4. For vocas, we do not require the counter value queries. Also, there are words that do not have a valid run. i.e., the counter goes below zero during the run. These words will not have a corresponding entry in the observation table and will be treated as don’t care. In this case, Algorithm 1 can be simplified further as we don’t have to create the set Operations to encode the counter actions, as the actions on the counter are already determined by the input alphabet.
Corollary 9
Given a voca \(\mathcal {A} \), a minimal voca recognising the same language can be learnt using at most \(\mathcal {O}(|\mathcal {A} |^3)\) queries to the SAT solver, \(\mathcal {O}(|\mathcal {A} |^3)\) minimal synchronous-equivalence queries and polynomially many membership queries.

5 Implementation

The proposed method, henceforth denoted as MinOCA, was implemented in Python3 and is tested against a set of randomly generated drocas. In this section, we discuss the implementation details and compare MinOCA with the method by Bruyère et al. [4], hereafter referred to as BPS.
Equivalence Query: Even though there is a polynomial time algorithm to check the equivalence of two drocas  [3], the polynomial is so large that it is not suitable for practical applications. In MinOCA, we construct a droca that is counter-synchronised with the droca to be learnt. The equivalence is checked by a breadth-first search on the configuration graph up to the counter value and length obtained from Theorem 3. The minimal synchronous-equivalence query either returns a word for which the constructed droca and droca to be learnt reach different counter values or returns a word that is accepted by one and rejected by the other. In our implementation, after each equivalence query, we increment the value of d and make the table is d-closed and d-consistent.
One major distinction between MinOCA and BPS is that the latter employs an approximate equivalence query, while the former uses an exact equivalence query. This implies that the droca returned by BPS after learning may not be correct. On the other hand, the droca returned by MinOCA is always correct.
Finding a Minimal-Separating DFA: We utilise the Python library DFAMiner by Dell’Erba et al. [6] that uses a SAT solver to find a minimal separating dfa from a given set of positive and negative samples. Various other techniques for computing a minimal separating automaton can be found in [12, 16, 18].
Random Generation of DROCAs: We follow a procedure similar to that by Bruyère et al. [4] to randomly generate drocas with a given number of states.
Let \(n\in \mathbb {N}\) be the number of states of the droca to be generated. The procedure GenerateDROCA used to generate random drocas is as follows. First, we initialise the set of states \(Q=\{q_1,q_2,\ldots , q_n\}\). For all \(q\in Q\), we add q to the set of final states F with probability 0.5. If \(Q=F\) or \(F=\emptyset \) after this step, then we restart the procedure. Otherwise, for all \(q\in Q\) and \(a\in \varSigma \), we assign \(\delta _0(q,a)=(p,c)\) (resp. \(\delta _1 (q,a)=(p,c)\)), with p a random state in Q and c a random counter operation in \(\{0,+1\}\) (resp. \(\{0,+1,-1\}\)). The constructed droca is \(\mathcal {A} =(Q, \varSigma , \{q_1\}, \delta _0,\delta _1, F)\). If the number of reachable states of \(\mathcal {A}\) from the initial configuration is not n, then we discard \(\mathcal {A} \) and restart the whole procedure. Otherwise, we output \(\mathcal {A} \).
Dataset for comparing MinOCA and BPS. A dataset containing a total of 5600 drocas was generated, with the input alphabet size varying from 2 to 5 and the number of states ranging from 2 to 15. For each combination of input alphabet size and number of states, 100 random drocas were generated. The notion of acceptance in BPS is by final state and zero counter value. MinOCA employs the widely accepted notion of acceptance with the final state only. For comparison, we generated random drocas where both acceptance conditions coincide. In order to achieve this, we use the procedure GenerateDROCA with an added condition that mandates that the final states can only be reached by transitions that read a symbol from counter value of zero and do not modify the counter value.
Fig. 3:
Number of successfully learnt languages by MinOCA and BPS.
Fig. 4:
The average length of the longest counter-example.
Fig. 5:
Average number of states in the learnt drocas.
Fig. 6:
Average number of equivalence queries used by MinOCA and BPS.
Experimental Results: We implemented MinOCA in Python and used the Java implementation of BPS. The computations were performed on an Apple M1 chip with 8GB of RAM, running macOS Sonoma Version 14.3. MinOCA spends most of the time in finding a minimal separating dfa. The scalability of our algorithm is hence dependent on the scalability of finding a minimal separating dfa.
A timeout of 5 minutes was allotted for both BPS and MinOCA for learning each droca. If the algorithm times out, we discard that sample and process the next one. The number of languages successfully learned by MinOCA and BPS for different input sizes is depicted in Figure 3. Notice that MinOCA outperforms BPS in terms of the number of successfully learnt drocas.
Figure 4 shows the average length of the longest counter-examples. MinOCA provides a smaller counter-example on average. Figure 5 shows the average number of states in the learnt drocas. The number of states of the automaton learnt by MinOCA is always less than or equal to the input size. Figure 6 shows the average number of equivalence queries used for successfully learning the input drocas. The number of equivalence queries is smaller for MinOCA in all cases.
Fig. 7:
The input droca recognises the language \(\{a^nb^na \mid n>0\}\).
Fig. 8:
The input droca recognises \(\{w\in \{a,b\}^*\mid w \text { does not contain a }b\}\).
In Figures 7 and 8, MinOCA learns a minimal counter-synchronised droca equivalent to the input. However, the droca learnt by BPS is equivalent but is not counter-synchronous with respect to the input. It is also not complete.

6 Conclusion

In this paper, we presented a novel approach for active learning of drocas. We showed that a drocas can be learnt using polynomially many queries with the help of a SAT solver. This is in contrast to the existing techniques that require exponentially many queries. Our algorithm learns a minimal counter-synchronous droca, which results in significantly smaller counter-examples on equivalence queries. Additionally, the development of a specialised equivalence-checking algorithm for counter-synchronous drocas, with a time complexity of \(\mathcal {O}(n^6)\), further contributes to the feasibility of our approach. For vocas, we optimised this equivalence-checking process to \(\mathcal {O}(n^3)\), enabling efficient learning. We evaluated our algorithm against randomly generated drocas. The results indicate that our method significantly outperforms the existing method by Bruyère et al. [4].
In future work, the proposed algorithm can be improved by finding better methods for identifying the minimal separating dfa. The algorithm can also be applied to learn vocas, and the scalability of this approach warrants further investigation. Additionally, extending these ideas to learn more general models, such as visibly pushdown automata, represents a valuable direction for further research. Another open problem is to determine whether active learning of drocas can be done in polynomial time. This problem is open, even in the case of vocas. However, learning a minimal voca cannot be done in polynomial time unless \(\mathsf {P=NP}\). Exploring the possibility of finding a polynomial-time algorithm to obtain a polynomial approximation is worthy of further study.

Acknowledgments

A.V. Sreejith would like to acknowledge the support by SERB for the project “Probabilistic Pushdown Automata” [MTR/2021/000788].
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Fußnoten
1
For all practical applications, one can consider \(\alpha \) as a constant (see Section 3).
 
2
This polynomial is derived from the algorithm given in [3].
 
3
The implementation of MinOCA, the datasets used, and the complete results generated can be found in the following link: https://​doi.​org/​10.​5281/​zenodo.​14604419 [13].
 
Literatur
3.
Zurück zum Zitat Stanislav Böhm and Stefan Göller. Language equivalence of deterministic real-time one-counter automata is nl-complete. In Filip Murlak and Piotr Sankowski, editors, Mathematical Foundations of Computer Science 2011 - 36th International Symposium, MFCS 2011, Warsaw, Poland, August 22-26, 2011. Proceedings, volume 6907 of Lecture Notes in Computer Science, pages 194–205. Springer, 2011. https://doi.org/10.1007/978-3-642-22993-0_20. Stanislav Böhm and Stefan Göller. Language equivalence of deterministic real-time one-counter automata is nl-complete. In Filip Murlak and Piotr Sankowski, editors, Mathematical Foundations of Computer Science 2011 - 36th International Symposium, MFCS 2011, Warsaw, Poland, August 22-26, 2011. Proceedings, volume 6907 of Lecture Notes in Computer Science, pages 194–205. Springer, 2011. https://​doi.​org/​10.​1007/​978-3-642-22993-0_​20.
4.
Zurück zum Zitat Véronique Bruyère, Guillermo A. Pérez, and Gaëtan Staquet. Learning realtime one-counter automata. In Dana Fisman and Grigore Rosu, editors, Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I, volume 13243 of Lecture Notes in Computer Science, pages 244–262. Springer, 2022. https://doi.org/10.1007/978-3-030-99524-9_13 . Véronique Bruyère, Guillermo A. Pérez, and Gaëtan Staquet. Learning realtime one-counter automata. In Dana Fisman and Grigore Rosu, editors, Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I, volume 13243 of Lecture Notes in Computer Science, pages 244–262. Springer, 2022. https://​doi.​org/​10.​1007/​978-3-030-99524-9_​13 .
6.
Zurück zum Zitat Daniele Dell’Erba, Yong Li, and Sven Schewe. Dfaminer: Mining minimal separating dfas from labelled samples. In André Platzer, Kristin Yvonne Rozier, Matteo Pradella, and Matteo Rossi, editors, Formal Methods - 26th International Symposium, FM 2024, Milan, Italy, September 9-13, 2024, Proceedings, Part II, volume 14934 of Lecture Notes in Computer Science, pages 48–66. Springer, 2024. https://doi.org/10.1007/978-3-031-71177-0_4 . Daniele Dell’Erba, Yong Li, and Sven Schewe. Dfaminer: Mining minimal separating dfas from labelled samples. In André Platzer, Kristin Yvonne Rozier, Matteo Pradella, and Matteo Rossi, editors, Formal Methods - 26th International Symposium, FM 2024, Milan, Italy, September 9-13, 2024, Proceedings, Part II, volume 14934 of Lecture Notes in Computer Science, pages 48–66. Springer, 2024. https://​doi.​org/​10.​1007/​978-3-031-71177-0_​4 .
7.
Zurück zum Zitat Amr F. Fahmy and Robert S. Roos. Efficient learning of real time one-counter automata. In Klaus P. Jantke, Takeshi Shinohara, and Thomas Zeugmann, editors, Algorithmic Learning Theory, 6th International Conference, ALT ’95, Fukuoka, Japan, October 18-20, 1995, Proceedings, volume 997 of Lecture Notes in Computer Science, pages 25–40. Springer, 1995. https://doi.org/10.1007/3-540-60454-5_26 . Amr F. Fahmy and Robert S. Roos. Efficient learning of real time one-counter automata. In Klaus P. Jantke, Takeshi Shinohara, and Thomas Zeugmann, editors, Algorithmic Learning Theory, 6th International Conference, ALT ’95, Fukuoka, Japan, October 18-20, 1995, Proceedings, volume 997 of Lecture Notes in Computer Science, pages 25–40. Springer, 1995. https://​doi.​org/​10.​1007/​3-540-60454-5_​26 .
10.
Zurück zum Zitat Marijn Heule and Sicco Verwer. Exact DFA identification using SAT solvers. In José M. Sempere and Pedro García, editors, Grammatical Inference: Theoretical Results and Applications, 10th International Colloquium, ICGI 2010, Valencia, Spain, September 13-16, 2010. Proceedings, volume 6339 of Lecture Notes in Computer Science, pages 66–79. Springer, 2010. https://doi.org/10.1007/978-3-642-15488-1_7 . Marijn Heule and Sicco Verwer. Exact DFA identification using SAT solvers. In José M. Sempere and Pedro García, editors, Grammatical Inference: Theoretical Results and Applications, 10th International Colloquium, ICGI 2010, Valencia, Spain, September 13-16, 2010. Proceedings, volume 6339 of Lecture Notes in Computer Science, pages 66–79. Springer, 2010. https://​doi.​org/​10.​1007/​978-3-642-15488-1_​7 .
11.
Zurück zum Zitat John E. Hopcroft and Richard M. Karp. A linear algorithm for testing equivalence of finite automata. Technical Report, University of California, pages 71–114, 1971. John E. Hopcroft and Richard M. Karp. A linear algorithm for testing equivalence of finite automata. Technical Report, University of California, pages 71–114, 1971.
12.
Zurück zum Zitat Martin Leucker and Daniel Neider. Learning minimal deterministic automata from inexperienced teachers. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change - 5th International Symposium, ISoLA 2012, Heraklion, Crete, Greece, October 15-18, 2012, Proceedings, Part I, volume 7609 of Lecture Notes in Computer Science, pages 524–538. Springer, 2012. https://doi.org/10.1007/978-3-642-34026-0_39 . Martin Leucker and Daniel Neider. Learning minimal deterministic automata from inexperienced teachers. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change - 5th International Symposium, ISoLA 2012, Heraklion, Crete, Greece, October 15-18, 2012, Proceedings, Part I, volume 7609 of Lecture Notes in Computer Science, pages 524–538. Springer, 2012. https://​doi.​org/​10.​1007/​978-3-642-34026-0_​39 .
14.
Zurück zum Zitat Prince Mathew, Vincent Penelle, Prakash Saivasan, and A. V. Sreejith. Weighted one-deterministic-counter automata. In Patricia Bouyer and Srikanth Srinivasan, editors, 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2023, December 18-20, 2023, IIIT Hyderabad, Telangana, India, volume 284 of LIPIcs, pages 39:1–39:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. https://doi.org/10.4230/LIPICS.FSTTCS.2023.39 . Prince Mathew, Vincent Penelle, Prakash Saivasan, and A. V. Sreejith. Weighted one-deterministic-counter automata. In Patricia Bouyer and Srikanth Srinivasan, editors, 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2023, December 18-20, 2023, IIIT Hyderabad, Telangana, India, volume 284 of LIPIcs, pages 39:1–39:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. https://​doi.​org/​10.​4230/​LIPICS.​FSTTCS.​2023.​39 .
15.
Zurück zum Zitat Jakub Michaliszyn and Jan Otop. Learning deterministic visibly pushdown automata under accessible stack. In Stefan Szeider, Robert Ganian, and Alexandra Silva, editors, 47th International Symposium on Mathematical Foundations of Computer Science, MFCS 2022, August 22-26, 2022, Vienna, Austria, volume 241 of LIPIcs, pages 74:1–74:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. https://doi.org/10.4230/LIPICS.MFCS.2022.74 . Jakub Michaliszyn and Jan Otop. Learning deterministic visibly pushdown automata under accessible stack. In Stefan Szeider, Robert Ganian, and Alexandra Silva, editors, 47th International Symposium on Mathematical Foundations of Computer Science, MFCS 2022, August 22-26, 2022, Vienna, Austria, volume 241 of LIPIcs, pages 74:1–74:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. https://​doi.​org/​10.​4230/​LIPICS.​MFCS.​2022.​74 .
16.
Zurück zum Zitat Daniel Neider. Computing minimal separating DFAs and regular invariants using SAT and SMT solvers. In Supratik Chakraborty and Madhavan Mukund, editors, Automated Technology for Verification and Analysis - 10th International Symposium, ATVA 2012, Thiruvananthapuram, India, October 3-6, 2012. Proceedings, volume 7561 of Lecture Notes in Computer Science, pages 354–369. Springer, 2012. https://doi.org/10.1007/978-3-642-33386-6_28 . Daniel Neider. Computing minimal separating DFAs and regular invariants using SAT and SMT solvers. In Supratik Chakraborty and Madhavan Mukund, editors, Automated Technology for Verification and Analysis - 10th International Symposium, ATVA 2012, Thiruvananthapuram, India, October 3-6, 2012. Proceedings, volume 7561 of Lecture Notes in Computer Science, pages 354–369. Springer, 2012. https://​doi.​org/​10.​1007/​978-3-642-33386-6_​28 .
17.
Zurück zum Zitat Daniel Neider and Christof Löding. Learning visibly one-counter automata in polynomial time. Technical Report, RWTH Aachen, AIB-2010-02, 2010. Daniel Neider and Christof Löding. Learning visibly one-counter automata in polynomial time. Technical Report, RWTH Aachen, AIB-2010-02, 2010.
Metadaten
Titel
Learning Real-Time One-Counter Automata Using Polynomially Many Queries
verfasst von
Prince Mathew
Vincent Penelle
A. V. Sreejith
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-90643-5_14