nach oben

Theory of Computing Systems

Erschienen in:

Open Access 02.05.2017

Rewriting Higher-Order Stack Trees

verfasst von: Vincent Penelle

Erschienen in: Theory of Computing Systems | Ausgabe 2/2017

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Higher-order pushdown systems and ground tree rewriting systems can be seen as extensions of suffix word rewriting systems. Both classes generate infinite graphs with interesting logical properties. Indeed, the model-checking problem for monadic second order logic (respectively first order logic with a reachability predicate) is decidable on such graphs. We unify both models by introducing the notion of stack trees, trees whose nodes are labelled by higher-order stacks, and define the corresponding class of higher-order ground tree rewriting systems. We show that these graphs retain the decidability properties of ground tree rewriting graphs while generalising the pushdown hierarchy of graphs.

This work was partially supported by the French National Research Agency (ANR), through excellence program Bé zout (ANR-10-LABX-58)

1 Introduction

Since Rabin’s proof of the decidability of monadic second order logic (MSO) over the full infinite binary tree Δ₂ [18], there has been an effort to characterise increasingly general classes of structures with decidable MSO theories. This can be achieved for instance using families of graph transformations which preserve the decidability of MSO - such as the unfolding or the MSO-interpretation and applying them to graphs of known decidable MSO theories, such as finite graphs or the graph Δ₂.

This approach was followed in [8], where it is shown that the prefix (or suffix) rewriting graphs of recognisable word rewriting systems, which coincide (up to graph isomorphism) with the transition graphs of pushdown automata (contracting ε-transitions), can be obtained from Δ₂ using inverse regular substitutions, a simple class of MSO-compatible transformations. They also coincide with those obtained by applying MSO interpretations to Δ₂ [1]. Alternately unfolding and applying inverse regular mappings to these graphs yields a strict hierarchy of classes of trees and graphs with a decidable MSO theory [7, 9] coinciding with the transition graphs of higher-order pushdown automata and capturing the solutions of safe higher-order program schemes,¹ whose MSO decidability had already been established in [15]. We will henceforth call this the pushdown hierarchy and the graphs at its n-th order n-pushdown graphs for simplicity.

Also well-known are the automatic and tree-automatic structures (see for instance [2]), whose vertices are represented by words or trees and whose edges are characterised using finite automata running over tuples of vertices. The decidability of first-order logic (FO) over these graphs stems from the well-known closure properties of regular word and tree languages, but it can also be related to Rabin’s result since tree-automatic graphs are precisely the class of graphs obtained from Δ₂ using finite-set interpretations [11], a generalisation of MSO interpretations mapping structures with a decidable weak-MSO theory (WMSO) to structures with a decidable FO theory. Applying finite-set interpretations to the whole pushdown hierarchy therefore yields an infinite hierarchy of graphs of decidable FO theory, which is proven in [11] to be strict.

Since prefix-recognisable graphs can be seen as word rewriting graphs, another variation is to consider similar rewriting systems over trees. This yields the class of ground tree rewriting graphs, which strictly contains that of real-time order-1 pushdown graphs. This class is orthogonal to the whole pushdown hierarchy since it contains at least one graph of undecidable MSO theory, for instance the infinite 2-dimensional grid. The transitive closures of ground tree rewriting systems can be represented using ground tree transducers, whose graphs were shown in [13] to have decidable FO[$\xrightarrow {*}$] theories by establishing their closure under iteration and then showing that any such graph is tree-automatic.

The purpose of this work is to propose a common extension to both higher-order stack operations and ground tree rewriting. We introduce a model of higher-order ground tree rewriting over trees labelled by higher-order stacks (henceforth called stack trees), which coincides, at order 1, with ordinary ground tree rewriting and, over unary trees, with the dynamics of higher-order pushdown automata. Following ideas from the works cited above, as well as the notion of recognisable sets and relations over higher-order stacks defined in [5], we introduce the class of ground (order n) stack tree rewriting systems, whose derivation relations are captured by ground stack tree transducers. Establishing that this class of relations is closed under iteration and can be finite-set interpreted in n-pushdown graphs yields the decidability of their FO[$\xrightarrow {*}$] theories. A preliminary version of this work has been presented in [17].

The remainder of this paper is organised as follows. Section 2 recalls some of the concepts used in the paper. Section 3 defines stack trees and stack tree rewriting systems. Section 4 explores a notion of recognisability for binary relations over stack trees. Section 5 proves the decidability of FO[$\xrightarrow {*}$] model checking over ground stack tree rewriting graphs. Finally, Section 7 presents some further perspectives.

2 Definitions and Notations

Relations

Given an arbitrary set Γ, a binary relation over it is a subset of Γ × Γ. A pair of elements (γ ₁,γ ₂) of Γ related by a relation R will be denoted as (γ ₁,γ ₂) ∈ R. Given two relations R ₁ and R ₂ over Γ, we denote by R ₁ ∘ R ₂ = {(γ ₁,γ ₂)∣∃γ ₃,(γ ₁,γ ₃) ∈ R ₁ ∧ (γ ₃,γ ₂) ∈ R ₂}. Notice that this notation is not the one we encounter in the case of function, but will be easier to consider in our case as it will be more consistent with the other notations encountered throughout this document.

Trees

Given an arbitrary set Γ, an ordered Γ-labelled tree t of arity at most $d \in \mathbb {N}$ is a partial function from {1,…,d}^∗ to Γ such that the domain of t, dom(t) is prefix-closed (if u is in dom(t), then every prefix of u is also in dom(t)) and left-closed (for all u ∈{1,…,d}^∗ and 2 ≤ j ≤ d, t(u j) is defined only if t(u i) is for every i < j). Node uj is called the j-th child of its parent node u. Additionally, the nodes of t are totally ordered by the natural length-lexicographic ordering over {1,…,d}^∗. By abuse of notation, given a symbol α ∈ Γ, we simply denote by α the tree {𝜖↦α} reduced to a unique α-labelled node. The frontier of t is the set fr(t) = {u ∈dom(t)∣u1∉dom(t)}. Its elements are called the leaves of t. Trees will always be drawn in such a way that the left-to-right placement of leaves respects the lexicographic order. The set of finite trees labelled by Γ is denoted by $\mathcal {T}(\Gamma )$. In this paper we only consider finite trees, i.e. trees with finite domains.

Given nodes u and v, we write $u \sqsubseteq v$ if u is a prefix of v, i.e. if there exists w ∈{1,⋯ ,d}^∗, such that v = u w. We will say that u is an ancestor of v or is above v, and symmetrically that v is below u or is its descendant. We call v _≤i the prefix of v of length i, and v _≥i its suffix of length i. For any u ∈dom(t), t(u) is called the label of node u in t. For any u ∈dom(t), we call # _t(u) the arity of u, i.e. its number of children. When t is understood, we simply write #(u). Given trees t,s ₁,…,s _k and a k-tuple of positions u = (u ₁,…,u _k) ∈dom(t)^k, we denote by t u s ₁,…,s _k the tree obtained by replacing the sub-tree at each position u _i in t by s _i, i.e. the tree in which any node v not below any u _i is labelled t(v), and any node u _i.v with v ∈dom(s _i) is labelled s _i(v).

Directed Graphs

A directed graph G with edge labels in Σ is a pair (V _G,E _G) where V _G is a set of vertices and E _G ⊆ (V _G × Σ × V _G) is a set of edges. Given two vertices x and y, we write $x \xrightarrow {a}{~}_{G} y$ if (x,a,y) ∈ E _G, $x \xrightarrow {~}{~}_{G} y$ if there exists a ∈ Σ such that $x \xrightarrow {a}{~}_{G} y$, and $x \xrightarrow {\Sigma ^{\prime }}{~}_{G} y$ if there exists a ∈ Σ^′ such that $x \xrightarrow {a}{~}_{G} y$. There is a directed path in G from x to y labelled by w = w ₁…w _k ∈ Σ^∗, written $x \xrightarrow {w}{~}_{G} y$, if there are vertices x ₀,…,x _k such that x = x ₀, x _k = y and for all i ∈{1,⋯ ,k}, $x_{i-1} \xrightarrow {w_{i}}{~}_{G} x_{i}$. We additionally write $x \xrightarrow {\ast }{~}_{G} y$ if there exists w such that $x \xrightarrow {w}{~}_{G} y$ , and $x \xrightarrow {+}{~}_{G} y$ if there is such a path with |w|≥ 1. A directed graph G is connected if there exists an undirected path between any two vertices x and y, meaning that $(x, y) \in (\xrightarrow {~}{~}_{G} \cup \xrightarrow {~}{~}_{G}^{-1})^{*}$. We omit G from all these notations when it is clear from the context. A directed graph D is acyclic, or is a DAG, if there is no x such that $x \xrightarrow {+} x$. The empty DAG consisting of a single vertex (and no edge, hence its name) is denoted by □. Given a DAG D, we denote by I _D its set of vertices of in-degree 0, called input vertices, and by O _D its set of vertices of out-degree 0, called output vertices. The DAG is said to be of in-degree |I _D| and of out-degree |O _D|. We henceforth only consider finite DAGs.

Rewriting Systems

Let Γ and Σ be finite alphabets. A Σ-labelled ground tree rewriting system (GTRS) is a finite set R of triples (ℓ,a,r) called rewrite rules, with ℓ and r finite Γ-labelled trees and a ∈ Σ a label. The rewriting graph of R is $\mathcal {G}_{R} = (V,E)$, where $V = \mathcal {T}(\Gamma )$ and $E = \{ ({c}{(i)}{\ell }, a, {c}{(i)}{r}) \mid (\ell , a, r) \in R,\ c \in \mathcal {T}(\Gamma ),\ i\text { is a leaf of }c \}$. The rewriting relation associated to R is $\xrightarrow {~}{~}_{R}\ =\ \xrightarrow {~}{~}_{\mathcal {G}_{R}}$, its derivation relation is $\xrightarrow {*}{~}_{R}\ =\ \xrightarrow {*}{~}_{\mathcal {G}_{R}}$. When restricted to words (or equivalently unary trees), such systems are usually called suffix (or prefix) word rewriting systems.

3 Higher-Order Stack Trees

3.1 Higher-Order Stacks

We briefly recall the notion of higher-order stacks (for details, see for instance [5]). In order to obtain a more straightforward extension from stacks to stack trees, we use a slightly tuned yet equivalent definition, whereby the hierarchy starts at order 0 and uses a different set of basic operations.

In the remainder, Γ will denote a fixed finite alphabet and n a positive integer. We first define stacks of order n (or n-stacks). Let S t a c k s ₀(Γ) = Γ denote the set of 0-stacks. For n > 0, the set of n-stacks is S t a c k s _n(Γ) = (S t a c k s _n−1(Γ))⁺, the set of non-empty sequences of (n − 1)-stacks. When Γ is understood, we simply write S t a c k s _n. For s ∈ S t a c k s _n, we write s = [s ₁…s _k]_n, with k > 0 and n > 0, for an n-stack of size |s| = k whose topmost (n − 1)-stack is s _k. For example, [[[α β α]₁]₂[[α β α]₁[β]₁[α α]₁]₂]₃ is a 3-stack of size 2, whose topmost 2-stack [[α β α]₁[β]₁[α α]₁]₂ contains three 1-stacks, etc.

For the sake of simplicity, we will denote as [α]_n the n-stack containing only the symbol α, i.e. [[…[α]₁…]_n−1]_n.

Basic Stack Operations

Given two letters α,β ∈ Γ, we define the operation rew_α,β, and its associated binary relation over S t a c k s ₀, $r_{\text {rew}_{\alpha ,\beta }} = \{(\alpha ,\beta )\}$. For n ≥ 1, the operation copy_n defines the relation

$$r_{\text{copy}_{n}} = \{([s_{1}{\ldots} s_{k}]{}_{n},[s_{1}{\ldots} s_{k}s_{k}]{}_{n})\mid [s_{1}{\ldots} s_{k}]{}_{n}\in Stacks_{n}\}.$$

We finally consider the operation $\overline {\text {copy}}_{n}$ which is the inverse of copy_n, in the sense that its associated relation is $r_{\overline {\text {copy}}_{n}} = r_{\text {copy}_{n}}^{-1}$.

For any order- ℓ operation 𝜃, we extend the relation r _𝜃 to any order n > ℓ by considering

$$r_{\theta} = \{([s_{1} {\ldots} s_{k-1} s_{k}]{}_{n},[s_{1} {\ldots} s_{k-1} s^{\prime}_{k}]{}_{n} )\mid (s_{k},s^{\prime}_{k}) \in r_{\theta}\}.$$

The set O p s _n of basic operations of order n is defined as: O p s ₀ = {rew_α,β∣α,β ∈ Γ}, and for n ≥ 1, $Ops_{n} = Ops_{n-1} \cup \{\text {copy}_{n},\overline {\text {copy}}_{n}\}$.

We consider sequences of operation of O p s _n. Given a sequence of operations $\theta = \theta _{1}\cdots \theta _{k} \in Ops_{n}^{*}$, we define its associated relation $r_{\theta } = r_{\theta _{1}} \circ {\cdots } \circ r_{\theta _{k}}$. We as well define $\overline {\theta } = \overline {\theta _{k}}\cdots \overline {\theta _{1}}$, where for every 𝜃 _i ∈ O p s _n, we take $\overline {\theta _{i}}$ such that $r_{\overline {\theta _{i}}} = r_{\theta _{i}}^{-1}$, i.e. $\overline {\text {rew}_{\alpha ,\beta }} = \text {rew}_{\beta }{\alpha }$ and $\overline {\overline {\text {copy}}_{n}} = \text {copy}_{n}$. Observe that we have $r_{\overline {\theta }} = r_{\theta }^{-1}$. Finally, observe that all relations defined by sequences of operations are functions. So, by abuse of notation, given a stack s and a sequence of operations 𝜃, we may in the following denote by 𝜃(s) the unique stack s ^′ such that (s,s ^′) ∈ r _𝜃.

3.2 Stack Trees

We introduce the set $ST_{n}(\Gamma ) = \mathcal {T}(Stacks_{n-1}(\Gamma ))$ (or simply S T _n when Γ is understood) of n-stack trees. Observe that an n-stack tree of degree 1 is isomorphic to an n-stack, and that $ST_{1} = \mathcal {T}(\Gamma )$. Figure 1 shows an example of a 3-stack tree. The notion of stack trees therefore subsumes both higher-order stacks and ordinary trees.

Basic Stack Tree Operations

We now present the basic operations we consider over n-stack trees. As we wish that our model extends stack operations, and namely that unary stack tree operations coincide with stack operations, we consider the extension of stack operations to stack trees, which will be applied to one of the leaves of the n-stack tree, and we define two new order-n operations allowing to respectively create and destroy several leaves. We first define the relation associated with a basic stack tree operation. As there are in general several positions of a stack tree where a given stack tree operation can be applied, we define the relation defined by the localised application of an operation to a given leaf of a stack tree, given by the index of that leaf in the lexicographic order over the leaves.

Definition 1

We consider the set of basic operations over S T _n, $TOps_{n} = Ops_{n-1} \cup \left \{\text {copy}_{n}^{k},\overline {\text {copy}}_{n}^{k} \mid 1\leq k \leq d\right \}$. Given an integer i, the relation defined by the application localised in i of a basic operation 𝜃 is defined as follows:

$r_{\theta }^{i} = \{(t,t^{\prime }) \mid \text {dom}(t) = \text {dom}(t^{\prime }) \wedge \forall v \neq u_{i}, t(v) = t^{\prime }(v) \wedge (t(u_{i}),t^{\prime }(u_{i})) \in r_{\theta }\}$, for 𝜃 ∈ O p s _n−1.
$r_{\text {copy}_{n}^{k}}^{i} = \{(t,t^{\prime }) \mid t^{\prime } = t\cup \{u_{i} j \mapsto t(u_{i}) \mid 1 \leq j \leq k\}$.
$r_{\overline {\text {copy}}_{n}^{k}}^{i} = \left (r_{\text {copy}_{n}^{k}}^{i}\right )^{-1}$.

For 𝜃 ∈ T O p s _n, the relation defined by 𝜃 is thus $r_{\theta } = \bigcup _{i\in \mathbb {N}} r_{\theta }^{i}$, i.e. the union of all the possible localised applications of 𝜃.

In the previous definition, we consider that fr(t) = {u ₁,⋯ ,u _|fr(t)|}, and that the leaves are ordered in respect to the lexicographic order over {1,⋯ ,d}^∗. If i > |fr(t)|, there is no stack tree t ^′ such that $(t,t^{\prime }) \in r_{\theta }^{i}$. For simplicity, we will henceforth only consider the case where stack trees have arity at most 2 and k ≤ 2, but all results go through in the general case.

3.3 Stack Tree Rewriting

As already mentioned, S T ₁ is the set of trees labelled by Γ. In contrast with basic stack tree operations, a tree rewrite rule (ℓ,a,r) expresses the replacement of an arbitrarily large ground subtree ℓ of some tree s = c[ℓ] into r, yielding the tree c[r]. Contrarily to the case of order-1 stacks (which are simply words), composing basic stack tree operations does not allow us to directly express such an operation, because there is no guarantee that two successive operations will be applied to the same part of a tree. We thus need to find a way to consider compositions of basic operations acting on a single sub-tree. In our notations, the effect of a ground tree rewrite rule could thus be seen as the localised application of a sequence of rew and $\overline {\text {copy}}_{1}^{2}$ operations followed by a sequence of rew and $\text {copy}_{1}^{2}$ operations. The relative positions where these operations must be applied could be represented as a pair of trees with edge labels in O p s ₀.

From order 2 on, this is no longer possible. Indeed a localised sequence of operations may be used to perform introspection on the stack labelling a node without destroying it, by first performing a copy₂ operation followed by a sequence of order-1 operations and a $\overline {\text {copy}}_{2}$ operation. It is thus impossible to directly represent such a transformation using pairs of trees labelled by stack tree operations. We therefore adopt a presentation of compound operations as DAGs, which allows us to specify the relative application positions of successive basic operations. However, not every DAG represents a valid compound operation, so we first need to define a suitable subclass of DAGs and associated concatenation operation. An example of the model we aim to define can be found in Fig. 2.

Concatenation of DAGs

Given two DAGs D and D ^′ with O _D = {b ₁,…,b _ℓ} and $I_{D^{\prime }} = \{a^{\prime }_{1}, \ldots , a^{\prime }_{k^{\prime }}\}$ and two indices i ∈{1,⋯ ,ℓ} and j ∈{1,⋯ ,k ^′} such that min(i,j) = 1, we denote by D⋅_i,j D ^′ the unique DAG D ^″ obtained by merging the (i + m)-th output vertex of D with the (j + m)-th input vertex of D ^′ for all m ≥ 0 such that both b _{i + m} and $a^{\prime }_{j+m}$ exist. Formally, letting d = min(ℓ − i,k ^′− j) + 1 denote the number of merged vertices, we have D ^″ =merge_f(D ⊎ D ^′) where merge_f(D) is the DAG whose set of vertices is f(V _D) and set of edges is {(f(x),a,f(x ^′))∣(x,a,x ^′) ∈ E _D}, and f(x) = b _{i + m} if $x = a^{\prime }_{j+m}$ for some m ∈{0,…,d − 1}, and f(x) = x otherwise. We call D ^″ the (i,j)-concatenation of D and D ^′. Note that the (i,j)-concatenation of two connected DAGs remains connected.

Compound Operations

We represent compound operations as DAGs. We will refer in particular to the set of DAGs $\mathcal {D}_{n} = \{D_{\theta } \mid \theta \in TOps_{n}\}$ associated with basic operations, which are depicted in Fig. 3. Compound operations are inductively defined below, as depicted in Fig. 4. Remark that for copy and anticopy of arity 2, we label the edges representing them by 1 and 2 (respectively $\bar {1}$ and $\bar {2}$), for the sake of simplicity. If we were considering trees of arity higher than 2, it would be necessary to distinguish the copies of different arity. Here, there is no ambiguity, as a 1 edge will always have the same source as a 2 edge and vice versa, by definition.

Definition 2

A DAG D is a compound operation (or simply an operation) if one of the following holds:

D = □;

D = (D ₁⋅_1,1 D _𝜃) ⋅_1,1 D ₂, with $|O_{D_{1}}| = |I_{D_{2}}| = 1$ and $\theta \in Ops_{n-1} \cup \left \{\text {copy}_{n}^{1}, \overline {\text {copy}}_{n}^{1}\right \}$;

$D = ((D_{1} \cdot _{1,1} D_{\text {copy}_{n}^{2}}) \cdot _{2,1} D_{3}) \cdot _{1,1} D_{2}$, with $|O_{D_{1}}| = |I_{D_{2}}| = |I_{D_{3}}| = 1$;

$D = (D_{1} \cdot _{1,1} (D_{2} \cdot _{1,2} D_{\overline {\text {copy}}_{n}^{2}})) \cdot _{1,1} D_{3}$ with $|O_{D_{1}}| = |O_{D_{2}}| = |I_{D_{3}}| = 1$;

$D = ((((D_{1} \cdot _{1,1} D_{\text {copy}_{n}^{2}}) \cdot _{2,1} D_{3}) \cdot _{1,1} D_{2}) \cdot _{1,1} D_{\overline {\text {copy}}_{n}^{2}}) \cdot _{1,1} D_{4}$, with $|O_{D_{1}}| =$ $|I_{D_{2}}| = |O_{D_{2}}| = |I_{D_{3}}| = |O_{D_{3}}| = |I_{D_{4}}| = 1$;

where D ₁,D ₂,D ₃ and D ₄ are compound operations.

Additionally, the vertices of D are ordered inductively in such a way that every vertex of D _i in the above definition is smaller than the vertices of D _i+1, the order over □ being the trivial one. This induces in particular an order over the input vertices of D, and one over its output vertices.

Definition 3

Given a compound operation D, we define $r_{D}^{i}$ the relation associated to its localised application starting in the i ^th leaf of a stack tree t as follows:

If D = □, then $r_{D}^{i} = \text {id}$ (where id denotes the identity relation).

If D = (D ₁⋅_1,1 D _𝜃) ⋅_1,1 D ₂ with $\theta \in Ops_{n-1} \cup \left \{\text {copy}_{n}^{1},\overline {\text {copy}}_{n}^{1}\right \}$,

then $r_{D}^{i} = r_{D_{1}}^{i} \circ r_{\theta }^{i} \circ r_{D_{2}}^{i}$.

If $D = ((D_{1} \cdot _{1,1} D_{\text {copy}_{n}^{2}}) \cdot _{2,1} D_{3}) \cdot _{1,1} D_{2}$, then $r_{D}^{i} = r_{D_{1}}^{i} \circ r_{\text {copy}_{n}^{2}}^{i} \circ r_{D_{3}}^{i+1} \circ r_{D_{2}}^{i}$.

If $D = (D_{1} \cdot _{1,1} (D_{2} \cdot _{2,1} D_{\overline {\text {copy}}_{n}^{2}})) \cdot _{1,1} D_{3}$, then $r_{D}^{i} = r_{D_{1}}^{i} \circ r_{D_{2}}^{i+1} \circ r_{\overline {\text {copy}}_{n}^{2}}^{i} \circ r_{D_{3}}^{i}$.

If $D = ((((D_{1} \cdot _{1,1} D_{\text {copy}_{n}^{2}}) \cdot _{2,1} D_{3}) \cdot _{1,1} D_{2}) \cdot _{1,1} D_{\overline {\text {copy}}_{n}^{2}}) \cdot _{1,1} D_{4}$,

then $r_{D}^{i} = r_{D_{1}}^{i} \circ r_{\text {copy}_{n}^{2}}^{i} \circ r_{D_{3}}^{i+1} \circ r_{D_{2}}^{i} \circ r_{\overline {\text {copy}}_{n}^{2}}^{i} \circ r_{D_{4}}^{i}$.

The relation defined by D is $r_{D} = \bigcup _{i} r_{D}^{i}$, i.e. the union of all its localised applications.

Remark 1

An operation may admit several different decompositions with respect to Definition 2. However, its application is well-defined, as one can show that this process is locally confluent. This result can be obtained through a case analysis of every possible decomposition of a DAG. This observation being easy but fastidious to write, we leave it to the reader.

Figure 2 shows an example of the application of a compound operation. Given a compound operation D, we define $\overline {D}$ the operation obtained by reversing the edges of D and swapping the input and output vertices. Formally, for 𝜃 ∈ O p s _n−1(Γ), we define $\overline {D_{\theta }} = D_{\overline {\theta }}$, $\overline {D_{\text {copy}_{n}^{d}}} = D_{\overline {\text {copy}}_{n}^{d}}$, $\overline {D_{\overline {\text {copy}}_{n}^{d}}} = D_{\text {copy}_{n}^{d}}$, and for every compound operations D ₁,D ₂, $\overline {D_{1}\cdot _{i,j} D_{2}} = \overline {D_{2}} \cdot _{j,i} \overline {D_{1}}$. For every operation D, we have $r_{\overline {D}} = r_{D}^{-1}$. Finally, given a k-tuple of operations D = (D ₁,…,D _k) of respective in-degrees d ₁,…,d _k and a k-tuple of indices i = (i ₁,…,i _k) with i _j+1 ≥ i _j + d _j for all j ∈{1,⋯ ,k}, we denote by $r_{\mathbf {D}}^{\mathbf {i}}= r_{D_{k}}^{i_{k}}\circ \cdots \circ r_{D_{1}}^{i_{1}}$ the relation defined by the parallel application of each D _j to the position i _j, and by r _D the union of all these relations (we apply the rightmost DAGs first to not have to deal with the effect of the application of the leftmost DAGs to the numbering of the leaves).

Since the (i,j)-concatenation of two operations as defined above is not necessarily a licit operation, we need to restrict ourselves to results which are well-formed according to Definition 2. Given D and D ^′, we let D ⋅ D ^′ = {D⋅_i,j D ^′∣D⋅_i,j D ^′ is an operation}. Given n > 1, we define² $D^{n} = \bigcup _{i <n} D^{i} \cdot D^{n-i}$, and let $D^{*} = \bigcup _{n\geq 0} D^{n}$ denote the set of iterations of D. These notations are naturally extended to sets of operations.

Proposition 1

$\mathcal {D}_{n}^{*}$ is precisely the set of all well-formed compound operations.

Proof

Recall that $\mathcal {D}_{n}$ denotes the set of DAGs associated with basic operations. By definition of iteration, any DAG in $\mathcal {D}_{n}^{*}$ is an operation. Conversely, by Definition 2, any operation can be decomposed into a concatenation of DAGs of $\mathcal {D}_{n}$. □

Ground Stack Tree Rewriting Systems

By analogy with order-1 trees, given some finite alphabet of labels Σ, we call labelled ground stack tree rewriting system (GSTRS) a set R = {R _a∣a ∈ Σ}, where each R _a is a finite set of operations. We straightforwardly extend the notions of rewriting graph and derivation relation to these systems. Note that for n = 1, this class coincides with ordinary ground tree rewriting systems. Moreover, one can easily show that the rewriting graphs of ground stack tree rewriting systems over unary n-stack trees (trees containing only unary operations, i.e. no edge labelled by 2 or $\bar {2}$) are isomorphic to the configuration graphs of order-n pushdown automata performing a finite sequence of operations at each transition.

4 Operation Automata

Our main goal is to prove the decidability of the $\mathsf {FO}[\xrightarrow {*}]$-theory of the graphs generated by GSTRS, and to do so, we need to be able to calculate the reachability relation of a GSTRS. In the case of trees, this is achieved through the definition of ground tree transducers, which are pairs of tree automata [12, 14]. However, as for higher-order stacks, there is no natural notion of recognisable sets of stack trees which could allow to directly define an equivalent of GTT for stack trees. In [5], Carayol defines a notion of recognisability over sets of operations and derives from it a notion of recognisability over higher-order stacks as the image of a given stack by a recognisable set of operations. This definition ensures that the reachability set of a higher-order pushdown automaton is recognisable. Following this idea, we introduce a notion of recognisability over stack trees operations, through a notion of operation automaton, and derive from it a notion of recognisability over stack tree. We thus show that the recognisable sets of operations are closed under union, intersection. Then, we associate two notions of relation associated with an automaton, one leading to recognisable ground stack tree rewriting systems (RGTRS), and the other to ground stack tree transducers (GSTT), following the ideas of [12, 14]. However, this is not sufficient to be able to describe the reachability relation of a GSTRS. Indeed, the extension of the construction on word automaton allowing to recognise the star of a language will fail (we will in general recognise more operations that the iteration of the initial set). To contourn that difficulty, we introduce a normalisation notion over automata, and show that every relation defined by an automaton can be defined by a normalised automaton. We then show that for every normalised and distinguished automaton (similarly to the notion over words automata, it means that initial states are target of no transition and final states are source of no transition), we can construct an automaton which recognises its iteration. We use that property to show that the reachability relations of GSTRS are captured by GSTT. Finally, let us precise that the normalisation property is also crucial in the proof of the decidability of the $\mathsf {FO}[\xrightarrow {*}]$-theory of a graph generated by a GSTRS presented in the next section.

Definition 4

An automaton over $\mathcal {D}_{n}^{*}$ is a tuple A = (Q,Γ,I,F,Δ), where

Q is a finite set of states,
Γ is a finite stack alphabet,
I ⊆ Q is a set of initial states,
F ⊆ Q is a set of final states,
$\Delta \subseteq \left (Q\times \left (Ops_{n-1}\cup \left \{\text {copy}_{n}^{1},\overline {\text {copy}}_{n}^{1}\right \}\right ) \times Q \right )$

∪((Q × Q) × Q) ∪(Q × (Q × Q)) is a set of transitions.

An operation D is accepted by A if there is a labelling of its vertices by states of Q such that all input vertices are labelled by initial states, all output vertices by final states, and this labelling is consistent with Δ, in the sense that for all x, y and z respectively labelled by states p, q and r, and for all $\theta \in Ops_{n-1} \cup \left \{\text {copy}_{n}^{1},\overline {\text {copy}}_{n}^{1}\right \}$,

$$\begin{array}{@{}rcl@{}} x \xrightarrow{\theta} y & \implies& (p,\theta,q) \in \Delta,\\ x \xrightarrow{1} y \land x \xrightarrow{2} z & \implies& (p,(q,r)) \in \Delta,\\ x \xrightarrow{\bar{1}} z \land y \xrightarrow{\bar{2}} z & \implies& ((p,q),r) \in \Delta. \end{array} $$

Such a labelling is called an accepting labelling.

We denote by Op(A) the set of operations recognised by A. R e c denotes the class of sets of operations recognised by operation automata. We denote by r_A = {(t,t ^′)∣∃D ∈Op(A),r_D(t,t ^′)} the union of the relations defined by the operations recognised by A.

A set of stack trees L is recognisable if there exists a recognisable set of operations R such that L = {t∣∃D ∈ R,(t ₀,t) ∈ r _D} where t ₀ is a stack tree containing only one node labelled by [α]_n−1, for a given α ∈ Γ.

4.1 Properties of Operation Automata

In this paragraph, we show that Rec is closed under union, intersection and contains the finite sets of operations.

Proposition 2

Given two automata A ₁ and A ₂, there exists an automaton A such that Op(A) = Op(A ₁) ∩Op(A ₂).

Proof

We construct an automaton which witnesses Proposition 2. We consider the product automaton of A ₁ and A ₂: we take $Q= Q_{A_{1}} \times Q_{A_{2}}$, $I= I_{A_{1}}\times I_{A_{2}}$, $F= F_{A_{1}} \times F_{A_{2}}$, and

$$\begin{array}{@{}rcl@{}} \Delta =&& \left\{\left( (q_{1},q_{2}),\theta,\left( q^{\prime}_{1},q^{\prime}_{2}\right)\right) \mid \left( q_{1},\theta,q^{\prime}_{1}\right) \in \Delta_{A_{1}} \land \left( q_{2},\theta,q^{\prime}_{2}\right)\in \Delta_{A_{2}}\right\}\\ \cup{} &&\left\{\left( \left( (q_{1},q_{2}),\!\left( q^{\prime}_{1},q^{\prime}_{2}\right)\right),\left( q^{\prime\prime}_{1},q^{\prime\prime}_{2}\right)\right) \!\mid\! \left( \left( q_{1},q^{\prime}_{1}\right),q^{\prime\prime}_{1}\right)\!\in\! \Delta_{A_{1}} \!\land\! \left( \left( q_{2},q^{\prime}_{2}\right),q^{\prime\prime}_{2}\right)\!\in\! \Delta_{A_{2}}\right\}\\ \cup{} &&\left\{\left( (q_{1},q_{2}),\!\left( \left( q^{\prime}_{1},q^{\prime}_{2}\right),\left( q^{\prime\prime}_{1},q^{\prime\prime}_{2}\right)\right)\right) \!\mid\! \left( q_{1},\!\left( q^{\prime}_{1},q^{\prime\prime}_{1}\right)\right)\!\in\! \Delta_{A_{1}} \!\land\! \left( q_{2},\!\left( q^{\prime}_{2},q^{\prime\prime}_{2}\right)\right)\!\in\!\Delta_{A_{2}}\right\}. \end{array} $$

If an operation admits a valid labelling in A ₁ and in A ₂, then the labelling which labels each node of the operation by the two states it has in its labelling in A ₁ and A ₂ is valid. If an operation admits a valid labelling in A, then, restricting it to the states of A ₁ (respectively A ₂), we have a valid labelling in A ₁ (respectively A ₂). □

Proposition 3

Given two automata A ₁ and A ₂, there exists an automaton A such that Op(A) = Op(A ₁) ∪Op(A ₂).

Proof

We take the disjoint union of A ₁ and A ₂, i.e. $Q = Q_{A_{1}} \uplus Q_{A_{2}}$, $I = I_{A_{1}} \uplus I_{A_{2}}$, $F = F_{A_{1}} \uplus F_{A_{2}}$ and $\Delta = \Delta _{A_{1}} \uplus \Delta _{A_{2}}$.

If an operation admits a valid labelling in A ₁ (resp A ₂), it is also a valid labelling in A. If an operation admits a valid labelling in A, as A is a disjoint union of A ₁ and A ₂, it can only be labelled by states of A ₁ or of A ₂ (by definition, there is no transition between states of A ₁ and states of A ₂, and every operation is connected) and then the labelling is valid in A ₁ or in A ₂. □

Proposition 4

Given an operation D, there exists an automaton A such that Op(A) = {D}.

Proof

If D = (V,E), we take Q = V, I as the set of input vertices, F as the set of output vertices and

$$\begin{array}{@{}rcl@{}} \Delta =&& \{(q,\theta,q^{\prime}) \mid (q,\theta,q^{\prime}) \in E\}\\ \cup{} &&\{(q,(q^{\prime},q^{\prime\prime})) \mid (q,1,q^{\prime}) \in E \land (q,2,q^{\prime\prime}) \in E\}\\ \cup{} &&\{((q,q^{\prime}),q^{\prime\prime}) \mid (q,1,q^{\prime\prime}) \in E \land (q^{\prime},2,q^{\prime\prime}) \in E\}. \end{array} $$

The recognised connected part is D by construction. □

4.2 Recognisable Ground Stack Tree Rewriting Systems and Ground Stack Tree Transducers

It is possible to extend the notion of ground stack tree rewriting systems to recognisable sets of operations. Such a system is called recognisable.

Definition 5

A Σ-labelled recognisable ground stack tree rewriting system R is a set of |Σ| operation automata A _a. Given a ∈ Σ and two stack trees t,t ^′, we have $t \xrightarrow [R]{a} t^{\prime }$ if there exists D ∈Op(A _a) such that (t,t ^′) ∈ r _D.

As in the case of ground tree rewriting systems, recognisable ground stack tree rewriting systems do not capture the iteration of the transition relation of any ground stack tree rewriting system, as such a system impose to rewrite only a given subtree, while the iteration of the transition relation may rewrite several subtrees in one step. For ground tree rewriting systems, this problem is lifted by the introduction of ground tree transducers which precisely allow to rewrite in parallel any number of subtrees with a set of rules in a recognisable set of rules (but without control on the number of subtrees rewritten that way). Following that idea, we define the relation defined by an automaton A as $\mathcal {R}(A)= \{(t,t^{\prime })\mid \exists \mathbf {D} \in \text {Op}(A), (t,t^{\prime }) \in r_{\mathbf {D}}\}$.

Definition 6

A Σ-labelled ground stack tree transducer Λ is a set of |Σ| operation automata A _a. Given a ∈ Σ and two stack trees t,t ^′, we have $t \xrightarrow [\Lambda ]{a} t^{\prime }$ if $(t,t^{\prime })\in \mathcal {R}(A_{a})$.

4.3 Normalised Automata

Operations may perform “unnecessary” actions on a given stack tree, for instance duplicating a leaf with a $\text {copy}_{n}^{2}$ operation and later destroying both copies with $\overline {\text {copy}}_{n}^{2}$. Such operations which leave the input tree unchanged are referred to as bubbles. There are thus in general infinitely many operations representing the same relation over stack trees. It is therefore desirable to look for a canonical representative (a canonical operation) for each considered relation. The intuitive idea is to simplify operations by removing occurrences of successive mutually inverse basic operations. This process is a very classical tool in the literature of pushdown automata and related models, and was applied to higher-order stacks in [5]. Our notion of reduced operations is inspired from this work.

To define such a notion for our operations, there are two main hurdles to overcome. First, as already mentioned, a compound operation D can perform introspection on the label of a leaf without destroying it. If D can be applied to a given stack tree t, such a sequence of operations does not change the resulting stack tree s. It does however forbid the application of D to other stack trees by inspecting their node labels, hence removing this part of the computation would lead to an operation with a possibly strictly larger domain. To adress this problem, and following [5], we use test operations ranging over regular sets of (n − 1)-stacks, which will allow us to handle non-destructive node-label introspection.

A second difficulty appears when an operation destroys a subtree and then reconstructs it identically, for instance a $\overline {\text {copy}}_{n}^{2}$ operation followed by $\text {copy}_{n}^{2}$. Trying to remove such a pattern would lead to a disconnected DAG, which does not describe a compound operation in our sense. We thus need to leave such occurrences intact. We can nevertheless bound the number of times a given position of the input stack tree is affected by the application of an operation by considering two phases: a destructive phase during which only $\overline {\text {copy}}_{n}^{i}$ and order- (n − 1) basic operations (possibly including tests) are performed on the input stack tree, and a constructive phase only consisting of $\text {copy}_{n}^{i}$ and order- (n − 1) basic operations. This idea is similar to the way ground tree rewriting is performed at order 1.

The goal of this section is to define a subset of operations called reduced operations analogously to the notion of reduced instructions with tests in [5], and prove that any recognisable relation can be defined by a recognisable set of reduced operations. However, as in [5], we won’t have a unique reduced operation representing a given relation, due to the presence of tests, but it limits the number of times the same stack tree can be obtained during its application to a stack tree, which is exactly what we need in the proof of the next section. To that end, we will first formally define tests, define the notion of reduced operations with tests, and prove that any relation accepted by an automaton can be accepted by an automaton accepting reduced operation with tests (with some technical restrictions), called a normalised automaton. A key point in the proof of that fact will be to use the notion of loop-free sets of instructions presented in [5], and we will explain why we can apply it to our case, as we don’t have the same set of basic operations as them.

Formally, a test T _L over S t a c k s _n is the restriction of the identity operation to L ∈ R e c(S t a c k s _n).³ In other words, given s ∈ S t a c k s _n, $(s,s) \in r_{T_{L}}$ if and only if s ∈ L. We denote by $\mathcal {T}_{n}$ the set of test operations over S t a c k s _n. We enrich our basic operations over S T _n with $\mathcal {T}_{n-1}$. We also extend compound operations with edges labelled by tests. We denote by $\mathcal {D}_{n}^{\mathcal {T}}$ the set of basic operations with tests.

We now define a notion of reduced sequences of stack operations that will be sufficient for limiting the number of time a given stack tree appears during the application of a reduced operation. Informally, we say that a sequence over $Ops_{n-1}\cup \mathcal {T}_{n-1}$ is reduced if and only if the only subsequences that define a relation that can leave a stack unchanged are single test operations.

Definition 7

We define the set of reduced operations over $Ops_{n-1}\cup \mathcal {T}_{n-1}$ by

$$\text{Red} = \left\{ \theta_{1}\cdots\theta_{k} \mid \begin{array}{l} \forall i\leq j, r_{\theta_{i}\cdots\theta_{j}} \cap \text{id} \neq \emptyset \Rightarrow \left( i= j \wedge \theta_{i} \in \mathcal{T}_{n-1}\right) \wedge\\ \forall i\leq j, \left( \theta_{i} = \text{rew}_{\mathrm{a},\mathrm{b}} \wedge \theta_{j} = \text{rew}_{c}{d}\right) \Rightarrow\\ \exists z, i <z < j \wedge \left( \theta_{z} = \text{copy}_{k} \vee \theta_{z} = \overline{\text{copy}}_{k}\right) \end{array} \right\}.$$

If a sequence 𝜃 = 𝜃 ₁⋯𝜃 _k is reduced, then for every stack s such that there is a stack s ^′ such that (s,s ^′) ∈ r _𝜃, the sequence s ₁,⋯ ,s _k−1 with $(s,s_{1}) \in r_{\theta _{1}}, (s_{1},s_{2}) \in r_{\theta _{2}}, \cdots ,(s_{k-1},s^{\prime })\in r_{\theta _{k}}$ is such that for any s _i,s _j, if s _i = s _j, then j = i + 1 and $\theta _{i} \in \mathcal {T}_{n-1}$. Notice, that this definition also implies that there are no two consecutive test operations in a reduced sequence. Thus, in the application of a reduced sequence, any stack will appear at most twice. Furthermore, we also ask that there are no two rew_a,b operations which can apply successively on the same letter. This prevents useless rewriting (as rew_a,brew_b,c is equivalent to rew_a,c).

For example, the sequence rew_a,acopy₁rew_a,b is not reduced as rew_a,a is not a test but $([\alpha ]{}_{1},[\alpha ]{}_{1})\in r_{\text {rew}_{\mathrm {a},\mathrm {a}}}$. Yet, copy₁rew_a,b is reduced. The sequence $\theta = \overline {\text {copy}}_{2}\text {rew}_{\mathrm {a},\mathrm {b}}$ $\text {copy}_{1}T_{L}\overline {\text {cop}}_{1}\text {rew}_{\mathrm {b},\mathrm {a}}\text {copy}_{2}$ is not reduced as ([[α]₁[α]₁]₂,[[α]₁[α]₁]₂) ∈ r _𝜃. The sequence $\theta = \overline {\text {copy}}_{2}\text {rew}_{\mathrm {a},\mathrm {b}}\text {copy}_{2}$ is reduced. $T_{L_{1}}\text {rew}_{\mathrm {a},\mathrm {b}}T_{L_{2}}\text {copy}_{1}T_{L_{3}}$ is reduced, as even if it has three test operations, they are not consecutive, and rew_a,bcopy₁ is reduced.

We can now define our notion of reduced stack tree operations. Informally, a stack tree operation is reduced if there is no $\overline {\text {copy}}_{n}^{d}$ operation “below” a $\text {copy}_{n}^{d}$ operation, and if any suboperation containing only stack operations is reduced.

Definition 8

We define the set of reduced sequences over $Ops_{n-1}\cup \mathcal {T}_{n-1} \cup \left \{\bar {1},\bar {2},1,2, \overline {\text {copy}}_{n}^{1},\text {copy}_{n}^{1}\right \}$ by

$$\text{TRed} = \left\{\theta_{1}\cdots\theta_{k}\mid \begin{array}{ll} \forall i\leq j, \theta_{i}\cdots\theta_{j} \in (Ops_{n-1}\cup \mathcal{T}_{n-1})^{*} \Rightarrow \theta_{i}\cdots\theta_{j} \in \text{Red} \wedge \\ \forall i,j, \left( \theta_{i} \in \left\{\bar{1},\bar{2},\overline{\text{copy}}_{n}^{1}\right\} \wedge \theta_{j} \in \left\{1,2,\text{copy}_{n}^{1}\right\}\right) \Rightarrow i < j \end{array} \right\}. $$

An operation D is reduced if for any vertices x,y of D if $x \xrightarrow {w} y$, then w ∈TRed.

Intuitively, a reduced operation will have a destructive part, which contains all the operations $\overline {\text {copy}}_{n}^{d}$ and thus “goes up (and only up)” in the stack tree it is applied to, followed by a constructive part, which contains all the operations $\text {copy}_{n}^{d}$ and thus “constructs a stack tree from the leaf it is applied to”, and never removes a leaf. Moreover, any suboperation that does not contain tree operation is reduced, preventing it to “loop” on a stack tree if it is not a single test operation.

We are now ready to present the main result of this section: a notion of normalisation for operation automata. An automaton A is said to be distinguished if there is no transition ending in an initial state or starting in a final state.

Definition 9

An automaton A with state set Q is said to be normalised if it accepts only reduced operations, it is distinguished, and furthermore Q admits two partitions:

into Q _T and Q _C such that all test transitions lead from Q _C to Q _T, while all other lead from Q _T to Q _C.
into Q _c and Q _d such that there is no transition from Q _c to Q _d, states of Q _d are target of no $\text {copy}_{n}^{d}$ transition, states of Q _c are source of no $\overline {\text {copy}}_{n}^{d}$ transition, and all transition from Q _d to Q _c are $\text {copy}_{n}^{d}$ transitions.

Theorem 1

For every automaton A, there exists a normalised automaton with tests A _r such that $\mathrm {r}_{A} = \mathrm {r}_{A_{r}}$.

The idea of the construction is to transform A in several steps, each modifying the set of accepted operations but not the recognised relation. The proof relies on the closure properties of regular sets of (n − 1)-stacks and an analysis of the structure of A. This transformation can be performed without altering the accepted relation over stack trees. The end of this section is devoted to explain this construction.

Before moving to the main proof, we first have to explain how we can derive that for every recognisable set of stack operations there is a recognisable set of reduced stack operations defining the same relation, from [5]. Indeed, we will need that fact in our construction. In [5], the set of O p s ₁ contains operations of the form push_α and pop_α, while we have the operations rew_α,β, copy₁ and $\overline {\text {cop}}_{1}$. Notice as well that in [5], there is an empty stack that we don’t consider in our model. Yet it is possible to simulate our operations with sets of sequences of push_α and pop_α, and vice-versa. For example, $\text {push}_{\alpha } \sim \bigcup _{\beta \in \Gamma } \text {copy}_{1} \text {rew}_{\beta }{\alpha }$ (observe that the relations defined by a sequence in this union are disjoint), or $\text {copy}_{1} \sim \bigcup _{\alpha \in \Gamma } \text {pop}_{\alpha } \text {push}_{\alpha } \text {push}_{\alpha }$. The remaining equivalences are similar and thus left to the reader.

Thus, as these are local transformations, any relation defined by a recognisable set for our notion is also defined by a recognisable set for the notion of [5], and vice-versa.

In [5], there is a notion of loop-free operations, which are operations without factors of the form $\text {copy}_{i}\overline {\text {copy}}_{i}$, $\overline {\text {copy}}_{i}\text {copy}_{i}$, push_apop_a or pop_apush_a. Observe that a loop-free operation (without test) cannot contain a suboperation 𝜃 such that r _𝜃 ∩id≠∅, as an operation defining a test necessarily contains one of these factors. They extend this notion to operations with tests (an operation with test is loop-free if the one obtained by removing the tests is loop-free). Observe that a loop-free operation with tests is reduced in our sense (except that there may be several tests in a row – but one can find easily an equivalent operation reduced in our sense by shrinking these tests). Proposition 2.2 from [5] thus states that any set recognised by an automaton A over operations defines the same relation as a set recognised by an automaton A ^′ over $Ops_{k} \cup \{T_{L_{q,q^{\prime }}}\mid q,q^{\prime } \in Q\}$ accepting only loop-free operations, where $L_{q,q^{\prime }}$ is the set of stacks s, such that there is an operation 𝜃 recognised by A between states q and q ^′ such that (s,s) ∈ r _𝜃. Furthermore, they show that the sets $L_{q,q^{\prime }}$ are regular, so it is possible to compute A ^′.

Finally, observe that for any loop-free operation with tests in the sense of [5], we can find a finite set of reduced operations in our sense recognising the same relation, by applying the transformation described earlier, replacing the rew_α α with a test operation and merging the possible $T_{L} T_{L^{\prime }}$ factors introduced that way with $T_{L\cap L^{\prime }}$. Observe that a factor rew_a,b T _Lrew_c d cannot appear in this transformation. These transformations being local, we can compute the set corresponding to a loop-free operation. For the same reason, we can compute a reduced automaton with tests from a loop-free automaton with tests.

Thus, we deduce from [5] the following lemma:

Lemma 1

Given an automaton A over Ops _n−1, there exists an automaton over $Ops_{n-1} \cup \mathcal {T}_{n-1}$ accepting only reduced operations recognising the same relation as A.

We now give the proof of Theorem 1.

Proof

The first thing to remark is that if we don’t have any tree transitions, we have a higher-order stack automaton as in [5] and that the notions of normalised automaton coincide. The idea is thus to first prevent the presence of bubbles, i.e. suboperations containing a $\text {copy}_{n}^{d}$ operation above a $\overline {\text {copy}}_{n}^{d^{\prime }}$ operation, and afterwards to replace all sequences of stack operations with reduced sequences, using Lemma 1. Finally, we will impose that between any two non-test operations there is a test operation, and there are no two consecutive test operations.

To forbid bubbles, we just have to prevent the automaton from recognising DAGs which contain

$$((D_{\text{copy}_{n}^{2}} \cdot_{1,1} F_{1}) \cdot_{2,1} F_{2}) \cdot_{1,1} D_{\overline{\text{copy}}_{n}^{2}}\text{, or }(D_{\text{copy}_{n}^{1}} \cdot_{1,1} F) \cdot_{1,1} D_{\overline{\text{copy}}_{n}^{1}} $$

as a subDAG, where F ₁, F ₂ and F have 1 input node and 1 output node. However, we do not want to modify the recognised relation. We will do it in two steps: first we allow the automaton to replace the bubbles with equivalent tests (after remarking that a bubble can only be a test) in any recognised DAG (step 1), and then by ensuring that there won’t be any $\overline {\text {copy}}_{n}^{i}$ transition after the first $\text {copy}_{n}^{j}$ transition (step 2).

Step 1:

We consider an automaton A = (Q,Γ,I,F,Δ). Given two states q ₁,q ₂, we denote by $L_{A_{q_{1},q_{2}}}$ the set $\{s \in Stacks_{n-1} \mid \exists D\in \mathcal {D}(A_{q_{1},q_{2}}), (t_{s},t_{s})\in r_{D}^{1}\}$ where $A_{q_{1},q_{2}}$ is a copy of A in which we take q ₁ as the unique initial state and q ₂ as the unique final state, and t _s is the stack tree with a unique node labelled by s. In other words, $L_{A_{q_{1},q_{2}}}$ is the set of (n − 1)-stacks such that the trees with one node labelled by a stack of this set remain unchanged by an operation recognised by $A_{q_{1},q_{2}}$. We define A ₁ = (Q,Γ,I,F,Δ^′) with

$$\begin{array}{@{}rcl@{}} \Delta^{\prime} =&&\! \Delta\\ &&\!\!\!\!\cup \{(q,T_{L_{A_{r,r^{\prime}}} \cap L_{A_{s,s^{\prime}}}},q^{\prime}) \mid (q,(r,s)), ((r^{\prime},s^{\prime}),q^{\prime}) \in \Delta \}\\ &&\!\!\!\!\cup \left\{(q,T_{L_{A_{r,r^{\prime}}}},q^{\prime}_{s} \mid \left( q,\text{copy}_{n}^{1},r\right), \left( r^{\prime},\overline{\text{copy}}_{n}^{1},q^{\prime}\right) \in \Delta \right\}, \end{array} $$

The idea of the construction is depicted in Fig. 5. Intuitively, we add, between every pair of states, a shortcut that replace a suboperation starting by a $\text {copy}_{n}^{d}$ operation, ending by a $\overline {\text {copy}}_{n}^{d}$ and not going upper in the stack tree it is applied to (i.e. a bubble) by a test operation which accepts the same (n − 1)-stacks (indeed, such an operation cannot modify the stack tree it is applied to). However in this step, we do not formally remove these bubbles (that will be done in the next step).

We give a lemma which will allow us to prove that adding these test operations does not modify the relation recognised by the automaton.

Lemma 2

Let C = (Q,Γ,I,F,Δ_C) be an automaton, and i ₁,i ₂,f ₁,f ₂ four states of Q (not necessarily distinct). For short, we denote ${L_{C}}_{i_{k},f_{k}}$ by L _C,k, for k ∈{1,2}. We define the two automata B ₁ = ({q ₁,q ₂},Γ{q ₁},{q ₂},Δ₁) and B ₂ = (Q ∪{q ₁,q ₂},Γ,{q ₁}, {q ₂},Δ₂), where q ₁ and q ₂ are two different states not in Q,

$$\Delta_{1} = \{(q_{1},T_{L_{C,1} \cap L_{C,2}},q_{2})\},\hspace{1cm}\Delta_{2} = \{(q_{1},(i_{1},i_{2})),((f_{1},f_{2}),q_{2})\} \cup \Delta_{C}.$$

The relation recognised by B ₁ is included in the relation recognised by B ₂.

We also define the two automata B ₃ = ({q ₁,q ₂},Γ,{q ₁},{q ₂},Δ₃) and B ₄ = (Q ∪{q ₁,q ₂},Γ,{q ₁},{q ₂},Δ₄), where,

$$\Delta_{3} = \{(q_{1},T_{L_{C,1}},q_{2})\},\hspace{1cm}\Delta_{4} = \left\{\left( q_{1},\text{copy}_{n}^{1},i_{1}\right),\left( f_{1},\overline{\text{copy}}_{n}^{1},q_{2}\right)\right\} \cup \Delta_{C}.$$

The relation recognised by B ₃ is included into the relation recognised by B ₄.

Proof

The two inclusions having very similar proof, we only detail the first one and leave the second to the reader.

Take s ∈ L _C,1 ∩ L _C,2. By definition there are $F_{1} \in \mathcal {D}(C_{i_{1},f_{1}})$ and $F_{2} \in \mathcal {D}(C_{i_{2},f_{2}})$ such that $(t_{s},t_{s}) \in r_{F_{1}} \cap r_{F_{2}}$, where t _s is the stack tree with a unique node labelled by s. By extension, we get that any stack tree t with the stack s as the label of one of its leaves is such that $(t,t) \in r_{F_{1}} \cap r_{F_{2}}$. Conversely, by definition $r_{T_{L_{C,1}\cap L_{C,2}}}$ only accepts pairs of the form (t,t) where one of the leaf of t is labelled by a stack in L _C,1 ∩ L _C,2. We consider the DAG $D = D_{\text {copy}_{n}^{2}} \cdot _{1,1} (F_{1} \cdot _{1,1} (F_{2} \cdot _{2,2} D_{\overline {\text {copy}}_{n}^{2}}))$. By definition of its components, we get that (t,t) ∈ r _D. It is easy to see that D is recognised by B ₂.

Thus, for any stack tree such that $(t,t) \in r_{T_{L_{C,1}\cap L_{C_{2}}}}$, we can find D recognised by B ₂ such that (t,t) ∈ r _D, which concludes the proof. □

The other direction of the inclusions is false. Indeed, in B ₂ (resp. B ₄), some operation that can modify node upper than the one they are applied to may be recognised (e.g. some operation starting by $\text {copy}_{n}^{d}\overline {\text {copy}}_{n}^{d}\overline {\text {copy}}_{n}^{d^{\prime }}$). As such an operation cannot accept a tree with a single node, it will not be shortcut by the test operation introduced in B ₁. This will not be a problem in our construction, as we simply need that by adding the test operations, we do not modify the recognised relation.

We have the following corollary as a consequence of this lemma.

Corollary 1

A ₁ and A ₂ recognise the same relation.

Indeed A ₁ is obtained from A by adding sub-automata of the form B ₁ between two states forming the initial and final states of the form B ₂. As the lemma shows that the relation recognised by an automaton of the form B ₁ is included in the one recognised by the corresponding automaton of the form B ₂, we do not add anything to the relation recognised by A ₁. As A is furthermore included in A ₁, we get that the two automata recognise the same relation.

Step 2:

Suppose that A ₁ = (Q,Γ,I,F,Δ) is the automaton obtained after step 1. We now want to really forbid bubbles. To do so, we split the control states of the automaton in two parts: We create 2 copies of Q:

Q _d which are target of no $\text {copy}_{n}^{d}$ transition,
Q _c which are source of no $\overline {\text {copy}}_{n}^{d}$ transition.

We construct A ₂ = (Q ^′,Γ,I ^′,F ^′,Δ^′) with Q ^′ = {q _d,q _c∣q ∈ Q}, I ^′ = {q _d,q _c∣q ∈ I}, F ^′ = {q _d,q _c∣q ∈ F} and

$$\begin{array}{@{}rcl@{}} \Delta^{\prime} =\!\!\!&& \left\{\left( q_{d},\theta,q^{\prime}_{d}\right),\left( q_{c},\theta,q^{\prime}_{c}\right)\mid (q,\theta,q^{\prime})\in \Delta, \theta \in Ops_{n-1} \cup \mathcal{T}_{n-1} \cup \{\text{id}\}\right\}\\ &&\!\!\!\cup \left\{\left( \left( q_{d},q^{\prime}_{d}\right),q^{\prime\prime}_{d}\right) \mid ((q,q^{\prime}),q^{\prime\prime}) \in \Delta\right\}\\ &&\!\!\!\cup \left\{\left( q_{d},\overline{\text{copy}}_{n}^{1},q^{\prime}_{d}\right) \mid \left( q,\overline{\text{copy}}_{n}^{1},q^{\prime}\right)\in\Delta\right\}\\ &&\!\!\!\cup \left\{\left( q_{c},\left( q^{\prime}_{c},q^{\prime\prime}_{c}\right)\right),\left( q_{d},\left( q^{\prime}_{c},q^{\prime\prime}_{c}\right)\right)\mid (q,(q^{\prime},q^{\prime\prime}))\in \Delta\right\}\\ &&\!\!\!\cup \left\{\left( q_{c},\text{copy}_{n}^{1},q^{\prime}_{c}\right),\left( q_{d},\text{copy}_{n}^{1},q^{\prime}_{c}\right) \mid \left( q,\text{copy}_{n}^{1},q^{\prime}\right)\in\Delta\right\}. \end{array} $$

Lemma 3

A ₁ and A ₂ recognise the same relation.

Proof

A ₂ recognises the operations recognised by A ₁ which contain no bubble. Indeed, every labelling of such an operation in A ₁ can be modified to be a labelling in A ₂ by replacing q by q _d or q _c. Conversely, each operation recognised by A ₂ is recognised by A ₁.

Let us take D recognised by A ₁ which contains at least one bubble. Suppose that D contains a bubble F and that D = D ^′[F]_x where D ^′ is a DAG such that we obtain D by replacing the node x by F in D ^′, and F contains exactly one bubble (said otherwise, F is one of the innermost bubble of D).

Let us suppose $F = D_{\text {copy}_{n}^{2}} \cdot _{1,1} (F_{1} \cdot _{1,1} (F_{2} \cdot _{1,2} D_{\overline {\text {copy}}_{n}^{2}}))$, with F ₁ and F ₂ containing no tree operation (as F is an innermost bubble). We call x _k the input node of F _k and y _k its output node. As D is accepted by A ₂, we can find an accepting labelling ρ such that there are states r,r ^′,s,s ^′ such that ρ(x ₁) = r, ρ(y ₁) = r ^′, ρ(x ₂) = s, ρ(y ₂) = s ^′. Thus, from step 1, $G = D^{\prime }[T_{L_{A_{r,r^{\prime }}} \cap L_{A_{s,s^{\prime }}}}]{}_{x}$ is recognised by A ₂. Then r _D ⊆ r _G, and G has one less bubble than D (as F was an innermost bubble).

The case where we consider $F = D_{\text {copy}_{n}^{1}} \cdot _{1,1} (F_{1} \cdot _{1,1} D_{\overline {\text {copy}}_{n}^{1}})$, where F ₁ does not contain any tree operation is similar, and thus omitted.

Iterating this process, we obtain an operation H without any bubble such that r _D ⊆ r _H and H is recognised by A ₁. As it contains no bubble, it is also recognised by A ₂.

Thus every relation recognised by an operation with bubbles is already included in a relation recognised by an operation without bubbles. Therefore A ₁ and A ₂ recognise the same relation. □

We call the destructive part the restriction A _2,d of A ₂ to Q _d and the constructive part its restriction A _2,c to Q _c. Notice that the two parts are separated by constructive operations (i.e. $\text {copy}_{n}^{d}$ operations), thus the automaton labels all the nodes before the first $\text {copy}_{n}^{d}$ operation with Q _d, and all the nodes after it with Q _c.

Step 3:

We consider an automaton A ₂ = (Q,Γ,I,F,Δ) obtained after the previous step. We now want every sequence of stack operations to be reduced. To do that, we will, as explained before, invoke the result of Lemma 1, but first we have to ensure that between two reduced sequences of stack operations, there is at least one tree operation, as the concatenation of two reduced sequences of stack operations is not necessarily reduced.

We define A ₃ = (Q ^′,Γ,I ^′,F ^′,Δ^′) with:

$$\begin{array}{@{}rcl@{}} Q^{\prime} &=& \{q_{s},q_{t} \mid q \in Q\},\\ I^{\prime} &= & \{q_{s},q_{t} \mid q \in I\},\\ F^{\prime} &= & \{q_{s},q_{t} \mid q \in F\},\\ \Delta^{\prime} &= & \left\{\left( q_{s},\theta,q^{\prime}_{s}\right),\left( q_{s},\theta,q^{\prime}_{t}\right) \mid (q,\theta,q^{\prime}) \in \Delta, \theta \in Ops_{n-1}\cup \mathcal{T}_{n-1}\right\}\\ &&\cup \left\{\left( q_{t},\left( q^{\prime}_{t},q^{\prime\prime}_{t}\right)\right),\left( q_{t},\left( q^{\prime}_{s},q^{\prime\prime}_{t}\right)\right),\left( q_{t},\left( q^{\prime}_{t},q^{\prime\prime}_{s}\right)\right),\left( q_{t},\left( q^{\prime}_{s},q^{\prime\prime}_{s}\right)\right) \!\mid\! (q,(q^{\prime},q^{\prime\prime})) \!\in\! \Delta\right\}\\ &&\cup\left \{\left( \left( q_{t},q^{\prime}_{t}\right),q^{\prime\prime}_{t}\right),\left( \left( q_{t},q^{\prime}_{t}\right),q^{\prime\prime}_{s}\right)\mid ((q,q^{\prime}),q^{\prime\prime}) \in \Delta\right\}\\ &&\cup \left\{\left( q_{t},\text{copy}_{n}^{1},q^{\prime}_{t}\right),\left( q_{t},\text{copy}_{n}^{1},q^{\prime}_{s}\right)\mid\left( q,\text{copy}_{n}^{1},q^{\prime}\right) \in \Delta\right\}\\ & &\cup \left\{\left( q_{t},\overline{\text{copy}}_{n}^{1},q^{\prime}_{t}\right),\left( q_{t},\overline{\text{copy}}_{n}^{1},q^{\prime}_{s}\right)\mid \left( q,\overline{\text{copy}}_{n}^{1},q^{\prime}\right)\in \Delta\right\}. \end{array} $$

Observe that in this automaton, states of Q _s are sources of all stack transitions and only them, and Q _t are sources of all tree transitions and only them.

Lemma 4

A ₂ and A ₃ recognise the same relation.

Proof

If an operation is accepted by A ₃ with the labelling ρ, we construct a labelling ρ ^′ by ρ ^′(x) = q if ρ(x) = q _s or ρ(x) = q _t. By construction, ρ(x) is initial (resp. final) if and only if ρ ^′(x) is initial (resp. final), and ρ ^′ respects the transitions of A ₂. Thus the operation is accepted by A ₂

Conversely, if an operation is accepted by A ₂ with the labelling ρ, we construct a labelling ρ ^′: if ρ(x) = q, we take ρ ^′(x) = q _s if there exists y and $\theta \in Ops_{n-1} \cup \mathcal {T}_{n-1}$ such that $x \xrightarrow {\theta } y$, and ρ ^′(x) = q _t otherwise. By construction of the automaton, ρ ^′ respects the transitions of A ₃, and state q _s and q _t are initial (resp. final) if and only if q is initial (resp. final). Thus the operation is accepted by A ₃. □

Step 4:

We consider an automaton A ₃ = (Q,Γ,I,F,Δ) obtained after the previous step.

We now only have to replace every sequence of stack operations between two stack transitions (or initial and final states or mix of both cases) with a reduced one, and we know that such can only be labelled with q _s states except the last node which is labelled with a q _t node (in the case of a final node, it is not compulsory, but it always can). For each pair of states $(q_{s},q^{\prime }_{t})$ we consider the subautomaton $A_{q_{s},q^{\prime }_{t}}$ with q _s as only initial state, $q^{\prime }_{t}$ as only final state, and which conserve only stack transitions. From Lemma 1, we consider the normalised automaton over stack operation $A^{\prime }_{q_{s},q^{\prime }_{t}}$ which accepts the same language. Without loss of generality, we assume these automata to be distinguished, and to have disjoint sets of non-initial and non-final states. We also consider that $A^{\prime }_{q_{s},q^{\prime }_{t}}$ has q _s as unique initial state and $q^{\prime }_{t}$ as unique final state.

We thus construct A ₄ = (Q ^′,Γ,I ^′,F ^′,Δ^′) with

$$\begin{array}{@{}rcl@{}} Q^{\prime} &= & Q \cup \bigcup\limits_{q_{s},q^{\prime}_{t} \in Q} Q_{A^{\prime}_{q_{s},q^{\prime}_{t}}},\hspace{1cm} I^{\prime} = I,\hspace{1cm}F^{\prime} = F,\\ \Delta^{\prime} &= & \Delta \backslash\{(q,\theta,q^{\prime})\in \Delta \mid \theta \in Ops_{n-1}\cup\mathcal{T}_{n-1}\} \cup \bigcup\limits_{q_{s},q^{\prime}_{t} \in Q} \Delta_{A^{\prime}_{q_{s},q^{\prime}_{t}}}. \end{array} $$

Informally, A ₄ is simply A ₃ in which we removed all stack transitions and we added between q _s and $q^{\prime }_{t}$ the normalised automaton accepting the same relation as $A_{q_{s},q^{\prime }_{t}}$. Figure 6 shows the idea of this transformation.

Lemma 5

A ₃ and A ₄ recognise the same relation.

Proof

As A ₄ is obtained by normalising the parts of A ₃ containing only operations of $Ops_{n-1}\cup \mathcal {T}_{n-1}$, the relation defined by the operations of $(Ops_{n-1}\cup \mathcal {T}_{n-1})^{*}$ recognised between two states q _s and $q^{\prime }_{t}$ is equal for A ₃ and A ₄. For every pair (t,t ^′) in the relation defined by A ₃, there exists an operation 𝜃 recognised by A ₃ such that (t,t ^′) ∈ r _𝜃. As the remaining of the automaton is unchanged, it is possible to modify 𝜃 by replacing every maximal subDAG labelled with states between a q _s and a $q^{\prime }_{s}$ by an operation of $(Ops_{n-1}\cup \mathcal {T}_{n-1})^{*}$ in the set recognised by $A_{q_{s},q^{\prime }_{s}}$ (belonging to Red) and thus obtain an operation 𝜃 ^′ recognised by A ₄ such that $(t,t^{\prime }) \in r_{\theta ^{\prime }}$. It follows that (t,t ^′) is in the relation recognised by A ₄. The other direction is proved the same way. □

Step 5:

We now split the control states set into two parts Q _T and Q _C such that all test transitions lead from Q _C to Q _T, while all other transitions lead from Q _T to Q _C.

To that end, we will add a test transition which accepts all stacks in order to serve as a link between two consecutive non-test transitions. Given automaton A ₄ = (Q,Γ,I,F,Δ) obtained from the previous step, we define A ₅ = (Q ^′,Γ,I ^′,F ^′,Δ^′) with

$$Q^{\prime}= \{q_{T},q_{C}\mid q\in Q\},\ I^{\prime}= \{q_{C} \mid q\in I\},\ F^{\prime}= \{q_{T},q_{C} \mid q \in F\}, $$

$$\begin{array}{@{}rcl@{}} \Delta^{\prime} &=& \left\{\left( q_{T},\theta,q^{\prime}_{C}\right) \mid (q,\theta,q^{\prime})\in \Delta, \theta \in Ops_{n-1} \cup \left\{\text{copy}_{n}^{1},\overline{\text{copy}}_{n}^{1}\right\}\right\}\\ && \cup \left\{\left( \left( q_{T},q^{\prime}_{T}\right),q^{\prime\prime}_{C}\right) \mid ((q,q^{\prime}),q^{\prime\prime})\in \Delta\right\}\\ && \cup \left\{ \left( q_{T},\left( q^{\prime}_{C},q^{\prime\prime}_{C}\right)\right) \mid (q,(q^{\prime},q^{\prime\prime})) \in \Delta\right\} \\ && \cup \left\{\left( q_{C},T_{L},q^{\prime}_{T}\right) \mid (q,T_{L},q^{\prime})\in \Delta\right\} \cup \{(q_{C},T_{\top},q_{T}) \mid q\in Q\}. \end{array} $$

Lemma 6

A ₄ and A ₅ recognise the same relation.

Proof

From step 4 it is not possible to have two successive test transitions. Suppose an operation D is accepted by A ₄. We construct D ^′ by adding $D_{T_{\top }}$ between every two consecutive non-test operations. For example we replace $D_{\theta } \cdot _{1,1} D_{\text {copy}_{n}^{2}}$ with $D_{\theta } \cdot _{1,1} D_{T_{\top }} \cdot _{1,1} D_{\text {copy}_{n}^{2}}$. By construction of the automaton, if there is a valid labelling for D, we can construct a valid labelling for D ^′ by replacing q with q _C before a test and with q _T otherwise (by construction, any node has a test operation touching it, or it is input or output), and for the added $D_{T_{\top }}$, we labelled its extremities with q _C and q _T, where q was the label of the node joining the two operation it was included between. As we only added $D_{T_{\top }}$ operations to D without changing the order of the other operations, and $D_{T_{\top }}$ has no effect at all, D and D ^′ define the same relation.

Suppose an operation D is accepted by A ₅. We construct D ^′ by removing all $D_{T_{\top }}$ operations whose both ends are labelled by q _C and q _T for a same q. From a labelling ρ of D, we construct a labelling ρ ^′ by replacing q _T and q _C with q, and the node resulting after the removal of a $D_{T_{\top }}$ is labelled by q if the nodes of $D_{T_{\top }}$ were labelled by q _C and q _T. By construction, such a $D_{T_{\top }}$ cannot be labelled otherwise, as we suppose A ₄ has no such tests. Thus, by construction, ρ is a valid labelling of D ^′. As we only removed operations which have no effect, D and D ^′ accept the same relation. □

Finally, we suppose that an automaton obtained by these steps is distinguished, i.e. initial states are target of no transition and final states are source of no transition. If not, we can distinguish it by a classical transformation (as in the case of word automata). Observe that A ₅ satisfies all the requirements of Theorem 1. Thus A ₅ is a normalised automaton with tests which recognises the same relation as the initial automaton A. In subsequent constructions, we will consider the subsets of states Q _T,Q _C,Q _d,Q _c as defined in steps 5 and 2, and Q _u,v = Q _u ∩ Q _v with u ∈{T,C} and v ∈{d,c}. □

4.4 Iteration of Ground Stack Tree Rewriting Systems

With normalised automata, we can now show that the iteration of any relation defined by a ground stack tree rewriting system or a ground stack tree transducer is defined by a ground stack tree transducer.

Proposition 5

Given a distinguished automaton A, there exists an automaton A ^′ that recognises Op(A)^∗.

Proof

We construct A ^′. We take Q = Q _A ⊎{q}, I = I _A ∪{q}, F = F _A ∪{q}. The set of transitions Δ contains the transitions of A together with multiple copies of each transition ending with a state in F _A, modified to end in a state belonging to I _A:

$$\begin{array}{@{}rcl@{}} \Delta &=& \Delta_{A}\\ &&\cup \{(q_{1},\theta,q_{i}) \mid q_{i} \in I_{A},\exists q_{f} \in F_{A}, (q_{1},\theta,q_{f}) \in \Delta_{A}\}\\ &&\cup \{((q_{1},q_{2}),q_{i}) \mid q_{i} \in I_{A}, \exists q_{f} \in F_{A}, ((q_{1},q_{2}),q_{f}) \in \Delta_{A}\}\\ &&\cup \{(q_{1},(q_{2},q_{i})) \mid q_{i} \in I_{A}, \exists q_{f} \in F_{A}, (q_{1},(q_{2},q_{f}))\in \Delta_{A}\}\\ &&\cup \{(q_{1},(q_{i},q_{2})) \mid q_{i} \in I_{A}, \exists q_{f} \in F_{A}, (q_{1},(q_{f},q_{2}))\in \Delta_{A}\}\\ &&\cup \left\{\left( q_{1},\left( q_{i},q^{\prime}_{i}\right)\right) \mid q_{i},q^{\prime}_{i} \in I_{A}, \exists q_{f},q^{\prime}_{f} \in F_{A}, \left( q_{1},\left( q_{f},q^{\prime}_{f}\right)\right)\in \Delta_{A}\right\}. \end{array} $$

We first prove for every $k \in \mathbb {N}$ that if D ∈Op(A)^k, it has a valid labelling in A ^′: The operation □ has a valid labelling because q is initial and final. So it is true for Op(A)⁰. If it is true for Op(A)^k, we take an operation G in Op(A)^k+1 and decompose it in D of Op(A)^ℓ and F of Op(A)^m, for ℓ,m ≤ k and ℓ + m = k + 1, such that G ∈ D ⋅ F. The labelling which is the union of some valid labellings for D and F and labels the identified nodes with the labelling of F (initial states) is valid in A ^′.

We now consider an operation D = (V,E) accepted by A ^′ with the labelling ρ. We first cut D into a collection of connected DAGs D ₁,⋯ ,D _k by cutting D at the nodes labelled by states of I _A. We call D ^′ the disconnected DAG which is the union of all the D _i. Formally, we define D ^′ = (V ^′,E ^′) with V ^′ = V ∪{(x,1)∣ρ(x) ∈ I _A}, and E ^′ = {(x,a,y)∣(x,a,y) ∈ E ∧ρ(y)∉I _A}∪{(x,a,(y,1))∣(x,a,y) ∈ E ∧ρ(y) ∈ I _A}. We as well consider that we remove DAGs which have only one node (i.e. those which were created by duplicating an input node of D). Observe that we can obtain D by concatenating the D ₁,⋯ ,D _k together.

By definition of D ^′, each D _i can be labelled by ρ ∪{(x,1) → ρ(x)} while respecting the transitions of A ^′ such that all input nodes are labelled by I _A all output nodes are labelled by I _A and F _A, and no other nodes are labelled by I _A and F _A (as otherwise, they would have been cut, we only cut at nodes labelled by I _A, and F _A can only label output nodes as A is distinguished). By definition of A ^′, we can modify the labels of the output nodes of D _i labelled by a state of I _A by labelling them by states of F _A such that this modified labelling is a valid labelling of A. Thus all D _i are accepted by A.

We have shown that D can be obtained by concatenating DAGs recognised by A. We now have to prove that the D _i are operations.

We say that a DAG is well-formed if for every nodes x,y with an edge between x and y, the subDAG containing x, y, the successors of x, and the predecessors of y, and the edges between them, is a DAG representing a basic operation. From Definition 2, all operations are well-formed. Indeed an operation is inductively formed by concatenating basic operations between output and input nodes of already existing operations. Thus if a node is already the input node of a basic operation, by Definition 2, it will never be appended to the input node of another basic operation in the construction of an operation as it is not the output node of an operation (similarly for output nodes of basic operations). However, the converse is not true, as for example the DAG $D_{\text {copy}_{n}^{2}}\cdot _{2,1}D_{\overline {\text {copy}}_{n}^{2}}$ is well-formed but not an operation.

As the D _i are obtained by cutting the operation D so that for every node we either conserve all its predecessors (resp. successors) or drop them, they are well-formed. To conclude, we only have to prove that if a well-formed DAG is accepted by a normalised automaton, it is an operation.

Lemma 7

A connected well-formed DAG that can be labelled by a normalised automaton is an operation.

Proof

We consider a normalised automaton A = (Q,Γ,I,F,Δ), and we consider the set of constructive states Q _c and of destructive states Q _d as in Theorem 1. We prove that a connected well-formed DAG D that can be labelled by A by a labelling ρ is an operation by induction on the number of its nodes. We furthermore prove that if D has an input node x with ρ(x) ∈ Q _c, then x is the only input node of D, and that if it has an output node y with ρ(y) ∈ Q _d, then y is the only output node of D.

The only DAG with 1 node is □. It can be labelled by A (by any state), and by definition, it is an operation. Furthermore, its only node is the only input node and the only output node, thus we can label it by Q _c or Q _d and satisfy the induction hypothesis.

Suppose now that any connected DAG with at most n nodes that can be labelled by A satisfies the induction hypothesis. We take a connected well-formed DAG D with n + 1 nodes that can be labelled by A. We call ρ this labelling. We consider the leftmost input node x of D, and call q = ρ(x). As ρ is consistent with the transitions of A and D is well-formed, we have the following cases for the successors of x:

x has a unique successor y and we have the edge (x,𝜃,y) with $\theta \in Ops_{n-1} \cup \left \{\text {copy}_{n}^{1},\overline {\text {copy}}_{n}^{1}\right \}$. As ρ is consistent with the transitions of A, we have a transition (q,𝜃,q ^′) in A and ρ(y) = q ^′. Thus, if we consider D ^′ obtained by removing x from D, we obtain a DAG with n nodes that can be labelled by A (by ρ restricted to it). By hypothesis of induction, D ^′ is an operation. Moreover, as D is well-formed, y has only x as a predecessor, and is thus the leftmost input node of D ^′. The subDAG of D obtained by keeping only x and y is D _𝜃. Thus, we have D = D _𝜃⋅_1,1 D ^′. From Definition 2, D is an operation. Furthermore, if ρ(x) ∈ Q _c, then as there is no transition from Q _d to Q _c, ρ(y) ∈ Q _c. Thus, by hypothesis of induction, y is the only input node of D ^′. Thus, x is the only input node of D. If D has an output node z labelled by Q _d, then as x is not an output node, z is an output node of D ^′. By hypothesis of induction it is its only output node. Thus D has a unique output node.
x has a unique successor y and we have the edge $(x,\bar {1},y)$. As ρ is consistent with the transitions of A, there exists a transition ((q,q ^′),q ^″) in A, ρ(y) = q ^″ and there is a node z such that $(z,\bar {2},y) \in E$ and ρ(z) = q ^′. Furthermore, as A is normalised, q,q ^′∈ Q _d. As D is well-formed, y has no other predecessor than x and z, and z has no other successor than y. We consider the DAG D ^′ obtained by removing x and the edge $(z,\bar {2},y)$ from D. As we removed a basic operation, D ^′ is well-formed, it can be labelled by A (by restricting ρ to it), y is an input node, and z an output node. If D ^′ was connected, as z is labelled by Q _d, it would be the only output node of D ^′, by hypothesis of induction. Thus, there would be a path from y to z, which contradicts the fact that D is a DAG. Thus, D ^′ is a disconnected DAG. As D is a connected DAG, D ^′ is the disjoint union of two connected DAGs: D _y having y as its leftmost input node (otherwise, x would not be the leftmost input node of D) and D _z having z as an output node. These DAGs are connected, well-formed, can be labelled by A and have less than n vertices, therefore, by hypothesis of induction, they are operations. Furthermore, as z is labelled by Q _d, it is the only output node of D _z. If we restrict D to x,y and z, we obtain the DAG $D_{\overline {\text {copy}}_{n}^{2}}$. Thus we have $D = (D_{z} \cdot _{1,2} D_{\overline {\text {copy}}_{n}^{2}}) \cdot _{1,1} D_{y}$. Therefore, it is an operation. Furthermore, if D has an output node labelled by Q _d, as z is the only output node of D _z, it is an output node of D _y. By hypothesis of induction, D _y has thus a unique output node, and thus D has a unique output node. As there are no transition from Q _c to Q _d, D _z cannot have any input node labelled by Q _c. If D _y has an input node labelled by Q _c, by hypothesis of induction, it has a single input node, which is y. Thus, as y is not an input node in D, D has no input node labelled by Q _c.
x has two successors y and z such that (x,1,y) ∈ E and (x,2,z) ∈ E. As ρ is consistent with the transitions of A, there exists a transition (q,(q ^′,q ^″)) in A, ρ(y) = q ^′ and ρ(z) = q ^″. Furthermore, as A is normalised, q ^′,q ^″∈ Q _c. We consider D ^′ the DAG obtained by removing x of D. As D is well-formed, y and z have no other predecessor in D. Thus x and y are input nodes of D ^′, both labelled by a state of Q _c. As D is well formed and we obtained D ^′ by removing a basic operation, D ^′ is well formed. Furthermore, it can be labelled by A (by restricting ρ to it). If D ^′ was connected, then by hypothesis of induction, it would have a single input node, which is not the case. Therefore, D ^′ is not connected. As D is connected, D ^′ is therefore the disjoint union of two connected DAGs: D _y having y as an input node and D _z having z as an input node. D _y and D _z are two connected well-formed DAGs with less than n nodes, that can be labelled by A and which have an input node labelled by Q _c. By hypothesis of induction, they are operations with a single input node. If we restrict D to x,y and z, we obtain the DAG $D_{\text {copy}_{n}^{2}}$. Thus we have $D = (D_{\text {copy}_{n}^{2}} \cdot _{2,1} D_{z}) \cdot _{1,1} D_{y}$. By Definition 2, D is therefore an operation. Furthermore, D has a unique input node: x. As there is no transition from Q _c to Q _d and ρ is consistent with transitions of A, no node of D ^′ is labelled with Q _d, and in particular, no output node. Thus, the induction hypothesis is satisfied.

As we supposed that x is the leftmost input node, there is no y such that $(x,\bar {2},y) \in E$, otherwise, there would be a z such that $(z,\bar {1},y)\in E$, and thus we could find an input node at the left of x.

Finally, we get that any well-formed DAG labelled by A is an operation. □

Thus, every D _i is an operation accepted by A, and therefore D ∈Op(A)^k.

Finally, we get $\text {Op}(A^{\prime }) = \bigcup _{k \geq 0}\text {Op}(A)^{k}$, and thus A ^′ recognises Op(A)^∗. □

Proposition 6

Given R a (finite or recognisable) ground stack tree rewriting system, there exists A an operation automaton such that for all stack trees t, t ^′, $t\xrightarrow [R]{*} t^{\prime } \Leftrightarrow (t,t^{\prime }) \in \mathcal {R}(A)$.

Given Λ a ground stack tree transducer, there exists A an operation automaton such that for all stack trees t, t ^′, $t\xrightarrow [\Lambda ]{*} t^{\prime } \Leftrightarrow (t,t^{\prime }) \in \mathcal {R}(A)$.

Proof

This proposition is a consequence of the closure properties defined earlier and the previous proposition. Indeed, every finite set of operations is recognised by an automaton, and recognisable sets of operations are closed under union. Thus, for every (finite or recognisable) ground stack tree rewriting system R, there exists A an operation automaton such that for all stack trees t,t ^′, $t \xrightarrow [R]{} t^{\prime } \Leftrightarrow (t,t^{\prime }) \in r_{A}$. Similarly, for every ground stack tree transducer Λ, there exists A an operation automaton such that $t\xrightarrow [\Lambda ]{}t^{\prime } \Leftrightarrow (t,t^{\prime })\in \mathcal {R}(A)$. By Theorem 1, there thus exists a normalised and distinguished automaton A _r such that $r_{A_{r}} = r_{A}$. By Proposition 5, there exists A ^′ an operation automaton such that Op(A ^′) = Op(A _r)^∗.

Let us now show that this implies $\mathcal {R}(A^{\prime }) = \mathcal {R}(A)^{*} = r_{A}^{*}$. The last equality is straightforward, because, as $\mathcal {R}(A)$ is the relation obtained by the disjoint application of operations recognised by A, we get $r_{A} \subseteq \mathcal {R}(A)$ and as the iteration of the relation allows disjoint application of operations recognised by the automaton, we get $\mathcal {R}(A) \subseteq r_{A}^{*}$.

Let us start by showing the proposition for ground stack tree rewriting systems. We show by induction on the lenght of the run in R that $t \xrightarrow [R]{*} t^{\prime } \Rightarrow (t,t^{\prime }) \in \mathcal {R}(A^{\prime })$. If t = t ^′, the proposition is trivial, as by definition, every pair (t,t) is in the relation recognised by A ^′, as the empty DAG is recognised by A ^′. Suppose now that $t \xrightarrow {\ell } t^{\prime }$, with ℓ ≥ 1. We consider t ^″ such that $t \xrightarrow [R]{} t^{\prime \prime }$ and $t^{\prime \prime } \xrightarrow [R]{\ell -1} t^{\prime }$. By hypothesis of induction, we have $(t^{\prime \prime },t^{\prime }) \in \mathcal {R}(A^{\prime })$, and thus by definition, there exists D = (D ₁,⋯ ,D _|D|) ∈Op(A ^′) = Op(A _r)^∗ and $\mathbf {i} = (i_{1},\cdots ,i_{|\mathbf {D}|}) \in \mathbb {N}^{*}$ such that $(t^{\prime \prime },t^{\prime }) \in r_{\mathbf {D}}^{\mathbf {i}}$, and D ∈Op(A _r) and $j\in \mathbb {N}$ such that $(t,t^{\prime \prime }) \in r_{D}^{j}$. For the sake of simplicity, we suppose that i ₁ < ⋯ < i _|D|. We have now two possible cases:

D is applied separately from the elements of D. As D is in Op(A _r), it is also in Op(A ^′). Thus, $(t,t^{\prime }) \in r_{\mathbf {D^{\prime }}}^{\mathbf {i^{\prime }}}$ with D ^′ = (D ₁,⋯ ,D _k,D,D _k+1,D _|D|) and $\mathbf {i^{\prime }} = (i_{1},\cdots ,i_{k},j,i^{\prime }_{k+1},\cdots ,i^{\prime }_{|\mathbf {D}|})$, with i _k < j < i _k+1, and for every ℓ, $i^{\prime }_{\ell } = i_{\ell } + |I_{D}| - |O_{D}|$ (to adapt the indices of application of the DAGs, as the number of leaves may have been modified by the application of D to t).
D is applied to leaves produced by D. Suppose that the operations $D_{j_{1}},\cdots ,D_{j_{\ell }}$ are applied to leaves produced by D (where j ₁,⋯ ,j _ℓ are consecutive numbers). By definition of the application, there are integers k ₁,⋯ ,k _ℓ,h ₁,⋯ ,h _ℓ such that $r_{D}^{j} \circ r_{D_{j_{\ell }}}^{i_{j_{\ell }}} \circ {\cdots } \circ r_{D_{j_{1}}}^{i_{j_{1}}} = r_{D^{\prime }}^{\min (j,j_{1})}$, with $D^{\prime } = (\cdots (D\cdot _{k_{\ell },h_{\ell }} D_{j_{\ell }}) {\cdots } D_{j_{2}}) \cdot _{k_{1},h_{1}} D_{j_{1}}$. It is easy to see that D ^′ is an operation. Informally, we append to D the operations applied after it. The $\min (j,i_{j_{1}})$ is present to ensure we apply the obtained operation at the right place, i.e. j if the leftmost input vertex of D ^′ is a vertex of D, or $i_{j_{1}}$ if it is a vertex of $D_{j_{1}}$. By construction, D ^′ is in Op(A)^∗, thus it is in Op(A ^′), and thus the vector D ^′ obtained by replacing all the $D_{j_{k}}$ with D ^′ is such that $r_{\mathbf {D^{\prime }}}^{\mathbf {i^{\prime }}} = r_{D}^{j} \circ r_{\mathbf {D}}^{\mathbf {i}}$, where i ^′ is the vector obtained by replacing all the $i_{j_{k}}$ with $\min (j,i_{j_{1}})$ and adding |I _D|−|O _D| to the indices after $i_{j_{l}}$.

Thus, in every case, we have (t,t ^′) ∈R(A ^′).

The other direction is obtained similarly, showing that every vector of operations recognised by A ^′ can be decomposed into operations of A _r applied at the positions defined by the vector i. As we have Op(A ^′) = Op(A _r)^∗, we get that this decomposition is always possible.

Thus, R(A ^′) is the transitive closure of the relation defined by R.

For a ground stack tree transducer, the demonstration is similar, differing only by the replacement of the operation D with a vector of operations. Its writing is more fastidious but the idea is unchanged. □

A consequence of this proposition is that the set of stack trees reachable in a GSTRS or a GSTT from a recognisable set of stack trees is recognisable.

Proposition 7

Given R a GSTRS or a GSTT and L a recognisable of stack trees, the set $L^{\prime } = \left \{t\mid \exists t_{0}\in L, t_{0} \xrightarrow [R]{*}t\right \}$ is a recognisable set of stack trees.

This property is obtained by appending the automaton recognising the transitive closure of the relation defined by the GSTRS (or the GSTT) to the automaton recognising the operation producing L from a given stack tree, as in the previous proof. It is then sufficient to observe that the resulting automaton recognises the set of operations producing L ^′ from a given stack tree.

5 Rewriting Graphs of Stack Trees

We have now the elements to prove our main technical result: the decidability of the $\mathsf {FO}[\xrightarrow {*}]$-theory of graphs generated by GSTRS. We first formally define the graphs we study. We will consider the restriction of these graphs to sets of vertices reachable from a recognisable set of vertices, as we want to have the decidability of the reachability predicate. Notice that although we will be able to express the reachability predicate, it will not be possible to express that a stack tree is in a recognisable set.

Given R a GSTRS, we define $\mathcal {G}_{R} = (V,E)$ its associated graph with:

V = S T _n(Γ),
E = {(t,a,t ^′)∣∃𝜃 ∈ R _a,(t,t ^′) ∈ r _𝜃}.

We consider the restriction of $\mathcal {G}_{R}$ to the set of vertices reachable in R from a recognisable set of vertices L by replacing V with $\{t \mid \exists t_{0} \in L, t_{0} \xrightarrow [R]{*} t\}$. The graph obtained that way is denoted $\mathcal {G}_{R,L}$. We denote by GSTR_n the set of the restrictions to reachable sets of vertices of graphs generated by finite GSTRS of order n, and RGSTR_n those generated by recognisable GSTRS of order n.

Given Λ a GSTT, we define $\mathcal {G}_{\Lambda } = (V,E)$ its associated graph with:

V = S T _n(Γ).
$E = \{(t,a,t^{\prime }) \mid (t,t^{\prime }) \in \mathcal {R}(\mathcal {A}_{a})\}$.

As previously, we consider the restriction of this graph to a reachable set of vertices in Λ from a recognisable set of vertices. The graph obtained that way is denoted $\mathcal {G}_{\Lambda ,L}$. We denote by GSTT_n the set of the restrictions to reachable sets of vertices of graphs generated by GSTT of order n.

From the previous section, the transitive closure of the relation defined by a GSTRS is defined by an automaton, as well as the transitive closure of the relation defined by a GSTT. Thus, the transitive closure of a graph of GSTR_n is a graph of GSTT_n, and this is also true for graphs of RGSTR_n and of GSTT_n.

Theorem 2

Graphs of GSTR _n, RGSTR _n and of GSTT _n have a decidable $\mathsf {FO}[\xrightarrow {*}]$ -theory.

We prove the theorem for the logic $\mathsf {FO}[\xrightarrow {*}]$. Notice that we could, without modifying the proof much, add a predicate $\xrightarrow {B^{*}}$ which is the restriction of the reachability predicate to a smaller alphabet B. We could as well fix an initial stack tree, thus allowing to express if a given stack tree belongs to a recognisable set (in the case of recognisable GSTRS or GSTT). Let us remark as well that it is impossible to add control state to GSTRS or to add a predicate stating that a stack tree is reachable by a word from a fixed regular language (as both are already undecidable on GTRS). It is as well impossible to consider higher-order stacks with collapse links, as the graphs generated by these stacks do not have a decidable WMSO-theory, which is crucial in our proof.

To prove this theorem, we use a remark of Colcombet and Löding in [11] stating that a graph obtained by a finite set interpretation (FSI) from a graph with a decidable M S O-theory has a decidable F O-theory. We use that remark by exhibiting a finite set interpretation of a graph of GSTR_n (resp, RGSTR_n, GSTT_n) in which we added the reachability relation from a graph defined by an n-pushdown automaton, namely the graph obtained by the application of n operations of treegraph to a given finite graph. Thus, Theorem 2 is a direct consequence of the following property.

In this proposition, we use the graph $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n} = (V,E)$ with V = S t a c k s _n(Γ ×{ε,11,21,22}) and E = {(x,𝜃,y)∣𝜃 ∈ O p s _n(Γ ×{ε,11,21,22}). This graph is the configuration graph of the n-pushdown automaton with one state q and the transition (q,𝜃,q) for every basic operation 𝜃. From [7], this graph has a decidable MSO-theory.

We recall that a finite set interpretation $\mathcal {I}$ of a graph $\mathcal {G}^{\prime } = (V^{\prime },E^{\prime })$ in a graph $\mathcal {G} = (V,E)$ is a tuple of M S O-formulæ $(\delta ,\phi _{a_{1}}, \cdots ,\phi _{a_{k}})$ where δ is a formula with a single free order-2 variable such that X ∈ V ^′ if and only if $\mathcal {G} \models \delta (X)$, and for every letter a _i labelling an edge of $\mathcal {G}^{\prime }$, $\phi _{a_{i}}$ is a formula with two free order-2 variables such that $X \xrightarrow [\mathcal {G}^{\prime }]{a_{i}} Y$ if and only if $\mathcal {G} \models \phi _{a_{i}}(X,Y)$.

Proposition 8

Given R a recognisable GSTRS of order n, and L a recognisable set of n-stack trees, there exists a finite set interpretation $(\delta ,\phi _{a_{1}},\cdots ,\phi _{a_{k}},\phi _{\tau })$ of the graph $\mathcal {G}_{R,L}$ in which we added the reachability relation of R, in $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$.

Given Λ an order n GSTT and L a recognisable set of n-stack trees, there exists a finite set interpretation $(\delta ,\phi _{a_{1}},\cdots ,\phi _{a_{k}},\phi _{\tau })$ of the graph $\mathcal {G}_{\Lambda ,L}$ in which we added the reachability relation of Λ, in $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$.

To prove this proposition, we first define a coding of an n-stack tree in S T _n(Γ) as a set of n-stacks of S t a c k s _n(Γ ×{ε,11,21,22}), which is the set of paths from the root of the tree to its leaves where we encode the position of each node in the path with the 11, 21, 22. We then define three M S O-formulæ over $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$: δ(X) which holds if X is a valid coding of an n-stack tree, and for every automaton A, ψ _A(X,Y ) and ϕ _A(X,Y ), the first holding if there is an operation D recognised by A, if (t,t ^′) ∈ r _D where t and t ^′ are the trees coded by X and Y, and the second holding if $(t,t^{\prime }) \in \mathcal {R}(A)$. These formulæ require that the automaton A is a normalised automaton produced by the proof of Theorem 1, as we will precise it in the proof. The finite set interpretation proving the proposition for the graph $\mathcal {G}_{R,L}$ is

$$((\delta(X)\wedge \exists Y, \psi_{B}([\alpha_{1}]{}_{n},Y) \wedge \phi_{C}(Y,X)),\psi_{A_{a_{1}}}(X,Y),\ldots, \psi_{A_{a_{k}}}(X,Y),\phi_{C}(X,Y)), $$

where B is the automaton recognising the set of operations generating L from the n-stack tree with one node labelled with [α ₁]_n−1, and C is the automaton recognising the reachability relation of R.

The finite set interpretation proving the proposition for the graph $\mathcal {G}_{\Lambda ,L}$ is

$$((\delta(X)\wedge \exists Y, \psi_{B}([\alpha_{1}]{}_{n},Y) \wedge \phi_{C}(Y,X)),\phi_{A_{a_{1}}}(X,Y),\ldots, \phi_{A_{a_{k}}}(X,Y),\phi_{C}(X,Y)), $$

Let us start by defining the coding of an n-stack tree of S T _n(Γ) as a set of n-stacks of S t a c k s _n(Γ ×{ε,11,21,22}). For the sake of simplicity, in this section, the pair (α,ε) will be denoted by α.

Definition 10

Given t an n-stack tree and u a position in dom(t), we define

$$\text{Code}(t,u) = [\mu_{t,u,1}(t(\varepsilon)) \mu_{t,u,2}(t(u_{\leq 1})) {\ldots} \mu_{t,u,|u|}(t(u_{\leq |u|-1}))t(u)]{}_{n}, $$

where for every p an (n − 1)-stack, t an n-stack tree, u a node of t and i ≤|u|, μ _t,u,i(p) is the n-stack p ^′ such that rew_α(α,j u _i)(p,p ^′) where α is the topmost element of p in Γ, j is the number of children of the node u ₁⋯u _i−1 of t, and u _i is the i ^th letter of u. An n-stack tree t is thus coded by the finite set of n-stacks X _t = {Code(t,u)∣u ∈fr(t)}, i.e. the set of codes of its leaves.

Informally, Code(t,u) is the n-stack read in the path from the root of t to u, in which we added to the label of each node the number of its children and the number of the son which leads to u.

Example 1

The coding of the n-stack tree t depicted in Fig. 1 is

$$\begin{array}{@{}rcl@{}} X_{t} &=& \{[[[\alpha\alpha]{}_{1}[\beta\alpha(\beta,21)]{}_{1}]{}_{2} [[ \alpha\alpha]{}_{1}[\alpha\alpha(\alpha,11)]{}_{1}]{}_{2}[[\alpha\beta]{}_{1}]{}_{2}]{}_{3},\\ &&\,\,\, [[[\alpha\alpha]{}_{1}[\beta\alpha(\beta,22)]{}_{1}]{}_{2} [[\alpha\alpha]{}_{1}[\alpha]{}_{1}[(\beta,21)]{}_{1}]{}_{2}[[\beta\alpha]{}_{1}][ \beta\alpha]{}_{1} [\beta]{}_{1}]{}_{2}]{}_{3} ,\\ &&\,\,\,[[[\alpha\alpha]{}_{1}[\beta\alpha(\beta,22)]{}_{1}]{}_{2} [[\alpha\alpha]{}_{1}[\alpha]{}_{1}[(\beta,22)]{}_{1}]{}_{2}[[\alpha\beta\beta]{}_{1}[ \alpha\beta]{}_{1}]{}_{2}]{}_{3}\}. \end{array} $$

We now detail the formulæ defining the finite set interpretations. After giving useful technical formulæ, we present the formulæ δ, ϕ _A, and finally ψ _A.

5.1 Notations and Technical Formulæ

We define technical formulæ over $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$ which are useful to define the set of stacks used to represent stack trees as sets of vertices of $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$. To avoid long notations, in this section, we will shortcut O p s _n−1(Γ ×{ε}) by O p s _n−1(Γ).

For α,β ∈ Γ ×{ε,11,21,22}, we define $\psi _{\text {rew}_{\alpha ,\beta }}(x,y) = x \xrightarrow {\text {rew}_{\alpha ,\beta }} y$. For every k < n, we define $\psi _{\text {copy}_{k}}(x,y) = x \xrightarrow {\text {copy}_{k}} y$ and $\psi _{\overline {\text {copy}}_{k}}(x,y) = y \xrightarrow {\text {copy}_{k}} x$. Finally, for k ∈{1,2} and d ≤ k, we define $\psi _{\text {copy}_{n}^{k},d}(x,y) = \exists z_{1},z_{2}, \bigvee _{\alpha \in \Gamma } \left (x \xrightarrow {\text {rew}_{\alpha }{(\alpha ,\mathit {kd})}} z_{1} \wedge z_{1} \xrightarrow {\text {copy}_{n}} z_{2} \wedge z_{2} \xrightarrow {\text {rew}_{(\alpha ,\mathit {kd})}{\alpha }} y\right )$. Finally, for $T_{L} \in \mathcal {T}_{n-1}$, we define $\psi _{T_{L}}(x,y) = (x = y) \wedge \psi ^{\prime }_{L}(x)$, where $\psi ^{\prime }_{L}(x)$ is a MSO-formula such that $\psi ^{\prime }_{L}(x)$ holds if and only if the topmost (n − 1)-stack of x is in L. From Proposition 2.6 of [5], such a formula exists for any regular set L of (n − 1)-stacks.

For 𝜃 ∈ O p s _n−1(Γ), ψ _𝜃(x,y) is satisfied if y is obtained by applying 𝜃 to x. $\psi _{\text {copy}_{n}^{k},d}(x,y)$ is satisfied if y is obtained by adding kd to the topmost letter of x, then copying its topmost (n − 1)-stack, and finally removing kd of its topmost letter. $\psi _{\overline {\text {copy}}_{n}^{k},d}(x,y)$ is satisfied if $\psi _{\text {copy}_{n}^{k},d}(y,x)$ is satisfied. These formulæ allow us to simulate the application of a stack tree operation to a stack tree over the elements of their coding.

We now give a formula checking that a given n-stack y is obtained from an n-stack x by using only the previous formulæ,

$$\begin{array}{@{}rcl@{}} &&\text{Reach}(x,y) = \forall X, \left( \left( x \in X \land \forall z,z^{\prime}, \left( z \in X \land \left( \bigvee\limits_{\theta \in Ops_{n-1}(\Gamma) \cup \mathcal{T}_{n-1}} \psi_{\theta}(z,z^{\prime})\right.\right.\right.\right.\\ &&\qquad\qquad\qquad\qquad\qquad\left.\left.\left.\left. \vee \bigvee\limits_{k \in \{1,2\}} \bigvee\limits_{d \leq k} \psi_{\text{copy}_{n}^{k},d}(z,z^{\prime})\!\right)\!\right) \Rightarrow z^{\prime} \in X\right) \Rightarrow y\! \in X\right). \end{array} $$

This formula is true if for every set of n-stacks X, if x is in X and X is closed under the relations defined by ψ _𝜃 and $\psi _{\text {copy}_{n}^{k},d}$, then y is in X.

Lemma 8

For every n-stacks x = [x ₁…x _m]_n and $y = [y_{1} {\ldots } y_{m^{\prime }}]{}_{n}$ , Reach(x,y) is satisfied if and only if

For every i < m,y _i = x _i,
For every i ∈ [m,m ^′− 1], there is 𝜃 ∈ O p s _n−1(Γ)^∗, α ∈ Γ and h ∈{11,21,22} such that $(x_{m},y_{i}) \in r_{\text {rew}_{\alpha }{(\alpha ,h)}} \circ r_{\theta }$,
There exists 𝜃 ∈ Ops _n−1(Γ)^∗ such that $(x_{m},y_{m^{\prime }}) \in r_{\theta }$

Proof

Suppose that Reach(x,y) is satisfied, then we can obtain y by applying to x a sequence of operations of O p s _n−1(Γ) or of the form rew_α(α,h)copy_nrew_(α,h) α, with h ∈{11,21,22}. Thus y is of the form given in the lemma.

Suppose that y has the form given in the lemma, then y can be obtained by applying to x a sequence of operations of O p s _n−1(Γ) or an operation of the form rew_α(α,h)copy_nrew_(α,h) α, with h ∈{11,21,22}. Which proves the satisfaction of Reach(x,y). □

Corollary 2

For every n-stack x and α ∈ Γ, Reach([α]_n,x) is satisfied if and only if there exists an n-stack tree t and u one of its leaves such that x = Code(t,u).

Proof

Suppose there exists an n-stack tree t and u one of its leaves such that x = Code(t,u). Then

$$x = [ \mu_{t,u,1}(t(\varepsilon)), \mu_{t,u,2}(t(u_{\leq 1})),\cdots, \mu_{t,u,|u|}(t(u_{\leq |u|-1})), t(u)]{}_{n}$$

As for every j, t(u _≤j) is in S t a c k s _n−1(Γ), there exists ρ _j ∈ O p s _n−1(Γ)^∗ such that ρ _j([α]_n−1,t(u _≤j)). Thus, by the previous lemma, Reach([α]_n−1,x) is satisfied.

Conversely, suppose Reach([α]_n−1,x) is satisfied. From Lemma 8, for every j < |x|, there exists 𝜃 ∈ O p s _n−1(Γ)^∗, β _j ∈ Γ and h _j ∈{11,21,22} such that

$$([\alpha]{}_{n-1}, x_{j}) \in r_{\text{rew}_{\beta_{j}}{(\beta_{j},h_{j})}} \circ r_{\theta} ,$$

and there exists 𝜃 ∈ O p s _n−1(Γ)^∗ such that ([α]_n−1,x _|x|) ∈ r _𝜃.

For every j, we consider that h _j = k _j d _j. We choose a tree domain U such that d ₁⋯d _|x|−1 ∈ U. We define t a tree of domain U such that for every j, every node d ₁⋯d _j has k _j children, $(t(d_{1}{\cdots } d_{j-1}),x_{j}) \in r_{\text {rew}_{\beta _{j}}{(\beta _{j},h_{j})}}$, and for every u ∈ U which is not a d ₁⋯d _j, t(u) = [α]_n−1. We thus have x = Code(t,d ₁⋯d _|x|−1). □

5.2 The Formula δ

We now define δ(X) = OnlyLeaves(X) ∧TreeDom(X) ∧UniqueLabel(X) with

$$\text{OnlyLeaves}(X)= \forall x, x\in X \Rightarrow \text{Reach}([\alpha]{}_{n}, x), $$

$$\begin{array}{@{}rcl@{}} \text{TreeDom}(X) &=& \forall x,y,z ((x \in X \land \psi_{\text{copy}_{n}^{2},2}(y,z) \land \text{Reach}(z,x)) \\ &&\Rightarrow\exists r,z^{\prime} (r \in X \land \psi_{\text{copy}_{n}^{2},1}(y,z^{\prime}) \land \text{Reach}(z^{\prime},r))) \\ &&\wedge((x \in X \land \psi_{\text{copy}_{n}^{2},1}(y,z) \land \text{Reach}(z,x)) \\ &&\Rightarrow\exists r,z^{\prime} (r \in X \land \psi_{\text{copy}_{n}^{2},2}(y,z^{\prime}) \land \text{Reach}(z^{\prime},r))),\\ \text{UniqueLabel}(X)&=&\forall x,y, (x \neq y \land x \in X \land y \in X) \\ & &\Rightarrow(\exists z,z^{\prime},z^{\prime\prime}, \psi_{\text{copy}_{n}^{2},1}(z,z^{\prime}) \land \psi_{\text{copy}_{n}^{2},2}(z,z^{\prime\prime}) \land((\text{Reach}(z^{\prime},x) \\ &&\land\text{Reach}(z^{\prime\prime},y)) \vee (\text{Reach}(z^{\prime\prime},x) \land \text{Reach}(z^{\prime},y)))), \end{array} $$

where α is a fixed letter of Γ.

Formula OnlyLeaves ensures that every element x in X encodes a node in some stack tree. TreeDom ensures that the prefix closure of the set of words u such that there exists a tree t such that Code(t,u) ∈ X is a tree domain, that the set of words k ₀⋯k _m−1, where for every j, k _j is the arity of the node u ₁⋯u _j of t is included in this domain, and that there is no j for which k ₁⋯k _j−1(k _j + 1) is included in this domain (in other words, that the arity announced by the k _j is respected). Finally UniqueLabel ensures that for any two elements x = Code(t,u) and y = Code(t ^′,v) of X, there exists an index j ∈{1,⋯ ,min(|u|,|v|)} such that for every i < j, u _i = v _i, t(u _i) = t ^′(u _i) and the node u _i has the same arity in t and t ^′, and u _j≠v _j, i.e. for every pair of elements, the (n − 1)-stacks labelling their common ancestors are equal, and x and y cannot code the same leaf (because u≠v). Moreover, this prevents x to code a node on the path from the root to y.

Lemma 9

For every X finite subset of Stacks _n(Γ ×{ε,11,21,22}), δ(X) ⇔ ∃t ∈ S T _n,X = X _t.

Proof

We first show that for every n-stack tree t, δ(X _t) holds over $\Delta _{\Gamma \times \{\varepsilon ,11,21,22\}}^{n}$. By definition, for every x ∈ X _t, ∃u ∈fr(t),x = Code(t,u), and then Reach([α]_n,x) holds (by Corollary 2). Thus OnlyLeaves holds.

Let us take x ∈ X _t such that x = Code(t,u) with u = u ₀⋯u _i2u _i+2⋯u _|u|. As t is a tree, u ₀⋯u _i2 ∈ d o m(t) and so is u ₀⋯u _i1. Then, there exists v ∈fr(t) such that ∀j ≤ i,v _j = u _j, v _i+1 = 1, and Code(t,v) ∈ X _t. Let us now take x ∈ X _t such that x = Code(t,u) with u = u ₀⋯u _i1u _i+2⋯u _|u| and the node u ₀⋯u _i has 2 children in t, then u ₀⋯u _i2 is in d o m(t) and there exists v ∈ f r(t) such that ∀j ≤ i,v _j = u _j, v _i+1 = 2 and Code(t,v) ∈ X _t. Thus TreeDom holds.

Let x and y in X _t such that x≠y, x = Code(t,u) and y = Code(t,v), and let i be the smallest index such that u _i≠v _i. Suppose that u _i = 1 and v _i = 2 (the other case is symmetric). We call z = Code(t,u ₀⋯u _i−1), and take z ^′ and z ^″ such that $\psi _{\text {copy}_{n}^{2},1}(z,z^{\prime })$ and $\psi _{\text {copy}_{n}^{2},2}(z,z^{\prime \prime })$. We have then Reach(z ^′,x) and Reach(z ^″,y). And thus UniqueLabel holds. Therefore, for every stack tree t, δ(X _t) holds.

Let us now show that for every X ⊆ S t a c k s _n(Γ ×{ε,11,21,22}) such that δ(X) holds, there exists t ∈ S T _n, such that X = X _t. As OnlyLeaves holds, for every x = [x ₁⋯x _|x|]_n ∈ X, there exists t _x an n-stack tree and $u^{x} = {u^{x}_{1}} \cdots u^{x}_{|u^{x}|}$ a node of t _x such that x = Code(t _x,u ^x). We define the set of words $U = \{u \mid \exists x \in X, u \sqsubseteq u^{x}\}$. By definition, U is closed under prefix. As TreeDom holds, for all u, if u2 is in U, then u1 is in U as well. Therefore U is the domain of a tree. Moreover, if there is an x such that $u1 \sqsubseteq u^{x}$ and the node u has 2 children in t _x, then TreeDom ensures that there is y such that $u2 \sqsubseteq u^{y}$ and thus u2 ∈ U. As UniqueLabel holds, for every x and y two distinct elements of X, there exists j such that for all k < j we have ${u^{x}_{k}}= {u^{y}_{k}}$, and ${u^{x}_{j}} \neq {u^{y}_{j}}$. Then, for all k ≤ j, we have x _k = y _k and thus the node ${u^{x}_{1}}{\cdots } {u^{x}_{k}}$ has as many children and the same label in t _x and in t _y. Thus, for every u ∈ U, we can define σ _u such that for every x such that $u \sqsubseteq u^{x}$, x _|u| = σ _u, and the number of children of each node is consistent with the coding.

Consider the tree t of domain U such that for all u ∈ U, t(u) = σ _u. We have X = X _t, which concludes the proof. □

5.3 The Formula ϕ_A Associated with a Normalised Automaton Satisfying Theorem 1

First, let us recall that from Theorem 1 we can construct from any automaton an equivalent automaton satisfying the conditions of the theorem. In this subsection, we consider that A is such a normalised automaton.

Let us now explain ϕ _A(X,Y ), which can be written as

$$\begin{array}{@{}rcl@{}} \phi_{A}(X,Y) &=& \exists Z_{q_{1}}{\cdots} Z_{q_{|Q|}}, \phi_{A}^{\prime}(X,Y,\mathbf{Z}), \text{ with} \\ \phi_{A}^{\prime}(X,Y,\mathbf{Z}) &=& \text{Init}(X,Y,\mathbf{Z}) \wedge \text{WellFormed}(\mathbf{Z}) \\ &&\wedge\text{Trans}(\mathbf{Z}) \wedge \text{RTrans}(\mathbf{Z}) \wedge \text{NoCycle}(\mathbf{Z}) \end{array} $$

We detail each of the subformulæ below:

$$\text{Init}(X,Y,\mathbf{Z})\,=\, \left( \bigcup_{q_{i} \in I} Z_{q_{i}}\!\right) \!\subseteq\! X \land \left( \bigcup_{q_{i} \in F} Z_{q_{i}}\!\right) \!\subseteq\! Y \land X \setminus\left( \bigcup_{q_{i} \in I} Z_{q_{i}}\!\right) \,=\, Y \setminus \left( \bigcup_{q_{i} \in F} Z_{q_{i}}\!\right). $$

This formula is here to ensure that only leaves of X are labelled by initial states, only leaves of Y are labelled by final states and outside of their labelled leaves, X and Y are equal (i.e. not modified).

$$\text{Trans}(\mathbf{Z}) \,=\, \forall s, \bigwedge\limits_{q\in Q} \left( (s\in Z_{q}) \!\Rightarrow\! \left( \bigvee\limits_{K \in \Delta \mid q \in \text{left}(K)} \exists t,t^{\prime}, \text{Trans}_{K}(s,t,t^{\prime},\mathbf{Z}) \vee \rho_{q} \right) \right), $$

where ρ _q is true if and only if q is a final state, q ∈left(K) means that q appears in the left member of K (as in the cases below), and

$$\begin{array}{@{}rcl@{}} \text{Trans}_{(q,\theta,q^{\prime})}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \wedge \psi_{\theta}(s,t) \wedge s\in Z_{q} \wedge t \in Z_{q^{\prime}}, \\ \text{Trans}_{(q,(q^{\prime},q^{\prime\prime}))}(s,t,t^{\prime},\mathbf{Z}) &=& \psi_{\text{copy}_{n}^{2},1}(s,t) \land \psi_{\text{copy}_{n}^{2},2}(s,t^{\prime}) \land s\in Z_{q}\\ &&\land t \in Z_{q^{\prime}} \land t^{\prime} \in Z_{q^{\prime\prime}}, \\ \text{Trans}_{((q,q^{\prime}),q^{\prime\prime})}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \land \psi_{\overline{\text{copy}}_{n}^{2},1}(s,t) \land s \in Z_{q} \land t \in Z_{q^{\prime\prime}}, \\ \text{Trans}_{((q^{\prime},q),q^{\prime\prime})}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \land \psi_{\overline{\text{copy}}_{n}^{2},2}(s,t) \land s \in Z_{q} \land t \in Z_{q^{\prime\prime}}. \end{array} $$

This formula ensures that the labelling respects the rules of the automaton in the downward direction, and that for every stack s labelled by q, if there is a rule starting by q, there is a labelled stack s ^′ which is the result of s by at least one of those rules. And also that it is possible for a stack labelled by a final state to have no successor labelled by a state of the automaton (and as the automaton is distinguished, it cannot have a successor).

$$\text{RTrans}(\mathbf{Z}) \,=\, \forall s, \bigwedge\limits_{q\in Q} \!\left( \!(s \in Z_{q}) \!\Rightarrow\! \left( \bigvee\limits_{K \in \Delta \mid q \in \text{right}(K)} \!\exists t,t^{\prime}, \text{RTrans}_{K}(s,t,t^{\prime},\mathbf{Z}) \!\vee\! \sigma_{q} \!\right) \right) $$

where σ _q is true if and only if q is an initial state, q ∈right(K) means that q appears in the right member of K (as in the cases below), and

$$\begin{array}{@{}rcl@{}} \text{RTrans}_{(q^{\prime},\theta,q)}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \wedge \psi_{\theta}(t,s) \wedge s\in Z_{q^{\prime}} \wedge t \in Z_{q}, \\ \text{RTrans}_{(q^{\prime},(q,q^{\prime\prime}))}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \wedge \psi_{\text{copy}_{n}^{2},1}(t,s) \land s\in Z_{q^{\prime}} \land t \in Z_{q}, \\ \text{RTrans}_{(q^{\prime},(q^{\prime\prime},q))}(s,t,t^{\prime},\mathbf{Z}) &=& (t=t^{\prime}) \wedge \psi_{\text{copy}_{n}^{2},2}(t,s) \land s\in Z_{q} \land t \in Z_{q^{\prime}}, \\ \text{RTrans}_{((q^{\prime},q^{\prime\prime}),q)}(s,t,t^{\prime},\mathbf{Z}) &=& \psi_{\overline{\text{copy}}_{n}^{2},1}(t,s) \land \psi_{\overline{\text{copy}}_{n}^{2},2}(t^{\prime},s) \land s \in Z_{q}\\ &&\land t \in Z_{q^{\prime}} \land t \in Z_{q^{\prime\prime}} \end{array} $$

This formula ensures that the labelling respects the rules of the automaton in the upward direction, and that for every stack s labelled by q, if there is a rule ending by q, there is a labelled stack s ^′ such that s is the result of s ^′ by at least one of those rules. And also that it is possible for a stack labelled by an initial state to have no predecessor labelled by a state of the automaton (and as the automaton is distinguished, it cannot have a predecessor).

$$\begin{array}{@{}rcl@{}} \text{WellFormed}(\mathbf{Z}) \!\!\!\!&&= \forall s,t_{1},t_{2},t_{3},t_{4},\\ &&\bigwedge\limits_{\begin{array}{c}K \neq K^{\prime} \in \Delta \\q \in \text{left}(K) \cap \text{left}(K^{\prime})\end{array}} \neg (\text{Trans}_{K}(s,t_{1},t_{2},\mathbf{Z}) \wedge \text{Trans}_{K^{\prime}}(s,t_{3},t_{4},\mathbf{Z}))\wedge\\ && \bigwedge\limits_{\begin{array}{c}K \neq K^{\prime} \in \Delta \\q \in \text{right}(K) \cap \text{right}(K^{\prime})\end{array}} \neg (\text{RTrans}_{K}(s,t_{1},t_{2},\mathbf{Z}) \wedge \text{RTrans}_{K^{\prime}}(s,t_{3},t_{4},\mathbf{Z})) \\ &&\wedge \bigwedge\limits_{q,q^{\prime},q^{\prime\prime}} \forall s,t, [ \text{RTrans}_{(q,(q^{\prime},q^{\prime\prime}))}(t,s,s,\mathbf{Z}) \Rightarrow \exists t^{\prime}, \text{Trans}_{(q,(q^{\prime},q^{\prime\prime})}(s,t,t^{\prime},\mathbf{Z})]\\ &&\wedge [ \text{Trans}_{((q,q^{\prime\prime}),q^{\prime})}(s,t,t,\mathbf{Z}) \Rightarrow \exists t^{\prime}, \text{RTrans}_{((q,q^{\prime\prime}),q^{\prime})}(t,s,t^{\prime},\mathbf{Z})]\\ &&\wedge [ \text{Trans}_{((q^{\prime\prime},q),q^{\prime})}(s,t,t,\mathbf{Z}) \Rightarrow \exists t^{\prime}, \text{RTrans}_{((q^{\prime\prime},q),q^{\prime})}(t,t^{\prime},s,\mathbf{Z})] \end{array} $$

This formula ensures that you never use two different transitions to label stacks from a single stack labelled with a state q (in direct and reverse direction). This forces that whenever the automaton has a non-deterministic choice, it only takes one. It furthermore ensures that it is impossible to have two nodes linked by a $\text {copy}_{n}^{2}$ or $\overline {\text {copy}}_{n}^{2}$ transition labelled without having the third node linked with that transition labelled.

$$\begin{array}{@{}rcl@{}} \text{NoCycle}(\mathbf{Z}) &=& \forall s, \bigvee\limits_{q \in Q} (s \in Z_{q}) \\ &&\Rightarrow\forall X, \left( s \in X \wedge \forall s^{\prime},t,t^{\prime}, \left( s^{\prime} \in X \wedge \left( \bigvee\limits_{K} \text{Trans}_{K}(s^{\prime},t,t^{\prime},\mathbf{Z}) \right)\right)\right. \\ &&\Rightarrow \left.\left( t \in X \wedge t^{\prime} \in X \right) {\vphantom{\bigvee\limits_{K}}}\right) \Rightarrow \bigcup\limits_{q \in F} Z_{q} \cap X \neq \emptyset \\ &&\wedge \left( s \in X \wedge \forall s^{\prime},t,t^{\prime}, \left( s^{\prime} \in X \wedge \left( \bigvee\limits_{K} \text{RTrans}_{K}(s^{\prime},t,t^{\prime},\mathbf{Z})\right)\right) \right.\\ &&\Rightarrow \left.\left( t \in X \wedge t^{\prime} \in X\right) {\vphantom{\bigvee\limits_{K}}}\right) \Rightarrow \bigcup\limits_{q \in I} Z_{q} \cap X \neq \emptyset \end{array} $$

This formula ensures that for any labelled stack, we can find a path of labelled stacks related by the transitions of the automaton which starts in an initial state, and another which ends in a final state.

Proposition 9

Given s, t two stack trees, ϕ _A(X _s,X _t) holds if and only if there exists a tuple of operations D = (D ₁,⋯ ,D _k) recognised by A and a tuple of positions i = (i ₁,⋯ ,i _k) such that $(s,t) \in r_{\mathbf {D}}^{\mathbf {i}}$.

Proof

First, suppose there exist such D and i. We construct a labelling of S t a c k s _n(Γ ×{ε,11,21,22}) which satisfies ϕ _A(X _s,X _t). We take an accepting labelling ρ of the D _i by A. We will label the S t a c k s _n according to this labelling. The idea is to cut the application of a D _i to s basic operation by basic operation and label the codes of the leaves appearing during this process by the label of the output node(s) of the basic operation considered. For example, if we consider a suboperation D _𝜃 whose output node is y, and we obtain a tree t ^″ after applying D _𝜃 at the position i to a tree t ^′, we label Code(t ^″,u _i) by ρ(y), where u _i is the i ^th leaf of t ^″. Notice that this does not depend on the order we apply the D _j to s nor the order of the leaves we choose to apply the operations first.

We suppose that $(s,t) \in r_{\mathbf {D}}^{\mathbf {i}}$. Given a node x of an D _j, we call ρ(x) its labelling.

We define the (D ₁,i ₁,s ₁),⋯ ,(D _k,i _k,s _k) labelling of S t a c k s _n(Γ ×{ε,11,21,22}) inductively. Let us precise that in this labelling, a single stack may have several labels.

The ∅ labelling is the empty labelling.
The (D ₁,i ₁,s ₁),⋯ ,(D _k,i _k,s _k) labelling is the union of the (D ₁,i ₁,s ₁) labelling and the (D ₂,i ₂,s ₂),⋯ ,(D _k,i _k,s _k) labelling.
The (□,i,s) labelling is {(Code(s,u _i),ρ(x))}, where u _i is the i ^{t
h} leaf of s and x is the unique node of □.
The ((F ₁⋅_1,1 D _𝜃) ⋅_1,1 F ₂,i,s) labelling is the (F ₁,i,s),(F ₂,i,s ^′) labelling, with $(s,s^{\prime }) \in r_{F_{1}}^{i}\circ r_{\theta }^{i} $.
The $((((F_{1} \cdot _{1,1} D_{\text {copy}_{n}^{2}}) \cdot _{2,1} F_{3}) \cdot _{1,1} F_{2}),i,s)$ labelling is the (F ₁,i,s),(F ₂,i,s ^′),(F ₃,i + 1,s ^′) labelling, with $(s,s^{\prime }) \in r_{F_{1}}^{i}\circ r_{\text {copy}_{n}^{2}}^{i}$.
The $\left (\left (F_{1} \cdot _{1,1} \left (F_{2} \cdot _{2,1} \overline {\text {copy}}_{n}^{2}\right )\right ) \cdot _{1,1} F_{3},i,s\right )$ labelling is the $(F_{1},i,s),(F_{2},i+|I_{F_{1}}|,s),$ (F ₃,i,s ^′) labelling, with $(s,s^{\prime }) \in r_{F_{1}}^{i}\circ r_{F_{2}}^{i+1} \circ r_{\overline {\text {copy}}_{n}^{2}}^{i}$.

Observe that this process terminates, as the sum of the edges and the nodes of all the DAGs strictly diminishes at every step.

We take Z the (D ₁,i ₁,s),⋯ ,(D _k,i _k,s) labelling of S t a c k s _n(Γ ×{ε,11,21,22}).

Lemma 10

The previously defined labelling Z satisfies $\phi ^{\prime }_{A}(X_{s},X_{t},\mathbf {Z})$.

Proof

Let us first state a technical lemma which follows directly from the definition of the labelling:

Lemma 11

Given D a reduced operation, ρ _D an accepting labelling of D, t a stack tree, $i\in \mathbb {N}$ and 1 ≤ j ≤|I _D|, one of the labels of Code(t,u _{i + j−1}) (where u _i is the i ^th leaf of t) in the (D,i,t) labelling is ρ _D(x _j) (where x _j is the j ^th input node of D).

Furthermore, if t ^′ is a stack tree such that $(t,t^{\prime }) \in r_{D}^{i}$ , and 1 ≤ j ≤|O _D|, one of the labels of Code(t ^′,u _{i + j−1}) is ρ _D(y _j) (where y _j is the j ^th output node of D).

Finally if q labels a stack s in the (D,i,t) labelling, then there is a node x of D such that ρ _D(x) = q.

For the sake of simplicity, let us consider for this proof that D is a unique reduced operation D (if it is a tuple of reduced operations, the proof is the same for every operation).

Init First, let us prove that Init is satisfied. From the previous lemma, all nodes of X _s are labelled with the labels of input nodes of D (or not labelled), thus they are labelled by initial states (as we considered an accepting labelling of D). Furthermore, as the automaton is distinguished, only these ones can be labelled by initial states. Similarly, the nodes of X _t, and only them are labelled by final states (or not labelled).

Trans and RTrans We show by induction on the size of the operation that Trans and RTrans are satisfied. For the sake of this proof, we will consider that ρ _q (resp. σ _q) are replaced by predicates that are true if and only if the nodes considered are output (resp. input). As Init is satisfied, for an operation accepted by the automaton, ρ _q will only be true for output nodes and σ _q for input nodes.

For D = □, the formulæ are true by vacuity.

Suppose now that the formulæ are true for any operation of at most n nodes. Consider D with n + 1 nodes and suppose D = D ₁⋅_1,1 D _𝜃⋅_1,1 D ₂. We call x the output node of D ₁ and y the input node of D ₂. We take an integer i and a stack tree t and consider the (D,i,t) labelling. By definition it is the (D ₁,i,t)(D ₂,i,t ^′) labelling with t ^′ is a stack tree such that $(t,t^{\prime }) \in r_{D_{1}}^{i} \circ r_{D_{\theta }}^{i}$. We call t ^″ a stack tree such that $(t,t^{\prime \prime }) \in r_{D_{1}}^{i}$. By definition, we thus have $(t^{\prime \prime },t^{\prime }) \in r_{D_{\theta }}^{i}$. By Lemma 11, we have s ₁ = Code(t ^″,u _i) labelled by ρ(x) and s ₂ = Code(t ^′,u _i) labelled by ρ(y). As furthermore, we have (s ₁,s ₂) ∈ r _𝜃, Trans_{(ρ(x),𝜃,ρ(y))}(s ₁,s ₂,s ₂,Z) and RTrans_{(ρ(x),𝜃,ρ(y))}(s ₂,s ₁,s ₁) are true. Thus Trans and RTrans are true for the (D,i,t) labelling, as by hypothesis of induction they hold for all the nodes of the (D ₁,i,t) and (D ₂,i,t ^′) labellings, x is the only output node of D ₁ and y the only input node of D ₂.

The other cases for the decomposition of D are similar, and thus omitted. Thus, Trans and RTrans are satisfied for any (D,i,t) labelling.

WellFormed Let us now show that the labelling satisfies WellFormed. We show it by induction. We furthermore show that if a stack s is labelled by an input node x, there is no K ∈ Δ with ρ(x) ∈right(K) and no t,t ^′ such that RTrans_K(s,t,t ^′,Z), and that if a stack s is labelled by an output node x, there is no K ∈ Δ with ρ(x) ∈left(K) and no t,t ^′ such that Trans_K(s,t,t ^′,Z).

If D = □ or $D = D_{T_{L}}$, the result is immediate.

Suppose that $D = (D_{1} \cdot _{1,1} (D_{\text {copy}_{n}^{2}} \cdot _{2,1} D_{3})) \cdot _{1,1} D_{2}$. By induction, WellFormed holds for the (D ₁,i,t), (D ₂,i,t ^′) and (D ₃,i + 1,t ^′) labellings with $(t,t^{\prime }) \in r_{D_{1}}^{i} \circ r_{\text {copy}_{n}^{2}}^{i}$, and we consider t ^″ such that $(t^{\prime \prime },t^{\prime }) \in r_{\text {copy}_{n}^{2}}^{i}$. We call x the only output node of D ₁, y and z the only input nodes of D ₂ and D ₃, s ₁ = Code(t ^″,u _i), s ₂ = Code(t ^′,u _i) and s ₃ = Code(t ^′,u _i+1). First, observe that we have Trans_{(ρ(x),(ρ(y),ρ(z)))}(s ₁,s ₂,s ₃,Z), RTrans_{(ρ(x),(ρ(y),ρ(z))}(s ₂,s ₁,s ₁,Z) and RTrans_{(ρ(x),(ρ(y),ρ(z))}(s ₃,s ₁,s ₁,Z), by definition of the labelling. By hypothesis of induction, there are no stacks s,s ^′ and no transition K with ρ(x) ∈left(K) (resp. ρ(y) ∈right(K), ρ(z) ∈right(K)) such that Trans_K(s ₁,s,s ^′,Z) (resp. RTrans_K(s ₂,s,s ^′,Z), RTrans_K(s ₃,s,s ^′)) holds in the (D ₁,i,t) labelling (resp. the (D ₂,i,t ^′) or (D ₃,i + 1,t ^′) labellings). Suppose there are stacks s,s ^′,s ^″ labelled in the (D,i,t) labelling such that Trans_K(s,s ^′,s ^″,Z) holds, such that s,s ^′,s ^″ are not all labelled in the same D _i. As A is normalised, all stacks labelled in D ₂ and D ₃ are labelled with states of Q _c, and have at least 1 (n − 1)-stack more than stack labelled with states of Q _c in D ₁. Moreover, as s ₂ is obtained as the left copy of s ₁ and s ₂ as the right copy of s ₁, it is not possible to produce stacks labelled in D ₂ from stacks labelled in D ₃ without producing first s ₁ (and vice-versa), and as A is normalised, this is impossible. Thus, if all three s,s ^′,s ^″ are labelled in Q _c, then s is labelled in D ₁ and K is a $\text {copy}_{n}^{d}$ transition. If s is labelled in Q _d, then the only possibility for K is also to be a $\text {copy}_{n}^{d}$ transition. Suppose it is a $\text {copy}_{n}^{1}$ transition, then s ^′ = s ^″. Suppose s ^′ is labelled in D ₂ (the case for D ₃ being symmetric). It is thus obtained from s by a $\text {copy}_{n}^{1}$ operation. But it is also obtained from s ₂ which has more (n − 1)-stacks than s and is obtained from s ₁ as the left copy of a $\text {copy}_{n}^{2}$. Thus s ₂ have at least as many (n − 1)-stacks as s ^′, and to obtain s ^′ from s ₂ we must first annul that binary copy with a $\overline {\text {copy}}_{n}^{2}$, which is impossible as A is normalised. Thus we have K is a $\text {copy}_{n}^{2}$ transition, s is labelled in D ₁, s ^′ in D ₂ and s ^″ in D ₃. They are thus 𝜃 ₁, 𝜃 ₂, 𝜃 ₃ such that $(s,s_{1}) \in r_{\theta _{1}}$, $(s_{2},s^{\prime }) \in r_{\theta _{2}}$ and $(s_{3},s^{\prime \prime }) \in r_{\theta _{3}}$. Thus we have $D_{\theta _{1}} \cdot _{1,1} ((D_{\text {copy}_{n}^{2}} \cdot _{2,1} D_{\theta _{3}}) \cdot _{1,1} D_{\theta _{2}})$ equivalent to $D_{\text {copy}_{n}^{2}}$. Thus as A is normalised, 𝜃 ₁, 𝜃 ₂ and 𝜃 ₃ can only be single tests or empty operations, but as there are no state that can be at the left (resp. right) of both a test and a non-test transition, we get that 𝜃 ₁, 𝜃 ₂ and 𝜃 ₃ are empty operation, and thus that s = s ₁, s ^′ = s ₂, s ^″ = s ₃ and K = (ρ(x),(ρ(y),ρ(z))). Similarly, if RTrans_K(s,s ^′,s ^″) for s,s ^′,s ^″ not labelled in the same DAG, then K = (ρ(x),(ρ(y),ρ(z))) and s = s ₂ or s = s ₃ and s ^′ = s ^″ = s ₁. Observe as well that as we added a full transition, we did not add any x,y such that there is a K with RTrans_K(x,y,y,Z) but no z such that Trans_K(y,z,x) or Trans_K(y,x,z) (or the converse). Thus WellFormed holds for the (D,i,t) labelling. Finally, as input nodes of D were input nodes of D ₁ (resp. output nodes of D were output nodes of D ₂ and D ₃), and we did not add them any incoming edge (resp. output edge), the same reasoning apply to show there is no RTrans_K (resp. Trans_K)) true with them as first variable, and thus the induction hypothesis holds.

For $D = D_{2} \cdot _{1,2}(D_{1} \cdot _{1,1} D_{\overline {\text {copy}}_{n}^{2}}))\cdot _{1,1} D_{3}$, the case is similar (by exchanging roles of $\text {copy}_{n}^{d}$ and $\overline {\text {copy}}_{n}^{d}$ and so on).

Suppose that D = D ₁⋅_1,1 D _𝜃⋅_1,1 D ₂, and call x the output node of D ₁ and y the input node of D ₂. We take t ^″ such that $(t,t^{\prime \prime }) \in r_{D_{1}}^{i}$ and t ^′ such that $(t^{\prime \prime },t^{\prime }) \in r_{\theta }^{i}$, and call s ₁ = Code(t ^″,u _i) and s ₂ = Code(t ^′,u _i). By definition of the labelling, Trans_{(ρ(x),𝜃,ρ(y))}(s ₁,s ₂,s ₂,Z) and RTrans_{(ρ(x),𝜃,ρ(y))}(s ₂,s ₁,s ₁,Z) hold. By hypothesis of induction, WellFormed holds for the (D ₁,i,t) and (D ₂,i,t ^′) labellings, and there are no K, and no s,s ^′ labelled in the (D ₁,i,t) labelling (resp. the (D ₂,i,t ^′) labelling) such that Trans_K(s ₁,s,s ^′,Z) holds (resp. RTrans_K(s ₂,s,s ^′,Z)). Suppose there are stacks s,s ^′,s ^″ labelled in the (D,i,t) labelling such that Trans_K(s,s ^′,s ^″,Z) holds and s,s ^′,s ^″ are not all labelled in the same D _i. Suppose that s is labelled by z ₂ in D ₂ and s ^′ by z ₁ in D ₁. Then, there is a path from z ₁ to z ₂ labelled by w which contains 𝜃. But as from ρ(z ₂) it is possible to take K, it is thus possible to label a DAG containing w followed by K, which is a non-single test word that can leave s ^′ unchanged. As A is normalised, this is impossible. Thus, s is labelled by z ₁ in D ₁ and s ^′,s ^″ by z ₂ and z ₃ in D ₂. Call 𝜃 ^′ the operation of K. There is a path from z ₁ to z ₂ labelled by an operation w that contains 𝜃, and such that $r_{w} = r_{\theta ^{\prime }}$. We thus have w = w ₁ ⋅ 𝜃 ⋅ w ₂ for some operation words w ₁,w ₂. Suppose 𝜃 ^′ is a $\text {copy}_{n}^{d}$ operation. Then s ^′ has exactly one (n − 1)-stack more than s and their topmost are equal. As 𝜃 is a non-test and non-tree operation, it must be inverted in w to get s ^′ from s. Thus w contains a non-single test operation containing the identity, which is impossible as A is normalised. If 𝜃 ^′ is a $\overline {\text {copy}}_{n}^{d}$ operation we get a similar contradiction. Thus 𝜃 ^′ is a stack operation. Suppose w contains tree operations. As A is normalised, you only can have $\overline {\text {copy}}_{n}^{d}$ followed by the same number of $\text {copy}_{n}^{d}$, as 𝜃 ^′ does not modify the number of (n − 1)-stacks. Thus ρ(z ₁) is in Q _d and ρ(z ₂) is in Q _c. But the only transition between Q _d and Q _c are labelled with $\text {copy}_{n}^{d}$ operations, so this is impossible. Thus w does not contain any tree operation and is thus in Red (as A is normalised). If 𝜃≠𝜃 ^′ and 𝜃 =copy_i (resp. $\overline {\text {copy}}_{j}$) or 𝜃 ^′ =copy_j (resp. $\overline {\text {copy}}_{j}$), we get a similar contradiction than for the $\text {copy}_{n}^{d}$ case. If 𝜃 = 𝜃 ^′ =copy_i, similarly to the above case for D, we get that w ₁ and w ₂ can only be empty operations, thus z ₁ = x, z ₂ = y, and K = (ρ(x),𝜃,ρ(y)). Suppose 𝜃 =rew_α,β and 𝜃 ^′ =rew_γ δ. As w is in Red, it cannot contain non-single test factors containing the identity, thus it only contains tests and $\text {rew}_{\alpha ^{\prime }}{\beta ^{\prime }}$ operations. But in a word of Red, there cannot be two $\text {rew}_{\alpha ^{\prime }}{\beta ^{\prime }}$ operations not separated by a copy_i or a $\overline {\text {copy}}_{i}$. Moreover as no state can be at the left of the right of both a test and a non-test transition, we get that w =rew_α,β and is equivalent to rew_γ δ. Thus z ₁ = x, z ₂ = y, and K = (ρ(x),𝜃,ρ(y)). Thus in every case, we have that if Trans_K(s,s ^′,s ^″) with s,s ^′,s ^″ labelled in different D _i, s = s ₁, s ^′ = s ^″ = s ₂ and K = (ρ(x),𝜃,ρ(y)). Similarly, if RTrans_K(s,s ^′,s ^″) with s,s ^′,s ^″ labelled in different D _i, we get s = s ₂, s ^′ = s ^″ = s ₁ and K = (ρ(x),𝜃,ρ(y)). Observe as well that as we added a full transition, we did not add any x,y such that there is a K with RTrans_K(x,y,y,Z) but no z such that Trans_K(y,z,x) or Trans_K(y,x,z) (or the converse). Thus WellFormed holds for the (D,i,t) labelling. Finally, as input nodes of D were input nodes of D ₁ (resp. output nodes of D were output nodes of D ₂ and D ₃), and we did not add them any incoming edge (resp. output edge), the same reasoning apply to show there is no RTrans_K (resp. Trans_K)) true with them as first variable, and thus the induction hypothesis holds.

Thus WellFormed holds for the (D,i,t) labelling.

NoCycle Finally, let us show that the labelling satisfies NoCycle. Take s a stack labelled in the (D,i,t) labelling of S t a c k s _n−1 with the label of a node x. We show by induction on the size of D that there exists a path of stacks related by basic operations from a stack s ^′ labelled with a state labelling an input node of D to s and a path of stacks related by basic operations from s to a stack s ^″ labelled with a state labelling an output node of D.

If D = □, the result is immediate, both paths being empty.

Suppose D = D ₁⋅_1,1 D _𝜃⋅_1,1 D ₂, and x is a node of D ₁ (the case x is a node of D ₂ is symmetric and thus omitted). By hypothesis of induction, there exists s ^′ labelled by an input node of D ₁ such that there is a path from s ^′ to s, and s ^″ labelled by the only output node of D ₁ such that there is a path from s to s ^″. By hypothesis of induction, for s ₁ labelled by the only input node of D ₂, there exists s ₂ labelled by an output node of D ₂ such that there is a path from s ₁ to s ₂. Furthermore, by definition of the labelling, we have (s ^″,s ₁) ∈ r _𝜃, and thus, there is a path from s to s ₂.

The other cases for the decomposition of D are similar and left to the reader.

Thus, as all its sub-formulæ are true, $\phi _{A}^{\prime }(X_{s},X_{t},\mathbf {Z})$ is true with the described labelling Z. And thus ϕ _A(X _s,X _t) is true. □

Suppose now that ϕ _A(X _s,X _t) is satisfied. We take a labelling Z that satisfies the formula $\phi _{A}^{\prime }(X_{s},X_{t},\mathbf {Z})$, minimal in the number of nodes labelled. We construct the following graph D:

$$\begin{array}{@{}rcl@{}} V_{D} =\!\!\!&& \{(x,q) \mid x \in Stacks_{n}(\Gamma \times \{\varepsilon,11,21,22\}) \land x \in Z_{q}\} \\ E_{D} =\!\!\!&& \{((x,q),\theta,(y,q^{\prime})) \mid (\exists \theta, (q,\theta,q^{\prime}) \in \Delta \land \psi_{\theta}(x,y))\}\\ && \cup \{((x,q),1,(y,q^{\prime})), ((x,q),2,(z,q^{\prime\prime})) \mid (q,(q^{\prime},q^{\prime\prime})) \in \Delta\\ && \land \psi_{\text{copy}_{n}^{2},1}(x,y) \land \psi_{\text{copy}_{n}^{2},2}(x,z)\}\\ && \cup \{((x,q),\bar{1},(z,q^{\prime\prime})), ((y,q^{\prime}),\bar{2},(z,q^{\prime\prime})) \mid ((q,q^{\prime}),q^{\prime\prime}) \in \Delta\\ && \land \psi_{\text{copy}_{n}^{2},1}(z,x) \land \psi_{\text{copy}_{n}^{2},2}(z,y)\}\\ && \cup \left\{\left( (x,q),\text{copy}_{n}^{1},(y,q^{\prime})\right) \mid \left( q,\text{copy}_{n}^{1},q^{\prime}\right) \in \Delta \land \psi_{\text{copy}_{n}^{1},1}(x,y)\right\}\\ && \cup \left\{\left( (x,q),\overline{\text{copy}}_{n}^{1},(y,q^{\prime})\right) \mid \left( q,\overline{\text{copy}}_{n}^{1},q^{\prime}\right) \in \Delta \land \psi_{\text{copy}_{n}^{1},1}(y,x) \right\} \end{array} $$

Lemma 12

D is a disjoint union of connected well-formed DAGs D ₁,⋯ ,D _k.

We recall that a well-formed DAG is a DAG where for every nodes x,y, if there is an edge between x and y, then the subDAG obtained by restricting the DAG to x, y, the successors of x and the predecessors of y is a DAG representing a basic operation.

Proof

First, we show that D is a DAG. If not, there exists (x,q) ∈ V such that $(x,q) \xrightarrow {w} (x,q)$, with w a non-empty sequence of operations.

Suppose that w is composed of $Ops_{n-1} \cup \mathcal {T}_{n-1}$ operations. As WellFormed is satisfied, we thus have a cycle $(x,q) \xrightarrow {w_{1}} (x_{1},q_{1}) {\cdots } (x_{k-1},q_{k-1}) \xrightarrow {w_{k}} (x,q)$ without any other edge touching the (x _i,q _i). As NoCycle is satisfied, there is a q _i initial and a q _j final. As the automaton is distinguished, there is no transition from q _j and no transition ending in q _i, thus we have a contradiction, and such a cycle is impossible.

Thus, there is a tree operation in w. Suppose that the first tree operation appearing in w is a $\overline {\text {copy}}_{n}^{d}$. As this operation diminishes the number of (n − 1)-stacks in x, and w starts from (x,q) and ends in (x,q), it has to increase again the number of (n − 1)-stacks. This can only be done by a $\text {copy}_{n}^{d^{\prime }}$ operation. As A is normalised, $\overline {\text {copy}}_{n}^{d}$ transition starts in Q _d and $\text {copy}_{n}^{d^{\prime }}$ transition ends in Q _c. Thus states of both Q _c and Q _d appears on the cycle, and in particular, we have an edge from a state of Q _c to a state of Q _d, and thus a transition from Q _c to Q _d in A. As A is normalised, this is impossible. If w starts by a $\text {copy}_{n}^{d}$ operations, we reach the same contradiction.

Thus, D is a DAG.

Now, let us suppose that there are (x,q) and (y,q ^′) nodes such that there is an edge $(x,q) \xrightarrow {\theta } (y,q^{\prime })$ and the restriction of D to (x,q), its successors, (y,q ^′) and its predecessors is not a basic operation. Suppose for example that there is a node (z,q ^″) distinct from (y,q ^′) such that $(x,q) \xrightarrow {\theta ^{\prime }} (z,q^{\prime \prime })$ and 𝜃 and 𝜃 ^′ are operations in $Ops_{n-1} \cup \mathcal {T}_{n-1} \cup \{\text {copy}_{n}^{1},\overline {\text {copy}}_{n}^{1}\}$. Thus, by definition of D, we have ψ _𝜃(x,y), $\psi _{\theta ^{\prime }}(x,z)$, $y \in Z_{q^{\prime }}$ and $z \in Z_{q^{\prime \prime }}$. Thus, we have $\text {Trans}_{(q,\theta ,q^{\prime })}(x,y,y,\mathbf {Z})$ and $\text {Trans}_{(q,\theta ^{\prime },q^{\prime \prime })}(x,z,z,\mathbf {Z})$ which hold. As WellFormed holds, this is impossible.

Suppose there is a node (z,q ^″) such that $(x,q) \xrightarrow {1} (z,q^{\prime \prime })$. By construction of D, there is a transition (q,(q ^″,q ^″′)) in the automaton, and thus as WellFormed is satisfied, there exists a node (z ^′,q ^″′) such that $\text {Trans}_{(q,(q^{\prime \prime },q^{\prime \prime \prime }))}(x,z,z^{\prime },\mathbf {Z})$ holds. But as we have $\text {Trans}_{(q,\theta ,q^{\prime })}(x,y,y,\mathbf {Z})$, this is impossible (by WellFormed).

The other cases for the neighbourhood of x and y which gives a non-basic operation wield similar contradictions and are left to the reader.

Thus, the restriction of D to (x,q), its successors, (y,q ^′) and its predecessors is a basic operation. Therefore, D is well-formed. □

Lemma 13

Each D _i is recognised by A

Proof

If a node (x,q) of D _i is an output node, by definition, there is no y and q ^′ such that there is a basic operation 𝜃 such that (q,𝜃,q ^′) ∈ Δ_A, (x,y) ∈ r _𝜃, and y is labelled by q ^′ (or (q,(q ^′,q ^″)) ∈ Δ or ((q,q ^′),q ^″)) ∈ Δ, and the corresponding nodes). As Trans is satisfied, if q is not final, there exist such y and q ^′. Thus q is final. Similarly (using RTrans instead of Trans), we get that if a node (x,q) is an input node then q is initial.

Furthermore, by construction of the edges, and because Trans is satisfied, by labelling a node (x,q) by q we obtain a labelling respecting the transitions of A.

Therefore D _i is accepted by A. □

Thus, every D _i is a well-formed DAG accepted by A. From Lemma 7, we get that D _i is an operation.

Lemma 14

t is obtained by applying the D _i to disjoint positions of s.

Proof

We show by induction on the size of D that $(s,t^{\prime })\in r_{D}^{j}$ if and only if $X_{t^{\prime }} = X_{s} \cup \{x \mid (x,q) \in O_{D}\} \backslash \{x \mid (x,q) \in I_{D}\}$, where j is the number of the leaf coded by the leftmost (x,q) ∈ I _D in s:

If D = □, it is true, as $X_{t^{\prime }} = X_{s}$ and t ^′ = s.
If D = (F⋅_1,1 D _𝜃) ⋅_1,1 G, by induction hypothesis, we consider r such that $(s,r)\in r_{F}^{j}$, we then have X _r = X _s ∪{y}∖{x∣(x,q) ∈ I _F}, where (y,q ^′) is the only output node of F. By construction, the input node of G, (z,q ^″) is such that ψ _𝜃(y,z), and thus we have $(r,r^{\prime })\in r_{\theta }^{j}$ such that $X_{r^{\prime }} = X_{r} \backslash \{y\} \cup \{z\}$. By induction hypothesis, we have $X_{t^{\prime }} = X_{r^{\prime }} \cup \{x \mid (x,q) \in O_{G}\} \backslash \{z\}$, as $(s,t^{\prime })\in r_{F}^{j} \circ r_{\theta }^{j} \circ r_{G}^{j}$ and thus $(r^{\prime },t^{\prime })\in r_{G}^{j}$. Thus, $X_{t^{\prime }} = X_{s} \cup \{x \mid (x,q) \in O_{G}\} \backslash \{x \mid (x,q) \in I_{F}\} = X_{s} \cup \{x \mid (x,q) \in O_{D}\} \backslash \{x \mid (x,q) \in I_{D}\}$.

Conversely, suppose we have $X_{t^{\prime }} = X_{s} \cup \{x \mid (x,q) \in O_{D}\} \backslash \{x \mid (x,q) \in I_{D}\}$. Thus $X_{t^{\prime }} = ((X_{s} \cup \{y\} \backslash \{x \mid (x,q) \in I_{F}\}) \backslash \{y\} \cup \{z\}) \cup \{x \mid (x,q) \in O_{G}\} \backslash \{z\}$, with (y,q ^′) the only output node of F and (z,q ^″) the only input node of G. We call X _r = X _s ∪{y}∖{x∣(x,q) ∈ I _F}, and $X_{r^{\prime }} = X_{r} \backslash \{y\} \cup \{z\}$. By induction hypothesis, X _r and $X_{r^{\prime }}$ code stack trees r and r ^′ such that $(s,r) \in r_{F}^{j}$ and $(r^{\prime },t^{\prime }) \in r_{G}^{j}$. As we have D _𝜃 = {((x,q ^′),𝜃,(y,q ^″))}, we have ψ _𝜃(x,y), by construction, and thus $(r,r^{\prime }) \in r_{\theta }^{j}$. Thus $(s,t^{\prime }) \in r_{D}^{j}$.

The other cases are similar and are thus left to the reader. We then construct this way successively t ₁, t ₂, and so on such that $(s,t_{1})\in r_{D_{1}}^{i_{1}}$, $(t_{1},t_{2})\in r_{D_{2}}^{i_{2}}$, etc, where i _k is the number of the leaf coded by the leftmost (x,q) ∈ D _k in t _k−1. As the D _i are disjoint, and in particular their input nodes are disjoint, we get that the D _k are applied disjointly as the leaf coded by the input node of a D _k is not coded by an input node of any other $D_{k^{\prime }}$. Finally, we obtain t as we have $X_{t_{|\mathbf {D}|}} = X_{s} \backslash (\bigcup\limits _{k\leq |\mathbf {D}|} I_{D}) \cup \bigcup\limits _{k\leq |\mathbf {D}|} O_{D} = X_{t}$ and thus prove the lemma. □

We have proved both directions: for every n-stack trees s and t, there exists a tuple of operations D recognised by A and a tuple of positions i such that $(s,t)\in r_{\mathbf {D}}^{\mathbf {i}}$ if and only if ϕ _A(X _s,X _t). □

5.4 The Formula ψ_A Associated with a Normalised Automaton A

Given two stack trees s,t, the formula ψ _A(X _s,X _t) holds if there exists an operation D recognised by A such that (s,t) ∈ r _D. Thus, it differs from ϕ _A only by the fact that D is a unique operation, as ϕ _A(X _s,X _t) holds if there exists a tuple of operations D recognised by A such that (s,t) ∈ r _D. The formula is defined as follows:

$$\psi_{A}(X,Y) = \exists Z_{q_{1}},\cdots,Z_{q_{|Q|}}, \phi^{\prime}_{A}(X,Y,\mathbf{Z}) \wedge \text{Connected}(\mathbf{Z}),\text{ with} $$

$$\begin{array}{@{}rcl@{}} \text{Connected}(\mathbf{Z}) &\,=\,& \forall x,y, x\!\in\! \bigcup\limits_{q} Z_{q} \wedge y \!\in\! \bigcup\limits_{q} Z_{q} \!\Rightarrow\! \forall W, \left( {\vphantom{\bigcup\limits_{K}}}x \in W \!\wedge\! \left( {\vphantom{\bigcup\limits_{K}}}\forall z,z^{\prime},z^{\prime\prime}, z \in W\right.\right.\\ &&\wedge \left( \bigcup\limits_{K} \text{Trans}_{K}(z,z^{\prime},z^{\prime\prime},\mathbf{Z}) \vee \text{RTrans}_{K} (z,z^{\prime},z^{\prime\prime},\mathbf{Z})\right)\\ &&\Rightarrow \left.\left.(z^{\prime} \in W \wedge z^{\prime\prime} \in W){\vphantom{\bigcup\limits_{K}}}\right){\vphantom{\bigcup\limits_{K}}}\right)\Rightarrow y \in W \end{array} $$

Informally, this formula states that between any two nodes labelled, one can find a path of labelled nodes related by transitions of the automaton.

Proposition 10

Given s and t two stack trees, the formula ψ _A(X _s,X _t) is satisfied if and only if there exists an operation D recognised by A such that (s,t) ∈ r _D.

Proof

As ψ _A is the conjunction of ϕ _A and Connected, we have as in the previous section that ψ _A(X _s,X _t) is satisfied if and only if there exists D a tuple of operations recognised by A and a tuple of positions i such that $(s,t)\in r_{\mathbf {D}}^{\mathbf {i}}$. To obtain our property, it is sufficient to show that ψ _A is satisfied if and only if we moreover have that D is a singleton.

Suppose first that there exists D recognised by A such that $(s,t)\in r_{D}^{i}$. With the labelling defined in the previous subsection, Lemma 10 holds. It suffices to add the following lemma to conclude that ψ _A(X _s,X _t) is satisfied:

Lemma 15

If D is a single operation D, the labelling Z defined in the previous subsection satisfies Conneted(X _s,X _t,Z).

Proof

We show the result by induction on the size of D.

If D = □, then there is only one node labelled in the (D,i,t) labelling and thus Connected holds.

Suppose D = D ₁⋅_1,1 D _𝜃⋅_1,1 D ₂ has n nodes, has a labelling ρ consistent with A, and that the result holds for any operation with less than n nodes. Take s,s ^′ two stacks labelled in the (D,i,t) labelling, and suppose they have been labelled by the nodes x and y of D. If x and y are in the same subDAG, e.g. D ₁, then by definition, as the (D,i,t) labelling is the (D ₁,i,t),(D ₂,i,t ^′) labelling with $(t,t^{\prime }) \in r_{D_{1}}^{i} \circ r_{D_{\theta }}^{i}$, s and s ^′ are in the (D ₁,i,t) labelling and thus by hypothesis of induction, there is an undirected path between them. If x is in D ₁, and y is in D ₂, we consider z ₁ the output node of D ₁ and z ₂ the input node of D ₂, t ^′ such that $(t,t^{\prime }) \in r_{D_{1}}^{i} \circ r_{D_{\theta }}^{i}$ and t ^″ such that $(t,t^{\prime \prime }) \in r_{D_{1}}^{i}$. Observe we have $(t^{\prime \prime },t^{\prime }) \in r_{D_{\theta }}^{i}$. By Lemma 11, Code(t ^″,u _i) is labelled by ρ(z ₁) in the (D ₁,i,t) labelling and Code(t ^′,u _i) is labelled by ρ(z ₂) in the (D ₂,i,t ^′) labelling. Thus, by hypothesis of induction, there is an undirected path of labelled nodes between s and Code(t ^″,u _i) and one between Code(t ^′,u _i) and s ^′. As furthermore, there is a transition K = (ρ(z ₁),𝜃,ρ(z ₂)) in A, Trans_K(z ₁,z ₂,z ₂,Z) holds, and thus there is an undirected path between s and s ^′. Thus Connected holds.

The other cases are similar and left to the reader. □

Suppose now that ψ _A(X _s,X _t) holds. As previously, we construct the graph D from a minimal labelling satisfying the formula. Lemmas 12, 13 and 14 hold. It is sufficient to modify the proof of Lemma 12 to show that D is furthermore a connected DAG. We chose the vertices (x,q) and (y,q ^′) in D. By definition of D, we have x ∈ Z _q and $y \in Z_{q^{\prime }}$. As Connected(Z) holds, there is a sequence x = x ₁,x ₂,⋯ ,x _k = y such that for any i, x _i is labelled with a state q _i, with q ₁ = q and q _k = q ^′, and there is a transition K and a node z _i such that Trans_K(x _i,x _i+1,z _i,Z) or RTrans_K(x _i,x _i+1,z _i,Z) holds. In the first case, by construction of D, we have an edge (x _i,𝜃,x _i+1) (or (x _i,1,x _i+1) or other similar cases depending on the transition K), and in the second case we have an edge (x _i+1,𝜃,x _i). Thus, there is an undirected path between (x,q) and (x,q ^′). Therefore D is a connected DAG. □

The three formulæ defined in this section allow to define the finite set interpretations announced at the beginning of the section, which proves Theorem 2.

6 Example of a Language

We can see a rewriting graph as a language acceptor in a classical way by defining some initial and final nodes and labelling the edges. We present here an example of a language recognised by a stack tree rewriting system. The recognised language is $\bigcup\limits _{u\in \Sigma ^{*}} u \quad u$. We recall that the shuffle of two words u v is the set of words inductively defined by u v = u ₁(u ₂⋯u _|u| v) ∪ v ₁(u v ₂⋯v _|v|), and ε ε = {ε}. Fix an alphabet Σ and two special symbols ↑ and ↓. We consider S T ₂(Σ ∪{↑,↓}). We now define a rewriting system R, whose rules are given in Fig. 7. We consider, for every a ∈ Σ, R _a = {D _a}, and R _ε = {D u p l}∪{P _a∣a ∈ Σ}, where R _ε is a set of rules which can be taken without reading a letter of the recognised word.

To recognise a language with this system, we have to fix an initial set of stack trees and a final set of stack trees. We will have a unique initial tree and a recognisable set of final trees. They are depicted on Fig. 8.

A word w ∈ R ^∗ is accepted by this rewriting system if there is a path from the initial tree to a final tree labelled by w. The trace language recognised is

$$\bigcup\limits_{a_{1}{\cdots} a_{n}\in \Sigma} P_{a_{1}}{\cdots} P_{a_{n}} \cdot \mathit{Dupl} \cdot ((D_{a_{n}} {\cdots} D_{a_{1}}) \quad (D_{a_{n}} {\cdots} D_{a_{1}})). $$

Let us informally explain why. We start on the initial tree, which has only a leaf labelled by a stack whose topmost symbol is ↓. So we cannot apply a D _a to it. If we apply a P _a to it, we remain in the same situation, but we added an a to the stack labelling the unique node. So we can read a sequence $P_{a_{1}}{\cdots } P_{a_{n}}$. From this situation, we can also apply a D u p l, which yields a tree with three nodes whose two leaves are labelled by [a ₁…a _n ↑]₁, if we first read $P_{a_{1}}{\cdots } P_{a_{n}}$. From this new situation, we can only apply D _a rules. If the two leaves are labelled by [b ₁…b _m ↑]₁ and [c ₁…c _ℓ ↑]₁, we can apply $D_{b_{m}}$ or $D_{c_{\ell }}$, yielding the same tree in which we removed b _m or c _ℓ from the adequate leaf. We can do this until a final tree remains. So, on each leaf, we will read $D_{a_{n}} {\cdots } D_{a_{1}}$ in this order, but we have no constraint on the order we will read these two sequences. So we effectively can read any word in $(D_{a_{n}} {\cdots } D_{a_{1}}) \quad (D_{a_{n}} {\cdots } D_{a_{1}})$. And this is the only way to reach a final tree.

To obtain the language we announced at the start, we just have to remark that every P _a and D u p l operations are taken without reading a letter, and that D _a is taken while reading a a. Thus, denoting λ(w) the word read by a calculus of trace w, we get that, if w is an accepting trace (so is of the form described earlier), we get λ(w) = (a _n⋯a ₁) (a _n⋯a ₁), and we indeed recognise $\bigcup _{u\in \Sigma ^{*}} u\quad u$.

7 Perspectives

There are several open questions arising from this work. The first one is the strictness of the hierarchy, and the question of finding simple examples of graphs separating each of its orders with the corresponding orders of the pushdown and tree-automatic hierarchies (the later defined from the first in [11], and currently only considered there, to the knowledge of the author). A second interesting question concerns the trace languages of stack tree rewriting graphs. It is known that the trace languages of higher-order pushdown automata are the indexed languages [8], that the class of languages recognised by automatic structures are the context-sensitive languages [19] and that those recognised by tree-automatic structures form the class Etime [16]. However there is to our knowledge no characterisation of the languages recognised by ground tree rewriting systems. An other interesting point consists in finding other representation of the classes of graphs GSTR_n, RGSTR_n and GSTT_n. At order 1, RGSTR₁ is the set of graphs defined by finite VRP (Vertex Replacement and Asynchronous product) systems of equations [10].⁴ We conjecture that RGSTR_n is the set of graphs defined by VRP systems of equations defined by a graph of the order n − 1 of the prefix-recognisable hierarchy. Finally, the model of stack trees can be readily extended to trees labelled by trees. Future work will include the question of extending our notion of rewriting and Theorem 2 to this model.

Acknowledgements

The author would like to address his thanks to Antoine Meyer for the help he provided in the realisation of this work, and for fruitful discussions. The author also thanks the anonymous reviewer for his invaluable comments.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Vorheriger Artikel Some Properties of Antistochastic Strings

Nächster Artikel Generalized LR Parsing Algorithm for Grammars with One-Sided Contexts

This hierarchy was extended to encompass unsafe schemes and collapsible automata, which are out of the scope of this paper. See [3, 4, 6] for recent results on the topic.

This unusual definition is necessary because ⋅ is not associative. For example, $(D_{\text {copy}_{n}^{2}} \cdot _{2,1} D_{\text {copy}_{n}^{2}}) \cdot _{1,1} D_{\text {copy}_{n}^{2}}$ is in $(D_{\text {copy}_{n}^{2}})^{2} \cdot D_{\text {copy}_{n}^{2}}$ but not in $D_{\text {copy}_{n}^{2}} \cdot (D_{\text {copy}_{n}^{2}})^{2}$.

Regular sets of n-stacks are obtained by considering regular sets of sequences of operations of O p s _n applied to a given stack s ₀. More details can be found in [5].

In the PhD of Colcombet, these VRP operations are known as VRA operations

Barthelmann, K.: When Can an Equational Simple Graph Be Generated by Hyperedge Replacement? In: MFCS, LNCS, vol. 1450, pp. 543–552. Springer (1998)

Blumensath, A., Grädel, E.: Finite presentations of infinite structures: Automata and interpretations. Theory Comput. Syst. 37(6), 641–674 (2004)MathSciNetCrossRefMATH

Broadbent, C.H., Carayol, A., Hague, M., Serre, O.: A Saturation Method for Collapsible Pushdown Systems. In: ICALP (2), LNCS, vol. 7392, pp. 165–17. Springer (2012)

Broadbent, C.H., Carayol, A., Ong, C.H.L., Serre, O.: Recursion Schemes and Logical Reflection. In: LICS, LNCS, pp. 120–129. IEEE (2010)

Carayol, A.: Regular Sets of Higher-Order Pushdown Stacks. In: MFCS, LNCS, vol. 3618, pp. 168–179. Springer (2005)

Carayol, A., Serre, O.: Collapsible Pushdown Automata and Labeled Recursion Schemes: Equivalence, Safety and Effective Selection. In: LICS, LNCS, pp. 165–174. IEEE (2012)

Carayol, A., Wöhrle, S.: The Caucal Hierarchy of Infinite Graphs in Terms of Logic and Higher-Order Pushdown Automata. In: FSTTCS, LNCS, vol. 2914, pp. 112–123. Springer (2003)

Caucal, D.: On Infinite Transition Graphs Having a Decidable Monadic Theory. In: ICALP, LNCS, vol. 1099, pp. 194–205. Springer (1996)

Caucal, D.: On Infinite Terms Having a Decidable Monadic Theory. In: MFCS, LNCS, vol. 2420, pp. 165–176. Springer (2002)

10.

Colcombet, T.: On Families of Graphs Having a Decidable First Order Theory with Reachability. In; ICALP, LNCS, vol. 2380, pp. 98–109. Springer (2002)

11.

Colcombet, T., Löding, C.: Transforming structures by set interpretations. logical methods in computer science (LMCS) 3(2) (2007)

12.

Dauchet, M., Heuillard, T., Lescanne, P., Tison, S.: Decidability of the confluence of finite ground term rewrite systems and of other related term rewrite systems. Inf. Comput. 88(2), 187–201 (1990)MathSciNetCrossRefMATH

13.

Dauchet, M., Tison, S.: The Theory of Ground Rewrite Systems is Decidable. In: LICS, LNCS, pp. 242–248. IEEE Computer Society (1990)

14.

Dauchet, M., Tison, S., Heuillard, T., Lescanne, P.: Decidability of the Confluence of Ground Term Rewriting Systems. In: LICS, LNCS, pp. 353–359. IEEE Computer Society (1987)

15.

Knapik, T., Niwinski, D., Urzyczyn, P.: Higher-Order Pushdown Trees are Easy. In: FoSSaCS, LNCS, vol. 2303, pp. 205–222. Springer (2002)

16.

Meyer, A.: Traces of Term-Automatic graphs. ITA 42(3), 615–630 (2008)MathSciNetMATH

17.

Penelle, V.: Rewriting Higher-Order Stack Trees. In: CSR, LNCS, vol. 9139, pp. 364–397. Springer (2015)

18.

Rabin, M. O.: Decidability of Second-Order theories and automata on infinite trees. Bull. Am. Math. Soc. 74, 1025–1029 (1968)MathSciNetCrossRefMATH

19.

Rispal, C.: The synchronized graphs trace the Context-Sensitive languages. Electr. Notes Theor. Comput. Sci. 68(6), 55–70 (2002)CrossRefMATH

Titel: Rewriting Higher-Order Stack Trees
verfasst von: Vincent Penelle
Publikationsdatum: 02.05.2017
Verlag: Springer US
Erschienen in: Theory of Computing Systems / Ausgabe 2/2017
Print ISSN: 1432-4350
Elektronische ISSN: 1433-0490
DOI: https://doi.org/10.1007/s00224-017-9769-6

Springer Professional

Rewriting Higher-Order Stack Trees

Abstract

1 Introduction

2 Definitions and Notations

3 Higher-Order Stack Trees

3.1 Higher-Order Stacks

3.2 Stack Trees

3.3 Stack Tree Rewriting

4 Operation Automata

4.1 Properties of Operation Automata

4.2 Recognisable Ground Stack Tree Rewriting Systems and Ground Stack Tree Transducers

4.3 Normalised Automata

4.4 Iteration of Ground Stack Tree Rewriting Systems

5 Rewriting Graphs of Stack Trees

5.1 Notations and Technical Formulæ

5.2 The Formula δ

5.3 The Formula ϕ_A Associated with a Normalised Automaton Satisfying Theorem 1

5.4 The Formula ψ_A Associated with a Normalised Automaton A

6 Example of a Language

7 Perspectives

Acknowledgements

Premium Partner

Springer Professional

Abstract

1 Introduction

2 Definitions and Notations

3 Higher-Order Stack Trees

3.1 Higher-Order Stacks

3.2 Stack Trees

3.3 Stack Tree Rewriting

4 Operation Automata

4.1 Properties of Operation Automata

4.2 Recognisable Ground Stack Tree Rewriting Systems and Ground Stack Tree Transducers

4.3 Normalised Automata

4.4 Iteration of Ground Stack Tree Rewriting Systems

5 Rewriting Graphs of Stack Trees

5.1 Notations and Technical Formulæ

5.2 The Formula δ

5.3 The Formula ϕ A Associated with a Normalised Automaton Satisfying Theorem 1

5.4 The Formula ψ A Associated with a Normalised Automaton A

6 Example of a Language

7 Perspectives

Acknowledgements

Weitere Artikel der Ausgabe 2/2017

Processing Succinct Matrices and Vectors

Preface

Space Saving by Dynamic Algebraization Based on Tree-Depth

Guest Editorial: Computer Science Symposium in Russia

On Compiling Structured CNFs to OBDDs

Satisfiability of ECTL∗ with Local Tree Constraints

Premium Partner

5.3 The Formula ϕ_A Associated with a Normalised Automaton Satisfying Theorem 1

5.4 The Formula ψ_A Associated with a Normalised Automaton A