Proof of lemma 9
Let us first consider the case where there is no reset of predicates. Let
\({{\mathcal {R}}_{ state }}\) denote the finite set of predicate symbols used by
\({\mathcal {T}}\). Let
X and
Z denote the finite sets of variables occurring in
\(\psi \) and introduced by substitutions from
\(\Theta \), respectively. Let
L denote the maximal level of a predicate in
\({\mathcal {R}}\). For
\(1\le t\le L\), let
\(V_t\) denote the set of all variables
$$\begin{aligned} z_{R,{\bar{b}}} \end{aligned}$$
(16)
with
\(z\in Z\),
\(\lambda (R) = t\), and
\({\bar{b}}\) a sequence of variables from
\(X\cup V_L\cup \ldots \cup V_{t+1}\) whose length equals the arity of
R. We then will use the set
\(V = V_1\cup \ldots \cup V_L\) as our set of bound variables. According to the definition of the
\(V_t\), the set
V is finite. By induction on the length
N of
\(\pi \), we construct a formula in a particular normal form
\(\psi _N\) which is equivalent to
\(\pi (\psi )\). In this construction, we make sure that all variables bound by existential or universal quantifiers are always taken from
V. The normal form
g of formulas we rely on, consists of a finite conjunction of
generalized clauses
c which are built up according to the following abstract grammar
$$\begin{aligned} \begin{array}{lll} g &{}{::}=&{} \top \quad \mid \quad c\wedge g \\ c &{}{::}=&{} c_0\quad \mid \quad c\vee \exists {\bar{z}}.A'{\bar{a}}{\bar{z}}\wedge g'\quad \mid \quad c\vee f_{R,{\bar{b}}} \\ f_{R,{\bar{b}}} &{}{::}=&{}\lnot R{\bar{b}}\wedge \forall {\bar{z}}_R.\,\bigwedge _{n=1}^r(\lnot A_n{\bar{b}}^{\sigma _n}{\bar{z}}_n\vee c_n)\\ \end{array} \end{aligned}$$
In the second line,
\(c_0\) is an ordinary clause without occurrences of input predicates,
\(A'\) is an input predicate where all predicates occurring in
\(g'\) have levels less than
\(\lambda (A')\). In the third line,
R is a predicate,
\({\bar{b}}\) are sequences of arguments,
\({\bar{z}}_R\) is a sequence of FO variables whose length only depends on
R and contains all FO variables required by substitutions of
R,
\(A_n\) are input predicates of levels
\(\lambda (R)\), and all predicates occurring in any of the
\(c_n\) have levels less than
\(\lambda (R)\). A formula
\(f_{R,{\bar{b}}}\) is also called
negation tree with head
\(\lnot R{\bar{b}}\).
\(f_{R,{\bar{b}}}\) is called
non-trivial, if it contains occurrences of input predicates, i.e., the conjunction in the second conjunct is non-empty. Thereby, we maintain the invariant that on the toplevel, i.e., outside the scopes of all quantifiers, the heads of all negation trees are distinct.
This normal form can be obtained for \(\psi \) by constructing the conjunctive normal form of \(\psi \) and thereby, removing all duplicates of literals in clauses. Let \(\psi _0\) denote the resulting conjunction for \(\psi \). Now assume that we have already constructed the normal form \(\psi _{N-1}\). By using distributivity of \(\wedge \), the corresponding normal form for \(\psi _N = \theta _N(\psi _{N-1})\) can be readily computed. For maintaining the invariant on negation trees, we recall that \(\lnot R{\bar{b}}\vee \lnot R{\bar{b}}\wedge g \longleftrightarrow \lnot R{\bar{b}}\), i.e., newly created literals \(\lnot R{\bar{b}}\) in clauses kill already existing negation trees with this head.
Given the normal form \(\psi _N\) for \(\pi (\psi )\), we now successively apply SO quantifier elimination. We proceed from input predicates \(A_i\) of higher levels downwards towards the input predicates of smaller levels. For each clause, we thereby collect the lists of FO variables \({\bar{z}}_t\) required to be universally quantified and prove that these lists possibly introduced for level t, consist of variables from \(V_t\) only. In the end, we thus arrive at a formula \(\forall {{\textbf {z}}}_1\ldots {{\textbf {z}}}_L.\psi '\) where \({{\textbf {z}}}_t\) is the sequence of variables in \(V_t\), and \(\psi '\) is a quantifierfree conjunction of (plain) clauses with variables from \(X\cup V\) only. Since there is only a finite set of of such clauses, the theorem follows.
So, let \(\tau \) denote a permutation of \(\{1,\ldots ,N\}\) so that \(\lambda (A_{\tau (N)})\le \ldots \le \lambda (A_{\tau (1)})\) holds. In particular, \(\lambda (A_{\tau (1)})\) and \(\lambda (A_{\pi (N)})\) denote input predicates of highest and lowest occurring levels, respectively. Then we successively remove the occurrences of \(A_{\tau (1)},A_{\tau (2)},\ldots \) by maintaining the normal form of the resulting formulas — while introducing fresh bound variable names just from the set V. Let \(\psi '_{L+1} = \psi _N\). Now we proceed level by level. for \(L\ge t >0\), assume that \(\forall {{\textbf {z}}}_{t+1}\ldots {{\textbf {z}}}_{L}.\psi '_{t+1}\) has already been constructed such that \(\psi '_{t+1}\) is in our normal form where the heads of all non-trivial negation trees on top-level are distinct and have levels at most t. Let us consider a single generalized clause c of \(\psi '_{t+1}\). Let \(f_{R_1,{\bar{b}}_1},\ldots , f_{R_m{\bar{b}}_m}\) denote the sequence of top-level negation trees in \(\psi '_{t+1}\) of level t in c.
If this sequence is empty, then we set \({{\textbf {z}}}_{t} = \epsilon \) (the empty sequence) and \(\psi '_t\) as the formula obtained from \(\psi '_{t+1}\) by removing all occurrences of subformulas \(\exists {\bar{z}}_R.A'{\bar{b}}{\bar{z}}\wedge \psi '\) with \(\lambda (A') = t\).
Otherwise, let
\(A_{\tau (j_1)}\ldots A_{\tau (j_2)}\) denote the subsequence of input predicates of level
t occurring in
c. Then for each
\(j=1,\ldots ,m\), the negation tree
\(f_{R_j,{\bar{b}}_j}\) is of the form
$$\begin{aligned} \lnot R_j{\bar{b}}_j\wedge \forall {\bar{z}}_{R_j}.\bigwedge _{k=1}^{n_j} \lnot A'_k{\bar{b}}_j^{\sigma _k}{\bar{z}}_{jk}\vee c_{jk} \end{aligned}$$
(17)
for suitable input predicates
\(A'_k\) of level
t, subsequences
\({\bar{z}}_{jk}\) of
\({\bar{z}}_{R_j}\) and (generalized) clauses
\(c_{jk}\) containing predicates only of levels less than
t. Since the heads of the negation trees
\(f_{R_j,{\bar{b}}_j}\) are all distinct, we can rename the corresponding universally quantified variables to distinct sequences
\({\bar{z}}_{R_j,{\bar{b}}_j} = z_{(R_j,{\bar{b}}_j),1}\ldots z_{(R_j,{\bar{b}}_j),l_j}\) if
\(l_j\) is the length of the corresponding list of bound variables and let the sequences
\({\bar{z}}_{{\bar{b}}_j,jk}\) be appropriate subsequences of this list for the
\({\bar{z}}_{jk}\). By inductive hypothesis for variables in
\({\bar{b}}_j\) and the definition of the set
\(V_t\), all these variables are contained in
\(V_t\). Therefore,
c is equivalent to the formula
\(\forall {{\textbf {z}}}_t.c'\) where
\(c'\) coincides with
c up the negation trees of level
t, which now take the form
$$\begin{aligned} \lnot R_j{\bar{b}}_j\wedge \bigwedge _{k=1}^{n_j} \lnot A'_k{\bar{b}}_j^{\sigma _k}{\bar{z}}_{{\bar{b}}_j,jk}\vee c_{jk}[{\bar{z}}_{R_j,{\bar{b}}_j}/{\bar{z}}_{R_j}] \end{aligned}$$
(18)
(
\(j=1,\ldots ,m\)). Performing SO quantifier elimination of all input predicates
\(A_{\tau (j_1)},\ldots ,A_{\tau (j_2)}\) in a row removes all subformulas
\(\exists {\bar{z}}_{R'}.A'{\bar{a}}{\bar{z}}_{R'}\wedge \psi '\) where
\(\lambda (A') = t\) from
\(c'\), and additionally replaces the subformulas (
18) with formulas
$$\begin{aligned} \lnot R_j{\bar{b}}_j\wedge \bigwedge _{k=1}^{n_j} \bigvee _{i=1}^{n_j}({\bar{a}}_{ji}^{\sigma _{ji}}={\bar{b}}_j^{\sigma _k})\wedge g_{ji}[{\bar{b}}_j^{\sigma _k}{\bar{z}}_{{\bar{b}}_j,jk}/{\bar{y}}_{ji}^{\sigma _{ji}}{\bar{z}}_{ji}] \vee c_{jk}[{\bar{z}}_{R_j,{\bar{b}}_j}/{\bar{z}}_{R_j}] \end{aligned}$$
(19)
for suitable conjunctions of (generalized) clauses
\(g_{ji}\) in with constants from
X and free variables from
\({\bar{z}}_{R}\) and some list of parameters
\({\bar{y}}\) for which a subformula
\(\exists {\bar{z}}_{ji}.A'_k{\bar{a}}_{ji}^{\sigma _{ji}}{\bar{z}}_{ji}\wedge g_{ji}[{\bar{a}}_{ji}/{\bar{y}}]\) has occurred in
\(c'\). In particular, each predicate occurring in any of the
\(g_{ji}\) is of level less than
t. By distributivity, the resulting formula is equivalent to a conjunction
\(g'\) of (generalized) clauses
\(c''\) each of which is obtained from
\(c'\) by replacing each of the subformulas (
18) with
\(\lnot R_j{\bar{b}}_j\), or, for some
k and some subset
\(I\subseteq \{1,\ldots ,n_j\}\), with the clause
$$\begin{aligned} \bigvee _{i\in I}({\bar{a}}_{ji}^{\sigma _{ji}}={\bar{b}}_j^{\sigma _k})\vee \bigvee _{i\not \in I} c'_{ji}[{\bar{b}}_j^{\sigma _k}{\bar{z}}_{{\bar{b}}_j,ik}/{\bar{y}}_{ji}^{\sigma _{ji}}{\bar{z}}_{ji}] \vee c_{jk}[{\bar{z}}_{R_j,{\bar{b}}_j}/{\bar{z}}_{R_j}] \end{aligned}$$
(20)
where
\(c'_{ji}\) is any clause of
\(g_{ji}\). The clauses from
\(g'\) constructed in this way, may contain negation trees which agree in their heads. Consider two such negation trees both with head
\(\lnot R{\bar{b}}\) occurring in the same clause
\(c''\) of
\(g'\) on top-level. If both originate from
\(\psi \) or have been introduced by means of the same substitution, say,
\(\theta _j\), the same sequence of substitutions has been applied to both, and subsequently also to their negation trees. Accordingly, the two negation trees are equivalent, meaning that one of them can be removed from
\(c''\). Therefore, now assume that one literal
\(\lnot R{\bar{b}}\) has been introduced by substitution
\(\theta _j\) while the the other did already occur in
\(\psi _{j-1}\). That means that the sequence of substitutions applied to the later one and its negation tree, also is applied to the earlier one and its negation tree. In their disjunction (now introduced due to SO quantifier elimination), therefore, the negation tree of the earlier literal implies the negation tree of the later, and therefore can be omitted. Performing this normalization on the clauses of
\(g'\), we achieve that in the resulting conjunction
\(g''\) of (generalized) clauses
all heads of top-level negation trees are distinct. Then we set
\(\psi '_t\) to
\(g''\).
In order to compute an explicit bound on the number of possible FO variables occurring as arguments to predicates of level
\(t=L,\ldots ,0\), let us introduce the following structural parameters:
$$\begin{aligned} \begin{array}{lll} v &{} \quad \text {---} &{}\quad \text {the number of variables occurring in } \psi \\ L &{} \quad \text {---} &{} \quad \text {the number of levels of predicates } \\ r &{} \quad \text {---} &{} \quad \text {maximal arity of a predicate } \\ m &{} \quad \text {---} &{} \quad \text {maximal number predicates at some level } i \\ l &{} \quad \text {---} &{} \quad \text {maximal length of } {\bar{z}}\text {in subformulas } \exists {\bar{z}}.\,\psi \text {occurring in the substitutions from } \Theta \end{array} \end{aligned}$$
For
\(t=L,\ldots ,0\), we inductively determine a bound
\(B_t\) to the number of distinct FO variables possibly occurring as arguments of literals at level
t. Thereby, we set
\(B_L = v\), since the only literals at level
L occurring in
\(c'\) already must have occurred in
\(\psi \). Therefore, assume that
\(t < L\) and a bound
\(B_{t+1}\) has already been found. Given the number
\(B_{t+1}\), the number of negated literals of predicates at level
\(t+1\) can be bound by
\(m\cdot B_{t+1}^r\). For each of these literals, a fresh list of variables of length at most
l may be provided. Accordingly, we set
$$\begin{aligned} B_t = B_{t+1}+l\cdot m\cdot B_{t+1}^r \le (1+l\cdot m)\cdot B_{t+1}^r \end{aligned}$$
Altogether, this means that the total number
B of variables possibly occurring in literals of
\(c'\) at level at least 0 is bounded by
$$\begin{aligned} B\le \left\{ \begin{array}{ll} (1+l\cdot m)^{L}\cdot v &{}\text {if}\;r=1 \\ \left( 1+ l\cdot m \right) ^{\frac{r^{L}-1}{r-1}}\cdot v^{r^{L}} &{}\text {if}\;r>1 \end{array}\right. \end{aligned}$$
(21)
It remains to consider the case when resets occur either on the highest level
L or at level 1. We claim that with these kinds of resets, still the same set
V for bound variables suffices to construct a FO universally quantified formula for
\(\forall A_1\ldots A_N.\pi (\psi )\). Let us first consider resets of predicates at top-level
L. Each of these, either has no effect or replaces all occurrences of one predicate
R with
\(\lambda (R) = L\) with a quantifierfree formula with occurrences of state predicates of lower levels and no input predicates. The only FO variables occurring in literals
\(R{\bar{b}}\) or
\(\lnot R{\bar{b}}\) for predicates
R of level
L are the variables from
X occurring already in
\(\psi \). Accordingly, we modify the first phase of constructing
\(\psi _1,\ldots ,\psi _N\) as follows. We put the list
\({{\textbf {z}}}_L\) as universally prenex in front of all
\(\psi _j\) in order to have a reservoir of FO variables for all univerally quantified bound variables introduced at level
L. As soon as an update substitution
\(\theta _i\) of a predicate
R at level
L occurs, we rename in each clause the universally quantified variables from
\({\bar{z}}_R\) introduced for
\(\lnot R{\bar{b}}\) to
\({\bar{z}}_{R,{\bar{b}}}\) from the reservoir and
immediately perform SO quantifier elimination for
\(A_i\). Then we bring the resulting formula into the format
g to obtain
\(\psi _i\). With this preparation, a reset
\(\theta _j\) at level
L can be dealt with during the construction of the sequence
\(\psi _1,\ldots ,\psi _N\) just by replacing literals at level
L with appropriate clauses of literals with predicates of lower level all using variables from
X as arguments only. In particular, no new variables are introduced. For the second phase, it then remains to perform SO quantifier elimination just for the levels
\(L-1,\ldots ,1\).
Now additionally, consider resets of predicates at level 1. In principle, we proceed as before. The resets at level 1, however, may additionally introduce subformulas of the form
\(o_{{\bar{b}}}\) in clauses where
$$\begin{aligned} o_{{\bar{b}}} \;{::}=\; \forall {\bar{z}}_R.\,\bigwedge _{k=1}^n(\lnot A_k{\bar{b}}{\bar{z}}\vee c_k) \end{aligned}$$
(22)
where
\({\bar{b}}\) is a sequence of variables,
\(A_k\) are input variables of level 1, and
\(c_n\) contains predicates of level 0 only with arguments possibly from
X,
\({\bar{b}}\), and
\({\bar{z}}\). No uniqueless can be guaranteed for subformulas of the form
\(o_{{\bar{b}}}\). We note, however, that in the second phase when we perform SO quantifier elimination of SO predicates of level 1 for a (generalized) clause
c, a formula is encountered of the form
$$\begin{aligned} c_0\vee \bigvee _{l=1}^r\forall {\bar{z}}_R.\bigwedge _{j=1}^{m_l}(\bigvee _{i=1}^{n_{lj}}({\bar{b}}_l={\bar{a}}_{lji})\wedge g_{lji}\vee c_{lj}) \end{aligned}$$
(23)
where
\(c_0\) consists of literals using variables from
\(V'\), the lists of variables
\({\bar{a}}_{lji}\) are only from
\(V'\), and the subformulas
\(g_{lji}\) and
\(c_{lj}\) use state predicates of level 0 only where all occurring FO variables are from
\(V'\cup Z\). The number of non-equivalent such formulas
\(g_{lji}\) as well as
\(c_{lj}\) with this restriction onto the occurring FO variables, however, is finite. Therefore, there is a fixed finite prefix of FO variables, independent of the length of the sequence of substitutions
\(\pi \) to take all universal quantifications from (
23) into account.
Altogether, therefore, the number of FO variables in quantified clauses
\(\forall {\bar{z}}'.c'\) contained in
\(\pi (\psi )\) remains bounded – even when we allow resets both on the maximal level and on level 1. This completes the proof of lemma
9.