Skip to main content

About this book

This book constitutes the proceedings of the 15th International Conference on Language and Automata Theory and Applications, LATA 2021, held in Milan, Italy, in March 2021.

The 26 full papers presented in this volume were carefully reviewed and selected from 52 submissions. They were organized in topical sections named: algebraic structures; automata; complexity; learning; logics and languages; trees and graphs; and words and strings.

Table of Contents


Algebraic Structures


On Language Varieties Without Boolean Operations

Eilenberg’s variety theorem marked a milestone in the algebraic theory of regular languages by establishing a formal correspondence between properties of regular languages and properties of finite monoids recognizing them. Motivated by classes of languages accepted by quantum finite automata, we introduce basic varieties of regular languages, a weakening of Eilenberg’s original concept that does not require closure under any boolean operations, and prove a variety theorem for them. To do so, we investigate the algebraic recognition of languages by lattice bimodules, generalizing Klíma and Polák’s lattice algebras, and we utilize the duality between algebraic completely distributive lattices and posets.
Fabian Birkmann, Stefan Milius, Henning Urbat

Partially Directed Animals with a Bounded Number of Holes

We address the problem of the exhaustive generation of a particular class of polyominoes, corresponding to partially directed animals with a bounded number of holes. We apply an approach based on discrete dynamical systems to develop an algorithm that generates each polyomino in constant amortized time and space O(n). By implementing the algorithm in C++ we have obtained new sequences that do not appear in the On-Line Encyclopedia of Integer Sequences.
Valentina Dorigatti, Paolo Massazza

On the Computational Power of Programs over Monoid

The PLP conjecture for monoids states that for every monoid M, either M is universal (that is, for every language \(L \subseteq \varSigma ^*\) there is a program over M which accepts the language L) or it has the polynomial length property (that is, every program over the monoid M has an equivalent program of length \({\mathsf {poly}}(n)\)). The conjecture has been confirmed (Tesson-Therien (2001)) for the case of groups and several subclasses of aperiodic monoids such as the variety DA and the monoids divided by the monoid U. However, the case of the set of monoids divided by the monoid \(\mathsf {BA}_2\) is still open, which if resolved, confirms the conjecture for all aperiodic monoids.
In this paper, we make progress towards confirming the conjecture for the case when the monoid is \(\mathsf {BA}_2\). It is known (Tesson-Therien 2001) already that the monoid \(\mathsf {BA}_2\) is not universal.
Towards proving that the monoid \(\mathsf {BA}_2\) has polynomial length property, we show the following results: we define a program over a monoid M to be a non-nullable program if there is no input for which the yield of the program is the zero of the monoid. We prove the following:
  • If a program over \(\mathsf {BA}_2\) is non-nullable, then there is an equivalent program with length at most \({\mathsf {poly}}(n)\).
  • If a program over \(\mathsf {BA}_2\) is nullable, then it should be exponentially non-nullable - that is there should be at least \(2^{\varOmega (n)}\) many inputs which send the output of the program to 0 of \(\mathsf {BA}_2\). We show that for any program P over \(\mathsf {BA}_2\), if the zeroes of the program have a witness subprogram of polynomial length, then there is a program of length \({\mathsf {poly}}(n)\) equivalent to program P.
On the universality front, Tesson and Therien(2001) have already shown that PARITY cannot be computed by programs over \(\mathsf {BA}_2\). We strengthen this in two ways. Firstly, we show that programs over \(\mathsf {BA}_2\) cannot accept any subset of PARITY or \(\overline{\mathsf {PARITY}}\) of size \(n^{\omega (1)}\). Secondly, we generalize the model of programs to allow parity queries to the input instead of variables. We show that \(\mathsf {BA}_2\) cannot compute parity of n input bits even when parity queries are allowed of size \(k < \frac{n}{3}\). In contrast, we show that there are polynomial length programs over \(\mathsf {BA}_2\) to compute parity when queries are allowed as parity of \(\frac{n}{3}\) bits or higher.
Manasi S. Kulkarni, Jayalal Sarma, Janani Sundaresan



Location Based Automata for Expressions with Shuffle

We define the notion of location for regular expressions with shuffle by extending the notion of position in standard regular expressions. Locations allow for the definition of the sets \(\mathsf {Follow}\), \(\mathsf {First}\) and \(\mathsf {Last}\) with their usual semantics. From these, we construct an automaton for regular expressions with shuffle (\(\mathcal {A}_{\mathrm {POS}}\)), which generalises the standard position/Glushkov automaton. The sets mentioned above are also the foundation for other constructions, such as the Follow automaton, and automata based on pointed expressions. As a consequence, all these constructions can now be directly generalised to regular expressions with shuffle, as well as their known relationships. We also show that the partial derivative automaton (\(\mathcal {A}_{\mathrm {PD}}\)) is a (right) quotient of the new position automaton, \(\mathcal {A}_{\mathrm {POS}}\). In previous work an automaton construction based on positions was studied (\(\mathcal {A}_{\partial pos}\)), and here we relate \(\mathcal {A}_{\mathrm {POS}}\) and \(\mathcal {A}_{\partial pos}\). Finally, we extend the construction of the prefix automaton \(\mathcal {A}_{\mathrm {Pre}}\) to the shuffle operator and show that it is not a quotient of \(\mathcal {A}_{\mathrm {POS}}\).
Sabine Broda, António Machiavelo, Nelma Moreira, Rogério Reis

Succinct Representations for (Non)Deterministic Finite Automata

Deterministic finite automata are one of the simplest and most practical models of computation studied in automata theory. Their extension is the non-deterministic finite automata which also have plenty of applications. In this article, we study these models through the lens of succinct data structures where our ultimate goal is to encode these mathematical objects using information theoretically optimal number of bits along with supporting queries on them efficiently. Towards this goal, we first design a succinct data structure for representing any deterministic finite automaton \(\mathcal {D}\) having n states over a \(\sigma \)-letter alphabet \(\varSigma \) using \((\sigma -1) n\log n (1+o(1))\) bits, which can determine, given an input string x over \(\varSigma \), whether \(\mathcal {D}\) accepts x in optimal O(|x|) time. We also consider the case when there are \(N < \sigma n\) non-failure transitions, and obtain various time-space trade-offs in both the cases. When the input deterministic finite automaton \(\mathcal {A}\) is acyclic, not only we can improve the above space bound significantly to \((\sigma -1) (n-1)\log n+ O(n + \log ^2 \sigma )\) bits, we can also check if an input string x over \(\varSigma \) can be accepted by \(\mathcal {A}\) optimally in O(|x|) time. We also exhibit a succinct data structure for representing a non-deterministic finite automaton \(\mathcal {N}\) having n states over a \(\sigma \)-letter alphabet \(\varSigma \) using \(\sigma n^2+n\) bits of space, such that given an input string x, we can decide whether \(\mathcal {N}\) accepts x efficiently in \(O(n^2|x|)\) time. Finally, we also provide time and space efficient algorithms for performing several standard operations such as union, intersection and complement on the languages accepted by deterministic finite automata.
Sankardeep Chakraborty, Roberto Grossi, Kunihiko Sadakane, Srinivasa Rao Satti

Optimising Attractor Computation in Boolean Automata Networks

This paper details a method for optimising the size of Boolean automata networks in order to compute their attractors under the parallel update schedule. This method relies on the formalism of modules introduced recently that allows for (de)composing such networks. We discuss the practicality of this method by exploring examples. We also propose results that nail the complexity of most parts of the process, while the complexity of one part of the problem is left open.
Kévin Perrot, Pacôme Perrotin, Sylvain Sené

On the Transformation of Two-Way Deterministic Finite Automata to Unambiguous Finite Automata

The paper estimates the number of states in an unambiguous finite automaton (UFA) that is sufficient and in the worst case necessary to simulate an n-state two-way deterministic finite automaton (2DFA). It is proved that a 2DFA with n states can be transformed to a UFA with fewer than \(2^n \cdot n!\) states. On the other hand, for every n, there is a language recognized by an n-state 2DFA that requires a UFA with at least \(\varOmega ((4\sqrt{2})^n \cdot n^{-1/2})\) states. The latter result is proved by estimating the rank of a certain matrix.
Semyon Petrov, Alexander Okhotin



Deciding Non-emptiness of Hypergraph Languages Generated by Connection-preserving Fusion Grammars is NP-complete

Fusion grammars are a novel approach to the generation of hypergraph languages. A fusion grammar is a hypergraph grammar which provides a start hypergraph of small connected components. To get large connected hypergraphs, they can be copied multiple times and can be fused by the application of fusion rules. In this paper, we analyze the non-emptiness problem for connection-preserving fusion grammars and show that this is an NP complete problem. We show this by relating language generation by connection-preserving fusion grammars to some variant of integer linear programming.
Aaron Lye

On the Power of Nondeterministic Circuits and Co-Nondeterministic Circuits

Revealing the power of nondeterministic computation and co-nondeterministic computation is one of the central problems in computational complexity. In this paper, we consider the two computation and deterministic computation in Boolean circuits. We give the first separations on the power of deterministic circuits, nondeterministic circuits, and co-nondeterministic circuits in general circuits. More precisely, we prove the following facts.
  • There is an explicit Boolean function f such that the nondeterministic \(U_2\)-circuit complexity of f is at most \(2n + o(n)\) and the deterministic and co-nondeterministic \(U_2\)-circuit complexity of f is \(3n - o(n)\).
  • There is an explicit Boolean function f such that the co-nondeterministic \(U_2\)-circuit complexity of f is at most \(2n + o(n)\) and the deterministic and nondeterministic \(U_2\)-circuit complexity of f is \(3n - o(n)\).
Hiroki Morizumi

On Hardest Languages for One-Dimensional Cellular Automata

Since the famous construction of “the hardest context-free language” by Greibach (1973), the existence of hardest languages under homomorphic reductions has been investigated for quite a few language families. This paper shows that for one-way real-time cellular automata, also known as trellis automata, there is no hardest language, whereas for linear-time cellular automata, the hardest language is constructed.
Mikhail Mrykhin, Alexander Okhotin

Usefulness of Information and Unary Languages

In this paper we continue the research on usefulness of information examining the effect of supplementary information on the complexity of solving a problem (see Rovan and Sádovský [7] for an overview). We use deterministic finite automata for a formal setting. Given a problem (a regular language) \(L_{prob}\) we measure the complexity of its solution – a DFA \(A_{prob}\) such that \(L_{prob} = L(A_{prob})\) – using the state complexity. A supplementary information (advice) \(L_{adv}\) given by \(A_{adv}\) is useful if a simpler problem \(L_{new}\) given by \(A_{new}\) exists such that \(L_{prob} = L_{new} \cap L_{adv}\) and both \(L_{new}\) and \(L_{adv}\) are simpler than \(L_{prob}\). This is formalized via the notion of decomposability of finite automata (see [1] for DFA case and [7] for NFA case). We address the problem of decomposability of unary regular languages and give a characterization of \(\lambda \)-cyclic languages upon deterministic decomposability.
Giovanni Pighizzini, Branislav Rovan, Šimon Sádovský



Learnability and Positive Equivalence Relations

Prior work of Gavryushkin, Khoussainov, Jain and Stephan investigated what algebraic structures can be realised in worlds given by a positive (= recursively enumerable) equivalence relation which partitions the natural numbers into infinitely many equivalence classes. The present work investigates the infinite one-one numbered recursively enumerable (r.e.) families realised by such relations and asks how the choice of the equivalence relation impacts the learnability properties of these classes when studying learnability in the limit from positive examples, also known as learning from text. For all choices of such positive equivalence relations, for each of the following entries, there are one-one numbered r.e. families which satisfy it: (a) they are behaviourally correctly learnable but not vacillatorily learnable; (b) they are explanatorily learnable but not confidently learnable; (c) they are not behaviourally correctly learnable. Furthermore, there is a positive equivalence relation which enforces that (d) every vacillatorily learnable one-one numbered family of languages closed under this equivalence relation is already explanatorily learnable and cannot be confidently learnable.
David Belanger, Ziyuan Gao, Sanjay Jain, Wei Li, Frank Stephan

Learning Mealy Machines with One Timer

We present Mealy machines with a single timer (MM1Ts), a class of models that is both sufficiently expressive to describe the real-time behavior of many realistic applications, and can be learned efficiently. We show how learning algorithms for MM1Ts can be obtained via a reduction to the problem of learning Mealy machines. We describe an implementation of an MM1T learner on top of LearnLib, and compare its performance with recent algorithms proposed by Aichernig et al. and An et al. on several realistic benchmarks.
Frits Vaandrager, Roderick Bloem, Masoud Ebrahimi

Logics and Languages


Finite-Word Hyperlanguages

Formal languages are in the core of models of computation and their behavior. A rich family of models for many classes of languages have been widely studied. Hyperproperties lift conventional trace-based languages from a set of execution traces to a set of sets of executions. Hyperproperties have been shown to be a powerful formalism for expressing and reasoning about information-flow security policies and important properties of cyber-physical systems. Although there is an extensive body of work on formal-language representation of trace properties, we currently lack such a general characterization for hyperproperties. We introduce hyperlanguages over finite words and models for expressing them. Essentially, these models express multiple words by using assignments to quantified word variables. Relying on the standard models for regular languages, we propose hyperregular expressions and finite-word hyperautomata (NFH), for modeling the class of regular hyperlanguages. We demonstrate the ability of regular hyperlanguages to express hyperproperties for finite traces. We explore the closure properties and the complexity of the fundamental decision problems such as nonemptiness, universality, membership, and containment for various fragments of NFH.
Borzoo Bonakdarpour, Sarai Sheinvald

Temporal Logics with Language Parameters

We develop a generic framework to extend the logics LTL, CTL\(^+\) and CTL\(^*\) by automata-based connectives from formal language classes and analyse this framework with regard to regular languages, visibly pushdown languages and (deterministic) context-free languages. More precisely, we consider how the use of different automata classes changes the expressive power of the logics and provide algorithms for the satisfiability and model checking problems induced by the use of different automata. For the model checking problem, we treat not only finite Kripke transition systems, but also visibly pushdown systems and pushdown systems. We provide completeness or undecidability results in almost all cases and show that the extensions we consider can formulate properties not expressible in classical temporal logics or regular extensions thereof.
Jens Oliver Gutsfeld, Markus Müller-Olm, Christian Dielitz

Commutative Rational Term Rewriting

Term rewriting for rational terms, i.e. infinite terms with a finite number of different subterms, has been considered e.g. in Corradini & Gadducci (1998) and Aoto & Ketema (2012). In this paper, we consider rational term rewriting by a set of commutativity rules i.e. rules of the form \(f(x,y) \rightarrow f(y,x)\), based on the framework of Aoto & Ketema (2012). A rewrite step with a commutativity rule is specified via a regular set of redex positions, thus via a finite automaton. We present some finite automata constructions that correspond to (in particular) taking inverse rewrite steps, merging two branching rewrite steps, and merging two consecutive rewrite steps. As a corollary, we show that rational rewrite steps by the commutativity rules are closed under taking equivalence of the rewrite steps.
Mamoru Ishizuka, Takahito Aoto, Munehiro Iwami

Context-Free Grammars with Lookahead

We introduce context-free grammars with lookahead. The grammars are an extension of both context-free grammars and parsing expression grammars, hence we can handle the two grammars in a unified way. To accommodate lookahead, we use a language with lookahead, which is a set of string pairs. We considered the grammar as a system of equations and give the language with lookahead by the limit of iterations from the empty set. The language class is closed under union, intersection, complement, and a weak version of concatenation and Kleene star.
Takayuki Miyazaki, Yasuhiko Minamide

Tree-Like Unit Refutations in Horn Constraint Systems

In this paper, we examine the problem of finding unit refutations of Horn constraint systems (HCSs). Recall that a Horn constraint is a linear constraint in which every coefficient belongs to the set \(\{0,1,-1\}\) and in which at most one coefficient is positive. In the current work, we extend the notion of unit refutations from CNF formulas to systems of linear constraints. Recall that for CNF formulas a unit resolution refutation is one in which every resolution step uses a one-literal (unit) clause. The equivalent notion in linear systems requires every inference step to use a one-variable (absolute) constraint. We analyze two problems associated with unit refutations of Horn constraint systems. In the length-bounded tree-like unit refutation (TLUR\(_{D}\)) problem, we ask if a given Horn constraint system has a tree-like unit refutation using at most L inference steps. In the optimal tree-like unit refutation (TLUR\(_{Opt}\)) problem, we ask for a tree-like unit refutation with the fewest inference steps. We show that the former problem is NP-complete and the latter is NPO-complete. We also show that the TLUR\(_{D}\) problem does not admit a polynomial size kernel with respect to a natural output parameter under some well-accepted complexity theoretic assumptions.
K. Subramani, Piotr Wojciechowski

Trees and Graphs


Homomorphic Characterization of Tree Languages Based on Comma-Free Encoding

A classic result in language theory is Medvedev’s theorem for trees, stating that any regular tree language can be defined by the projection of a local tree language, i.e., of a language defined by its tiles of height 2, a.k.a. di-grams. The simple proof of the statement is based on a local alphabet whose size (linearly) depends on the number of states of the tree automaton recognizing the original language. We prove a new extended version of Medvedev’s theorem for trees, by using a k-locally testable tree language defined by tiles of height \(k\ge 2\) (k-grams). The size of the local alphabet is just the double of the original one, hence it is independent from the complexity of the tree automaton. This result relies on an encoding of the states traversed by a tree automaton, by means of binary comma-free codes carefully laid on tree paths. We thus generalize from words to trees our recent extended Medvedev’s theorem for word languages that was based on strictly locally testable word languages. By applying the result to the syntax trees of context-free grammars, we characterize them by k-locally testable, binary-labeled trees .
Stefano Crespi Reghizzi, Pierluigi San Pietro

Approximated Determinisation of Weighted Tree Automata

We introduce the notion of t-approximated determinisation and the t-twinning property of weighted tree automata (\(\mathrm {WTA}\)) over the tropical semiring. We provide an algorithm that accomplishes t-approximated determinisation of an input automaton \(\mathscr {A}\), whenever it terminates. Moreover, we prove that the t-twinning property of \(\mathscr {A}\) is a sufficient condition for the termination of our algorithm. Ultimately, we show decidability of the t-twinning property for \(\mathrm {WTA}\).
Frederic Dörband, Thomas Feller, Kevin Stier

Sequentiality of Group-Weighted Tree Automata

We introduce the notion of group-weighted tree automata over commutative groups and characterise sequentialisability of such automata. In particular, we introduce a fitting notion for tree distance and prove the equivalence between sequentialisability, the so-called Lipschitz property, and the so-called twinning property.
Frederic Dörband, Thomas Feller, Kevin Stier

An Algorithm for Single-Source Shortest Paths Enumeration in Parameterized Weighted Graphs

We consider weighted graphs with parameterized weights and we propose an algorithm that, given such a graph and a source node, builds a collection of trees, each one describing the shortest paths from the source to all the other nodes of the graph for a particular zone of the parameter space. Moreover, the union of these zones covers the full parameter space: given any valuation of the parameters, one of the trees gives the shortest paths from the source to all the other nodes of the graph when the weights are computed using this valuation.
Bastien Sérée, Loïg Jezequel, Didier Lime

Words and Strings


On Balanced Sequences and Their Asymptotic Critical Exponent

We study aperiodic balanced sequences over finite alphabets. A sequence \(\mathbf {v}\) of this type is fully characterised by a Sturmian sequence \(\mathbf {u}\) and two constant gap sequences \(\mathbf {y}\) and \(\mathbf {y}'\). We study the language of \(\mathbf {v}\), with focus on return words to its factors. We provide a uniform lower bound on the asymptotic critical exponent of all sequences \(\mathbf {v}\) arising by \(\mathbf {y}\) and \(\mathbf {y}'\). It is a counterpart to the upper bound on the least critical exponent of \(\mathbf {v}\) conjectured and partially proved recently in works of Baranwal, Rampersad, Shallit and Vandomme. We deduce a method computing the exact value of the asymptotic critical exponent of \(\mathbf {v}\) provided the associated Sturmian sequence \(\mathbf {u}\) has a quadratic slope. The method is used to compare the critical and the asymptotic critical exponent of balanced sequences over an alphabet of size \(d\le 10\) which are conjectured by Rampersad et al. to have the least critical exponent.
Francesco Dolce, L’ubomíra Dvořáková, Edita Pelantová

Completely Reachable Automata, Primitive Groups and the State Complexity of the Set of Synchronizing Words

We give a new characterization of primitive permutation groups tied to the notion of completely reachable automata. Also, we introduce sync-maximal permutation groups tied to the state complexity of the set of synchronizing words of certain associated automata and show that they are contained between the 2-homogeneous and the primitive groups. Lastly, we define k-reachable groups in analogy with synchronizing groups and motivated by our characterization of primitive permutation groups. But the results show that a k-reachable permutation group of degree n with \(6 \le k \le n - 6\) is either the alternating or the symmetric group.
Stefan Hoffmann

State Complexity of the Set of Synchronizing Words for Circular Automata and Automata over Binary Alphabets

Most slowly synchronizing automata over binary alphabets are circular, i.e., containing a letter permuting the states in a single cycle, and their set of synchronizing words has maximal state complexity, which also implies complete reachability. Here, we take a closer look at generalized circular and completely reachable automata. We derive that over a binary alphabet every completely reachable automaton must be circular, a consequence of a structural result stating that completely reachable automata over strictly less letters than states always contain permutational letters. We state sufficient conditions for the state complexity of the set of synchronizing words of a generalized circular automaton to be maximal. We apply our main criteria to the family \(\mathscr {K}_n\) of automata that was previously only conjectured to have this property.
Stefan Hoffmann

Cadences in Grammar-Compressed Strings

Cadences are structurally maximal arithmetic progressions of indices corresponding to equal characters in an underlying string.
This paper provides a detection algorithm for 3-cadences in binary strings which runs in linear time on uncompressed strings and in polynomial time on grammar-compressed strings.
Furthermore, this paper proves that several variants of the cadence detection problem are \(\mathcal {NP}\)-complete on grammar-compressed strings and that the equidistant subsequence matching problem with patterns of length three is \(\mathcal {NP}\)-complete on grammar-compressed ternary strings.
Julian Pape-Lange


Additional information