main-content

## Über dieses Buch

This book constitutes the proceedings of the 14th International Conference on Language and Automata Theory and Applications, LATA 2020, held in Milan, Italy, in March 2020. The 26 full papers presented in this volume were carefully reviewed and selected from 59 submissions. They were organized in topical sections named: algebraic structures; automata; complexity; grammars; languages; trees and graphs; and words and codes. The book also contains 6 invited papers in full-paper length.

## Inhaltsverzeichnis

### The New Complexity Landscape Around Circuit Minimization

We survey recent developments related to the Minimum Circuit Size Problem.

Eric Allender

### Containment and Equivalence of Weighted Automata: Probabilistic and Max-Plus Cases

This paper surveys some results regarding decision problems for probabilistic and max-plus automata, such as containment and equivalence. Probabilistic and max-plus automata are part of the general family of weighted automata, whose semantics are maps from words to real values. Given two weighted automata, the equivalence problem asks whether their semantics are the same, and the containment problem whether one is point-wise smaller than the other one. These problems have been studied intensively and this paper will review some techniques used to show (un)decidability and state a list of open questions that still remain.

Laure Daviaud

### Approaching Arithmetic Theories with Finite-State Automata

The automata-theoretic approach provides an elegant method for deciding linear arithmetic theories. This approach has recently been instrumental for settling long-standing open problems about the complexity of deciding the existential fragments of Büchi arithmetic and linear arithmetic over p-adic fields. In this article, which accompanies an invited talk, we give a high-level exposition of the NP upper bound for existential Büchi arithmetic, obtain some derived results, and further discuss some open problems.

Christoph Haase

### Recompression: Technique for Word Equations and Compressed Data

In this talk I will present the recompression technique on the running example of word equations. In word equation problem we are given an equation $$u = v$$, where both u and v are words of letters and variables, and ask for a substitution of variables by words that equalizes the sides of the equation. The recompression technique is based on employing simple compression rules (replacement of two letters ab by a new letter c, replacement of maximal repetitions of a by a new letter), and modifying the equations (replacing a variable X by bX or Xa) so that those operations are sound and complete. The simple analysis focuses on the size of the instance and not on the combinatorial properties of words that are used. The recompression-based algorithm for word equations runs in nondeterministic linear space.The approach turned out to be quite robust and can be applied to various generalized, simplified and related problems, in particular, to problems in the area of grammar compressed words. I will comment on some of those applications.

Artur Jeż

### How to Prove that a Language Is Regular or Star-Free?

This survey article presents some standard and less standard methods used to prove that a language is regular or star-free.

Jean-Éric Pin

### Deciding Classes of Regular Languages: The Covering Approach

We investigate the membership problem that one may associate to every class of languages $$\mathcal {C}$$. The problem takes a regular language as input and asks whether it belongs to $$\mathcal {C}$$. In practice, finding an algorithm provides a deep insight on the class $$\mathcal {C}$$. While this problem has a long history, many famous open questions in automata theory are tied to membership. Recently, a breakthrough was made on several of these open questions. This was achieved by considering a more general decision problem than membership: covering. In the paper, we investigate how the new ideas and techniques brought about by the introduction of this problem can be applied to get new insight on earlier results. In particular, we use them to give new proofs for two of the most famous membership results: Schützenberger’s theorem and Simon’s theorem.

Thomas Place

### Nonstandard Cayley Automatic Representations for Fundamental Groups of Torus Bundles over the Circle

We construct a new family of Cayley automatic representations of semidirect products $$\mathbb {Z}^n \rtimes _A \mathbb {Z}$$ for which none of the projections of the normal subgroup $$\mathbb {Z}^n$$ onto each of its cyclic components is finite automaton recognizable. For $$n=2$$ we describe a family of matrices from $$\mathrm {GL}(2,\mathbb {Z})$$ corresponding to these representations. We are motivated by a problem of characterization of all possible Cayley automatic representations of these groups.

Dmitry Berdinsky, Prohrak Kruengthomya

### Is Decidable in

We show that it is decidable whether or not a relation on the reals definable in the structure $$\langle \mathbb {R}, +,< ,\mathbb {Z}\rangle$$ can be defined in the structure $$\langle \mathbb {R}, +,<,1 \rangle$$. This result is achieved by obtaining a topological characterization of $$\langle \mathbb {R}, +,<,1 \rangle$$-definable relations in the family of $$\langle \mathbb {R}, +,< ,\mathbb {Z}\rangle$$-definable relations and then by following Muchnik’s approach of showing that this characterization can be expressed in the logic of $$\langle \mathbb {R}, +,<,1 \rangle$$.

Alexis Bès, Christian Choffrut

### Ordered Semiautomatic Rings with Applications to Geometry

The present work looks at semiautomatic rings with automatic addition and comparisons which are dense subrings of the real numbers and asks how these can be used to represent geometric objects such that certain operations and transformations are automatic. The underlying ring has always to be a countable dense subring of the real numbers and additions and comparisons and multiplications with constants need to be automatic. It is shown that the ring can be selected such that equilateral triangles can be represented and rotations by $$30^\circ$$ are possible, while the standard representation of the b-adic rationals does not allow this.

Ziyuan Gao, Sanjay Jain, Ji Qi, Philipp Schlicht, Frank Stephan, Jacob Tarr

### Boolean Monadic Recursive Schemes as a Logical Characterization of the Subsequential Functions

This paper defines boolean monadic recursive schemes (BMRSs), a restriction on recursive programs, and shows that when interpreted as transductions on strings they describe exactly the subsequential functions. We discuss how this new result furthers the study of the connections between logic, formal languages and functions, and automata.

### Expressiveness and Conciseness of Timed Automata for the Verification of Stochastic Models

Timed Automata are a well-known formalism for specifying timed behaviours. In this paper we are concerned with Timed Automata for the specification of timed behaviour of Continuous Time Markov Chains (CTMC), as used in the stochastic temporal logic CSL$$^{\!\text {TA}}$$. A timed path formula of CSL$$^{\!\text {TA}}$$ is specified by a Deterministic Timed Automaton (DTA) that features two kinds of transitions: synchronizing transitions (triggered by CTMC transitions) and autonomous transitions (triggered when a clock reaches a given threshold). Other definitions of CSL$$^{\!\text {TA}}$$ are based on DTAs that do not include autonomous transitions. This raises the natural question: do autonomous transitions enhance expressiveness and/or conciseness of DTAs? We prove that this is the case and we provide a syntactical characterization of DTAs for which autonomous transitions do not add expressive power, but allow one to define exponentially more concise DTAs.

### Windable Heads and Recognizing NL with Constant Randomness

Every language in NL has a k-head two-way nondeterministic finite automaton (2nfa($$k$$)) recognizing it. It is known how to build a constant-space verifier algorithm from a 2nfa($$k$$) for the same language with constant-randomness, but with error probability that can not be reduced further by repetition. We have defined the unpleasant characteristic of the heads that causes the high error as the property of being “windable”. With a tweak on the previous verification algorithm, the error is improved to , where $$k_{\text {W}}\le k$$ is the number of windable heads. Using this new algorithm, a subset of languages in NL that have a 2nfa($$k$$) recognizer with $$k_{\text {W}}\le 1$$ can be verified with arbitrarily reducible error using constant space and randomness.

Mehmet Utkan Gezer

### Alternating Finite Automata with Limited Universal Branching

We consider measures that limit universal parallelism in computations of an alternating finite automaton (AFA). Maximum pared tree width counts the largest number of universal branches in any computation and acceptance width counts the number of universal branches in the best accepting computation, i.e., in the accepting computation with least universal parallelism. We give algorithms to decide whether the maximum pared tree width or the acceptance width of an AFA are bounded by an integer k. For a constant k the algorithm for maximum pared tree width operates in polynomial time. An AFA with m states and acceptance width k can be converted to an NFA with $$(m+1)^k$$ states. We consider corresponding lower bounds for the transformation. The tree width of an AFA counts the number of all (existential and universal) branches of the computation. We give upper and lower bounds for converting an AFA of bounded tree width to a DFA.

Chris Keeler, Kai Salomaa

### Pebble-Intervals Automata and FO with Two Orders

We introduce a novel automata model, which we call pebble-intervals automata (PIA), and study its power and closure properties. PIAs are tailored for a decidable fragment of FO that is important for reasoning about structures that use data values from infinite domains: the two-variable fragment with one total preorder and its induced successor relation, one linear order, and an arbitrary number of unary relations. We prove that the string projection of every language of data words definable in the logic is accepted by a pebble-intervals automaton , and obtain as a corollary an automata-theoretic proof of the $$\textsc {ExpSpace}$$ upper bound for finite satisfiability due to Schwentick and Zeume.

Nadia Labai, Tomer Kotek, Magdalena Ortiz, Helmut Veith

### Limited Two-Way Deterministic Finite Automata with Advice

External assistance in the form of strings called advice is given to an automaton in order to make it a non-uniform model of computation. Automata with advice are then examined to better understand the limitations imposed by uniformity, which is a typical property shared by all feasible computational models. The main contribution of this paper is to introduce and investigate an extension of the model introduced by Küçük et al. [6]. The model is called circular deterministic finite automaton with advice tape (cdfat). In this model the input head is allowed to pass over input multiple times. The number of allowed passes over the input, which is typically a function of input length, is considered as a resource besides the advice amount. The results proved for the model include a hierarchy for cdfat with real-time heads, simulation of 1w/1w cdfat by 1w/rt cdfat, lower bounds of resources provided to a cdfat in order to make it powerful enough to recognize any language, utilizable advice limit regardless of the allowed pass limit, a relation between utilizable pass limit and advice limit, and some closure properties.

Ahmet Bilal Uçan

### On the Size of Depth-Two Threshold Circuits for the Inner Product Mod 2 Function

In this paper, we study the size of depth-two threshold circuits computing the inner product mod 2 function $$\mathsf{IP2}_n(x_1, \ldots , x_n, y_1, \ldots , y_n) := \sum _{i} x_i y_i$$ (mod 2). First, we reveal that $$\mathsf{IP2}_n$$ can be computed by a depth-two threshold circuit of size significantly smaller than a folklore construction of size $$O(2^n)$$. Namely, we give a construction of such a circuit (denoted by $$\mathsf{THR}\circ \mathsf{THR}$$ circuit) of size $$O(1.682^n)$$. We also give an upper bound of $$O(1.899^n)$$ for the case that the weights of the top threshold gate are polynomially bounded (denoted by $$\mathsf{MAJ}\circ \mathsf{THR}$$ circuit). Second, we give new lower bounds on the size of depth-two circuits of some special form; the top gate is an unbounded weight threshold gate and the bottom gates are symmetric gates (denoted by $$\mathsf{THR}\circ \mathsf{SYM}$$ circuit). We show that any such circuit computing $$\mathsf{IP2}_n$$ has size $$\varOmega ((1.5-\epsilon )^n)$$ for every constant $$\epsilon > 0$$. This improves the previous bound of $$\varOmega (\sqrt{2}^n/n)$$ based on the sign-rank method due to Forster et al. [JCSS ’02, FSTTCS ’01]. Our technique has a unique feature that the lower bound is obtained by giving an explicit feasible solution to (the dual of) a certain linear programming problem. In fact, the problem itself was presented by the author over a decade ago [MFCS ’05], and finding a good solution is an actual contribution of this work.

Kazuyuki Amano

### Complexity Issues of String to Graph Approximate Matching

The problem of matching a query string to a directed graph, whose vertices are labeled by strings, has application in different fields, from data mining to computational biology. Several variants of the problem have been considered, depending on the fact that the match is exact or approximate and, in this latter case, which edit operations are considered and where are allowed. In this paper we present results on the complexity of the approximate matching problem, where edit operations are symbol substitutions and are allowed only on the graph labels or both on the graph labels and the query string. We introduce a variant of the problem that asks whether there exists a path in a graph that represents a query string with any number of edit operations and we show that is NP-complete, even when labels have length one and in the case the alphabet is binary. Moreover, when it is parameterized by the length of the input string and graph labels have length one, we show that the problem is fixed-parameter tractable and it is unlikely to admit a polynomial kernel. The NP-completeness of this problem leads to the inapproximability (within any factor) of the approximate matching when edit operations are allowed only on the graph labels. Moreover, we show that the variants of approximate string matching to graph we consider are not fixed-parameter tractable, when the parameter is the number of edit operations, even for graphs that have distance one from a DAG. The reduction for this latter result allows us to prove the inapproximability of the variant where edit operations can be applied both on the query string and on graph labels.

Riccardo Dondi, Giancarlo Mauri, Italo Zoppis

### Complexity of Automatic Sequences

Automatic sequences can be defined by DFAs with output (DFAO) in two natural ways. We propose to consider the minimal size of a corresponding DFAO as the complexity measure of the automatic sequence, for both variants. This paper compares these complexity measures and investigates their properties like the relationships with kernel and morphic sequences. There exist automatic sequences for which the one complexity is exponentially greater than the other one, in both directions. For both complexity measures we investigate the effect of taking basic operations on sequences like removing or adding an element in front, and observe that these operations may increase the complexity by at most a quadratic factor.

Hans Zantema

### Context-Sensitive Fusion Grammars Are Universal

Context-sensitive fusion grammars are a special case of context-dependent fusion grammars where a rule has only a single positive context condition instead of finite sets of positive and negative context conditions. They generate hypergraph languages from start hypergraphs via successive applications of context-sensitive fusion rules and multiplications of connected components, as well as a filtering mechanism to extract terminal hypergraphs from derived hypergraphs in a certain way. The application of a context-sensitive fusion rule consumes two complementarily labeled hyperedges and identifies corresponding attachment vertices provided that the context condition holds. In this paper, we show that the Post correspondence problem can be formulated very intuitively by such a grammar. Furthermore, we prove that these grammars can generate all recursively enumerable string languages (up to representation of strings as graphs) and are universal in this respect.

Aaron Lye

### Cyclic Shift on Multi-component Grammars

Multi-component grammars, known in the literature as “multiple context-free grammars” and “linear context-free rewriting systems”, describe the structure of a string by defining the properties of k-tuples of its substrings, in the same way as ordinary formal grammars (Chomsky’s “context-free”) define properties of substrings. It is shown that, for every fixed k, the family of languages described by k-component grammars is closed under the cyclic shift operation. On the other hand, the subfamily defined by well-nested k-component grammars is not closed under the cyclic shift, yet their cyclic shifts are always defined by well-nested $$(k+1)$$-component grammars.

Alexander Okhotin, Alexey Sorokin

### The Automatic Baire Property and an Effective Property of -Rational Functions

We prove that $$\omega$$-regular languages accepted by Büchi or Muller automata satisfy an effective automata-theoretic version of the Baire property. Then we use this result to obtain a new effective property of rational functions over infinite words which are realized by finite state Büchi transducers: for each such function $$F: \mathsf {\Sigma }^\omega \rightarrow \mathsf {\Gamma }^\omega$$, one can construct a deterministic Büchi automaton $$\mathcal {A}$$ accepting a dense $$\mathbf{\Pi }^0_2$$-subset of $$\mathsf {\Sigma }^\omega$$ such that the restriction of F to $$L(\mathcal {A})$$ is continuous.

Olivier Finkel

### The Power of Programs over Monoids in

The model of programs over (finite) monoids, introduced by Barrington and Thérien, gives an interesting way to characterise the circuit complexity class and its subclasses and showcases deep connections with algebraic automata theory. In this article, we investigate the computational power of programs over monoids in , a small variety of finite aperiodic monoids. First, we give a fine hierarchy within the class of languages recognised by programs over monoids from , based on the length of programs but also some parametrisation of . Second, and most importantly, we make progress in understanding what regular languages can be recognised by programs over monoids in . We show that those programs actually can recognise all languages from a class of restricted dot-depth one languages, using a non-trivial trick, and conjecture that this class suffices to characterise the regular languages recognised by programs over monoids in .

Nathan Grosshans

### Geometrically Closed Positive Varieties of Star-Free Languages

A recently introduced operation of geometrical closure on formal languages is investigated. It is proved that the geometrical closure of a language from the positive variety $$\mathcal {V}_{3/2}$$, the level 3/2 of the Straubing-Thérien hierarchy of star-free languages, always falls into the variety $$\mathcal {R}_{LT}$$, which is a new variety consisting of specific R-trivial languages. As a consequence, each class of regular languages lying between $$\mathcal {R}_{LT}$$ and $$\mathcal {V}_{3/2}$$ is geometrically closed.

Ondřej Klíma, Peter Kostolányi

### Intersection and Union Hierarchies of Deterministic Context-Free Languages and Pumping Lemmas

We study the computational complexity of finite intersections and unions of deterministic context-free languages. Earlier, Wotschke (1978) demonstrated that intersections of $$(d+1)$$ deterministic context-free languages are in general more powerful than intersections of d deterministic context-free languages for any positive integer d based on the hierarchy separation of Liu and Weiner (1973). The argument of Liu and Weiner, however, works only on bounded languages of particular forms, and therefore Wotschke’s result cannot be extended to disprove any other language to be written in the form of an intersection of d deterministic context-free languages. To deal with the non-membership of a wide range of languages, we circumvent their proof argument and instead devise a new, practical technical tool: a pumping lemma for finite unions of deterministic context-free languages. Since the family of deterministic context-free languages is closed under complementation, this pumping lemma enables us to show a non-membership relation of languages made up with finite intersections of even non-bounded languages as well. We also refer to a relationship to Hibbard’s limited automata.

Tomoyuki Yamakami

### On the Weisfeiler-Leman Dimension of Fractional Packing

The k-dimensional Weisfeiler-Leman procedure ($$k\text {-}\mathrm {WL}$$) has proven to be immensely fruitful in the algorithmic study of Graph Isomorphism. More generally, it is of fundamental importance in understanding and exploiting symmetries in graphs in various settings. Two graphs are $$k\text {-}\mathrm {WL}$$-equivalent if dimention k does not suffice to distinguish them. $$1\text {-}\mathrm {WL}$$-equivalence is known as fractional isomorphism of graphs, and the $$k\text {-}\mathrm {WL}$$-equivalence relation becomes finer as k increases.We investigate to what extent standard graph parameters are preserved by $$k\text {-}\mathrm {WL}$$-equivalence, focusing on fractional graph packing numbers. The integral packing numbers are typically NP-hard to compute, and we discuss applicability of $$k\text {-}\mathrm {WL}$$-invariance for estimating the integrality gap of the LP relaxation provided by their fractional counterparts.

Vikraman Arvind, Frank Fuhlbrück, Johannes Köbler, Oleg Verbitsky

### Input Strictly Local Tree Transducers

We generalize the class of input strictly local string functions (Chandlee et al. 2014) to tree functions. We show they are characterized by a subclass of frontier-to-root, deterministic, linear tree transducers. We motivate this class from the study of natural language as it provides a way to distinguish local syntactic processes from non-local ones. We give examples illustrating this kind of analysis.

Jing Ji, Jeffrey Heinz

### Lyndon Words versus Inverse Lyndon Words: Queries on Suffixes and Bordered Words

The Lyndon factorization of a word has been extensively studied in different contexts and several variants of it have been proposed. In particular, the canonical inverse Lyndon factorization $$\mathrm {ICFL}$$, introduced in [5], maintains the main properties of the Lyndon factorization since it can be computed in linear time and it is uniquely determined. In this paper we investigate new properties of this factorization with the purpose of exploring its use in string queries.As a main result, we prove an upper bound on the length of the longest common extension (or longest common prefix) for two factors of a word w. This bound is at most the maximum length of two consecutive factors of $$\mathrm {ICFL}(w)$$. A tool used in the proof is a property that we state for factors with nonempty borders in $$\mathrm {ICFL}(w)$$: a nonempty border of a factor $$m_i$$ cannot be a prefix of the next factor $$m_{i+1}$$. Another interesting result relates sorting of global suffixes, i.e., suffixes of a word w, and sorting of local suffixes, i.e., suffixes of the factors in $$\mathrm {ICFL}(w)$$.Finally, given a word w and a factor x of w, we prove that their Lyndon factorizations share factors, except for the first and last term of the Lyndon factorization of x. This property suggests that, given two words sharing a common overlap, their Lyndon factorizations could be used to capture the common overlap of these two words.

Paola Bonizzoni, Clelia De Felice, Rocco Zaccagnino, Rosalba Zizza

### Reducing the Ambiguity of Parikh Matrices

The Parikh matrix mapping allows us to describe words using matrices. Although compact, this description comes with a level of ambiguity since a single matrix may describe multiple words. This work looks at how considering the Parikh matrices of various transformations of a given word can decrease that ambiguity. More specifically, for any word, we study the Parikh matrix of its Lyndon conjugate as well as that of its projection to a smaller alphabet. Our results demonstrate that ambiguity can often be reduced using these concepts, and we give conditions on when they succeed.

Jeffery Dick, Laura K. Hutchinson, Robert Mercaş, Daniel Reidenbach

### On Collapsing Prefix Normal Words

Prefix normal words are binary words in which each prefix has at least the same number of $$\mathsf {1}$$s as any factor of the same length. Firstly introduced in 2011, the problem of determining the index (amount of equivalence classes for a given word length) of the prefix normal equivalence relation is still open. In this paper, we investigate two aspects of the problem, namely prefix normal palindromes and so-called collapsing words (extending the notion of critical words). We prove characterizations for both the palindromes and the collapsing words and show their connection. Based on this, we show that still open problems regarding prefix normal words can be split into certain subproblems.

Pamela Fleischmann, Mitja Kulczynski, Dirk Nowotka, Danny Bøgsted Poulsen

### Simplified Parsing Expression Derivatives

This paper presents a new derivative parsing algorithm for parsing expression grammars; this new algorithm is both simpler and faster than the existing parsing expression derivative algorithm presented by Moss [12]. This new algorithm improves on the worst-case space and runtime bounds of the previous algorithm by a linear factor, as well as decreasing runtime by about half in practice.

Aaron Moss

### Complete Variable-Length Codes: An Excursion into Word Edit Operations

Given an alphabet A and a binary relation $$\tau \subseteq A^*\times A^*$$, a language $$X\subseteq A^*$$ is $$\tau$$-independent if $$\tau (X)\cap X\,=\,\emptyset$$; X is $$\tau$$-closed if $$\tau (X)\subseteq X$$. The language X is complete if any word over A is a factor of some concatenation of words in X. Given a family of languages $$\mathcal{F}$$ containing X, X is maximal in $$\mathcal{F}$$ if no other set of $$\mathcal{F}$$ can strictly contain X. A language $$X\subseteq A^*$$ is a variable-length code if any equation among the words of X is necessarily trivial. The study discusses the relationship between maximality and completeness in the case of $$\tau$$-independent or $$\tau$$-closed variable-length codes. We focus to the binary relations by which the images of words are computed by deleting, inserting, or substituting some characters.

Jean Néraud

### Backmatter

Weitere Informationen