Top

2013 | Book

Read chapter Read first chapter

Language and Automata Theory and Applications

7th International Conference, LATA 2013, Bilbao, Spain, April 2-5, 2013. Proceedings

Editors: Adrian-Horia Dediu, Carlos Martín-Vide, Bianca Truthe

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 7th International Conference on Language and Automata Theory and Applications, LATA 2013, held in Bilbao, Spain in April 2013.

The 45 revised full papers presented together with 5 invited talks were carefully reviewed and selected from 97 initial submissions. The volume features contributions from both classical theory fields and application areas (bioinformatics, systems biology, language technology, artificial intelligence, etc.). Among the topics covered are algebraic language theory; algorithms for semi-structured data mining; algorithms on automata and words; automata and logic; automata for system analysis and program verification; automata, concurrency and Petri nets; automatic structures; cellular automata; combinatorics on words; computability; computational complexity; computational linguistics; data and image compression; decidability questions on words and languages; descriptional complexity; DNA and other models of bio-inspired computing; document engineering; foundations of finite state technology; foundations of XML; fuzzy and rough languages; grammars (Chomsky hierarchy, contextual, multidimensional, unification, categorial, etc.); grammars and automata architectures; grammatical inference and algorithmic learning; graphs and graph transformation; language varieties and semigroups; language-based cryptography; language-theoretic foundations of artificial intelligence and artificial life; parallel and regulated rewriting; parsing; pattern recognition; patterns and codes; power series; quantum, chemical and optical computing; semantics; string and combinatorial issues in computational biology and bioinformatics; string processing algorithms; symbolic dynamics; symbolic neural networks; term rewriting; transducers; trees, tree languages and tree automata; weighted automata.

Frontmatter

Invited Talks

Complexity Dichotomy for Counting Problems

I would like to report on some significant progress in the study of the exact complexity of counting problems. Specifically I will describe the classification program of counting complexity of locally specified problems. This classification program is advanced in three interrelated frameworks: Graph Homomorphisms, Counting CSP, and Holant Problems. In each formulation, complexity dichotomy theorems have been achieved which classify every problem in a given class to be either solvable in polynomial time or #P-hard.

Jin-Yi Cai

Algorithms for Analyzing and Verifying Infinite-State Recursive Probabilistic Systems

Until recent years, research on automated verification and model checking of probabilistic systems was largely confined to using

finite-state

Markov chains (MCs) and

finite-state

Markov Decision processes (MDPs) as the core underlying models. In recent years, researchers have begun to develop new algorithms for, and study the computational complexity of, analysis and verification problems for classes of finitely-presented infinite-state probabilistic systems that arise as probabilistic extensions to classic infinite-state automata-theoretic models.

Kousha Etessami

Recursion Schemes, Collapsible Pushdown Automata and Higher-Order Model Checking

This paper is about two models of computation that underpin recent developments in the algorithmic verification of higher-order computation. Recursion schemes are in essence the simply-typed lambda calculus with recursion, generated from first-order symbols. Collapsible pushdown automata are a generalisation of pushdown automata to higher-order stacks — which are iterations of stack of stacks — that contain symbols equipped with links. We study and compare the expressive power of the two models and the algorithmic properties of infinite structures such as trees and graphs generated by them. We conclude with a brief overview of recent applications to the model checking of higher-order functional programs. A central theme of the work is the fruitful interplay of ideas between the neighbouring fields of semantics and algorithmic verification.

Luke Ong

Discrete Linear Dynamical Systems

The theory of

dynamical systems

is concerned with describing and studying the evolution of systems over time, where a ‘system’ is represented as a vector of variables, and there is a fixed rule governing how the system evolves. Dynamical systems originate in the development of Newtonian mechanics, and have widespread applications in many areas of science and engineering. Systems that evolve in a piecewise continuous manner (typically via differential equations) are known as

continuous

dynamical systems, whereas systems exhibiting discrete transitions (commonly via difference equations) are known as

discrete

dynamical systems.

The theory of dynamical systems comprises a rich body of techniques and results, mainly geared towards the analysis of long-term qualitative behaviour of systems: existence and uniqueness of attractors, fixed points, or periodic points, sensitivity to initial conditions, etc. From a computer-science perspective, it is somewhat surprising to note that the literature is largely devoid of work on

decision problems

concerning dynamical systems, e.g., whether a fixed point or a particular region will actually be reached in finite time, whether a variable will assume negative values infinitely often, etc. Such questions, in turn, have numerous applications in a wide array of scientific areas, such as theoretical biology (analysis of L-systems, population dynamics), microeconomics (stability of supply-and-demand equilibria in cyclical markets), software verification (termination of linear programs), probabilistic model checking (reachability in Markov chains, stochastic logics), quantum computing (threshold problems for quantum automata), as well as combinatorics, term rewriting, formal languages, cellular automata, the study of generating functions, etc.

In this tutorial, I will first briefly introduce the main elements of the theory of (both continuous and discrete) dynamical systems, using several illustrative examples. I will then present various interesting decision problems, mainly in the context of discrete

linear

dynamical systems, for which there already are many open questions. Finally, I will survey some of the main known results and techniques, which draw on a wide array of mathematical tools, including linear algebra, algebraic and analytic number theory, real algebraic geometry, and model theory.

Joël Ouaknine

XML Schema Management: A Challenge for Automata Theory

XML is the standard format for data exchange over the internet and the Web contains millions of XML documents. Even though the structure of XML documents can be very flexible, it is desirable for many applications that documents come with a schema that describe their structure in a concise way. Whereas there exists a lot of software for the manipulation of XML documents and modern database management systems can deal with XML, the foundations of XML schema and document management systems are still not fully understood and they constitute an active research area.

Schema languages for XML, most prominently XML Schema and DTD, are closely related to concepts from Formal Language Theory, including context-free grammars, tree automata and regular expressions. However, the official standards of the

World Wide Web Consortium

(W3C) pose various restrictions that have not been much studied in Formal Language Theory before the advent of XML (and its precursor SGML). Altogether, the foundations of XML schema and document management systems raise a lot of new challenges to Automata Theory, in particular, to provide suitable algorithms and concepts.

The aim of this talk is to describe some of these challenges, to report on recent developments and to highlight some current directions of research. Topics will include the expressive power of schema languages, the repair of schemas that do not obey the restrictions posed by the W3C standards, the combination of schemas, schema conversion, schema minimisation, inference of schemas from a given set of documents, segmentation of schemas for distributed documents, and constraints that combine structural and data aspects.

Thomas Schwentick

Regular Papers

On the Complexity of Shortest Path Problems on Discounted Cost Graphs

Discounted Cost Register Automata (DCRA) associate costs with strings in a regular manner using the operation of discounted sum. The min-cost optimization problem for DCRAs corresponds to computing shortest paths in graphs with more general forms of discounting than the well-studied notion of future discounting. We present solutions to two classes of such shortest path problems: in presence of both past and future discounting, we show the decision problem is

-complete, but has a polynomial-time approximation scheme; in presence of two future discounting criteria that are composed in a prioritized manner, we show that the problem is solvable in

Nexptime

Rajeev Alur, Sampath Kannan, Kevin Tian, Yifei Yuan

Termination of Rule-Based Calculi for Uniform Semi-Unification

Uniform semi-unification is a generalization of unification; its efficient algorithms have been extensively studied in (Kapur et al., 1994) and (Oliart&Snyder, 2004). For (uniform) semi-unification, several variants of rule-based calculi are known. But, some of these calculi in the literature lack the termination property, i.e. not all derivations are terminating. We revisit symbolic semi-unification whose solvability coincides with that of uniform semi-unification. We give an abstract criterion of the strategy on which a general rule-based calculus for symbolic semi-unification terminates. Based on this, we give an alternative and robust correctness proof of a rule-based uniform semi-unification algorithm.

Takahito Aoto, Munehiro Iwami

Deciding WQO for Factorial Languages

A language is factorial if it is closed under taking factors (i.e. contiguous subwords). Every factorial language can be described by an antidictionary, i.e. a minimal set of forbidden factors. We show that the problem of deciding whether a factorial language given by a

finite

antidictionary is well-quasi-ordered under the factor containment relation can be solved in polynomial time.

Aistis Atminas, Vadim Lozin, Mikhail Moshkov

On the Construction of a Family of Automata That Are Generically Non-minimal

One way to generate an accessible deterministic finite automaton is to first generate a spanning tree and then complete it to an automaton. We introduce the ideas of a sequential automaton, that are automata with sequential trees as breadth-first spanning subtrees. We introduce the concept of elementary equivalent states and explore combinatorial properties of non-minimal sequential automata. We then show that minimality is negligible among sequential automata by calculating the probability that an automaton has two elementary equivalent states and showing that this probability approach 1 as the size of the automaton increases.

Parisa Babaali, Christopher Knaplund

Limited Non-determinism Hierarchy of Counter Automata

Automata with weights on edges, especially automata with counters, have been studied extensively in recent years, both because of to their interesting theory and due to their practical applications in data analysis. One of the most significant differences between weighted and classical automata concerns determinization: while every classical automaton can be determinized, this is not the case for weighted automata. Still, obtaining an equivalent automaton as close to a sequential (deterministic) one as possible is crucial in many practical applications, as unbounded non-determinism incurs large computational costs. There exist a few ways to limit the non-determinism of a counter automaton. For each word, one can require that only

runs are accepting (

-ambiguous automata), that there are only

possible runs at all (

-path automata), or one can restrict the automaton itself to be a disjoint sum of

sequential ones (

-sequential automata). Moreover, there are different types of automata with counters: distance automata that cannot reset, desert automata, and R-automata with many counters. In this paper, we establish a hierarchy for all these possibilities. First, we show that the parameter

induces a hierarchy in all cases. Then, we prove that

-path automata can be made 2

− 1

-sequential and that this bound is strict. Finally, we show an unambiguous automaton which is not finitely sequential at all.

Sebastian Bala, Dariusz Jackowski

Unambiguous Automata Denoting Finitely Sequential Functions

The min-plus automata with real weights are interesting both in theory and in practice, e.g. their variants are used as a data structure in speech recognition. In this paper we study automata which are finite unions of deterministic ones, called finitely sequential automata. Such automata allow fast detection of optimal paths in parallel while still allowing to express ambiguous functions. We provide a polynomial time algorithm which decides if the given min-plus unambiguous automaton, with rational weights, has a finitely sequential version and we show how to build such equivalent one if the answer is positive. To this end, we introduce the Fork Property which plays the same role as the negation of the Twin Property in case of determinisation. We show that an unambiguous automaton can be transformed into a finitely sequential one if and only if the Fork Property is not satisfied.

Sebastian Bala, Artur Koniński

Duplication-Loss Genome Alignment: Complexity and Algorithm

Recently, an Alignment approach for the comparison of two genomes, based on an evolutionary model restricted to Duplications and Losses, has been presented. An exact linear programming algorithm has been developed and successfully applied to the Transfer RNA (tRNA) repertoire in Bacteria, leading to interesting observation on tRNA shift of identity. Here, we explore a direct dynamic programming approach for the Duplication-Loss Alignment of two genomes, which proceeds in two steps: (1) (The Dynamic Programming step) Outputs a best candidate alignment between the two genomes and (2) (

Minimum Label Alignment

problem) Finds an evolutionary scenario of minimum duplication-loss cost that is in agreement with the alignment. We show that the

Minimum Label Alignment

is APX-hard, even if the number of occurrences of a gene inside a genome is bounded by 5. We then develop a heuristic which is a thousands of times faster than the linear programming algorithm and exhibits a high degree of accuracy on simulated datasets. The heuristic has been implemented in JAVA and is available on request.

Billel Benzaid, Riccardo Dondi, Nadia El-Mabrouk

Maximizing Entropy over Markov Processes

The channel capacity of a deterministic system with confidential data is an upper bound on the amount of bits of data an attacker can learn from the system. We encode all possible attacks to a system using a probabilistic specification, an Interval Markov Chain. Then the channel capacity computation reduces to finding a model of a specification with highest entropy.

Entropy maximization for probabilistic process specifications has not been studied before, even though it is well known in Bayesian inference for discrete distributions. We give a characterization of global entropy of a process as a reward function, a polynomial algorithm to verify the existence of an system maximizing entropy among those respecting a specification, a procedure for the maximization of reward functions over Interval Markov Chains and its application to synthesize an implementation maximizing entropy.

We show how to use Interval Markov Chains to model abstractions of deterministic systems with confidential data, and use the above results to compute their channel capacity. These results are a foundation for ongoing work on computing channel capacity for abstractions of programs derived from code.

Fabrizio Biondi, Axel Legay, Bo Friis Nielsen, Andrzej Wąsowski

MAT Learning of Universal Automata

A MAT learning algorithm is presented that infers the universal automaton (UA) for a regular target language, using a polynomial number of queries with respect to that automaton. The UA is one of several canonical characterizations for regular languages. Our learner is based on the concept of an observation table, which seems to be particularly fitting for this computational model, and the necessary notions and definitions are adapted from the literature to the case of UA.

Johanna Björklund, Henning Fernau, Anna Kasprzik

A Graph Polynomial Approach to Primitivity

Recently, Tittmann et al. introduced the

subgraph component polynomial

and showed that its power for distinguishing graphs is quite different from the power of other graph polynomials that appear in the literature such as the matching polynomial, the Tutte polynomial, the characteristic polynomial, the chromatic polynomial, etc. The subgraph component polynomial enumerates vertex induced subgraphs in a given undirected graph with respect to the number of components. We show the use of the subgraph component polynomial to count the number of primitive partial words of a given length over an alphabet of a fixed size, which leads to a method for enumerating such partial words.

Francine Blanchet-Sadri, Michelle Bodnar, Nathan Fox, Joe Hidakatsu

Suffix Trees for Partial Words and the Longest Common Compatible Prefix Problem

Suffix trees, introduced by Weiner in 1973, are powerful data structures used to solve many problems on words such as searching for patterns in biological sequences, data compression, text processing, etc. Although they have fallen out of favor in the past years to the more space-efficient suffix arrays, suffix trees are useful for modelling partial words by allowing paths to meet whilst keeping them acyclic through directedness (a partial word may have don’t care symbols or holes, which are compatible with any letter of the alphabet over which it is defined). We extend suffix trees to partial words by introducing a suffix directed acyclic graph, with compatibility links, that exhibits all the suffixes while preserving the longest common compatible prefix, lccp, between suffixes. We give an optimal

(

) time and space algorithm for constructing the suffix dag of a given partial word

with an arbitrary number of holes of length

over a fixed alphabet by modifying Weiner’s algorithm. Our algorithm also computes the lccp array between suffixes of

starting with holes and all other suffixes of

. It possesses the invariant that after the suffix at position

has been processed, the lccp between any suffix starting at or after position

with any suffix starting with a hole can be computed in constant time. As a result, with

(

) preprocessing time, finding the lccp of two given suffixes of

requires constant time.

Francine Blanchet-Sadri, Justin Lazarow

Dynamic Communicating Automata and Branching High-Level MSCs

We study dynamic communicating automata (DCA), an extension of classical communicating finite-state machines that allows for dynamic creation of processes. The behavior of a DCA can be described as a set of message sequence charts (MSCs). While DCA serve as a model of an implementation, we propose branching high-level MSCs (bHMSCs) on the specification side. Our focus is on the implementability problem: given a bHMSC, can one construct an equivalent DCA? As this problem is undecidable, we introduce the notion of executability, a decidable necessary criterion for implementability. We show that executability of bHMSCs is EXPTIME-complete. We then identify a class of bHMSCs for which executability effectively implies implementability.

Benedikt Bollig, Aiswarya Cyriac, Loïc Hélouët, Ahmet Kara, Thomas Schwentick

Visibly Pushdown Automata: Universality and Inclusion via Antichains

Visibly pushdown automata (VPAs), introduced by Alur and Madhusudan in 2004, are a useful formalism in various contexts, such as expressing and checking properties on control flows of programs, or on XML documents. In this context, we propose efficient antichain-based algorithms to check universality and inclusion of VPAs. Whereas the computation complexity is known to be ExpTime-complete for both problems, we show how antichains can avoid explicit determinization and save computations. The approach is extended to hedge automata. We implement the proposed algorithms in a prototype tool and conduct experiments on randomly generated VPAs. We show that, on numerous instances, our algorithms outperform other VPA tools.

Véronique Bruyère, Marc Ducobu, Olivier Gauwin

Two-Sided Derivatives for Regular Expressions and for Hairpin Expressions

The aim of this paper is to design the polynomial construction of a finite recognizer for hairpin completions of regular languages. This is achieved by considering completions as new expression operators and by applying derivation techniques to the associated extended expressions called hairpin expressions. More precisely, we extend partial derivation of regular expressions to two-sided partial derivation of hairpin expressions and we show how to deduce a recognizer for a hairpin expression from its two-sided derived term automaton, providing an alternative proof of the fact that hairpin completions of regular languages are linear context-free.

Jean-Marc Champarnaud, Jean-Philippe Dubernard, Hadrien Jeanne, Ludovic Mignot

How to Travel between Languages

We consider how to edit strings from a source language so that the edited strings belong to a target language, where the languages are given as deterministic finite automata. Non-streaming (or offline) transducers perform edits given the whole source string. We show that the class of deterministic one-pass transducers with registers along with increment and min operation suffices for computing optimal edit distance, whereas the same class of transducers without the min operation is not sufficient. Streaming (or online) transducers perform edits as the letters of the source string are received. We present a polynomial time algorithm for the

partial-repair problem

that given a bound

asks for the construction of a deterministic streaming transducer (if one exists) that ensures that the ‘maximum fraction’

of the strings of the source language are edited, within cost

, to the target language.

Krishnendu Chatterjee, Siddhesh Chaubal, Sasha Rubin

Execution Information Rate for Some Classes of Automata

We study the Shannon information rate of accepting runs of various forms of automata. The rate is therefore a complexity indicator for executions of the automata. Accepting runs of finite automata and reversal-bounded nondeterministic counter machines, as well as their restrictions and variations, are investigated and are shown, in many cases, with computable execution rates. We also conduct experiments on C programs showing that estimating information rates for their executions is feasible in many cases.

Cewei Cui, Zhe Dang, Thomas R. Fischer, Oscar H. Ibarra

Decidability and Complexity Results for Verification of Asynchronous Broadcast Networks

We study decidability and complexity of verification problems for networks in which nodes communicate via asynchronous broadcast messages. This type of communication is achieved by using a distributed model in which nodes have a local buffer. We consider here safety properties expressed as a coverability problem with an arbitrary initial configuration. This formulation naturally models the search of an initial topology that may lead to an error state in the protocol.

Giorgio Delzanno, Riccardo Traverso

The Buffered π-Calculus: A Model for Concurrent Languages

Message-passing based concurrent languages are widely used in developing large distributed and coordination systems. This paper presents the buffered

-calculus — a variant of the

-calculus where channel names are classified into buffered and unbuffered: communication along buffered channels is asynchronous, and remains synchronous along unbuffered channels. We show that the buffered

-calculus can be fully simulated in the polyadic

-calculus with respect to strong bisimulation. In contrast to the

-calculus which is hard to use in practice, the new language enables easy and clear modeling of practical concurrent languages. We encode two real-world concurrent languages in the buffered

-calculus: the (core) Go language and the Core Erlang. Both encodings are fully abstract with respect to weak bisimulations.

Xiaojie Deng, Yu Zhang, Yuxin Deng, Farong Zhong

Mix-Automatic Sequences

Mix-automatic sequences form a proper extension of the class of automatic sequences, and arise from a generalization of finite state automata where the input alphabet is state-dependent. In this paper we compare the class of mix-automatic sequences with the class of morphic sequences. For every polynomial

we construct a mix-automatic sequence whose subword complexity exceeds

. This stands in contrast to automatic and morphic sequences which are known to have at most quadratic subword complexity. We then adapt the notion of

-kernels to obtain a characterization of mix-automatic sequences, and employ this notion to construct morphic sequences that are not mix-automatic.

Jörg Endrullis, Clemens Grabmayer, Dimitri Hendriks

A Multivariate Analysis of Some DFA Problems

We initiate a multivariate analysis of two well-known NPhard decision problems on DFAs: the problem of finding a short synchronizing word and that of finding a DFA on few states consistent with a given sample of the intended language and its complement. For both problems, we study natural parameterizations and classify them with the tools provided by Parameterized Complexity. Somewhat surprisingly, in both cases, rather simple FPT algorithms can be shown to be optimal, mostly assuming the (Strong) Exponential Time Hypothesis.

Henning Fernau, Pinar Heggernes, Yngve Villanger

On the Size Complexity of Deterministic Frequency Automata

Austinat, Diekert, Hertrampf, and Petersen [2] proved that every language

that is (

)-recognizable by a deterministic frequency automaton such that

/2 can be recognized by a deterministic finite automaton as well. First, the size of deterministic frequency automata and of deterministic finite automata recognizing the same language is compared. Then approximations of a language are considered, where a language

′ is called an

approximation

of a language

′ differs from

in only a finite number of strings. We prove that if a deterministic frequency automaton has

states and (

)-recognizes a language

, where

/2, then there is a language

′ approximating

such that

′ can be recognized by a deterministic finite automaton with no more than

states.

Austinat

et al.

[2] also proved that every language

over a single-letter alphabet that is (1,

)-recognizable by a deterministic frequency automaton can be recognized by a deterministic finite automaton. For languages over a single-letter alphabet we show that if a deterministic frequency automaton has

states and (1,

)-recognizes a language

then there is a language

′ approximating

such that

′ can be recognized by a deterministic finite automaton with no more that

states. However, there are approximations such that our bound is much higher, i.e.,

Rūsiņš Freivalds, Thomas Zeugmann, Grant R. Pogosyan

On the Number of Unbordered Factors

We illustrate a general technique for enumerating factors of

-automatic sequences by proving a conjecture on the number

(

) of unbordered factors of the Thue-Morse sequence. We show that

(

) ≤

for

≥ 4 and that

(

) =

infinitely often. We also give examples of automatic sequences having exactly 2 unbordered factors of every length.

Daniel Goč, Hamoon Mousavi, Jeffrey Shallit

Primitive Words and Lyndon Words in Automatic and Linearly Recurrent Sequences

We investigate questions related to the presence of primitive words and Lyndon words in automatic and linearly recurrent sequences. We show that the Lyndon factorization of a

-automatic sequence is itself

-automatic. We also show that the function counting the number of primitive factors (resp., Lyndon factors) of length

in a

-automatic sequence is

-regular. Finally, we show that the number of Lyndon factors of a linearly recurrent sequence is bounded.

Daniel Goč, Kalle Saari, Jeffrey Shallit

Efficient Submatch Extraction for Practical Regular Expressions

capturing group

is a syntax used in modern regular expression implementations to specify a subexpression of a regular expression. Given a string that matches the regular expression,

submatch extraction

is the process of extracting the substrings corresponding to those subexpressions.

Greedy

and

reluctant

closures are variants on the standard closure operator that impact how submatches are extracted. The state of the art and practice in submatch extraction are automata based approaches and backtracking algorithms. In theory, the number of states in an automata-based approach can be exponential in

, the size of the regular expression, and the running time of backtracking algorithms can be exponential in ℓ, the length of the string. In this paper, we present an

(ℓ

) runtime automata based algorithm for extracting submatches from a string that matches a regular expression, where

> 0 is the number of capturing groups. The previous fastest automata based algorithm was

(

ℓ

). Both our approach and the previous fastest one require worst-case exponential compile time. But in practice, the worst case behavior rarely occurs, so achieving a practical speed-up against state-of-the-art methods is of significant interest. Our experimental results show that, for a large set of regular expressions used in practice, our algorithm is approximately twice as fast as Java’s backtracking based regular expression library and approximately twenty times faster than the RE2 regular expression engine.

Stuart Haber, William Horne, Pratyusa Manadhata, Miranda Mowbray, Prasad Rao

Determinacy and Subsumption for Single-Valued Bottom-Up Tree Transducers

This paper discusses the decidability of determinacy and subsumption for tree transducers. For two tree transducers

and

determines

if the output of

is identified by the output of

, that is, there is a partial function

such that

$[\![T_2]\!]=f\circ [\![T_1]\!]$

where

$[\![T_1]\!]$

and

$[\![T_2]\!]$

are tree transformation relations induced by

and

, respectively. Also,

subsumes

determines

and the partial function

such that

$[\![T_2]\!]=f \circ [\![T_1]\!]$

can be defined by a transducer in a designated class that

belongs to. In this paper, we show that determinacy is decidable for single-valued linear extended bottom-up tree transducers as the determiner class and single-valued bottom-up tree transducers as the determinee class. We also show that subsumption is decidable for these classes.

Kenji Hashimoto, Ryuta Sawada, Yasunori Ishihara, Hiroyuki Seki, Toru Fujiwara

Revealing vs. Concealing: More Simulation Games for Büchi Inclusion

We address the problem of deciding language inclusion between two non-deterministic Büchi automata. It is known to be PSPACE-complete and finding techniques that are efficient in practice is still a challenging problem. We introduce two new sequences of simulation relations, called multi-letter simulations, in which Verifier has to reproduce Refuter’s moves taking advantage of a forecast. We compare these with the multi-pebble games introduced by Etessami. We show that multi-letter simulations, despite being more restrictive than multi-pebble ones, have a greater potential for an incremental inclusion test, for their size grows generally slower. We evaluate this idea experimentally and show that incremental inclusion testing may outperform the most advanced Ramsey-based algorithms by two orders of magnitude.

Milka Hutagalung, Martin Lange, Etienne Lozes

On Bounded Languages and Reversal-Bounded Automata

Bounded context-free languages have been investigated for nearly fifty years, yet they continue to generate interest as seen from recent studies. Here, we present a number of results about bounded context-free languages. First we give a new (simpler) proof that every context-free language

$L \subseteq w_1^* w_2^* ... w_n^*$

can be accepted by a PDA with at most 2

− 3 reversals. We also introduce new collections of bounded context-free languages and present some of their interesting properties. Some of the properties are counter-intuitive and may point to some deeper facts about bounded CFL’s. We present some results about semilinear sets and also present a generalization of the well-known result that over a one-letter alphabet, the family of context-free and regular languages coincide.

Oscar H. Ibarra, Bala Ravikumar

Rewrite Closure and CF Hedge Automata

We introduce an extension of hedge automata called bidimensional context-free hedge automata. The class of unranked ordered tree languages they recognize is shown to be preserved by rewrite closure with inverse-monadic rules. We also extend the parameterized rewriting rules used for modeling the W3C XQuery Update Facility in previous works, by the possibility to insert a new parent node above a given node. We show that the rewrite closure of hedge automata languages with these extended rewriting systems are context-free hedge languages.

Florent Jacquemard, Michael Rusinowitch

Linear-Time Version of Holub’s Algorithm for Morphic Imprimitivity Testing

Stepan Holub (Discr. Math., 2009) gave the first polynomial algorithm deciding whether a given word is a nontrivial fixed point of a morphism. His algorithm works in quadratic time for large alphabets. We improve the algorithm to work in linear time. Our improvement starts with a careful choice of a subset of rules used in Holub’s algorithm that is necessary to grant correctness of the algorithm.Afterwards we show how to choose the order of applying the rules that allows to avoid unnecessary operations on sets. We obtain linear time using efficient data structures for implementation of the rules. Holub’s algorithm maintains connected components of a graph corresponding to specially marked positions in a word. This graph is of quadratic size for large alphabet. In our algorithm only a linear number of edges of this conceptual graph is processed.

Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

From Regular Tree Expression to Position Tree Automaton

The word position automaton was introduced by Glushkov and McNaughton in the early 1960. This automaton is homogeneous and has (||E|| + 1) states for an expression of alphabetic width ||E||. In this paper this type of automata is extended to regular tree expressions and it is shown that the conversion of a regular tree expression of size |E| and alphabetic width ||E|| into its reduced tree automaton can be done in

(|E|·||E||) time. The time complexity of the algorithm proposed by Dietrich Kuske and Ingmar Meinecke is also proved in order to reach an

(||E||·|E|) time complexity for the problem of the construction of the equation automaton from a regular tree expression.

Éric Laugerotte, Nadia Ouali Sebti, Djelloul Ziadi

Convergence of Newton’s Method over Commutative Semirings

We give a lower bound on the speed at which Newton’s method (as defined in [5,6]) converges over arbitrary

-continuous commutative semirings. From this result, we deduce that Newton’s method converges within a finite number of iterations over any semiring which is “collapsed at some

∈ ℕ” (i.e.

+ 1 holds) in the sense of [1]. We apply these results to (1) obtain a generalization of Parikh’s theorem, (2) to compute the provenance of Datalog queries, and (3) to analyze weighted pushdown systems. We further show how to compute Newton’s method over any

-continuous semiring.

Michael Luttenberger, Maximilian Schlund

Counting Minimal Symmetric Difference NFAs

A result of Nicaud states that the number of distinct unary regular string languages recognized by minimal deterministic finite automata (DFAs) with

states is asymptotically equal to

− 1

. We consider the analogous question for symmetric difference automata (ℤ

-NFAs), and show that precisely 2

− 1

unary languages are recognized by

-state minimal ℤ

-NFAs.

Brink van der Merwe, Mark Farag, Jaco Geldenhuys

Interval Logics and ωB-Regular Languages

In the recent years, interval temporal logics are emerging as a workable alternative to more standard point-based ones. In this paper, we establish an original connection between these logics and

ωB

-regular languages. First, we provide a logical characterization of regular (resp.,

-regular) languages in the interval logic

$A\mspace{-0.3mu}B\bar{B}$

of Allen’s relations

meets

begun by

, and

begins

over finite linear orders (resp., ℕ). Then, we lift such a correspondence to

ωB

-regular languages by substituting

$A\mspace{-0.3mu}B\bar{B}\bar{A}$

for

$A\mspace{-0.3mu}B\bar{B}$

(

$A\mspace{-0.3mu}B\bar{B}\bar{A}$

is obtained from

$A\mspace{-0.3mu}B\bar{B}$

by adding a modality for Allen’s relation

met by

). In addition, we show that new classes of extended (

-)regular languages can be naturally defined in

$A\mspace{-0.3mu}B\bar{B}\bar{A}$

Angelo Montanari, Pietro Sala

Eliminating Stack Symbols in Push-Down Automata and Linear Indexed Grammars

This paper investigates two subjects in push-down automata (PDAs) and linear indexed grammars (LIGs), which are extended PDAs, focusing on eliminating the stack symbols. One of the subjects is concerned with

PI- (push-input-)

PDA and PI-LIG without

-transition rule, in which only input symbols are pushed down to the stack. It is shown that the class of languages of PI-LIGs is incomparable with that of PDAs, which is the class of context-free languages (CFLs). The other subject is a simple bottom-up parsing method for LIGs, in which the stack symbols are eliminated at the first step of the parsing. The paper shows several PI-LIGs, including PI-PDAs for fundamental context-free and context-sensitive languages, which are synthesized by a grammatical inference system LIG Learner.

Katsuhiko Nakamura, Keita Imada

Asynchronous PC Systems of Pushdown Automata

We introduce

asynchronous

variants of the PC systems of pushdown automata of Csuhaj-Varjú et. al. These are obtained by using a

response symbol

in addition to the usual

query symbols

. Our main result states that centralized asynchronous PC systems of pushdown automata of degree

that work in returning mode have exactly the same expressive power as

-head pushdown automata. This holds in the nondeterministic as well as in the deterministic case.

Friedrich Otto

Model Checking Metric Temporal Logic over Automata with One Counter

We study the decidability status of the model checking problem for Metric Temporal Logic over models with one counter variable whose value can increase and decrease. This includes 1-counter machines with zero tests, 1-dimensional vector addition systems with states, and weighted automata with weights in the integers. We show that model checking of non-deterministic models is undecidable, even if we restrict the intervals used in the logic to be of the form ( − ∞ ,0] and [0, ∞ ). On the positive side, we show that model checking of deterministic models is decidable.

Karin Quaas

Coinductive Proof Techniques for Language Equivalence

Language equivalence can be checked coinductively by establishing a bisimulation on suitable deterministic automata. We improve and extend this technique with

bisimulation-up-to

, which is an enhancement of the bisimulation proof method. First, we focus on the regular operations of union, concatenation and Kleene star, and illustrate our method with new proofs of classical results such as Arden’s rule. Then we extend our enhanced proof method to incorporate language complement and intersection. Finally we define a general format of behavioural differential equations, in which one can define operations on languages for which bisimulation-up-to is a sound proof technique.

Jurriaan Rot, Marcello Bonsangue, Jan Rutten

Ostrowski Numeration and the Local Period of Sturmian Words

We show that the local period at position

in a characteristic Sturmian word can be given in terms of the Ostrowski representation for

+ 1.

Luke Schaeffer

Boolean Algebras of Regular ω-Languages

We characterize up to isomorphism the Boolean algebras of regular

-languages and of regular aperiodic

-languages, and show decidability of classes of languages related to these characterizations.

Victor Selivanov, Anton Konovalov

Pumping, Shrinking and Pronouns: From Context Free to Indexed Grammars

Pumping-shrinking operation is a useful tool in the study of Formal Languages. A pump-shrink bound for Linear Indexed Grammars is derived, depending only on the the number of auxiliary symbols of the grammar. Observations are suggested on formal operations analogous to shrinking of subordinate clauses.

Eli Shamir

Online Matching of Multiple Regular Patterns with Gaps and Character Classes

Given a dictionary

of regular expressions and a text

, the online regular-pattern-matching problem is to single out, for each text position

[

], those expressions in

that have a match ending at

[

], while processing

only once. This problem is considered in the context of regular patterns over bounded-length gaps and keywords, where the gaps are specified by wildcards and character classes and the keywords are strings over the input alphabet. Our algorithm is based on constructing the Aho–Corasick pattern-matching automaton for the set of keywords, and representing as a bit vector the set of keywords that can precede a given keyword in a regular-pattern instance. For a dictionary

with

patterns and with

keywords in pattern

, the preprocessing takes time

$O(|D| + \sum_{i=1}^r k_i^2 \log k_i / w)$

, where

denotes the number of bits in a memory word. When only fixed-length wildcard gaps without character classes are allowed, the time spent by our matching algorithm for each text character

[

] is at most

((log

) (

+ 1)), where

= max {

, …,

} and

is the number of keyword occurrences in

matched at text position

[

Seppo Sippu, Eljas Soisalon-Soininen

Infiniteness and Boundedness in 0L, DT0L, and T0L Systems

We investigate the boundary between finiteness and infiniteness in three types of L systems: 0L, DT0L, and T0L. We establish necessary and sufficient conditions for 0L, DT0L, and T0L systems to be infinite, and characterize the boundedness of finite classes of such systems. First, we give a pumping lemma for these systems, proving that the language of a system is infinite iff the system is pumpable. Next, we show that the number of steps needed to derive any string in any finite 0L or DT0L system is bounded by a function depending only on the size of the alphabet, and not on the production rules or start string. This alphabet boundedness does not hold for finite T0L systems in general. Finally, we show that every infinite 0L system has an infinite D0L subsystem.

Tim Smith

Uniformisation of Two-Way Transducers

We show that every relation realised by a nondeterministic two-way transducer contains a function with equal domain which can be realised by a sequential two-way transducer. Our proof is built on three structural constructions with automata: a variant of Shepherdson’s method to convert a two-way automaton into an equivalent one-way automaton, which we call the folding of a two-way automaton; the construction of an unambiguous automaton for a rational language based on covering of automata; a simulation of an unambiguous automaton by a deterministic two-way one due to Hopcroft and Ullman. It follows a new proof for the fact (due to Engelfriet and Hoogeboom) that every functional two-way transducer can be converted into a sequential one, together with a clear estimation for the underlying complexity.

Rodrigo de Souza

A Conditional Superpolynomial Lower Bound for Extended Resolution

Extended resolution is a propositional proof system that simulates polynomially very powerful proof systems such as Frege systems and extended Frege systems. Feasible interpolation has been one of the most promising approaches for proving lower bounds for propositional proof systems and for bounded arithmetic. We show that an extended resolution refutation of an unsatisfiable CNF representing the clique-graph colouring principle admits feasible interpolation. It gives us a conditional result: If there is a superpolynomial lower bound on the non-monotone circuit size for this class of formulas then extended resolution has a superpolynomial lower bound.

Olga Tveretina

A Turing Machine Distance Hierarchy

We introduce a new so-called distance complexity measure for Turing machine computations which is sensitive to long-distance transfers of information on the worktape. An important special case of this measure can be interpreted as a kind of buffering complexity which counts the number of necessary block uploads into a virtual buffer on top of the worktape. Thus, the distance measure can be used for investigating the buffering aspects of Turing computations. In this paper, we start this study by proving a tight separation and hierarchy result. In particular, we show that a very small increase in the distance complexity bound (roughly from

(

) to

(

+ 1) +

constant

) brings provably more computational power to both deterministic and nondeterministic Turing machines. For this purpose, we formulate a very general diagonalization method for Blum-like complexity measures. We also obtain a hierarchy of the distance complexity classes.

Stanislav Žák, Jiří Šíma

Backmatter

Title: Language and Automata Theory and Applications
Editors: Adrian-Horia Dediu
Carlos Martín-Vide
Bianca Truthe
Publisher: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-37064-9
Print ISBN: 978-3-642-37063-2
DOI: https://doi.org/10.1007/978-3-642-37064-9

Springer Professional

About this book

Table of Contents

Frontmatter

Invited Talks

Complexity Dichotomy for Counting Problems

Algorithms for Analyzing and Verifying Infinite-State Recursive Probabilistic Systems

Recursion Schemes, Collapsible Pushdown Automata and Higher-Order Model Checking

Discrete Linear Dynamical Systems

XML Schema Management: A Challenge for Automata Theory

Regular Papers

On the Complexity of Shortest Path Problems on Discounted Cost Graphs

Termination of Rule-Based Calculi for Uniform Semi-Unification

Deciding WQO for Factorial Languages

On the Construction of a Family of Automata That Are Generically Non-minimal

Limited Non-determinism Hierarchy of Counter Automata

Unambiguous Automata Denoting Finitely Sequential Functions

Duplication-Loss Genome Alignment: Complexity and Algorithm

Maximizing Entropy over Markov Processes

MAT Learning of Universal Automata

A Graph Polynomial Approach to Primitivity

Suffix Trees for Partial Words and the Longest Common Compatible Prefix Problem

Dynamic Communicating Automata and Branching High-Level MSCs

Visibly Pushdown Automata: Universality and Inclusion via Antichains

Two-Sided Derivatives for Regular Expressions and for Hairpin Expressions

How to Travel between Languages

Execution Information Rate for Some Classes of Automata

Decidability and Complexity Results for Verification of Asynchronous Broadcast Networks

The Buffered π-Calculus: A Model for Concurrent Languages

Mix-Automatic Sequences

A Multivariate Analysis of Some DFA Problems

On the Size Complexity of Deterministic Frequency Automata

On the Number of Unbordered Factors

Primitive Words and Lyndon Words in Automatic and Linearly Recurrent Sequences

Efficient Submatch Extraction for Practical Regular Expressions

Determinacy and Subsumption for Single-Valued Bottom-Up Tree Transducers

Revealing vs. Concealing: More Simulation Games for Büchi Inclusion

On Bounded Languages and Reversal-Bounded Automata

Rewrite Closure and CF Hedge Automata

Linear-Time Version of Holub’s Algorithm for Morphic Imprimitivity Testing

From Regular Tree Expression to Position Tree Automaton

Convergence of Newton’s Method over Commutative Semirings

Counting Minimal Symmetric Difference NFAs

Interval Logics and ωB-Regular Languages

Eliminating Stack Symbols in Push-Down Automata and Linear Indexed Grammars

Asynchronous PC Systems of Pushdown Automata

Model Checking Metric Temporal Logic over Automata with One Counter

Coinductive Proof Techniques for Language Equivalence

Ostrowski Numeration and the Local Period of Sturmian Words

Boolean Algebras of Regular ω-Languages

Pumping, Shrinking and Pronouns: From Context Free to Indexed Grammars

Online Matching of Multiple Regular Patterns with Gaps and Character Classes

Infiniteness and Boundedness in 0L, DT0L, and T0L Systems

Uniformisation of Two-Way Transducers

A Conditional Superpolynomial Lower Bound for Extended Resolution

A Turing Machine Distance Hierarchy

Backmatter

Premium Partner