main-content

## Über dieses Buch

The 13th International Conference on Implementation and Application of - tomata (CIAA 2008) was held at San Francisco State University, San Francisco, July 21–24, 2008. This volume of Lecture Notes in Computer Science contains the papers that were presented at CIAA 2008, as well as the abstracts of the poster papers that were displayed during the conference. The volume also includes the - per/extended abstract of the four invited talks presented by Markus Holzer, Kai Salomaa, Mihalis Yannakakis, and Hsu-Chun Yen. The 24 regular papers were selected from 40 submissions covering various topics in the theory, implementation, and applications of automata and related structures. Each submitted paper was reviewed by at least three ProgramC- mittee members, with the assistance of external referees. The authors of the papers and posters presented in this volume come from the following co- tries: Australia, Belgium, Canada, China, Columbia, Czech Republic, France, Germany, Hungary, Italy, Japan, The Netherlands, Poland, Portugal, Romania, Russia, Spain, Sweden, Taiwan, United Arab Emerates, and USA. We wish to thank all who made this conference possible: the authors for s- mittingpapers,theProgramCommitteemembersandexternalreferees(listedin the proceedings) for their excellent work, and the four invited speakers. Finally, we wish to express our sincere appreciation to the sponsors, local organizers, and the editors of the Lecture Notes in Computer Science seriesand Springer, in particular Alfred Hofmann, for their help in publishing this volume in a timely manner.

## Inhaltsverzeichnis

### Nondeterministic Finite Automata—Recent Results on the Descriptional and Computational Complexity

Abstract
Nondeterministic finite automata (NFAs) were introduced in [67], where their equivalence to deterministic finite automata was shown. Over the last 50 years, a vast literature documenting the importance of finite automata as an enormously valuable concept has been developed. In the present paper, we tour a fragment of this literature. Mostly, we discuss recent developments relevant to NFAs related problems like, for example, (i) simulation of and by several types of finite automata, (ii) minimization and approximation, (iii) size estimation of minimal NFAs, and (iv) state complexity of language operations. We thus come across descriptional and computational complexity issues of nondeterministic finite automata. We do not prove these results but we merely draw attention to the big picture and some of the main ideas involved.
Markus Holzer, Martin Kutrib

### Language Decompositions, Primality, and Trajectory-Based Operations

Abstract
We consider the decomposability of languages and the notion of primality with respect to catenation, as well as, more general operations. We survey recent results and discuss open problems.
Kai Salomaa

### Automata, Probability, and Recursion

Abstract
We discuss work on the modeling and analysis of systems with probabilistic and recursive features. Recursive Markov chains extend ordinary finite state Markov chains with the ability to invoke other Markov chains in a potentially recursive manner. The equivalent model of Probabilistic Pushdown Automata extends ordinary pushdown automata with probabilistic actions. Both of these are natural abstract models for probabilistic programs with procedures, and related systems. They generalize other classical well-studied stochastic models, e.g. Stochastic Context-free Grammars and (Multi-type) Branching Processes, that arise in a variety of areas. More generally, Recursive Markov Decision Processes and Recursive Stochastic Games can be used to model recursive systems that have both probabilistic and nonprobabilistic, controllable actions. In recent years there has been substantial work on the algorithmic analysis of these models, regarding basic questions of termination, reachability, and analysis of the properties of their executions. In this talk we will present some of the basic theory, algorithmic methods, results, and challenges.
Mihalis Yannakakis

### Concurrency, Synchronization, and Conflicts in Petri Nets

Abstract
Petri nets represent one of the most popular formalisms for specifying, modeling, and analyzing concurrent systems. In spite of their popularity, many interesting problems concerning Petri nets are either undecidable or of very high complexity. Lipton [7] and Rackoff [10] showed exponential space lower and upper bounds, respectively, for the boundedness problem. As for the containment and the equivalence problems, Rabin [1] and Hack [5], respectively, showed these two problems to be undecidable. The reachability problem is known to be decidable [8] and exponential-space-hard [7].
Hsu-Chun Yen

### Automated Compositional Reasoning of Intuitionistically Closed Regular Properties

Abstract
Analysis of infinitary safety properties with automated compositional reasoning through learning is discussed. We consider the class of intuitionistically closed regular languages and show that it forms a Heyting algebra and is finitely approximatable. Consequently, compositional proof rules can be verified automatically and learning algorithms for finitary regular languages suffice for generating the needed contextual assumptions. We also provide a semantic justification of an axiom to deduce circular compositional proof rules for such infinitary languages.
Yih-Kuen Tsay, Bow-Yaw Wang

### Antimirov and Mosses’s Rewrite System Revisited

Abstract
Antimirov and Mosses proposed a rewrite system for deciding the equivalence of two (extended) regular expressions. In this paper we present a functional approach to that method, prove its correctness, and give some experimental comparative results. Besides an improved version of Antimirov and Mosses’s algorithm, we present a version using partial derivatives. Our preliminary results lead to the conclusion that, indeed, these methods are feasible and, generally, faster than the classical methods.
Marco Almeida, Nelma Moreira, Rogério Reis

### Antichain-Based Universality and Inclusion Testing over Nondeterministic Finite Tree Automata

Abstract
We propose new antichain-based algorithms for checking universality and inclusion of nondeterministic tree automata (NTA). We have implemented these algorithms in a prototype tool and our experiments show that they provide a significant improvement over the traditional determinisation-based approaches. We use our antichain-based inclusion checking algorithm to build an abstract regular tree model checking framework based entirely on NTA. We show the significantly improved efficiency of this framework through a series of experiments with verifying various programs over dynamic linked tree-shaped data structures.
Ahmed Bouajjani, Peter Habermehl, Lukáš Holík, Tayssir Touili, Tomáš Vojnar

### Testing Whether a Binary and Prolongeable Regular Language L Is Geometrical or Not on the Minimal Deterministic Automaton of Pref(L)

Abstract
Our aim is to present an efficient algorithm that checks whether a binary and prolongeable regular language is geometrical or not, based on specific properties of its minimal deterministic automaton. Geometrical languages have been introduced in the framework of off-line temporal validation of real-time softwares. Actually, validation can be achieved through both a model based on regular languages and a model based on discrete geometry. Geometrical languages are intended to develop a link between these two models. The regular case is of practical interest regarding to implementation features, which motivates the design of an efficient geometricity test addressing the family of regular languages.
J. -M. Champarnaud, J. -Ph. Dubernard, H. Jeanne

### Hopcroft’s Minimization Technique: Queues or Stacks?

Abstract
We consider the absolute worst case time complexity for Hopcroft’s minimization algorithm applied to unary languages (or a modification of this algorithm for cover automata minimization). We show that in this setting the worst case is reached only for deterministic automata or cover automata following the structure of the de Bruijn words. We refine a previous result by showing that the Berstel/Carton example reported before is actually the absolute worst case time complexity in the case of unary languages for deterministic automata. We show that the same result is valid also when considering the setting of cover automata and an algorithm based on the Hopcroft’s method used for minimization of cover automata. We also show that a LIFO implementation for the splitting list is desirable for the case of unary languages in the setting of deterministic finite automata.
Andrei Păun, Mihaela Păun, Alfonso Rodríguez-Patón

### Learning Regular Languages Using Nondeterministic Finite Automata

Abstract
A new general method for inference of regular languages using nondeterministic automata as output has recently been developed and proved to converge. The aim of this paper is to describe and analyze the behavior of two implementations of that method and to compare it with two well known algorithms for the same task. A complete set of experiments has been carried out and the results of the new algorithms improve the existing ones both in recognition rates as in sizes of the output automata.
Pedro García, Manuel Vázquez de Parga, Gloria I. Álvarez, José Ruiz

### Multi-Return Macro Tree Transducers

Abstract
An extension of macro tree transducers is introduced with the capability of states to return multiple trees at the same time. Under call-by-value semantics, the new model is strictly more expressive than call-by-value macro tree transducers, and moreover, it has better closure properties under composition.
Kazuhiro Inaba, Haruo Hosoya, Sebastian Maneth

### Computing Convex Hulls by Automata Iteration

Abstract
This paper considers the problem of computing the real convex hull of a finite set of n-dimensional integer vectors. The starting point is a finite-automaton representation of the initial set of vectors. The proposed method consists in computing a sequence of automata representing approximations of the convex hull and using extrapolation techniques to compute the limit of this sequence. The convex hull can then be directly computed from this limit in the form of an automaton-based representation of the corresponding set of real vectors. The technique is quite general and has been implemented. Also, our result fits in a wider scheme whose objective is to improve the techniques for converting automata-based representation of constraints to formulas.
François Cantin, Axel Legay, Pierre Wolper

### A Translation from the HTML DTD into a Regular Hedge Grammar

Abstract
The PHP string analyzer developed by the second author approximates the string output of a program with a context-free grammar. By developing a procedure to decide inclusion between context-free and regular hedge languages, Minamide and Tozawa applied this analysis to checking the validity of dynamically generated XHTML documents. In this paper, we consider the problem of checking the validity of dynamically generated HTML documents instead of XHTML documents.
HTML is not specified by an XML schema language, but by an SGML DTD, and we can omit several kinds of tags in HTML documents. We formalize a subclass of SGML DTDs and develop a translation into regular hedge grammars. Thus we can validate dynamically generated HTML documents. We have implemented this translation and incorporated it in the PHP string analyzer. The experimental results show that the validation through this translation works well in practice.
Takuya Nishiyama, Yasuhiko Minamide

### Tree-Series-to-Tree-Series Transformations

Abstract
We investigate the tree-series-to-tree-series (ts-ts) transformation computed by tree series transducers. Unless the used semiring is complete, this transformation is, in general, not well-defined. In practice, many used semirings are not complete (like the probability semiring). We establish a syntactical condition that guarantees well-definedness of the ts-ts transformation in arbitrary commutative semirings. For positive (ie, zero-sum and zero-divisor free) semirings the condition actually characterizes the well-definedness, so that well-definedness is decidable in this scenario.
Andreas Maletti

### Automata-Theoretic Analysis of Bit-Split Languages for Packet Scanning

Abstract
Bit-splitting breaks the problem of monitoring traffic payloads to detect the occurrence of suspicious patterns into several parallel components, each of which searches for a particular bit pattern. We analyze bit-splitting as applied to Aho-Corasick style string matching. The problem can be viewed as the recovery of a special class of regular languages over product alphabets from a collection of homomorphic images. We use this characterization to prove correctness and to give space bounds. In particular we show that the NFA to DFA conversion of the Aho-Corasick type machine used for bit-splitting incurs only linear overhead.
Ryan Dixon, Ömer Eğecioğlu, Timothy Sherwood

### Pattern Matching in DCA Coded Text

Abstract
A new algorithm searching all occurrences of a regular expression pattern in a text is presented. It uses only the text that has been compressed by the text compression using antidictionaries without its decompression. The proposed algorithm runs in $${\mathcal O}(2^m\cdot||{\rm AD}||^2+n_c+r)$$ worst case time, where m is the length of the pattern, AD is the antidictionary, n C is the length of the coded text and r is the number of found matches.
Jan Lahoda, Bořivoj Melichar, Jan Žd’árek

### Five Determinisation Algorithms

Abstract
Determinisation of nondeterministic finite automata is a well-studied problem that plays an important role in compiler theory and system verification. In the latter field, one often encounters automata consisting of millions or even billions of states. On such input, the memory usage of analysis tools becomes the major bottleneck. In this paper we present several determinisation algorithms, all variants of the well-known subset construction, that aim to reduce memory usage and produce smaller output automata. One of them produces automata that are already minimal. We apply our algorithms to determinise automata that describe the possible sequences appearing after a fixed-length run of cellular automaton 110, and obtain a significant improvement in both memory and time efficiency.
Rob van Glabbeek, Bas Ploeger

### Persistent Computations of Turing Machines

Abstract
In this paper we formally define the notion of persistent Turing machines to model interactive computations. We compare the power of persistent computations with their classical counterparts.

### On Complexity of Two Dimensional Languages Generated by Transducers

Abstract
We consider two-dimensional languages, called here 2d transducer languages, generated by iterative applications of transducers (finite state automata with output). To each transducer a two-dimensional language consisting of blocks of symbols is associated: the bottom row of a block is an input string accepted by the transducer and, by iterative application of the transducer, each row of the block is an output of the transducer on the preceding row. We observe that this class of languages is a proper subclass of recognizable picture languages containing the class of all factorial local 2d languages. By taking the average growth rate of the number of blocks in the language as a measure of its complexity, also known as the entropy of the language, we show that every entropy value of a one-dimensional regular language can be obtained as an entropy value of a 2d transducer language.
Egor Dolzhenko, Nataša Jonoska

### Games for Temporal Logics on Trees

Abstract
We associate a temporal logic XTL$$({\mathcal L})$$ with each class $$\mathcal L$$ of (regular) tree languages and provide both an algebraic and a game-theoretic characterization of the expressive power of the logic XTL($$\mathcal L$$).
Z. Ésik, Sz. Iván

### A Run-Time Efficient Implementation of Compressed Pattern Matching Automata

Abstract
We present a run-time efficient implementation of automata for compressed pattern matching (CPM), where a text is given as a truncation-free collage system $$\langle{\mathcal D},{\mathcal S}\rangle$$ such that variable sequence $$\mathcal S$$ is encoded by any prefix code. We experimentally show that a combination of recursive-pairing compression and byte-oriented Huffman coding allows both a high compression ratio and a high speed CPM.
Tetsuya Matsumoto, Kazuhito Hagio, Masayuki Takeda

### Composed Bisimulation for Tree Automata

Abstract
We address the problem of reducing the size of (nondeterministic, bottom-up) tree automata using suitable, language-preserving equivalences on the states of the automata. In particular, we propose the so-called composed bisimulation as a new language preserving equivalence. Composed bisimulation is defined in terms of two different relations, namely upward and downward bisimulation. Moreover, we provide simple and efficient algorithms for computing composed bisimulation based on a reduction to the problem of computing bisimulations on transition systems. The proposal of composed bisimulation is motivated by an attempt to obtain an equivalence that can provide better reductions than what currently known bisimulation-based approaches can offer, but which is not significantly more difficult to compute (and hence stays below the computational requirements of simulation-based reductions). The experimental results we present in the paper show that our composed bisimulation meets such requirements, and hence provides users of tree automata with a finer way to resolve the trade-off between the available degree of reduction and its cost.
Parosh A. Abdulla, Ahmed Bouajjani, Lukáš Holík, Lisa Kaati, Tomáš Vojnar

### Hyper-Minimization in O(n 2)

Abstract
Two formal languages are f-equivalent if their symmetric difference L 1 ∆ L 2 is a finite set — that is, if they differ on only finitely many words. The study of f-equivalent languages, and particularly the DFAs that accept them, was recently introduced [1]. First, we restate the fundamental results in this new area of research. Second, our main result is a faster algorithm for the natural minimization problem: given a starting DFA D, find the smallest (by number of states) DFA D′ such that L(D) and L(D′) are f-equivalent. Finally, we present a technique that combines this hyper-minimization with the well-studied notion of a deterministic finite cover automaton [2–4], or DFCA, thereby extending the application of DFCAs from finite to infinite regular languages.

### Deterministic Pushdown Automata and Unary Languages

Abstract
The simulation of deterministic pushdown automata defined over a one letter alphabet by finite state automata is investigated from a descriptional complexity point of view. We show that each unary deterministic pushdown automaton of size s can be simulated by a deterministic finite automaton with a number of states which is exponential in s. We prove that this simulation is tight. Furthermore, its cost cannot be reduced even if it is performed by a two-way nondeterministic automaton. We also prove that there are unary languages for which deterministic pushdown automata cannot be exponentially more succinct than finite automata. In order to state this result, we investigate the conversion of deterministic pushdown automata into context-free grammars. We prove that in the unary case the number of variables in the resulting grammar is strictly lower than the number of variables needed in the case of nonunary alphabets.
Giovanni Pighizzini

### Finite Eilenberg Machines

Abstract
Eilenberg machines define a general computational model. They are well suited to the simulation of problems specified using finite state formalisms such as formal languages and automata theory. This paper introduces a subclass of them called finite Eilenberg machines. We give a formal description of complete and efficient algorithms which permit the simulation of such machines. We show that our finiteness condition ensures a correct behavior of the simulation. Interpretations of this condition are studied for the cases of non-deterministic finite automata (NFA) and transducers, leading to applications to computational linguistics. The given implementation provides a generic simulation procedure for any problem encoded as a composition of finite Eilenberg machines.
Benoît Razet

### The Number of Runs in Sturmian Words

Abstract
Denote by the class of standard Sturmian words. It is a class of highly compressible words extensively studied in combinatorics of words, including the well known Fibonacci words. The suffix automata for these words have a very particular structure. This implies a simple characterization (described in the paper by the Structural Lemma) of the periods of runs (maximal repetitions) in Sturmian words. Using this characterization we derive an explicit formula for the number ρ(w) of runs in words , with respect to their recurrences (directive sequences). We show that $$\frac{\rho(w)}{|w|}\le \frac{4}{5} \textrm{\ for each\ }\ w\in {\cal S},$$ and there is an infinite sequence of strictly growing words $$w_k\in {\cal S}$$ such that $$\lim_{k\rightarrow \infty}\ \frac{\rho(w_k)}{|w_k|}\ =\ \frac{4}{5}$$. The complete understanding of the function ρ for a large class of complicated words is a step towards better understanding of the structure of runs in words. We also show how to compute the number of runs in a standard Sturmian word in linear time with respect to the size of its compressed representation (recurrences describing the word). This is an example of a very fast computation on texts given implicitly in terms of a special grammar-based compressed representation (usually of logarithmic size with respect to the explicit text).
Paweł Baturo, Marcin Piątkowski, Wojciech Rytter

### 3-Way Composition of Weighted Finite-State Transducers

Abstract
Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, 3-way composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T 1, T 2, and T 3 resulting in T, is O(|T| Q min (d(T 1) d(T 3), d(T 2)) + |T| E ), where |·| Q denotes the number of states, |·| E the number of transitions, and d(·) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
Cyril Allauzen, Mehryar Mohri

### Progressive Solutions to FSM Equations

Abstract
The equation solving problem is to derive the behavior of the unknown component X knowing the joint behavior of the other components (or the context) C and the specification of the overall system S. The component X can be derived by solving the Finite State Machine (FSM) equation C $$\lozenge$$ X ~ S, where $$\lozenge$$ is the parallel composition operator and ~ is the trace equivalence or the trace reduction relation. A solution X to an FSM equation is called progressive if for every external input sequence the composition C $$\lozenge$$ X does not fall into a livelock without an exit. In this paper, we formally define the notion of a progressive solution to a parallel FSM equation and present an algorithm that derives a largest progressive solution (if a progressive solution exists). In addition, we generalize the work to a system of FSM equations. Application examples are provided.
Khaled El-Fakih, Nina Yevtushenko

### Combination of Context-Free Grammars and Tree Automata for Unranked and Ranked Trees

Abstract
In this paper we will study the use of context-free grammars (CFGs) for the verification of tree structures. Because of the long history of the study of CFGs, parsing techniques for CFGs are well-established. The aim of this paper is to make use of those parsing techniques for the verification of tree structures.
Akio Fujiyoshi

### Approximate Periods with Levenshtein Distance

Abstract
We present a new algorithm deciding for strings t and w whether w is an approximate generator of t with Levenshtein distance at most k. The algorithm is based on finite state transducers.
Martin Šimůnek, Bořivoj Melichar

### Backmatter

Weitere Informationen