Sie können Operatoren mit Ihrer Suchanfrage kombinieren, um diese noch präziser einzugrenzen. Klicken Sie auf den Suchoperator, um eine Erklärung seiner Funktionsweise anzuzeigen.
Findet Dokumente, in denen beide Begriffe in beliebiger Reihenfolge innerhalb von maximal n Worten zueinander stehen. Empfehlung: Wählen Sie zwischen 15 und 30 als maximale Wortanzahl (z.B. NEAR(hybrid, antrieb, 20)).
Findet Dokumente, in denen der Begriff in Wortvarianten vorkommt, wobei diese VOR, HINTER oder VOR und HINTER dem Suchbegriff anschließen können (z.B., leichtbau*, *leichtbau, *leichtbau*).
Dieses Buch stellt die Protokolle der 26. Internationalen Konferenz zur Implementierung und Anwendung von Automaten CIAA 2022 dar, die im Juni / Juli 2022 in Rouen, Frankreich, stattfand. Die 16 regelmäßigen Vorträge, die zusammen mit 3 eingeladenen Vorträgen in diesem Buch präsentiert wurden, wurden sorgfältig überprüft und aus 26 Einreichungen ausgewählt. Die Themen der Vorträge decken verschiedene Bereiche der Anwendung, Implementierung und Theorie von Automaten und verwandten Strukturen ab.
Mit KI übersetzt
Über dieses Buch
This book constitutes the proceedings of the 26th International Conference on Implementation and Application of Automata, CIAA 2022, held in Rouen, France in June/ July 2022. The 16 regular papers presented together with 3 invited lectures in this book were carefully reviewed and selected from 26 submissions. The topics of the papers covering various fields in the application, implementation, and theory of automata and related structures.
We investigate the structure of the co-authorship graph for the Conference on Implementation and Application of Automata (CIAA) with techniques from network sciences. This allows us to answer a broad variety of questions on collaboration patterns. Our findings are in line with (statistical) properties of other co-authorship networks from biology, physics and mathematics as conducted earlier by pioneers of network sciences.
The notions of derivative and partial derivative of regular expressions revealed themselves to be very powerful and have been successfully extended to many other formal language classes and algebraic structures. Although the undisputed elegance of this formalism, its efficient practical use is still a challenging research topic. Here we give a brief historical overview and summarise some of these aspects.
Most regular-expression matching engines in practice are based on the Thompson construction and the Spencer matching algorithm. While these engines work fast and efficiently, a serious problem, the regular expression denial-of-service (ReDoS), has been reported recently. ReDoS is an algorithm complexity attack, which exploits the backtracking feature of the engine, and makes the service unresponsive indefinitely. Researchers suggested a few remedies to cope with the ReDoS problem, yet they are often ad-hoc or undesirable in practice. We instead propose a hybrid matching scheme that selects between the Thompson and the Spencer matching algorithms depending on the needed features. We also suggest to use the position construction for its intrinsic characteristics for fast matching. We evaluate the proposed approach using a benchmark dataset collected from various open-source projects, and compare the performance with the current approach. The experimental results show that a hybrid matcher reduces the ReDoS-vulnerability by 96% and 99.98% in full and partial matching, respectively. Moreover, 55% of the most problematic regular expressions become invulnerable to ReDoS by the position construction.
We propose a new unambiguous grammar formalism, referred to as ordered context-free grammars, which is identical to context-free grammars, apart from the property that it also places an order on parse trees. Since only a minor modification to ordered context-free grammars is required to obtain parsing expression grammars, the relationship between context-free grammars and parsing expression grammars becomes more evident. By preserving how ordered context-free grammars support left-recursion, parsing expression grammars is modified to support left recursion in ways much more natural than current approaches.
We study several classes of symbolic weighted formalisms: automata (swA), transducers (swT) and visibly pushdown extensions (swVPA, swVPT). They combine the respective extensions of their symbolic and weighted counterparts, allowing a quantitative evaluation of words over a large or infinite input alphabet.
We present properties of closure by composition, the computation of transducer-defined distances between nested words and languages, as well as a PTIME 1-best search algorithm for swVPA. These results are applied to solve in PTIME a variant of parsing over infinite alphabets. We illustrate this approach with a motivating use case in automated music transcription.
We present a distance metric on formal languages based on the accumulated weight of words in their symmetric difference. The contribution of an individual word to this weight decreases exponentially in its length, guaranteeing the distance between languages to be a real value between 0 and 1. We show that this distance is computable for regular languages. As an application, we show how the similarity measure derived from a modification of this metric can be used in automatic grading of particular standard exercises in formal language theory classes.
We introduce an automaton model for recognizing sets of hybrid trees, the hybrid tree automaton (HTA). Special cases of hybrid trees are constituent trees and dependency trees, as they occur in natural language processing. This includes the cases of discontinuous constituent trees and non-projective dependency trees. In general, a hybrid tree is a tree over a ranked alphabet in which symbols can additionally be equipped with an index, i.e., a natural number which indicates the position of that symbol in the yield of the hybrid tree. As a special case of HTA, we define constituent tree automata (CTA) which recognize sets of constituent trees. We show that the set of yields of a CTA-recognizable set of constituent trees is an LCFRS language, and vice versa.
The goal of this paper is to present a family of partial automata that achieve length \(\varTheta (3^{\frac{n}{3}})\) of the shortest carefully synchronizing words, but using \(\frac{2}{9}n + 2\) letters, thus substantially improving the result obtained in [19], which is \(\frac{1}{3}n + 1\) letters. Additionally, modifying our idea we obtain a family of automata over a three letter alphabet and a subexponential length of the shortest carefully synchronizing words and, as a corollary of that construction, a series of binary automata with a subexponential length of word reducing set of states to a particular subset.
The Parikh matrix mapping is a concept that provides information on the number of occurrences of certain (scattered) subwords in a word. Although Parikh matrices have been thoroughly studied, many of their basic properties remain open. In the present paper, we describe a toolkit that has been developed to support research in this field. Its functionality includes elementary and advanced operations related to Parikh matrices and the recently introduced variants of \(\mathbb {P}\)-Parikh matrices and \(\mathbb {L}\)-Parikh matrices.
Laura K. Hutchinson, Robert Mercaş, Daniel Reidenbach
We consider parsers of deterministic context-free languages and study the sizes of their syntax checking components. More precisely, we allow the input processing from left to right or, alternatively, from right to left, whatever is best for the given language. We establish an infinite sequence of deterministic context-free languages \(L_k\), for \(k\ge 1\), such that there is an exponential size trade-off between a deterministic pushdown automaton that reads its input from right to left and another one that reads its input from left to right. Concerning the constructibility of such a parser out of a given deterministic context-free language, it is shown that it is undecidable whether the reversal of a given deterministic context-free language is again deterministic context free. Furthermore, we study the expressive capacity of the family of languages whose reversals are deterministic context free. Finally, we turn to the family of deterministic context-free languages whose reversals are also deterministic context free and collect several of their closure properties.
Input-driven pushdown automata (\(\text {IDPDA}\)s) are pushdown automata where the next action on the pushdown store (push, pop, nothing) is solely governed by the input symbol. Nowadays such devices are usually defined such that every push operation pushes exactly one additional symbol on the pushdown store and, in addition, the devices work in real time so that stationary moves are not allowed. Here, we relax this strong definition and consider \(\text {IDPDA}\)s that may push more than one symbol in one step (\(\text {push-IDPDA}\)) or may perform stationary moves (\(\text {stat-IDPDA}\)). We study the computational power of the extended variants both in the deterministic and nondeterministic case, we investigate several decidability questions for the new automata classes, and we obtain useful interesting representations by inverse homomorphisms.
Martin Kutrib, Andreas Malcher, Matthias Wendlandt
The cut of two languages is a subset of their concatenation given by the leftmost maximal substring match. We study the state complexity of the cut operation assuming that both operands belong to some, possibly different, subclasses of convex languages, namely, right, left, two-sided, and all-sided ideal, prefix-, suffix-, factor-, and subword-closed, and -free languages. For all considered pairs of classes, we get the exact state complexity of cut. We show that it is m whenever the first language is a right ideal, and it is \(m+n-1\) or \(m+n-2\) if the first language is prefix-closed or prefix-free. In the other cases, the state complexity of cut is between \(mn-2n-m+4\) and \(mn-n+m\), the latter being the known state complexity of cut on regular languages. All our witnesses are described over a fixed alphabet of size at most three, except for three cases when the witness languages are described over an alphabet of size m or \(m-1\).
The separating words problem seeks to determine the asymptotic growth of the minimum number of states of a deterministic finite automaton that accepts x but rejects y, where x and y are given strings. We study three natural variants of this problem which impose additional constraints on the start and/or end states: \(\forall \)-separation requires different end states for every common start state; \(\forall ^2\)-separation requires different end states for every pair of start states; and \(\forall ^201\)-separation requires fixed different end states for every pair of start states.
For distinct strings of the same length, we establish exact bounds on the number of states for \(\forall ^2\)- and \(\forall ^201\)-separation, as well as a logarithmic lower bound and a linear upper bound for \(\forall \)-separation.
Graph-walking automata (GWA) analyze an input graph by moving between its nodes, following the edges. This paper investigates the effect of node-replacement graph homomorphisms on recognizability by these automata. The family of graph languages recognized by GWA is closed under inverse homomorphisms. The main result of this paper is that, for n-state automata operating on graphs with k labels of edge end-points, the inverse homomorphic images require GWA with \(kn+O(1)\) states in the worst case. The second result is that already for tree-walking automata, the family they recognize is not closed under injective homomorphisms; here the proof is based on a homomorphic characterization of regular tree languages.
Site-directed deletion is a biologically inspired operation that removes a contiguous substring from the host string guided by a template string. The template string must match the prefix and suffix of a substring. When this occurs the middle section of the substring not contained in the prefix or suffix is removed. We consider the nondeterministic state complexity of the site-directed deletion operation. For regular languages recognized by nondeterministic finite automata with N and M states, respectively, we establish a new upper bound of \(2NM + N\) and a new worst case lower bound of 2NM. The upper bound improves a previously established upper bound, and no non-trivial lower bound was previously known for the nondeterministic state complexity of site-directed deletion.
The erasure of each bit of information by a computing device has an intrinsic energy cost. Although any Turing machine can be rewritten to be thermodynamically reversible without changing the recognized language, finite automata that are restricted to scan their input once in “real-time” fashion can only recognize the members of a proper subset of the class of regular languages in this reversible manner. We use a general quantum finite automaton model to study the thermodynamic cost per step associated with the recognition of different regular languages. We show that zero-error quantum finite automata have no energy cost advantage over their classical deterministic counterparts, and prove an upper bound for the cost that holds for all regular languages. We also demonstrate languages for which “error can be traded for energy”, i.e. whose zero-error recognition is associated with provably bigger energy cost per step when compared to their bounded-error recognition by real-time finite-memory quantum devices.
Öykü Yılmaz, Fırat Kıyak, Meriç Üngör, A. C. Cem Say
We study the class of languages that have membership proofs which can be verified by real-time finite-state machines using only a constant number of random bits, regardless of the size of their inputs. Since any further restriction on the verifiers would preclude the verification of nonregular languages, this is the tightest computational budget which allows the checking of externally provided proofs to have meaningful use. We show that all languages that can be recognized by two-head one-way deterministic finite automata have such membership proofs. For any \(k>0\), there exist languages that cannot be recognized by any k-head one-way nondeterministic finite automaton, but that are nonetheless real-time verifiable in this sense. The set of nonpalindromes, which cannot be recognized by any one-way multihead deterministic finite automaton, is also demonstrated to be verifiable within these restrictions.
Özdeniz Dolu, Nevzat Ersoy, M. Utkan Gezer, A. C. Cem Say
For general input automata, there exist regular constraint languages such that asking if a given input automaton admits a synchronizing word in the constraint language is PSPACE-complete or NP-complete. Here, we investigate this problem for the following classes of input automata: monotonic automata, solvable automata and automata with simple idempotents over a binary alphabet. The latter class contains, for example, the Černý family of automata, an infinite family of n-state automata whose shortest synchronizing words have length \((n-1)^2\). Solvable automata generalize both commutative automata and weakly acyclic automata. We find that for monotonic automata, the problem is solvable in nondeterministic logarithmic space, for every regular constraint language. We identify a subclass of solvable automata for which the problem is in NP. This subclass strictly contains the weakly acyclic automata. In the course of our investigation we derive a sharp linear upper bound for the length of a shortest synchronizing word in a given constraint language, improving previous quadratic bounds for weakly acyclic and commutative automata. We also give structural characterizations of solvable automata and their recognized languages that imply that they recognize languages for which partial commutation closures give regular languages. Lastly, we show that for input automata with simple idempotents over a binary alphabet and with a constraint language given by a partial automaton with up to three states, the constrained synchronization problem is in P.
Weighted context-free grammar (WCFG) is a quantitative extension of context-free grammar (CFG). It is known that unambiguous weighted automata (WA), finitely-ambiguous WA, polynomially-ambiguous WA and general WA over the tropical semiring have different expressive powers. We prove that there exists a similar ambiguity hierarchy of WCFG over the tropical semiring, using an extended Ogden’s lemma. Furthermore, we show that the hierarchy we proved is different from the known ambiguity hierarchy of unweighted CFG.
Informationen zur Barrierefreiheit für dieses Buch folgen in Kürze. Wir arbeiten daran, sie so schnell wie möglich verfügbar zu machen. Vielen Dank für Ihre Geduld.