Skip to main content
Erschienen in:
Buchtitelbild

1996 | ReviewPaper | Buchkapitel

A faster algorithm for approximate string matching

verfasst von : Ricardo Baeza-Yates, Gonzalo Navarro

Erschienen in: Combinatorial Pattern Matching

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a non-deterministic finite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine with word length O(log n), being n the maximum size of the text. The running time achieved is O(n) for small patterns (i.e. of length m=O(√log n)), independently of the maximum number of errors allowed, k. This algorithm is then used to design two general algorithms. One of them partitions the problem into subproblems, while the other partitions the automaton into sub-automata. These algorithms are combined to obtain a hybrid algorithm which on average is O(n) for moderate k/m ratios, O(√mk/log n n) for medium ratios, and O((m−k)kn/log n) for large ratios. We show experimentally that this hybrid algorithm is faster than previous ones for moderate size of patterns and error ratios, which is the case in text searching.

Metadaten
Titel
A faster algorithm for approximate string matching
verfasst von
Ricardo Baeza-Yates
Gonzalo Navarro
Copyright-Jahr
1996
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-61258-0_1

Neuer Inhalt