2002 | OriginalPaper | Buchkapitel
A Metric Index for Approximate String Matching
verfasst von : Edgar Chávez, Gonzalo Navarro
Erschienen in: LATIN 2002: Theoretical Informatics
Verlag: Springer Berlin Heidelberg
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the R occurrences of a pattern of length m in a text of length n in average time O(mlog2n+m2+R), using O(n log n) space and O(n log2n) index construction time. This complexity improves by far over all other previous methods. We also show a simpler scheme needing O(n) space.