2009 | OriginalPaper | Buchkapitel
Indexing Variable Length Substrings for Exact and Approximate Matching
verfasst von : Gonzalo Navarro, Leena Salmela
Erschienen in: String Processing and Information Retrieval
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
We introduce two new index structures based on the
q
-gram index. The new structures index substrings of variable length instead of
q
-grams of fixed length. For both of the new indexes, we present a method based on the suffix tree to efficiently choose the indexed substrings so that each of them occurs almost equally frequently in the text. Our experiments show that the resulting indexes are up to 40% faster than the
q
-gram index when they use the same space.