2007 | OriginalPaper | Buchkapitel
Cache-Oblivious Index for Approximate String Matching
verfasst von : Wing-Kai Hon, Tak-Wah Lam, Rahul Shah, Siu-Lung Tam, Jeffrey Scott Vitter
Erschienen in: Combinatorial Pattern Matching
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text
T
of length
n
and a positive integer
k
, we want to construct an index of
T
such that for any input pattern
P
, we can find all its
k
-error matches in
T
efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies
O
((
n
log
k
n
)/
B
) disk pages and finds all
k
-error matches with
$O((|P|+occ)/B + \log^k n \log \log_{\scriptscriptstyle B} n)$
I/Os, where
B
denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require
$\Omega(|P| + occ + \mbox{poly}(\log n))$
I/Os. The second index reduces the space to
O
((
n
log
n
)/
B
) disk pages, and the I/O complexity is
O
((|
P
| +
occ
)/
B
+ log
k
(
k
+ 1)
n
loglog
n
).