2006 | OriginalPaper | Buchkapitel
Reducing the Space Requirement of LZ-Index
verfasst von : Diego Arroyuelo, Gonzalo Navarro, Kunihiko Sadakane
Erschienen in: Combinatorial Pattern Matching
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The LZ-index is a
compressed full-text self-index
able to represent a text
P
1...m
, over an alphabet of size
$\sigma = O(\textrm{polylog}(u))$
and with
k
-th order empirical entropy
H
k
(
T
), using 4
uH
k
(
T
) +
o
(
u
log
σ
) bits for any
k
=
o
(log
σ
u
). It can report all the
occ
occurrences of a pattern
P
1...m
in
T
in
O
(
m
3
log
σ
+ (
m
+
occ
)log
u
) worst case time. Its main drawback is the factor 4 in its space complexity, which makes it larger than other state-of-the-art alternatives. In this paper we present two different approaches to reduce the space requirement of LZ-index. In both cases we achieve (2 +
ε
)
uH
k
(
T
) +
o
(
u
log
σ
) bits of space, for any constant
ε
> 0, and we simultaneously improve the search time to
O
(
m
2
log
m
+ (
m
+
occ
)log
u
). Both indexes support displaying any subtext of length ℓ in optimal
O
(ℓ/log
σ
u
) time. In addition, we show how the space can be squeezed to (1 +
ε
)
uH
k
(
T
) +
o
(
u
log
σ
) to obtain a structure with
O
(
m
2
) average search time for
$m \geqslant 2\log_\sigma{u}$
.