2010 | OriginalPaper | Buchkapitel
Join Directly on Heavy-Weight Compressed Data in Column-Oriented Database
verfasst von : Gan Liang, Li RunHeng, Jia Yan, Jin Xin
Erschienen in: Web-Age Information Management
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Operating directly on compressed data can decrease CPU costs. Many light-weight compressions, such as run-length encoding and bit-vector encoding, can gain this benefit easily. Heavy-Weight Lempel-Ziv (LZ) has no method to operate directly on compressed data. We proposed a join algorithm,
LZ join
, which join two relations
R
and
S
directly on compressed data when decoding. Regard
R
as probe table and
S
as build table,
R
is encoded by LZ. When
R
probing
S
,
LZ join
decreases the join cost by using
cached results
(previous join results of IDs in
R’
s LZ dictionary window when decoder find that the same
R
’s ID sequence in window).
LZ join
combines decoding and join phase into one, which reduces the memory usage for decoding the whole
R
and CPU overhead for probing those
cached results
. Our analysis and experiments show that
LZ join
is better in some cases, the more compression ratio the better.