Skip to main content
Top
Published in: Knowledge and Information Systems 2/2016

01-08-2016 | Regular Paper

Performance evaluation of word-aligned compression methods for bitmap indices

Authors: Gheorghi Guzun, Guadalupe Canahuate

Published in: Knowledge and Information Systems | Issue 2/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Bitmap indices are a widely used scheme for large read-only repositories in data warehouses and scientific databases. This binary representation allows the use of bit-wise operations for fast query processing and is typically compressed using run-length encoding techniques. Most bitmap compression techniques are aligned using a fixed encoding length (32 or 64 bits) to avoid explicit decompression during query time. They have been proposed to extend or enhance word-aligned hybrid (WAH) compression. This paper presents a comparative study of four bitmap compression techniques: WAH, PLWAH, CONCISE, and EWAH. Experiments are targeted to identify the conditions under which each method should be applied and quantify the overhead incurred during query processing. Performance in terms of compression ratio and query time is evaluated over synthetic-generated bitmap indices, and results are validated over bitmap indices generated from real data sets. Different query optimizations are explored, query time estimation formulas are defined, and the conditions under which one method should be preferred over another are formalized.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Antoshenkov G (1995) Byte-aligned bitmap compression. In: DCC ’95: proceedings of the conference on data compression, Washington, DC, USA. IEEE Computer Society, p 476 Antoshenkov G (1995) Byte-aligned bitmap compression. In: DCC ’95: proceedings of the conference on data compression, Washington, DC, USA. IEEE Computer Society, p 476
2.
go back to reference Apaydin T, Tosun AC, Ferhatosmanoglu H (2008) Analysis of basic data reordering techniques. In: International conference on scientific and statistical database management, pp 517–524 Apaydin T, Tosun AC, Ferhatosmanoglu H (2008) Analysis of basic data reordering techniques. In: International conference on scientific and statistical database management, pp 517–524
3.
go back to reference Colantonio A, Di Pietro R (2010) Concise: compressed ‘n’ composable integer set. Inf Process Lett 110(16):644–650CrossRefMATH Colantonio A, Di Pietro R (2010) Concise: compressed ‘n’ composable integer set. Inf Process Lett 110(16):644–650CrossRefMATH
4.
go back to reference Deliege F, Pederson T (2010) Position list word aligned hybrid: optimizing space and performance for compressed bitmaps. In: Proceedings of the 2010 international conference on extending database technology (EDBT’10), pp 228–239 Deliege F, Pederson T (2010) Position list word aligned hybrid: optimizing space and performance for compressed bitmaps. In: Proceedings of the 2010 international conference on extending database technology (EDBT’10), pp 228–239
5.
go back to reference Fabian Corrales DC, Sawin J (2011) Variable length compression for bitmap indices. In: ACM international conference on database and expert systems applications, pp 381–395 Fabian Corrales DC, Sawin J (2011) Variable length compression for bitmap indices. In: ACM international conference on database and expert systems applications, pp 381–395
6.
go back to reference Fusco F, Stoecklin MP, Vlachos M (2010) Net-fli: on-the-fly compression, archiving and indexing of streaming network traffic. Proc VLDB Endow 3(2):1382–1393CrossRef Fusco F, Stoecklin MP, Vlachos M (2010) Net-fli: on-the-fly compression, archiving and indexing of streaming network traffic. Proc VLDB Endow 3(2):1382–1393CrossRef
7.
go back to reference Guzun G, Canahuate G, Chiu D, Sawin J (2014) A tunable compression framework for bitmap indices. In: 2014 IEEE 30th international conference on data engineering (ICDE). IEEE, pp 484–495 Guzun G, Canahuate G, Chiu D, Sawin J (2014) A tunable compression framework for bitmap indices. In: 2014 IEEE 30th international conference on data engineering (ICDE). IEEE, pp 484–495
8.
go back to reference Kaser O, Lemire D, Aouiche K (2008) Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes. In: ACM 11th international workshop on data warehousing and OLAP, pp 1–8 Kaser O, Lemire D, Aouiche K (2008) Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes. In: ACM 11th international workshop on data warehousing and OLAP, pp 1–8
9.
go back to reference Lemire D, Kaser O, Aouiche K (2010) Sorting improves word-aligned bitmap indexes. Data Knowl Eng 69:3–28CrossRef Lemire D, Kaser O, Aouiche K (2010) Sorting improves word-aligned bitmap indexes. Data Knowl Eng 69:3–28CrossRef
10.
go back to reference Lemire D, Kaser O, Gutarra E (2012) Reordering rows for better compression: beyond the lexicographic order. ACM Trans Database Syst 37(3):20:1–20:29CrossRef Lemire D, Kaser O, Gutarra E (2012) Reordering rows for better compression: beyond the lexicographic order. ACM Trans Database Syst 37(3):20:1–20:29CrossRef
11.
go back to reference Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137 Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
12.
go back to reference Malik HH, Kender JR (2007) Optimizing frequency queries for data mining applications. In: International conference on data mining, pp 595–600 Malik HH, Kender JR (2007) Optimizing frequency queries for data mining applications. In: International conference on data mining, pp 595–600
13.
go back to reference Pinar A, Heath MT (1999) Improving performance of sparse matrix-vector multiplication. In: Proceedings of supercomputing Pinar A, Heath MT (1999) Improving performance of sparse matrix-vector multiplication. In: Proceedings of supercomputing
14.
go back to reference Pinar A, Tao T, Ferhatosmanoglu H (2005) Compressing bitmap indices by data reorganization. In: Proceedings of the 2005 international conference on data engineering (ICDE’05), pp 310–321 Pinar A, Tao T, Ferhatosmanoglu H (2005) Compressing bitmap indices by data reorganization. In: Proceedings of the 2005 international conference on data engineering (ICDE’05), pp 310–321
15.
go back to reference van Schaik SJ, de Moor O (2011) A memory efficient reachability data structure through bit vector compression. In: ACM SIGMOD international conference on management of data, pp 913–924 van Schaik SJ, de Moor O (2011) A memory efficient reachability data structure through bit vector compression. In: ACM SIGMOD international conference on management of data, pp 913–924
16.
go back to reference Wu K, Otoo E, Shoshani A (2004) On the performance of bitmap indices for high cardinality attributes. In: Proceedings of the thirtieth international conference on very large data bases, vol 30, pp 24–35 Wu K, Otoo E, Shoshani A (2004) On the performance of bitmap indices for high cardinality attributes. In: Proceedings of the thirtieth international conference on very large data bases, vol 30, pp 24–35
17.
go back to reference Wu K, Otoo EJ, Shoshani A (2002) Compressing bitmap indexes for faster search operations. In: Proceedings of the 2002 international conference on scientific and statistical database management conference (SSDBM’02), pp 99–108 Wu K, Otoo EJ, Shoshani A (2002) Compressing bitmap indexes for faster search operations. In: Proceedings of the 2002 international conference on scientific and statistical database management conference (SSDBM’02), pp 99–108
18.
go back to reference Wu K, Otoo EJ, Shoshani A (2001) A performance comparison of bitmap indexes. In: CIKM 2001, pp 559–561 Wu K, Otoo EJ, Shoshani A (2001) A performance comparison of bitmap indexes. In: CIKM 2001, pp 559–561
19.
go back to reference Wu K, Otoo EJ, Shoshani A (2006) Optimizing bitmap indices with efficient compression. ACM Trans Database Syst 31(1):1–38CrossRef Wu K, Otoo EJ, Shoshani A (2006) Optimizing bitmap indices with efficient compression. ACM Trans Database Syst 31(1):1–38CrossRef
20.
go back to reference Wu K, Otoo EJ, Shoshani A, Nordberg H (2001b) Notes on design and implementation of compressed bit vectors. Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory Wu K, Otoo EJ, Shoshani A, Nordberg H (2001b) Notes on design and implementation of compressed bit vectors. Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory
21.
go back to reference Zipf G (1949) Human behaviour and the principle of least-effort. Addison-Wesley, Cambridge Zipf G (1949) Human behaviour and the principle of least-effort. Addison-Wesley, Cambridge
Metadata
Title
Performance evaluation of word-aligned compression methods for bitmap indices
Authors
Gheorghi Guzun
Guadalupe Canahuate
Publication date
01-08-2016
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2016
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0877-9

Other articles of this Issue 2/2016

Knowledge and Information Systems 2/2016 Go to the issue

Premium Partner