Skip to main content
Top

2018 | OriginalPaper | Chapter

Entropy Aware Adaptive Compression for SQL Column Stores

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the advent of SQL column stores, compression has gained renewed interest and drawn considerable attention from both academia and industry. Unlike row stores, column stores use lightweight compression methods and, generally, compression granularity is at entire column level. In this paper we outline and explore an alternative compression strategy for column stores that works at a different granularity and adapts itself to data, on-the-fly, using a compression planner. The approach yields good compression ratios, facilitates compression during bulk data load and also mitigates some issues that arise from having to maintain global meta-data on compression. We describe its implementation in analytics database dbX, a cloud agnostic, columnar MPP SQL product and present experimental results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abadi, D., Boncz, P.A., Harizopoulos, S., Idreos, S., Madden, S.: The design and implementation of modern column-oriented database systems. Found. Trends Databases 5(3), 197–280 (2013)CrossRef Abadi, D., Boncz, P.A., Harizopoulos, S., Idreos, S., Madden, S.: The design and implementation of modern column-oriented database systems. Found. Trends Databases 5(3), 197–280 (2013)CrossRef
2.
go back to reference Abadi, D.J., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: SIGMOD 2006, pp. 671–682. ACM (2006) Abadi, D.J., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: SIGMOD 2006, pp. 671–682. ACM (2006)
3.
go back to reference Abadi, D.J., Madden, S., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD 2008, pp. 967–980. ACM (2008) Abadi, D.J., Madden, S., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD 2008, pp. 967–980. ACM (2008)
4.
go back to reference Baklarz, G.: DB2 compression estimation tool. Technical report, IBM Corporation, Canada, October 2016 Baklarz, G.: DB2 compression estimation tool. Technical report, IBM Corporation, Canada, October 2016
5.
go back to reference Binnig, C., Hildenbrand, S., Färber, F.: Dictionary-based order-preserving string compression for main memory column stores. In: SIGMOD 2009, pp. 283–296. ACM (2009) Binnig, C., Hildenbrand, S., Färber, F.: Dictionary-based order-preserving string compression for main memory column stores. In: SIGMOD 2009, pp. 283–296. ACM (2009)
6.
go back to reference Copeland, G.P., Khoshafian, S.: A decomposition storage model. In: SIGMOD, pp. 268–279. ACM (1985) Copeland, G.P., Khoshafian, S.: A decomposition storage model. In: SIGMOD, pp. 268–279. ACM (1985)
7.
go back to reference Damme, P., Habich, D., Hildebrandt, J., Lehner, W.: Lightweight data compression algorithms: an experimental survey. In: Proceedings of 20th EDBT, pp. 72–83 (2017) Damme, P., Habich, D., Hildebrandt, J., Lehner, W.: Lightweight data compression algorithms: an experimental survey. In: Proceedings of 20th EDBT, pp. 72–83 (2017)
8.
go back to reference Flajolet, P., Fusy, E., Gandouet, O., Mennier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Analysis of Algorithms 2007 (AofA07), pp. 127–146 (2007) Flajolet, P., Fusy, E., Gandouet, O., Mennier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Analysis of Algorithms 2007 (AofA07), pp. 127–146 (2007)
9.
go back to reference Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: Proceedings of 14th ICDE, pp. 370–379. IEEE (1998) Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: Proceedings of 14th ICDE, pp. 370–379. IEEE (1998)
10.
go back to reference Graefe, G., Shapiro, L.: Data compression and database performance. In: Proceedings of ACM/IEEE-CS Symposium on Applied Computing, pp. 22–27. IEEE (1991) Graefe, G., Shapiro, L.: Data compression and database performance. In: Proceedings of ACM/IEEE-CS Symposium on Applied Computing, pp. 22–27. IEEE (1991)
11.
go back to reference Hovestadt, M., Kao, O., Kliem, A., Warneke, D.: Evaluating adaptive compression to mitigate the effects of shared I/O in clouds. In: 25th IEEE IPDPS, pp. 1042–1051. IEEE (2011) Hovestadt, M., Kao, O., Kliem, A., Warneke, D.: Evaluating adaptive compression to mitigate the effects of shared I/O in clouds. In: 25th IEEE IPDPS, pp. 1042–1051. IEEE (2011)
12.
go back to reference Idreos, S., Groffen, F., Nes, N., Manegold, S., Mullender, K.S., Kersten, M.L.: MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng. Bull. 35(1), 40–45 (2012) Idreos, S., Groffen, F., Nes, N., Manegold, S., Mullender, K.S., Kersten, M.L.: MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng. Bull. 35(1), 40–45 (2012)
13.
go back to reference Iyer, B.R., Wilhite, D.: Data compression support in databases. In: Proceedings of 20th VLDB, pp. 695–704 (1994) Iyer, B.R., Wilhite, D.: Data compression support in databases. In: Proceedings of 20th VLDB, pp. 695–704 (1994)
14.
go back to reference Krintz, C., Sucu, S.: Adaptive on-the-fly compression. IEEE Trans. Parallel Distrib. Syst. 17(1), 15–24 (2006)CrossRef Krintz, C., Sucu, S.: Adaptive on-the-fly compression. IEEE Trans. Parallel Distrib. Syst. 17(1), 15–24 (2006)CrossRef
16.
go back to reference Lang, H., Mühlbauer, T., Funke, F., Boncz, P.A., Neumann, T., Kemper, A.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD 2016, pp. 311–326. ACM (2016) Lang, H., Mühlbauer, T., Funke, F., Boncz, P.A., Neumann, T., Kemper, A.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD 2016, pp. 311–326. ACM (2016)
17.
go back to reference Lelewer, D.A., Hirschberg, D.S.: Data compression. ACM Comput. Surv. 19(3), 261–296 (1987)CrossRef Lelewer, D.A., Hirschberg, D.S.: Data compression. ACM Comput. Surv. 19(3), 261–296 (1987)CrossRef
18.
go back to reference Poess, M., Potapov, D.: Data compression in oracle. In: Proceedings of 29th VLDB, pp. 761–770 (2003) Poess, M., Potapov, D.: Data compression in oracle. In: Proceedings of 29th VLDB, pp. 761–770 (2003)
19.
go back to reference Raman, V., et al.: DB2 with BLU acceleration: so much more than just a column store. PVLDB 6(11), 1080–1091 (2013) Raman, V., et al.: DB2 with BLU acceleration: so much more than just a column store. PVLDB 6(11), 1080–1091 (2013)
21.
go back to reference Stonebraker, M., et al.: C-store: a column-oriented DBMS. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 553–564. VLDB Endowment (2005) Stonebraker, M., et al.: C-store: a column-oriented DBMS. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 553–564. VLDB Endowment (2005)
23.
go back to reference Viglas, S.: Just-in-time compilation for SQL query processing. PVLDB 6(11), 1190–1191 (2013) Viglas, S.: Just-in-time compilation for SQL query processing. PVLDB 6(11), 1190–1191 (2013)
24.
go back to reference Wust, J., Krüger, J., Gund, M., Hartmann, U., Plattner, H.: Sparse dictionaries for in-memory column stores. In: Proceedings of 4th DBKDA, pp. 25–33. IARIA (2012) Wust, J., Krüger, J., Gund, M., Hartmann, U., Plattner, H.: Sparse dictionaries for in-memory column stores. In: Proceedings of 4th DBKDA, pp. 25–33. IARIA (2012)
25.
go back to reference Zukowski, M., Héman, S., Nes, N., Boncz, P.A.: Super-scalar RAM-CPU cache compression. In: Proceedings of 22nd ICDE, pp. 59–71. IEEE (2006) Zukowski, M., Héman, S., Nes, N., Boncz, P.A.: Super-scalar RAM-CPU cache compression. In: Proceedings of 22nd ICDE, pp. 59–71. IEEE (2006)
Metadata
Title
Entropy Aware Adaptive Compression for SQL Column Stores
Authors
K. T. Sridhar
Jimson Johnson
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-99987-6_7

Premium Partner