Skip to main content

2015 | OriginalPaper | Buchkapitel

How Can We Implement a Multidimensional Data Warehouse Using NoSQL?

verfasst von : Max Chevalier, Mohammed El Malki, Arlind Kopliku, Olivier Teste, Ronan Tournier

Erschienen in: Enterprise Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The traditional OLAP (On-Line Analytical Processing) systems store data in relational databases. Unfortunately, it is difficult to manage big data volumes with such systems. As an alternative, NoSQL systems (Not-only SQL) provide scalability and flexibility for an OLAP system. We define a set of rules to map star schemas and its optimization structure, a precomputed aggregate lattice, into two logical NoSQL models: column-oriented and document-oriented. Using these rules we analyse and implement two decision support systems, one for each model (using MongoDB and HBase).We compare both systems during the phases of data (generated using the TPC-DS benchmark) loading, lattice generation and querying.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008). ACMCrossRef Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008). ACMCrossRef
2.
Zurück zum Zitat Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26, 65–74 (1997)CrossRef Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26, 65–74 (1997)CrossRef
3.
Zurück zum Zitat El Malki, M., Teste, O., Kopliku, A., Chevalier, M., Tournier, R.: Implementation of multidimensional databases with document-oriented NoSQL. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 379–390. Springer, Heidelberg (2015)CrossRef El Malki, M., Teste, O., Kopliku, A., Chevalier, M., Tournier, R.: Implementation of multidimensional databases with document-oriented NoSQL. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 379–390. Springer, Heidelberg (2015)CrossRef
4.
Zurück zum Zitat Kopliku, A., Chevalier, M., Malki, M.E., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL Systems. In: Morzy, T., Valduriez, P., Ladjel, B. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 79–91. Springer, Heidelberg (2015)CrossRef Kopliku, A., Chevalier, M., Malki, M.E., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL Systems. In: Morzy, T., Valduriez, P., Ladjel, B. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 79–91. Springer, Heidelberg (2015)CrossRef
5.
Zurück zum Zitat Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on NoSQL technologies. In: IEEE International Conference on Research Challenges in Information Systems (RCIS), pp. 480–485. IEEE (2015) Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on NoSQL technologies. In: IEEE International Conference on Research Challenges in Information Systems (RCIS), pp. 480–485. IEEE (2015)
6.
Zurück zum Zitat Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), vol. 1, pp. 172–183. SciTePress (2015) Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), vol. 1, pp. 172–183. SciTePress (2015)
7.
Zurück zum Zitat Colliat, G.: Olap, relational, and multidimensional database systems. ACM SIGMOD Rec. 25(3), 64–69 (1996)CrossRef Colliat, G.: Olap, relational, and multidimensional database systems. ACM SIGMOD Rec. 25(3), 64–69 (1996)CrossRef
8.
Zurück zum Zitat Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: 16th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 67–70. ACM (2013) Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: 16th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 67–70. ACM (2013)
9.
Zurück zum Zitat Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a MongoDB and hadoop platform for scientific data analysis. In: 4th Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013) Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a MongoDB and hadoop platform for scientific data analysis. In: 4th Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013)
10.
Zurück zum Zitat Dehdouh, K., Boussaid, O., Bentayed, F., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: 21st International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469–475 (2015) Dehdouh, K., Boussaid, O., Bentayed, F., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: 21st International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469–475 (2015)
11.
Zurück zum Zitat Bentayeb, F., Boussaid, O., Kabachi, N., Dehdouh, K.: Towards an OLAP environment for column-oriented data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 221–232. Springer, Heidelberg (2014) Bentayeb, F., Boussaid, O., Kabachi, N., Dehdouh, K.: Towards an OLAP environment for column-oriented data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 221–232. Springer, Heidelberg (2014)
12.
Zurück zum Zitat Bentayeb, F., Dehdouh, K., Boussaid, O.: Columnar NoSQL star schema benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 281–288. Springer, Heidelberg (2014) Bentayeb, F., Dehdouh, K., Boussaid, O.: Columnar NoSQL star schema benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 281–288. Springer, Heidelberg (2014)
13.
Zurück zum Zitat Floratou, A., Teletia, N., Dewitt, D., Patel, J., Zhang, D.: Can the elephants handle the NoSQL onslaught? In: International Conference on Very Large Data Bases (VLDB) 5(12), 1712–1723. VLDB Endowment (2012) Floratou, A., Teletia, N., Dewitt, D., Patel, J., Zhang, D.: Can the elephants handle the NoSQL onslaught? In: International Conference on Very Large Data Bases (VLDB) 5(12), 1712–1723. VLDB Endowment (2012)
14.
Zurück zum Zitat Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7(2–3), 215–247 (1998)CrossRef Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7(2–3), 215–247 (1998)CrossRef
15.
Zurück zum Zitat Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International Conference on Data Engineering (ICDE), pp. 152–159. IEEE Computer Society (1996) Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International Conference on Data Engineering (ICDE), pp. 152–159. IEEE Computer Society (1996)
16.
Zurück zum Zitat Han, D., Stroulia, E.: A three-dimensional data model in Hbase for large time-series dataset analysis. In: 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA), pp. 47–56. IEEE (2012) Han, D., Stroulia, E.: A three-dimensional data model in Hbase for large time-series dataset analysis. In: 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA), pp. 47–56. IEEE (2012)
17.
Zurück zum Zitat Jacobs, A.: The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)CrossRef Jacobs, A.: The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)CrossRef
18.
Zurück zum Zitat Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Indianapolis (2013) Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Indianapolis (2013)
19.
Zurück zum Zitat Kim, J., Moon, Y.-S., Lee, S., Lee, W.: Efficient distributed parallel top-down computation of R-OLAP data cube using mapreduce. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 168–179. Springer, Heidelberg (2012)CrossRef Kim, J., Moon, Y.-S., Lee, S., Lee, W.: Efficient distributed parallel top-down computation of R-OLAP data cube using mapreduce. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 168–179. Springer, Heidelberg (2012)CrossRef
20.
Zurück zum Zitat LeFevre, J., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: International Conference on Management of data (SIGMOD), pp. 1591–1602. ACM (2014) LeFevre, J., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: International Conference on Management of data (SIGMOD), pp. 1591–1602. ACM (2014)
21.
Zurück zum Zitat Li, C.: Transforming relational database into Hbase: A case study. In: International Conference on Software Engineering and Service Sciences (ICSESS), pp. 683–687. IEEE (2010) Li, C.: Transforming relational database into Hbase: A case study. In: International Conference on Software Engineering and Service Sciences (ICSESS), pp. 683–687. IEEE (2010)
22.
Zurück zum Zitat Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data Knowl. Eng. (DKE) 59(2), 348–377 (2006). ElsevierCrossRef Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data Knowl. Eng. (DKE) 59(2), 348–377 (2006). ElsevierCrossRef
23.
Zurück zum Zitat Morfonios, K., Konakas, S., Ioannidis, Y., Kotsis, N.: R-OLAP implementations of the data cube. ACM Comput. Surv. 39(4), 12 (2007). ACMCrossRef Morfonios, K., Konakas, S., Ioannidis, Y., Kotsis, N.: R-OLAP implementations of the data cube. ACM Comput. Surv. 39(4), 12 (2007). ACMCrossRef
24.
Zurück zum Zitat Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: International Conference on Management of data (SIGMOD), pp. 165–178. ACM (2009) Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: International Conference on Management of data (SIGMOD), pp. 165–178. ACM (2009)
25.
Zurück zum Zitat Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and Graphic Languages for OLAP Manipulations. Int. J. Data Warehouse. Min. (IJDWM) 4(1), 17–46 (2008). IGI PublishingCrossRef Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and Graphic Languages for OLAP Manipulations. Int. J. Data Warehouse. Min. (IJDWM) 4(1), 17–46 (2008). IGI PublishingCrossRef
26.
Zurück zum Zitat Simitsis, A., Vassiliadis, P., Sellis, T.: Optimizing ETL processes in data warehouses. In: International Conference on Data Engineering (ICDE), pp. 564–575. IEEE (2005) Simitsis, A., Vassiliadis, P., Sellis, T.: Optimizing ETL processes in data warehouses. In: International Conference on Data Engineering (ICDE), pp. 564–575. IEEE (2005)
27.
Zurück zum Zitat Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012)CrossRef Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012)CrossRef
28.
Zurück zum Zitat Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: 33rd International Conference on Very large Data Bases (VLDB), pp. 1150–1160. ACM (2007) Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: 33rd International Conference on Very large Data Bases (VLDB), pp. 1150–1160. ACM (2007)
30.
Zurück zum Zitat Vajk, T., Feher, P., Fekete, K., Charaf, H.: Denormalizing data into schema-free databases. In: 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 747–752. IEEE (2013) Vajk, T., Feher, P., Fekete, K., Charaf, H.: Denormalizing data into schema-free databases. In: 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 747–752. IEEE (2013)
31.
Zurück zum Zitat Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N.: ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Engineering Bulletin, 23(4), IEEE, pp. 42–47, 2000 Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N.: ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Engineering Bulletin, 23(4), IEEE, pp. 42–47, 2000
32.
Zurück zum Zitat Tahara, D., Diamond, T., Abadi, D.J.: Sinew: a SQL system for multi-structured data. In: International Conference on Management of data (SIGMOD), pp. 815–826. ACM (2014) Tahara, D., Diamond, T., Abadi, D.J.: Sinew: a SQL system for multi-structured data. In: International Conference on Management of data (SIGMOD), pp. 815–826. ACM (2014)
34.
Zurück zum Zitat Wrembel, R.: A survey of managing the evolution of data warehouses. Int. J. Data Warehouse. Min. (IJDWM) 5(2), 24–56 (2009). IGI PublishingCrossRef Wrembel, R.: A survey of managing the evolution of data warehouses. Int. J. Data Warehouse. Min. (IJDWM) 5(2), 24–56 (2009). IGI PublishingCrossRef
35.
Zurück zum Zitat Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–108. Springer, Heidelberg (2014)CrossRef Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–108. Springer, Heidelberg (2014)CrossRef
Metadaten
Titel
How Can We Implement a Multidimensional Data Warehouse Using NoSQL?
verfasst von
Max Chevalier
Mohammed El Malki
Arlind Kopliku
Olivier Teste
Ronan Tournier
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-29133-8_6