Skip to main content
Top

2018 | OriginalPaper | Chapter

DWIaaS: Data Warehouse Infrastructure as a Service for Big Data Analytics

Authors : Hichem Dabbèchi, Ahlem Nabli, Lotfi Bouzguenda, Kais Haddar

Published in: Transactions on Computational Collective Intelligence XXX

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Many novel challenges and opportunities are associated with Big Data which require rethinking for many aspects of the traditional data warehouse architecture. Indeed, big data are collections of data sets so large and complex to process using classical data warehousing. This data is sourced from many different places such as social media and stored in different formats. It is primarily unstructured data needs a high performance information technology infrastructure that provides superior computational efficiency and storage capacity. This infrastructure should be flexible and scalable to ensure its management over large scale. In recent years, cloud computing is gaining momentum with more and more successful adoptions. This paper proposes a new data warehouse infrastructure as a service to effectively support distribution of big data storage, computing and parallelized programming.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Big data-as-a-service: a market and technology perspective. Technical report, EMC Solution Group (2012) Big data-as-a-service: a market and technology perspective. Technical report, EMC Solution Group (2012)
4.
go back to reference Abouzied, A., Bajda-Pawlikowski, K., Huang, J., Abadi, D.J., Silberschatz, A.: HadoopDB in action: building real world applications. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 1111–1114. ACM (2010) Abouzied, A., Bajda-Pawlikowski, K., Huang, J., Abadi, D.J., Silberschatz, A.: HadoopDB in action: building real world applications. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 1111–1114. ACM (2010)
5.
go back to reference Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 530–533. ACM (2011) Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 530–533. ACM (2011)
6.
go back to reference Aloisioa, G., Fiorea, S., Foster, I., Williams, D.: Scientific big data analytics challenges at large scale. In: Proceedings of Big Data and Extreme-scale Computing (BDEC) (2013) Aloisioa, G., Fiorea, S., Foster, I., Williams, D.: Scientific big data analytics challenges at large scale. In: Proceedings of Big Data and Extreme-scale Computing (BDEC) (2013)
7.
go back to reference Bakshi, K.: Considerations for big data: architecture and approach. In: 2012 IEEE Aerospace Conference, pp. 1–7. IEEE (2012) Bakshi, K.: Considerations for big data: architecture and approach. In: 2012 IEEE Aerospace Conference, pp. 1–7. IEEE (2012)
8.
go back to reference Bhatia, A., Vaswani, G.: Big data–a review. IEEE Int. J. Eng. Sci. Res. Technol. IJESRT (2013) Bhatia, A., Vaswani, G.: Big data–a review. IEEE Int. J. Eng. Sci. Res. Technol. IJESRT (2013)
9.
go back to reference Borthakur, D.: The hadoop distributed file system: architecture and design. Hadoop Proj. Website 11(2007), 21 (2007) Borthakur, D.: The hadoop distributed file system: architecture and design. Hadoop Proj. Website 11(2007), 21 (2007)
10.
go back to reference Chaiken, R., et al.: SCOPE: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow. 1(2), 1265–1276 (2008)CrossRef Chaiken, R., et al.: SCOPE: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow. 1(2), 1265–1276 (2008)CrossRef
11.
go back to reference Chaudhuri, S.: What next?: a half-dozen data management research goals for big data and the cloud. In: Proceedings of the 31st Symposium on Principles of Database Systems, pp. 1–4. ACM (2012) Chaudhuri, S.: What next?: a half-dozen data management research goals for big data and the cloud. In: Proceedings of the 31st Symposium on Principles of Database Systems, pp. 1–4. ACM (2012)
12.
go back to reference Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011)CrossRef Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011)CrossRef
13.
go back to reference Chen, S.: Cheetah: a high performance, custom data warehouse on top of MapReduce. Proc. VLDB Endow. 3(1–2), 1459–1468 (2010)CrossRef Chen, S.: Cheetah: a high performance, custom data warehouse on top of MapReduce. Proc. VLDB Endow. 3(1–2), 1459–1468 (2010)CrossRef
14.
15.
go back to reference Cuzzocrea, A., Bellatreche, L., Song, I.: Data warehousing and OLAP over big data: current challenges and future research directions. In: Proceedings of the Sixteenth International Workshop on Data Warehousing and OLAP, DOLAP 2013, San Francisco, CA, USA, 28 October 2013, pp. 67–70 (2013) Cuzzocrea, A., Bellatreche, L., Song, I.: Data warehousing and OLAP over big data: current challenges and future research directions. In: Proceedings of the Sixteenth International Workshop on Data Warehousing and OLAP, DOLAP 2013, San Francisco, CA, USA, 28 October 2013, pp. 67–70 (2013)
16.
go back to reference Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011) Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)
20.
go back to reference Essaidi, M.: ODBIS: towards a platform for on-demand business intelligence services. In: Proceedings of the 2010 EDBT/ICDT Workshops, p. 12. ACM (2010) Essaidi, M.: ODBIS: towards a platform for on-demand business intelligence services. In: Proceedings of the 2010 EDBT/ICDT Workshops, p. 12. ACM (2010)
21.
go back to reference Fiore, S., D’Anca, A., Palazzo, C., Foster, I., Williams, D.N., Aloisio, G.: Ophidia: toward big data analytics for escience. Procedia Comput. Sci. 18, 2376–2385 (2013)CrossRef Fiore, S., D’Anca, A., Palazzo, C., Foster, I., Williams, D.N., Aloisio, G.: Ophidia: toward big data analytics for escience. Procedia Comput. Sci. 18, 2376–2385 (2013)CrossRef
23.
go back to reference Herodotou, H., et al.: Starfish: a self-tuning system for big data analytics. In: CIDR, vol. 11, pp. 261–272 (2011) Herodotou, H., et al.: Starfish: a self-tuning system for big data analytics. In: CIDR, vol. 11, pp. 261–272 (2011)
24.
go back to reference Ji, C., Li, Y., Qiu, W., Awada, U., Li, K.: Big data processing in cloud computing environments. In: 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks (ISPAN), pp. 17–23. IEEE (2012) Ji, C., Li, Y., Qiu, W., Awada, U., Li, K.: Big data processing in cloud computing environments. In: 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks (ISPAN), pp. 17–23. IEEE (2012)
25.
go back to reference Kala Karun, A., Chitharanjan, K.: A review on hadoop—HDFS infrastructure extensions. In: 2013 IEEE Conference on Information & Communication Technologies (ICT), pp. 132–137. IEEE (2013) Kala Karun, A., Chitharanjan, K.: A review on hadoop—HDFS infrastructure extensions. In: 2013 IEEE Conference on Information & Communication Technologies (ICT), pp. 132–137. IEEE (2013)
26.
go back to reference Kataria, M., Mittal, M.P.: Big data: a review. Int. J. Comput. Sci. Mob. Comput. 3(7), 106–110 (2014) Kataria, M., Mittal, M.P.: Big data: a review. Int. J. Comput. Sci. Mob. Comput. 3(7), 106–110 (2014)
27.
28.
go back to reference O’Driscoll, A., Daugelaite, J., Sleator, R.D.: ‘Big data’, hadoop and cloud computing in genomics. J. Biomed. Inform. 46(5), 774–781 (2013)CrossRef O’Driscoll, A., Daugelaite, J., Sleator, R.D.: ‘Big data’, hadoop and cloud computing in genomics. J. Biomed. Inform. 46(5), 774–781 (2013)CrossRef
29.
go back to reference Sagiroglu, S., Sinanc, D.: Big data: a review. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 42–47. IEEE (2013) Sagiroglu, S., Sinanc, D.: Big data: a review. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 42–47. IEEE (2013)
31.
go back to reference Strauch, C., Sites, U.L.S., Kriha, W.: NoSQL databases. Lecture Notes, Stuttgart Media University (2011) Strauch, C., Sites, U.L.S., Kriha, W.: NoSQL databases. Lecture Notes, Stuttgart Media University (2011)
32.
go back to reference Thusoo, A., et al.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)CrossRef Thusoo, A., et al.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)CrossRef
33.
go back to reference Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010) Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010)
34.
go back to reference Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput. Commun. Rev. 39(1), 50–55 (2008)CrossRef Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput. Commun. Rev. 39(1), 50–55 (2008)CrossRef
35.
go back to reference Wanderman-Milne, S., Li, N.: Runtime code generation in cloudera impala. IEEE Data Eng. Bull. 37(1), 31–37 (2014) Wanderman-Milne, S., Li, N.: Runtime code generation in cloudera impala. IEEE Data Eng. Bull. 37(1), 31–37 (2014)
36.
go back to reference Wang, K., Zhou, X., Qiao, K., Lang, M., McClelland, B., Raicu, I.: Towards scalable distributed workload manager with monitoring-based weakly consistent resource stealing. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp. 219–222. ACM (2015) Wang, K., Zhou, X., Qiao, K., Lang, M., McClelland, B., Raicu, I.: Towards scalable distributed workload manager with monitoring-based weakly consistent resource stealing. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp. 219–222. ACM (2015)
38.
go back to reference Xinhua, E., Han, J., Wang, Y., Liu, L.: Big data-as-a-service: definition and architecture. In: 2013 15th IEEE International Conference on Communication Technology (ICCT), pp. 738–742. IEEE (2013) Xinhua, E., Han, J., Wang, Y., Liu, L.: Big data-as-a-service: definition and architecture. In: 2013 15th IEEE International Conference on Communication Technology (ICCT), pp. 738–742. IEEE (2013)
39.
go back to reference Zheng, Z., Zhu, J., Lyu, M.R.: Service-generated big data and big data-as-a-service: an overview. In: 2013 IEEE International Congress on Big Data (BigData Congress), pp. 403–410. IEEE (2013) Zheng, Z., Zhu, J., Lyu, M.R.: Service-generated big data and big data-as-a-service: an overview. In: 2013 IEEE International Congress on Big Data (BigData Congress), pp. 403–410. IEEE (2013)
Metadata
Title
DWIaaS: Data Warehouse Infrastructure as a Service for Big Data Analytics
Authors
Hichem Dabbèchi
Ahlem Nabli
Lotfi Bouzguenda
Kais Haddar
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-99810-7_7

Premium Partner