Skip to main content

2019 | OriginalPaper | Buchkapitel

Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics

Visionary Paper

verfasst von : Maribel Yasmina Santos, Carlos Costa, João Galvão, Carina Andrade, Oscar Pastor, Ana Cristina Marcén

Erschienen in: Information Systems Engineering in Responsible Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The existing capacity to collect, store, process and analyze huge amounts of data that is rapidly generated, i.e., Big Data, is characterized by fast technological developments and by a limited set of conceptual advances that guide researchers and practitioners in the implementation of Big Data systems. New data stores or processing tools frequently appear, proposing new (and usually more efficient) ways to store and query data (like SQL-on-Hadoop). Although very relevant, the lack of common methodological guidelines or practices has motivated the implementation of Big Data systems based on use-case driven approaches. This is also the case for one of the most valuable organizational data assets, the Data Warehouse, which needs to be rethought in the way it is designed, modeled, implemented, managed and monitored. This paper addresses some of the research challenges in Big Data Warehousing systems, proposing a vision that looks into: (i) the integration of new business processes and data sources; (ii) the proper way to achieve this integration; (iii) the management of these complex data systems and the enhancement of their performance; (iv) the automation of some of their analytical capabilities with Complex Event Processing and Machine Learning; and, (v) the flexible and highly customizable visualization of their data, providing an advanced decision-making support environment.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Madden, S.: From databases to big data. IEEE Internet Comput. 16(3), 4–6 (2012)CrossRef Madden, S.: From databases to big data. IEEE Internet Comput. 16(3), 4–6 (2012)CrossRef
2.
3.
Zurück zum Zitat Gandomi, A., Haider, M.: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manag. 35, 137–144 (2015)CrossRef Gandomi, A., Haider, M.: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manag. 35, 137–144 (2015)CrossRef
4.
Zurück zum Zitat Philip Chen, C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef Philip Chen, C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef
5.
Zurück zum Zitat Costa, C., Santos, M.Y.: Big data: state-of-the-art concepts, techniques, technologies, modeling approaches and research challenges. IAENG Int. J. Comput. Sci. 44, 285–301 (2017) Costa, C., Santos, M.Y.: Big data: state-of-the-art concepts, techniques, technologies, modeling approaches and research challenges. IAENG Int. J. Comput. Sci. 44, 285–301 (2017)
6.
Zurück zum Zitat NBD-PWG: NIST Big Data Interoperability Framework (2015) NBD-PWG: NIST Big Data Interoperability Framework (2015)
7.
Zurück zum Zitat Krishnan, K.: Data Warehousing in the Age of Big Data. Elsevier, Burlington (2013) Krishnan, K.: Data Warehousing in the Age of Big Data. Elsevier, Burlington (2013)
9.
Zurück zum Zitat Santos, M.Y., et al.: A Big Data system supporting Bosch Braga Industry 4.0 strategy. Int. J. Inf. Manag. 37, 750–760 (2017)CrossRef Santos, M.Y., et al.: A Big Data system supporting Bosch Braga Industry 4.0 strategy. Int. J. Inf. Manag. 37, 750–760 (2017)CrossRef
11.
Zurück zum Zitat Kimball, R., Ross, M.: The Data Warehouse Toolkit: The definitive Guide to Dimensional Modeling. Wiley, Indianapolis (2013) Kimball, R., Ross, M.: The Data Warehouse Toolkit: The definitive Guide to Dimensional Modeling. Wiley, Indianapolis (2013)
12.
Zurück zum Zitat Clegg, D.: Evolving data warehouse and BI architectures: the big data challenge. TDWI Bus. Intell. J. 20, 19–24 (2015) Clegg, D.: Evolving data warehouse and BI architectures: the big data challenge. TDWI Bus. Intell. J. 20, 19–24 (2015)
13.
Zurück zum Zitat Russom, P.: Data Warehouse Modernization in the Age of Big Data Analytics (2016) Russom, P.: Data Warehouse Modernization in the Age of Big Data Analytics (2016)
14.
Zurück zum Zitat Russom, P.: Evolving Data Warehouse Architectures in the Age of Big Data (2014) Russom, P.: Evolving Data Warehouse Architectures in the Age of Big Data (2014)
15.
Zurück zum Zitat Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Document-oriented models for data warehouses - NoSQL document-oriented for data warehouses. In: Proceedings of the 18th International Conference on Enterprise Information Systems, Rome, Italy, pp. 142–149 (2016). https://doi.org/10.5220/0005830801420149 Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Document-oriented models for data warehouses - NoSQL document-oriented for data warehouses. In: Proceedings of the 18th International Conference on Enterprise Information Systems, Rome, Italy, pp. 142–149 (2016). https://​doi.​org/​10.​5220/​0005830801420149​
16.
Zurück zum Zitat Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), Barcelona, Spain (2015) Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), Barcelona, Spain (2015)
17.
Zurück zum Zitat Gröger, C., Schwarz, H., Mitschang, B.: The deep data warehouse: link-based integration and enrichment of warehouse data and unstructured content. In: IEEE 18th International Enterprise Distributed Object Computing Conference (EDOC), pp. 210–217 (2014) Gröger, C., Schwarz, H., Mitschang, B.: The deep data warehouse: link-based integration and enrichment of warehouse data and unstructured content. In: IEEE 18th International Enterprise Distributed Object Computing Conference (EDOC), pp. 210–217 (2014)
18.
Zurück zum Zitat Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 39, 12 (2011)CrossRef Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 39, 12 (2011)CrossRef
19.
Zurück zum Zitat Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010) Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010)
20.
Zurück zum Zitat Pandis, I.: Impala: a modern, open-source SQL engine for hadoop. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR), p. 10 (2015) Pandis, I.: Impala: a modern, open-source SQL engine for hadoop. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR), p. 10 (2015)
23.
Zurück zum Zitat Song, J., Guo, C., Wang, Z., Zhang, Y., Yu, G., Pierson, J.-M.: HaoLap: a hadoop based OLAP system for big data. J. Syst. Softw. 102, 167–181 (2015)CrossRef Song, J., Guo, C., Wang, Z., Zhang, Y., Yu, G., Pierson, J.-M.: HaoLap: a hadoop based OLAP system for big data. J. Syst. Softw. 102, 167–181 (2015)CrossRef
24.
Zurück zum Zitat Wang, H., et al.: Efficient query processing framework for big data warehouse: an almost join-free approach. Front. Comput. Sci. 9, 224–236 (2015)MathSciNetCrossRef Wang, H., et al.: Efficient query processing framework for big data warehouse: an almost join-free approach. Front. Comput. Sci. 9, 224–236 (2015)MathSciNetCrossRef
25.
Zurück zum Zitat Tria, F.D., Lefons, E., Tangorra, F.: A framework for evaluating design methodologies for big data warehouses: measurement of the design process. Int. J. Data Warehouse. Min. 14(1), 15–39 (2018)CrossRef Tria, F.D., Lefons, E., Tangorra, F.: A framework for evaluating design methodologies for big data warehouses: measurement of the design process. Int. J. Data Warehouse. Min. 14(1), 15–39 (2018)CrossRef
28.
Zurück zum Zitat Rodrigues, M., Santos, M.Y., Bernardino, J.: Big data processing tools: an experimental performance evaluation. WIREs Data Min. Knowl. Discov. 9(2), e1297 (2019)CrossRef Rodrigues, M., Santos, M.Y., Bernardino, J.: Big data processing tools: an experimental performance evaluation. WIREs Data Min. Knowl. Discov. 9(2), e1297 (2019)CrossRef
29.
31.
Zurück zum Zitat Palacio, A.L., López, Ó.P., Ródenas, J.C.C.: A method to identify relevant genome data: conceptual modeling for the medicine of precision. In: Trujillo, J.C., Davis, K.C., Du, X., Li, Z., Ling, T.W., Li, G., Lee, M.L. (eds.) ER 2018. LNCS, vol. 11157, pp. 597–609. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_44CrossRef Palacio, A.L., López, Ó.P., Ródenas, J.C.C.: A method to identify relevant genome data: conceptual modeling for the medicine of precision. In: Trujillo, J.C., Davis, K.C., Du, X., Li, Z., Ling, T.W., Li, G., Lee, M.L. (eds.) ER 2018. LNCS, vol. 11157, pp. 597–609. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-00847-5_​44CrossRef
36.
Zurück zum Zitat Hadar, E.: BIDCEP: a vision of big data complex event processing for near real-time data streaming: position paper, a practitioner view. In: CAiSE 2016 Industry Track, CEUR Workshop Proceedings (2016) Hadar, E.: BIDCEP: a vision of big data complex event processing for near real-time data streaming: position paper, a practitioner view. In: CAiSE 2016 Industry Track, CEUR Workshop Proceedings (2016)
37.
Zurück zum Zitat Flouris, I., et al.: FERARI: a prototype for complex event processing over streaming multi-cloud platforms. In: Proceedings of the 2016 International Conference on Management of Data - SIGMOD 2016, pp. 2093–2096. ACM Press, San Francisco (2016). https://doi.org/10.1145/2882903.2899395 Flouris, I., et al.: FERARI: a prototype for complex event processing over streaming multi-cloud platforms. In: Proceedings of the 2016 International Conference on Management of Data - SIGMOD 2016, pp. 2093–2096. ACM Press, San Francisco (2016). https://​doi.​org/​10.​1145/​2882903.​2899395
Metadaten
Titel
Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics
verfasst von
Maribel Yasmina Santos
Carlos Costa
João Galvão
Carina Andrade
Oscar Pastor
Ana Cristina Marcén
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-21297-1_19