Skip to main content
Top

2019 | OriginalPaper | Chapter

Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics

Visionary Paper

Authors : Maribel Yasmina Santos, Carlos Costa, João Galvão, Carina Andrade, Oscar Pastor, Ana Cristina Marcén

Published in: Information Systems Engineering in Responsible Information Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The existing capacity to collect, store, process and analyze huge amounts of data that is rapidly generated, i.e., Big Data, is characterized by fast technological developments and by a limited set of conceptual advances that guide researchers and practitioners in the implementation of Big Data systems. New data stores or processing tools frequently appear, proposing new (and usually more efficient) ways to store and query data (like SQL-on-Hadoop). Although very relevant, the lack of common methodological guidelines or practices has motivated the implementation of Big Data systems based on use-case driven approaches. This is also the case for one of the most valuable organizational data assets, the Data Warehouse, which needs to be rethought in the way it is designed, modeled, implemented, managed and monitored. This paper addresses some of the research challenges in Big Data Warehousing systems, proposing a vision that looks into: (i) the integration of new business processes and data sources; (ii) the proper way to achieve this integration; (iii) the management of these complex data systems and the enhancement of their performance; (iv) the automation of some of their analytical capabilities with Complex Event Processing and Machine Learning; and, (v) the flexible and highly customizable visualization of their data, providing an advanced decision-making support environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Madden, S.: From databases to big data. IEEE Internet Comput. 16(3), 4–6 (2012)CrossRef Madden, S.: From databases to big data. IEEE Internet Comput. 16(3), 4–6 (2012)CrossRef
3.
go back to reference Gandomi, A., Haider, M.: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manag. 35, 137–144 (2015)CrossRef Gandomi, A., Haider, M.: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manag. 35, 137–144 (2015)CrossRef
4.
go back to reference Philip Chen, C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef Philip Chen, C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef
5.
go back to reference Costa, C., Santos, M.Y.: Big data: state-of-the-art concepts, techniques, technologies, modeling approaches and research challenges. IAENG Int. J. Comput. Sci. 44, 285–301 (2017) Costa, C., Santos, M.Y.: Big data: state-of-the-art concepts, techniques, technologies, modeling approaches and research challenges. IAENG Int. J. Comput. Sci. 44, 285–301 (2017)
6.
go back to reference NBD-PWG: NIST Big Data Interoperability Framework (2015) NBD-PWG: NIST Big Data Interoperability Framework (2015)
7.
go back to reference Krishnan, K.: Data Warehousing in the Age of Big Data. Elsevier, Burlington (2013) Krishnan, K.: Data Warehousing in the Age of Big Data. Elsevier, Burlington (2013)
9.
go back to reference Santos, M.Y., et al.: A Big Data system supporting Bosch Braga Industry 4.0 strategy. Int. J. Inf. Manag. 37, 750–760 (2017)CrossRef Santos, M.Y., et al.: A Big Data system supporting Bosch Braga Industry 4.0 strategy. Int. J. Inf. Manag. 37, 750–760 (2017)CrossRef
11.
go back to reference Kimball, R., Ross, M.: The Data Warehouse Toolkit: The definitive Guide to Dimensional Modeling. Wiley, Indianapolis (2013) Kimball, R., Ross, M.: The Data Warehouse Toolkit: The definitive Guide to Dimensional Modeling. Wiley, Indianapolis (2013)
12.
go back to reference Clegg, D.: Evolving data warehouse and BI architectures: the big data challenge. TDWI Bus. Intell. J. 20, 19–24 (2015) Clegg, D.: Evolving data warehouse and BI architectures: the big data challenge. TDWI Bus. Intell. J. 20, 19–24 (2015)
13.
go back to reference Russom, P.: Data Warehouse Modernization in the Age of Big Data Analytics (2016) Russom, P.: Data Warehouse Modernization in the Age of Big Data Analytics (2016)
14.
go back to reference Russom, P.: Evolving Data Warehouse Architectures in the Age of Big Data (2014) Russom, P.: Evolving Data Warehouse Architectures in the Age of Big Data (2014)
15.
go back to reference Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Document-oriented models for data warehouses - NoSQL document-oriented for data warehouses. In: Proceedings of the 18th International Conference on Enterprise Information Systems, Rome, Italy, pp. 142–149 (2016). https://doi.org/10.5220/0005830801420149 Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Document-oriented models for data warehouses - NoSQL document-oriented for data warehouses. In: Proceedings of the 18th International Conference on Enterprise Information Systems, Rome, Italy, pp. 142–149 (2016). https://​doi.​org/​10.​5220/​0005830801420149​
16.
go back to reference Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), Barcelona, Spain (2015) Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), Barcelona, Spain (2015)
17.
go back to reference Gröger, C., Schwarz, H., Mitschang, B.: The deep data warehouse: link-based integration and enrichment of warehouse data and unstructured content. In: IEEE 18th International Enterprise Distributed Object Computing Conference (EDOC), pp. 210–217 (2014) Gröger, C., Schwarz, H., Mitschang, B.: The deep data warehouse: link-based integration and enrichment of warehouse data and unstructured content. In: IEEE 18th International Enterprise Distributed Object Computing Conference (EDOC), pp. 210–217 (2014)
18.
go back to reference Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 39, 12 (2011)CrossRef Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 39, 12 (2011)CrossRef
19.
go back to reference Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010) Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996–1005. IEEE (2010)
20.
go back to reference Pandis, I.: Impala: a modern, open-source SQL engine for hadoop. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR), p. 10 (2015) Pandis, I.: Impala: a modern, open-source SQL engine for hadoop. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR), p. 10 (2015)
23.
go back to reference Song, J., Guo, C., Wang, Z., Zhang, Y., Yu, G., Pierson, J.-M.: HaoLap: a hadoop based OLAP system for big data. J. Syst. Softw. 102, 167–181 (2015)CrossRef Song, J., Guo, C., Wang, Z., Zhang, Y., Yu, G., Pierson, J.-M.: HaoLap: a hadoop based OLAP system for big data. J. Syst. Softw. 102, 167–181 (2015)CrossRef
24.
go back to reference Wang, H., et al.: Efficient query processing framework for big data warehouse: an almost join-free approach. Front. Comput. Sci. 9, 224–236 (2015)MathSciNetCrossRef Wang, H., et al.: Efficient query processing framework for big data warehouse: an almost join-free approach. Front. Comput. Sci. 9, 224–236 (2015)MathSciNetCrossRef
25.
go back to reference Tria, F.D., Lefons, E., Tangorra, F.: A framework for evaluating design methodologies for big data warehouses: measurement of the design process. Int. J. Data Warehouse. Min. 14(1), 15–39 (2018)CrossRef Tria, F.D., Lefons, E., Tangorra, F.: A framework for evaluating design methodologies for big data warehouses: measurement of the design process. Int. J. Data Warehouse. Min. 14(1), 15–39 (2018)CrossRef
28.
go back to reference Rodrigues, M., Santos, M.Y., Bernardino, J.: Big data processing tools: an experimental performance evaluation. WIREs Data Min. Knowl. Discov. 9(2), e1297 (2019)CrossRef Rodrigues, M., Santos, M.Y., Bernardino, J.: Big data processing tools: an experimental performance evaluation. WIREs Data Min. Knowl. Discov. 9(2), e1297 (2019)CrossRef
29.
31.
go back to reference Palacio, A.L., López, Ó.P., Ródenas, J.C.C.: A method to identify relevant genome data: conceptual modeling for the medicine of precision. In: Trujillo, J.C., Davis, K.C., Du, X., Li, Z., Ling, T.W., Li, G., Lee, M.L. (eds.) ER 2018. LNCS, vol. 11157, pp. 597–609. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_44CrossRef Palacio, A.L., López, Ó.P., Ródenas, J.C.C.: A method to identify relevant genome data: conceptual modeling for the medicine of precision. In: Trujillo, J.C., Davis, K.C., Du, X., Li, Z., Ling, T.W., Li, G., Lee, M.L. (eds.) ER 2018. LNCS, vol. 11157, pp. 597–609. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-00847-5_​44CrossRef
36.
go back to reference Hadar, E.: BIDCEP: a vision of big data complex event processing for near real-time data streaming: position paper, a practitioner view. In: CAiSE 2016 Industry Track, CEUR Workshop Proceedings (2016) Hadar, E.: BIDCEP: a vision of big data complex event processing for near real-time data streaming: position paper, a practitioner view. In: CAiSE 2016 Industry Track, CEUR Workshop Proceedings (2016)
37.
go back to reference Flouris, I., et al.: FERARI: a prototype for complex event processing over streaming multi-cloud platforms. In: Proceedings of the 2016 International Conference on Management of Data - SIGMOD 2016, pp. 2093–2096. ACM Press, San Francisco (2016). https://doi.org/10.1145/2882903.2899395 Flouris, I., et al.: FERARI: a prototype for complex event processing over streaming multi-cloud platforms. In: Proceedings of the 2016 International Conference on Management of Data - SIGMOD 2016, pp. 2093–2096. ACM Press, San Francisco (2016). https://​doi.​org/​10.​1145/​2882903.​2899395
Metadata
Title
Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics
Authors
Maribel Yasmina Santos
Carlos Costa
João Galvão
Carina Andrade
Oscar Pastor
Ana Cristina Marcén
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-21297-1_19

Premium Partner