Skip to main content

2019 | OriginalPaper | Buchkapitel

Data Lakes: Trends and Perspectives

verfasst von : Franck Ravat, Yan Zhao

Erschienen in: Database and Expert Systems Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As a relatively new concept, data lake has neither a standard definition nor an acknowledged architecture. Thus, we study the existing work and propose a complete definition and a generic and extensible architecture of data lake. What’s more, we introduce three future research axes in connection with our health-care Information Technology (IT) activities. They are related to (i) metadata management that consists of intra- and inter-metadata, (ii) a unified ecosystem for companies’ data warehouses and data lakes and (iii) data lake governance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alserafi, A., Abelló, A., Romero, O., Calders, T.: Towards information profiling: data lake content metadata management. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 178–185. IEEE (2016) Alserafi, A., Abelló, A., Romero, O., Calders, T.: Towards information profiling: data lake content metadata management. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 178–185. IEEE (2016)
2.
Zurück zum Zitat Bilalli, B., Abelló, A., Aluja-Banet, T., Wrembel, R.: Towards intelligent data analysis: the metadata challenge. In: Proceedings of the International Conference on Internet of Things and Big Data, Rome, Italy, pp. 331–338 (2016) Bilalli, B., Abelló, A., Aluja-Banet, T., Wrembel, R.: Towards intelligent data analysis: the metadata challenge. In: Proceedings of the International Conference on Internet of Things and Big Data, Rome, Italy, pp. 331–338 (2016)
5.
Zurück zum Zitat Fang, H.: Managing data lakes in big data era: what’s a data lake and why has it became popular in data management ecosystem. In: 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pp. 820–824. IEEE (2015) Fang, H.: Managing data lakes in big data era: what’s a data lake and why has it became popular in data management ecosystem. In: 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pp. 820–824. IEEE (2015)
6.
Zurück zum Zitat Foshay, N., Mukherjee, A., Taylor, A.: Does data warehouse end-user metadata add value? Commun. ACM 50(11), 70–77 (2007)CrossRef Foshay, N., Mukherjee, A., Taylor, A.: Does data warehouse end-user metadata add value? Commun. ACM 50(11), 70–77 (2007)CrossRef
7.
Zurück zum Zitat Hai, R., Geisler, S., Quix, C.: Constance: an intelligent data lake system. In: Proceedings of the 2016 International Conference on Management of Data, pp. 2097–2100. ACM (2016) Hai, R., Geisler, S., Quix, C.: Constance: an intelligent data lake system. In: Proceedings of the 2016 International Conference on Management of Data, pp. 2097–2100. ACM (2016)
9.
Zurück zum Zitat Halevy, A.Y., et al.: Managing google’s data lake: an overview of the goods system. IEEE Data Eng. Bull. 39(3), 5–14 (2016) Halevy, A.Y., et al.: Managing google’s data lake: an overview of the goods system. IEEE Data Eng. Bull. 39(3), 5–14 (2016)
10.
Zurück zum Zitat Inmon, B.: Data Lake Architecture: Designing the Data Lake and avoiding the garbage dump. Technics publications (2016) Inmon, B.: Data Lake Architecture: Designing the Data Lake and avoiding the garbage dump. Technics publications (2016)
13.
Zurück zum Zitat Kwon, O., Lee, N., Shin, B.: Data quality management, data usage experience and acquisition intention of big data analytics. Int. J. Inf. Manag. 34(3), 387–394 (2014)CrossRef Kwon, O., Lee, N., Shin, B.: Data quality management, data usage experience and acquisition intention of big data analytics. Int. J. Inf. Manag. 34(3), 387–394 (2014)CrossRef
14.
Zurück zum Zitat LaPlante, A., Sharma, B.: Architecting Data Lakes. O’Reilly Media, Sebastopol (2014) LaPlante, A., Sharma, B.: Architecting Data Lakes. O’Reilly Media, Sebastopol (2014)
15.
Zurück zum Zitat Llave, M.R.: Data lakes in business intelligence: reporting from the trenches. Procedia Comput. Sci. 138, 516–524 (2018)CrossRef Llave, M.R.: Data lakes in business intelligence: reporting from the trenches. Procedia Comput. Sci. 138, 516–524 (2018)CrossRef
17.
Zurück zum Zitat Maccioni, A., Torlone, R.: Crossing the finish line faster when paddling the data lake with kayak. Proc. VLDB Endow. 10(12), 1853–1856 (2017)CrossRef Maccioni, A., Torlone, R.: Crossing the finish line faster when paddling the data lake with kayak. Proc. VLDB Endow. 10(12), 1853–1856 (2017)CrossRef
18.
Zurück zum Zitat Madera, C., Laurent, A.: The next information architecture evolution: the data lake wave. In: Proceedings of the 8th International Conference on Management of Digital EcoSystems, pp. 174–180. ACM (2016) Madera, C., Laurent, A.: The next information architecture evolution: the data lake wave. In: Proceedings of the 8th International Conference on Management of Digital EcoSystems, pp. 174–180. ACM (2016)
20.
Zurück zum Zitat Merino, J., Caballero, I., Rivas, B., Serrano, M., Piattini, M.: A data quality in use model for big data. Future Gener. Comput. Syst. 63, 123–130 (2016)CrossRef Merino, J., Caballero, I., Rivas, B., Serrano, M., Piattini, M.: A data quality in use model for big data. Future Gener. Comput. Syst. 63, 123–130 (2016)CrossRef
21.
Zurück zum Zitat Miloslavskaya, N., Tolstoy, A.: Big data, fast data and data lake concepts. Procedia Comput. Sci. 88, 300–305 (2016)CrossRef Miloslavskaya, N., Tolstoy, A.: Big data, fast data and data lake concepts. Procedia Comput. Sci. 88, 300–305 (2016)CrossRef
22.
Zurück zum Zitat Nadipalli, R.: Effective Business Intelligence with QuickSight. Packt Publishing Ltd., Birmingham (2017) Nadipalli, R.: Effective Business Intelligence with QuickSight. Packt Publishing Ltd., Birmingham (2017)
23.
Zurück zum Zitat O’Leary, D.E.: Embedding AI and crowdsourcing in the big data lake. IEEE Intell. Syst. 29(5), 70–73 (2014)CrossRef O’Leary, D.E.: Embedding AI and crowdsourcing in the big data lake. IEEE Intell. Syst. 29(5), 70–73 (2014)CrossRef
26.
Zurück zum Zitat Ponniah, P.: Data Warehousing Fundamentals: a Comprehensive Guide for IT Professionals. Wiley, Hoboken (2004) Ponniah, P.: Data Warehousing Fundamentals: a Comprehensive Guide for IT Professionals. Wiley, Hoboken (2004)
27.
Zurück zum Zitat Quix, C., Hai, R., Vatov, I.: Metadata extraction and management in data lakes with gemms. Complex Syst. Inf. Model. Q. 9, 67–83 (2016) Quix, C., Hai, R., Vatov, I.: Metadata extraction and management in data lakes with gemms. Complex Syst. Inf. Model. Q. 9, 67–83 (2016)
28.
Zurück zum Zitat Ravat, F., Zhao, Y.: Metadata management for data lakes. In: East European Conference on Advances in Databases and Information Systems. Springer (2019) Ravat, F., Zhao, Y.: Metadata management for data lakes. In: East European Conference on Advances in Databases and Information Systems. Springer (2019)
29.
Zurück zum Zitat Sawadogo, P., Kibata, T., Darmont, J.: Metadata management for textual documents in data lakes. In: 21st International Conference on Enterprise Information Systems (ICEIS 2019) (2019) Sawadogo, P., Kibata, T., Darmont, J.: Metadata management for textual documents in data lakes. In: 21st International Conference on Enterprise Information Systems (ICEIS 2019) (2019)
31.
Zurück zum Zitat Walker, C., Alrehamy, H.: Personal data lake with data gravity pull. In: 2015 IEEE Fifth International Conference on Big Data and Cloud Computing, pp. 160–167. IEEE (2015) Walker, C., Alrehamy, H.: Personal data lake with data gravity pull. In: 2015 IEEE Fifth International Conference on Big Data and Cloud Computing, pp. 160–167. IEEE (2015)
32.
Zurück zum Zitat Weill, P., Ross, J.W.: IT Governance: How Top Performers Manage IT Decision Rights for Superior Results. Harvard Business Press, Boston (2004) Weill, P., Ross, J.W.: IT Governance: How Top Performers Manage IT Decision Rights for Superior Results. Harvard Business Press, Boston (2004)
Metadaten
Titel
Data Lakes: Trends and Perspectives
verfasst von
Franck Ravat
Yan Zhao
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-27615-7_23

Premium Partner