Skip to main content
Top

2020 | OriginalPaper | Chapter

HANDLE - A Generic Metadata Model for Data Lakes

Authors : Rebecca Eichler, Corinna Giebler, Christoph Gröger, Holger Schwarz, Bernhard Mitschang

Published in: Big Data Analytics and Knowledge Discovery

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The substantial increase in generated data induced the development of new concepts such as the data lake. A data lake is a large storage repository designed to enable flexible extraction of the data’s value. A key aspect of exploiting data value in data lakes is the collection and management of metadata. To store and handle the metadata, a generic metadata model is required that can reflect metadata of any potential metadata management use case, e.g., data versioning or data lineage. However, an evaluation of existent metadata models yields that none so far are sufficiently generic. In this work, we present HANDLE, a generic metadata model for data lakes, which supports the flexible integration of metadata, data lake zones, metadata on various granular levels, and any metadata categorization. With these capabilities HANDLE enables comprehensive metadata management in data lakes. We show HANDLE’s feasibility through the application to an exemplary access-use-case and a prototypical implementation. A comparison with existent models yields that HANDLE can reflect the same information and provides additional capabilities needed for metadata management in data lakes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference DAMA International: DAMA-DMBOK: Data Management Body of Knowledge. Technics Publications (2017) DAMA International: DAMA-DMBOK: Data Management Body of Knowledge. Technics Publications (2017)
2.
go back to reference Diamantini, C., et al.: A new metadata model to uniformly handle heterogeneous data lake sources. In: Proceedings of the 22nd European Conference on Advances in Databases and Information Systems ADBIS (2018) Diamantini, C., et al.: A new metadata model to uniformly handle heterogeneous data lake sources. In: Proceedings of the 22nd European Conference on Advances in Databases and Information Systems ADBIS (2018)
6.
go back to reference Gröger, C., Hoos, E.: Ganzheitliches Metadatenmanagement im Data Lake: Anforderungen, IT-Werkzeuge und Herausforderungen in der Praxis. In: Proceedings of the 18. Fachtagung für Datenbanksysteme für Business, Technologie und Web BTW (2019) Gröger, C., Hoos, E.: Ganzheitliches Metadatenmanagement im Data Lake: Anforderungen, IT-Werkzeuge und Herausforderungen in der Praxis. In: Proceedings of the 18. Fachtagung für Datenbanksysteme für Business, Technologie und Web BTW (2019)
7.
go back to reference Hai, R., et al.: Constance: an intelligent data lake system. In: Proceedings of the 2016 International Conference on Management of Data SIGMOD (2016) Hai, R., et al.: Constance: an intelligent data lake system. In: Proceedings of the 2016 International Conference on Management of Data SIGMOD (2016)
9.
go back to reference Halevy, A., et al.: Managing Google’s data lake: an overview of the Goods system. IEEE Data Eng. Bull. 39, 5–14 (2016) Halevy, A., et al.: Managing Google’s data lake: an overview of the Goods system. IEEE Data Eng. Bull. 39, 5–14 (2016)
10.
go back to reference Hellerstein, J.M., et al.: Ground : a data context service. In: Proceedings of the 8th Biennial Conference on Innovative Data Systems Research CIDR (2017) Hellerstein, J.M., et al.: Ground : a data context service. In: Proceedings of the 8th Biennial Conference on Innovative Data Systems Research CIDR (2017)
12.
go back to reference Kandogan, E., et al.: LabBook: metadata-driven social collaborative data analysis. In: Proceedings of the IEEE International Conference on Big Data (2015) Kandogan, E., et al.: LabBook: metadata-driven social collaborative data analysis. In: Proceedings of the IEEE International Conference on Big Data (2015)
13.
go back to reference Kassner, L., Gröger, C., Königsberger, J., Hoos, E., Kiefer, C., Weber, C., Silcher, S., Mitschang, B.: The stuttgart IT architecture for manufacturing. In: Hammoudi, S., Maciaszek, L.A., Missikoff, M.M., Camp, O., Cordeiro, J. (eds.) ICEIS 2016. LNBIP, vol. 291, pp. 53–80. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62386-3_3CrossRef Kassner, L., Gröger, C., Königsberger, J., Hoos, E., Kiefer, C., Weber, C., Silcher, S., Mitschang, B.: The stuttgart IT architecture for manufacturing. In: Hammoudi, S., Maciaszek, L.A., Missikoff, M.M., Camp, O., Cordeiro, J. (eds.) ICEIS 2016. LNBIP, vol. 291, pp. 53–80. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-62386-3_​3CrossRef
14.
go back to reference Kaur, K., Rani, R.: Modeling and querying data in NoSQL databases. In: Proceedings of the IEEE International Conference on Big Data (2013) Kaur, K., Rani, R.: Modeling and querying data in NoSQL databases. In: Proceedings of the IEEE International Conference on Big Data (2013)
15.
go back to reference Quix, C., et al.: Metadata extraction and management in data lakes with GEMMS. Complex Syst. Inform. Model. Quarterly 9, 67–83 (2016)CrossRef Quix, C., et al.: Metadata extraction and management in data lakes with GEMMS. Complex Syst. Inform. Model. Quarterly 9, 67–83 (2016)CrossRef
17.
go back to reference Sawadogo, P.N., et al.: Metadata management for textual documents in data lakes. In: Proceedings of the 21st International Conference on Enterprise Information Systems, ICEIS (2019) Sawadogo, P.N., et al.: Metadata management for textual documents in data lakes. In: Proceedings of the 21st International Conference on Enterprise Information Systems, ICEIS (2019)
19.
go back to reference Simoni, G.D., et al.: Magic Quadrant for Metadata Management Solutions (2018) Simoni, G.D., et al.: Magic Quadrant for Metadata Management Solutions (2018)
20.
go back to reference Spiekermann, M., et al.: A metadata model for data goods. In: Proceedings of the Multikonferenz Wirtschaftsinformatik MKWI (2018) Spiekermann, M., et al.: A metadata model for data goods. In: Proceedings of the Multikonferenz Wirtschaftsinformatik MKWI (2018)
22.
go back to reference Walker, C., Alrehamy, H.: Personal data lake with data gravity pull. In: Proceedings of the 5th International Conference on Big Data and Cloud Computing, BDCloud (2015) Walker, C., Alrehamy, H.: Personal data lake with data gravity pull. In: Proceedings of the 5th International Conference on Big Data and Cloud Computing, BDCloud (2015)
23.
go back to reference Zaloni: The Data Lake Reference Architecture - Leveraging a Data Reference Architecture to Ensure Data Lake Success. Technical report (2018) Zaloni: The Data Lake Reference Architecture - Leveraging a Data Reference Architecture to Ensure Data Lake Success. Technical report (2018)
Metadata
Title
HANDLE - A Generic Metadata Model for Data Lakes
Authors
Rebecca Eichler
Corinna Giebler
Christoph Gröger
Holger Schwarz
Bernhard Mitschang
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-59065-9_7

Premium Partner