Skip to main content
Top

2016 | OriginalPaper | Chapter

Data Analysis of Hierarchical Data for RDF Term Identification

Authors : Pieter Heyvaert, Anastasia Dimou, Ruben Verborgh, Erik Mannens

Published in: Semantic Technology

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Generating Linked Data based on existing data sources requires the modeling of their information structure. This modeling needs the identification of potential entities, their attributes and the relationships between them and among entities. For databases this identification is not required, because a data schema is always available. However, for other data formats, such as hierarchical data, this is not always the case. Therefore, analysis of the data is required to support rdf term and data type identification. We introduce a tool that performs such an analysis on hierarchical data. It implements the algorithms, Daro and S-Daro, proposed in this paper. Based on our evaluation, we conclude that S-Daro offers a more scalable solution regarding run time, with respect to the dataset size, and provides more complete results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
3
 
4
For brevity, we did not include the code that allows users to determine the data model, keys, and data types separately.
 
5
All experiments were conducted on a 64-bit Ubuntu 14.04 machine with 128 GB of RAM and a 24-core 2.40 GHz CPU. Each algorithm was run in a Docker container and was able to use at any moment a maximum of 8 GB of RAM and 1 CPU core.
 
Literature
1.
go back to reference Heyvaert, P., Dimou, A., Herregodts, A.-L., Verborgh, R., Schuurman, D., Mannens, E., Walle, R.: RMLEditor: a graph-based mapping editor for linked data mappings. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 709–723. Springer, Heidelberg (2016). doi:10.1007/978-3-319-34129-3_43 CrossRef Heyvaert, P., Dimou, A., Herregodts, A.-L., Verborgh, R., Schuurman, D., Mannens, E., Walle, R.: RMLEditor: a graph-based mapping editor for linked data mappings. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 709–723. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-34129-3_​43 CrossRef
2.
go back to reference Pinkel, C., Schwarte, A., Trame, J., Nikolov, A., Bastinos, A.S., Zeuch, T.: DataOps: seamless end-to-end anything-to-RDF data integration. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 123–127. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25639-9_24 CrossRef Pinkel, C., Schwarte, A., Trame, J., Nikolov, A., Bastinos, A.S., Zeuch, T.: DataOps: seamless end-to-end anything-to-RDF data integration. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 123–127. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-25639-9_​24 CrossRef
4.
go back to reference Dimou, A., Sande, M.V., Colpaert, P., Verborgh, R., Mannens, E., Rik Van de Walle, R.M.L.: A generic language for integrated rdf mappings of heterogeneous data. In: Workshop on Linked Data on the Web (2014) Dimou, A., Sande, M.V., Colpaert, P., Verborgh, R., Mannens, E., Rik Van de Walle, R.M.L.: A generic language for integrated rdf mappings of heterogeneous data. In: Workshop on Linked Data on the Web (2014)
5.
go back to reference Chen, P.P.-S.: The entity-relationship model - toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)CrossRef Chen, P.P.-S.: The entity-relationship model - toward a unified view of data. ACM Trans. Database Syst. (TODS) 1(1), 9–36 (1976)CrossRef
6.
go back to reference Galiegue, F., Zyp, K., Json schema: core definitions and terminology. In: Internet Engineering Task Force (IETF) (2013) Galiegue, F., Zyp, K., Json schema: core definitions and terminology. In: Internet Engineering Task Force (IETF) (2013)
7.
go back to reference Soru, T., Marx, E., Ngonga Ngomo, A.-C.: ROCKER - a refinement operator for key discovery. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1025–1033. International World Wide Web Conferences Steering Committee (2015) Soru, T., Marx, E., Ngonga Ngomo, A.-C.: ROCKER - a refinement operator for key discovery. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1025–1033. International World Wide Web Conferences Steering Committee (2015)
8.
go back to reference Pernelle, N., Saïs, F., Symeonidou, D.: An automatic key discovery approach for data linking. Web Semant. Sci. Serv. Agents WWW 23, 16–30 (2013)CrossRef Pernelle, N., Saïs, F., Symeonidou, D.: An automatic key discovery approach for data linking. Web Semant. Sci. Serv. Agents WWW 23, 16–30 (2013)CrossRef
Metadata
Title
Data Analysis of Hierarchical Data for RDF Term Identification
Authors
Pieter Heyvaert
Anastasia Dimou
Ruben Verborgh
Erik Mannens
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-50112-3_15