Skip to main content

2021 | OriginalPaper | Buchkapitel

Background Knowledge in Schema Matching: Strategy vs. Data

verfasst von : Jan Portisch, Michael Hladik, Heiko Paulheim

Erschienen in: The Semantic Web – ISWC 2021

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The use of external background knowledge can be beneficial for the task of matching schemas or ontologies automatically. In this paper, we exploit six general-purpose knowledge graphs as sources of background knowledge for the matching task. The background sources are evaluated by applying three different exploitation strategies. We find that explicit strategies still outperform latent ones and that the choice of the strategy has a greater impact on the final alignment than the actual background dataset on which the strategy is applied. While we could not identify a universally superior resource, BabelNet achieved consistently good results. Our best matcher configuration with BabelNet performs very competitively when compared to other matching systems even though no dataset-specific optimizations were made.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
The size of the external resource is not relevant within the matching process since all similarity functions applied here are lookup-based. When training an embedding with the external resource, the size of the resource affects scalability; however, the training is a one-time process – once the vectors are available, they can be reused in all other matching tasks.
 
5
Unfortunately, there is no RDF version of the latest BabelNet version.
 
13
For example, symposium and conference are mutual hypernyms of each other in WebIsALOD.
 
16
This may seem odd at first. However, lower recall values are due to the Hungarian optimization method to obtain a 1:1 alignment, which, in that case, extracts more false positives.
 
Literatur
1.
Zurück zum Zitat Annane, A., Bellahsene, Z., Azouaou, F., Jonquet, C.: Selection and combination of heterogeneous mappings to enhance biomedical ontology matching. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 19–33. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49004-5_2CrossRef Annane, A., Bellahsene, Z., Azouaou, F., Jonquet, C.: Selection and combination of heterogeneous mappings to enhance biomedical ontology matching. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 19–33. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-49004-5_​2CrossRef
2.
Zurück zum Zitat Bodenreider, O., Hayamizu, T.F., Ringwald, M., de Coronado, S., Zhang, S.: Of mice and men: aligning mouse and human anatomies. In: AMIA 2005 (2005) Bodenreider, O., Hayamizu, T.F., Ringwald, M., de Coronado, S., Zhang, S.: Of mice and men: aligning mouse and human anatomies. In: AMIA 2005 (2005)
4.
Zurück zum Zitat Chen, X., Xia, W., Jiménez-Ruiz, E., Cross, V.V.: Extending an ontology alignment system with bioportal: a preliminary analysis. In: ISWC 2014 Posters & Demonstrations Track. CEUR Workshop Proceedings, vol. 1272, pp. 313–316 (2014) Chen, X., Xia, W., Jiménez-Ruiz, E., Cross, V.V.: Extending an ontology alignment system with bioportal: a preliminary analysis. In: ISWC 2014 Posters & Demonstrations Track. CEUR Workshop Proceedings, vol. 1272, pp. 313–316 (2014)
7.
Zurück zum Zitat Faria, D., Pesquita, C., Santos, E., Cruz, I.F., Couto, F.M.: Automatic background knowledge selection for matching biomedical ontologies. PloS One 9(11) (2014) Faria, D., Pesquita, C., Santos, E., Cruz, I.F., Couto, F.M.: Automatic background knowledge selection for matching biomedical ontologies. PloS One 9(11) (2014)
8.
Zurück zum Zitat Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge (1998) Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge (1998)
9.
Zurück zum Zitat Groß, A., Hartung, M., Kirsten, T., Rahm, E.: Mapping composition for matching large life science ontologies. In: Proceedings of the 2nd International Conference on Biomedical Ontology. CEUR Workshop Proceedings, vol. 833 (2011) Groß, A., Hartung, M., Kirsten, T., Rahm, E.: Mapping composition for matching large life science ontologies. In: Proceedings of the 2nd International Conference on Biomedical Ontology. CEUR Workshop Proceedings, vol. 833 (2011)
11.
Zurück zum Zitat Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: 14th International Conference on Computational Linguistics, COLING 1992, Nantes, France, 23–28 August 1992, pp. 539–545 (1992) Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: 14th International Conference on Computational Linguistics, COLING 1992, Nantes, France, 23–28 August 1992, pp. 539–545 (1992)
12.
Zurück zum Zitat Hertling, S., Paulheim, H.: WikiMatch - Using Wikipedia for ontology matching. In: OM@ISWC 2012, vol. 946, pp. 37–48 (2012) Hertling, S., Paulheim, H.: WikiMatch - Using Wikipedia for ontology matching. In: OM@ISWC 2012, vol. 946, pp. 37–48 (2012)
14.
Zurück zum Zitat Hertling, S., Paulheim, H.: DOME results for OAEI 2018. In: OM@ISWC 2018, pp. 144–151 (2018) Hertling, S., Paulheim, H.: DOME results for OAEI 2018. In: OM@ISWC 2018, pp. 144–151 (2018)
16.
Zurück zum Zitat Hertling, S., Portisch, J., Paulheim, H.: Supervised ontology and instance matching with MELT. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 60–71 (2020). CEUR-WS.org Hertling, S., Portisch, J., Paulheim, H.: Supervised ontology and instance matching with MELT. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 60–71 (2020). CEUR-WS.​org
17.
Zurück zum Zitat Kachroudi, M., Diallo, G., Yahia, S.B.: KEPLER at OAEI 2018. In: OM@ISWC 2018, pp. 173–178 (2018) Kachroudi, M., Diallo, G., Yahia, S.B.: KEPLER at OAEI 2018. In: OM@ISWC 2018, pp. 173–178 (2018)
18.
Zurück zum Zitat Kazemi, S.M., et al.: Relational representation learning for dynamic (knowledge) graphs: a survey. CoRR abs/1905.11485 (2019) Kazemi, S.M., et al.: Relational representation learning for dynamic (knowledge) graphs: a survey. CoRR abs/1905.11485 (2019)
19.
Zurück zum Zitat Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logistics Q. 2(1–2), 83–97 (1955)MathSciNetCrossRef Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logistics Q. 2(1–2), 83–97 (1955)MathSciNetCrossRef
20.
Zurück zum Zitat Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning (ICML), pp. 1188–1196 (2014) Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning (ICML), pp. 1188–1196 (2014)
21.
Zurück zum Zitat Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)CrossRef Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)CrossRef
22.
Zurück zum Zitat Lin, F., Krizhanovsky, A.: Multilingual ontology matching based on Wiktionary data accessible via SPARQL endpoint. In: RCDL 2011. CEUR Workshop Proceedings, vol. 803, pp. 1–8 (2011) Lin, F., Krizhanovsky, A.: Multilingual ontology matching based on Wiktionary data accessible via SPARQL endpoint. In: RCDL 2011. CEUR Workshop Proceedings, vol. 803, pp. 1–8 (2011)
23.
Zurück zum Zitat Lütke, A.: AnyGraphMatcher submission to the OAEI knowledge graph challenge 2019. In: OM@ISWC 2019 (2019) Lütke, A.: AnyGraphMatcher submission to the OAEI knowledge graph challenge 2019. In: OM@ISWC 2019 (2019)
24.
Zurück zum Zitat McCrae, J., et al.: Interchanging lexical resources on the Semantic Web. Lang. Resour. Eval. 46(4), 701–719 (2012)CrossRef McCrae, J., et al.: Interchanging lexical resources on the Semantic Web. Lang. Resour. Eval. 46(4), 701–719 (2012)CrossRef
25.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: 27th Annual Conference on Neural Information Processing Systems 2013, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: 27th Annual Conference on Neural Information Processing Systems 2013, pp. 3111–3119 (2013)
26.
Zurück zum Zitat Mohammadi, M., Atashin, A.A., Hofman, W., Tan, Y.: Comparison of ontology alignment systems across single matching task via the Mcnemar’s test. ACM Trans. Knowl. Discov. Data 12(4), 51:1–51:18 (2018) Mohammadi, M., Atashin, A.A., Hofman, W., Tan, Y.: Comparison of ontology alignment systems across single matching task via the Mcnemar’s test. ACM Trans. Knowl. Discov. Data 12(4), 51:1–51:18 (2018)
27.
Zurück zum Zitat Monych, M., Portisch, J., Hladik, M., Paulheim, H.: DESKMatcher. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 181–186 (2020) Monych, M., Portisch, J., Hladik, M., Paulheim, H.: DESKMatcher. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 181–186 (2020)
28.
Zurück zum Zitat Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRef Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRef
29.
Zurück zum Zitat Paulheim, H.: WeSeE-match results for OEAI 2012. In: OM@ISWC 2012 (2012) Paulheim, H.: WeSeE-match results for OEAI 2012. In: OM@ISWC 2012 (2012)
30.
Zurück zum Zitat Portisch, J., Hladik, M., Paulheim, H.: Evaluating ontology matchers on real-world financial services data models. In: Posters & Demos of SEMANTiCS 2019 (2019) Portisch, J., Hladik, M., Paulheim, H.: Evaluating ontology matchers on real-world financial services data models. In: Posters & Demos of SEMANTiCS 2019 (2019)
31.
Zurück zum Zitat Portisch, J., Hladik, M., Paulheim, H.: Alod2vec matcher results for OAEI 2020. In: Shvaiko, P., Euzenat, J., Jiménez-Ruiz, E., Hassanzadeh, O., Trojahn, C. (eds.) OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 147–153 (2020) Portisch, J., Hladik, M., Paulheim, H.: Alod2vec matcher results for OAEI 2020. In: Shvaiko, P., Euzenat, J., Jiménez-Ruiz, E., Hassanzadeh, O., Trojahn, C. (eds.) OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 147–153 (2020)
32.
Zurück zum Zitat Portisch, J., Hladik, M., Paulheim, H.: KGvec2go - knowledge graph embeddings as a service. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marseille, France (2020) Portisch, J., Hladik, M., Paulheim, H.: KGvec2go - knowledge graph embeddings as a service. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marseille, France (2020)
33.
Zurück zum Zitat Portisch, J., Hladik, M., Paulheim, H.: Rdf2Vec light - a lightweight approach for knowledge graph embeddings. In: ISWC 2020 Demos and Industry Tracks. CEUR Workshop Proceedings, vol. 2721, pp. 79–84 (2020) Portisch, J., Hladik, M., Paulheim, H.: Rdf2Vec light - a lightweight approach for knowledge graph embeddings. In: ISWC 2020 Demos and Industry Tracks. CEUR Workshop Proceedings, vol. 2721, pp. 79–84 (2020)
34.
Zurück zum Zitat Portisch, J., Paulheim, H.: Wiktionary matcher results for OAEI 2020. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 225–232 (2020) Portisch, J., Paulheim, H.: Wiktionary matcher results for OAEI 2020. In: OM@ISWC 2020. CEUR Workshop Proceedings, vol. 2788, pp. 225–232 (2020)
35.
Zurück zum Zitat Quix, C., Roy, P., Kensche, D.: Automatic selection of background knowledge for ontology matching. In: SWIM 2011, p. 5. ACM (2011) Quix, C., Roy, P., Kensche, D.: Automatic selection of background knowledge for ontology matching. In: SWIM 2011, p. 5. ACM (2011)
36.
Zurück zum Zitat Ristoski, P., Rosati, J., Noia, T.D., Leone, R.D., Paulheim, H.: Rdf2vec: RDF graph embeddings and their applications. Semant. Web 10(4), 721–752 (2019)CrossRef Ristoski, P., Rosati, J., Noia, T.D., Leone, R.D., Paulheim, H.: Rdf2vec: RDF graph embeddings and their applications. Semant. Web 10(4), 721–752 (2019)CrossRef
37.
Zurück zum Zitat Seitner, J., et al.: A large database of hypernymy relations extracted from the web. In: LREC 2016 (2016) Seitner, J., et al.: A large database of hypernymy relations extracted from the web. In: LREC 2016 (2016)
38.
Zurück zum Zitat Sérasset, G.: DBnary: Wiktionary as a lemon-based multilingual lexical resource in RDF. Semant. Web 6(4), 355–361 (2015)CrossRef Sérasset, G.: DBnary: Wiktionary as a lemon-based multilingual lexical resource in RDF. Semant. Web 6(4), 355–361 (2015)CrossRef
39.
Zurück zum Zitat Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)CrossRef Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)CrossRef
40.
Zurück zum Zitat Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef
Metadaten
Titel
Background Knowledge in Schema Matching: Strategy vs. Data
verfasst von
Jan Portisch
Michael Hladik
Heiko Paulheim
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-88361-4_17

Premium Partner