Skip to main content

2015 | OriginalPaper | Buchkapitel

Combining Syntactic and Semantic Evidence for Improving Matching over Linked Data Sources

verfasst von : Klitos Christodoulou, Alvaro A. A. Fernandes, Norman W. Paton

Erschienen in: Web Information Systems Engineering – WISE 2015

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the context of Linked Data (LD) sources, the ability to traverse links and retrieve further information can be exploited to harvest semantic annotations. Such annotations can, in turn, underpin the inference of semantic correspondences between sources. This paper shows that using semantic annotations as additional evidence of equivalence between schematic representations of LD sources can improve upon the prevalent, purely syntactic approaches. The paper both describes the construction of probabilistic models that yield degrees of belief on the equivalence of the real-world concepts represented by the data and shows how these models are crucial in underpinning a Bayesian approach to assimilating both syntactic evidence (in the form of similarity scores derived by string-based matchers) and semantic evidence (in the form of semantic annotations stemming from LD vocabularies) of equivalence. The paper presents an empirical evaluation of the techniques described. The main finding is confirmation that, with respect to equivalence judgements made by human experts, the use of the contributed techniques incurs significantly fewer discrepancies than purely syntactic approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
4
Gaussian kernel was used due to its mathematical convenience. Note that any kernel other than Gaussian can be applied, however, the shape of the distribution may differ depending on the kernel characteristics.
 
6
Informally, the theorem states that the hypothesis given the evidence (so called posterior) is equal to the ratio between the product of the dob in the evidence given the hypothesis (what we called likelihood in Sect. 3) and the dob in the hypothesis (so called prior) divided by the dob in the evidence.
 
7
The survey was distributed and completed by 15 human participants all experts in solving data integration tasks, such as schema matching and mapping.
 
8
BLOOMS was configured with a high threshold, viz., \(> 0.8\)..
 
Literatur
1.
Zurück zum Zitat Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: SIGMOD Conference, pp. 906–908 (2005) Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: SIGMOD Conference, pp. 906–908 (2005)
2.
Zurück zum Zitat Bernstein, P., Madhavan, J., Rahm, E.: Generic schema matching, ten years later. Proc. VLDB Endowment 4(11), 695–701 (2011) Bernstein, P., Madhavan, J., Rahm, E.: Generic schema matching, ten years later. Proc. VLDB Endowment 4(11), 695–701 (2011)
3.
Zurück zum Zitat Bowman, A.W., Azzalini, A.: Applied Smoothing Techniques for Data Analysis : The Kernel Approach with S-Plus Illustrations: The Kernel Approach with S-Plus Illustrations. OUP, Oxford (1997)MATH Bowman, A.W., Azzalini, A.: Applied Smoothing Techniques for Data Analysis : The Kernel Approach with S-Plus Illustrations: The Kernel Approach with S-Plus Illustrations. OUP, Oxford (1997)MATH
4.
Zurück zum Zitat Christodoulou, K., Paton, N.W., Fernandes, A.A.A.: Structure inference for linked data sources using clustering. In: EDBT/ICDT Workshops, pp. 60–67 (2013) Christodoulou, K., Paton, N.W., Fernandes, A.A.A.: Structure inference for linked data sources using clustering. In: EDBT/ICDT Workshops, pp. 60–67 (2013)
5.
Zurück zum Zitat de Vaus, D.: Surveys in Social Research. Research methods/Sociology. Taylor & Francis (2002) de Vaus, D.: Surveys in Social Research. Research methods/Sociology. Taylor & Francis (2002)
6.
Zurück zum Zitat Hyndman, R.J., Koehler, A.B.: Another look at measures of forecast accuracy. Int. J. Forecast. (IJF) 22(4), 679–688 (2006)CrossRef Hyndman, R.J., Koehler, A.B.: Another look at measures of forecast accuracy. Int. J. Forecast. (IJF) 22(4), 679–688 (2006)CrossRef
7.
Zurück zum Zitat Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology alignment for linked open data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010) CrossRef Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology alignment for linked open data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010) CrossRef
8.
Zurück zum Zitat Marie, A., Gal, A.: Managing uncertainty in schema matcher ensembles. In: Prade, H., Subrahmanian, V.S. (eds.) SUM 2007. LNCS (LNAI), vol. 4772, pp. 60–73. Springer, Heidelberg (2007) CrossRef Marie, A., Gal, A.: Managing uncertainty in schema matcher ensembles. In: Prade, H., Subrahmanian, V.S. (eds.) SUM 2007. LNCS (LNAI), vol. 4772, pp. 60–73. Springer, Heidelberg (2007) CrossRef
9.
Zurück zum Zitat Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies, New York (1991) Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies, New York (1991)
10.
Zurück zum Zitat Peukert, E., Maßmann, S., König, K.: Comparing similarity combination methods for schema matching. GI Jahrestagung 1, 692–701 (2010) Peukert, E., Maßmann, S., König, K.: Comparing similarity combination methods for schema matching. GI Jahrestagung 1, 692–701 (2010)
11.
Zurück zum Zitat Polleres, A., Hogan, A., Harth, A., Decker, S.: Can we ever catch up with the web? Semantic Web 1(1–2), 45–52 (2010) Polleres, A., Hogan, A., Harth, A., Decker, S.: Can we ever catch up with the web? Semantic Web 1(1–2), 45–52 (2010)
12.
Zurück zum Zitat Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)MATHCrossRef Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)MATHCrossRef
13.
Zurück zum Zitat Sabou, M., d’Aquin, M., Motta, E.: Exploring the semantic web as background knowledge for ontology matching. J. Data Semant. 11, 156–190 (2008) Sabou, M., d’Aquin, M., Motta, E.: Exploring the semantic web as background knowledge for ontology matching. J. Data Semant. 11, 156–190 (2008)
14.
Zurück zum Zitat Sabou, M., d’Aquin, M., Motta, E.: SCARLET: semantiC relAtion discoveRy by harvesting onLinE onTologies. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 854–858. Springer, Heidelberg (2008) CrossRef Sabou, M., d’Aquin, M., Motta, E.: SCARLET: semantiC relAtion discoveRy by harvesting onLinE onTologies. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 854–858. Springer, Heidelberg (2008) CrossRef
15.
Zurück zum Zitat Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)CrossRef Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)CrossRef
16.
Zurück zum Zitat Spragins, J.: A note on the iterative application of bayes’ rule. IEEE Trans. Inf. Theor. 11(4), 544–549 (2006)MathSciNetCrossRef Spragins, J.: A note on the iterative application of bayes’ rule. IEEE Trans. Inf. Theor. 11(4), 544–549 (2006)MathSciNetCrossRef
Metadaten
Titel
Combining Syntactic and Semantic Evidence for Improving Matching over Linked Data Sources
verfasst von
Klitos Christodoulou
Alvaro A. A. Fernandes
Norman W. Paton
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-26190-4_14