Skip to main content

2016 | OriginalPaper | Buchkapitel

Measurement of Semantic Similarity: A Concept Hierarchy Based Approach

verfasst von : Shrutilipi Bhattacharjee, Soumya K. Ghosh

Erschienen in: Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Resolving semantic heterogeneity is one of the major issues in many fields, namely, natural language processing, search engine development, document clustering, geospatial information retrieval and knowledge discovery, etc. Semantic heterogeneity is often considered as an obstacle for realizing full interoperability among diverse datasets. Appropriate measurement metric is essential to properly understand the extent of similarity between concepts. The proposed approach is based on the notion of concept hierarchy which is built using a lexical database. The WordNet, a semantic lexical database, is used here to build the semantic hierarchy. A measurement metric is also proposed to quantify the extent of similarity between a pair of concepts. The work is compared with existing methodologies on Miller-Charles benchmark dataset using three correlation coefficients (Pearson’s, Spearman’s and Kendall Tau rank correlation coefficients). The proposed approach is found to yield better results than most of the existing techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. ed.: Journal on Data Semantics IV. Volume 3730 of Lecture Notes in Computer Science, pp. 146–171. Springer, Berlin Heidelberg (2005) Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. ed.: Journal on Data Semantics IV. Volume 3730 of Lecture Notes in Computer Science, pp. 146–171. Springer, Berlin Heidelberg (2005)
2.
Zurück zum Zitat Bhattacharjee, S., Ghosh, S.K.: Automatic resolution of semantic heterogeneity in GIS: an ontology based approach. In: Advanced Computing, Networking and Informatics, vol. 1, pp. 585–591. Springer (2014) Bhattacharjee, S., Ghosh, S.K.: Automatic resolution of semantic heterogeneity in GIS: an ontology based approach. In: Advanced Computing, Networking and Informatics, vol. 1, pp. 585–591. Springer (2014)
3.
Zurück zum Zitat Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRef Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRef
4.
Zurück zum Zitat Jannink, J.F.: A word nexus for systematic interoperation of semantically heterogeneous data sources. PhD Thesis, Stanford University (2001) Jannink, J.F.: A word nexus for systematic interoperation of semantically heterogeneous data sources. PhD Thesis, Stanford University (2001)
5.
Zurück zum Zitat Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. on Syst. Man Cybern. 19(1), 17–30 (1989)CrossRef Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. on Syst. Man Cybern. 19(1), 17–30 (1989)CrossRef
6.
Zurück zum Zitat Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)CrossRef Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)CrossRef
7.
Zurück zum Zitat Cilibrasi, R.L., Vitanyi, P.M.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)CrossRef Cilibrasi, R.L., Vitanyi, P.M.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)CrossRef
8.
Zurück zum Zitat Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, pp. 377–386. ACM (2006) Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, pp. 377–386. ACM (2006)
9.
Zurück zum Zitat Bollegala, D., Matsuo, Y., Ishizuka, M.: A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 2-vol. 2, pp. 803–812. Association for Computational Linguistics (2009) Bollegala, D., Matsuo, Y., Ishizuka, M.: A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 2-vol. 2, pp. 803–812. Association for Computational Linguistics (2009)
10.
11.
Zurück zum Zitat Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008) Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
12.
Zurück zum Zitat Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Proc. 6(1), 1–28 (1991)CrossRef Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Proc. 6(1), 1–28 (1991)CrossRef
13.
Zurück zum Zitat Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. WWW 7, 757–766 (2007) Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. WWW 7, 757–766 (2007)
14.
Zurück zum Zitat Chen, H.H., Lin, M.S., Wei, Y.C.: Novel association measures using web search with double checking. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 1009–1016 (2006) Chen, H.H., Lin, M.S., Wei, Y.C.: Novel association measures using web search with double checking. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 1009–1016 (2006)
15.
Zurück zum Zitat Chen, V.Y.J., Chinchilli, V.M., Donald St, P.R.: Robustness and monotonicity properties of generalized correlation coefficients. J. Stat. Planning Infer. 141(2), 924–936 (2011)CrossRefMATH Chen, V.Y.J., Chinchilli, V.M., Donald St, P.R.: Robustness and monotonicity properties of generalized correlation coefficients. J. Stat. Planning Infer. 141(2), 924–936 (2011)CrossRefMATH
Metadaten
Titel
Measurement of Semantic Similarity: A Concept Hierarchy Based Approach
verfasst von
Shrutilipi Bhattacharjee
Soumya K. Ghosh
Copyright-Jahr
2016
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2529-4_43