Skip to main content
Erschienen in: Neural Computing and Applications 8/2020

16.10.2018 | Original Article

Measuring distance-based semantic similarity using meronymy and hyponymy relations

verfasst von: Yuanyuan Cai, Shirui Pan, Ximeng Wang, Hongshu Chen, Xiaoyan Cai, Min Zuo

Erschienen in: Neural Computing and Applications | Ausgabe 8/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The assessment of semantic similarity between lexical terms plays a critical part in semantic-oriented applications for natural language processing and cognitive science. The optimization of calculation models is still a challenging issue for improving the performance of similarity measurement. In this paper, we investigate WordNet-based measures including distance-based, information-based, feature-based and hybrid. Among them, the distance-based measures are considered to have the lowest computational complexity due to simple distance calculation. However, most of existing works ignore the meronymy relation between concepts and the non-uniformity of path distances caused by various semantic relations, in which path distances are simply determined by conceptual hyponymy relation. To solve this problem, we propose a novel model to calculate the path distance between concepts, and also propose a similarity measure which nonlinearly transforms the distance to semantic similarity. In the proposed model, we assign different weights in accordance with various relations to edges that link different concepts. On basis of the distance model, we use five structure properties of WordNet for similarity measurement, which consist of multiple meanings, multiple inheritance, link type, depth and local density. Our similarity measure is compared against state-of-the-art WordNet-based measures on M&C dataset, R&G dataset and WS-353 dataset. According to experiment results, the proposed measure in this work outperforms others in terms of both Pearson and Spearman correlation coefficients, which indicates the effectiveness of our distance model. Besides, we construct six additional benchmarks to prove that the proposed measure maintains stable performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aouicha MB, Taieb MAH (2016) Computing semantic similarity between biomedical concepts using new information content approach. J Biomed Inform 59(1):258–275 Aouicha MB, Taieb MAH (2016) Computing semantic similarity between biomedical concepts using new information content approach. J Biomed Inform 59(1):258–275
2.
Zurück zum Zitat Aouicha MB, Taieb MAH, Hamadou AB (2016) LWCR: multi-layered wikipedia representation for computing word relatedness. Neurocomputing 216:816–843 Aouicha MB, Taieb MAH, Hamadou AB (2016) LWCR: multi-layered wikipedia representation for computing word relatedness. Neurocomputing 216:816–843
3.
Zurück zum Zitat Aouicha MB, Taieb MAH, Hamadou AB (2018) SISR: system for integrating semantic relatedness and similarity measures. Soft Comput 22:1855–1879 Aouicha MB, Taieb MAH, Hamadou AB (2018) SISR: system for integrating semantic relatedness and similarity measures. Soft Comput 22:1855–1879
4.
Zurück zum Zitat Bae M, Kang S, Oh S (2014) Semantic similarity method for keyword query system on RDF. Neurocomputing 146(C):264–275 Bae M, Kang S, Oh S (2014) Semantic similarity method for keyword query system on RDF. Neurocomputing 146(C):264–275
5.
Zurück zum Zitat Banerjee S, Pedersen T (2003) Extended gloss overlaps as a measure of semantic relatedness. Proc Int Jt Conf Artif Intell 3:805–810 Banerjee S, Pedersen T (2003) Extended gloss overlaps as a measure of semantic relatedness. Proc Int Jt Conf Artif Intell 3:805–810
6.
Zurück zum Zitat Bollegala D, Matsuo Y, Ishizuka M (2011) A web search engine-based approach to measure semantic similarity between words. IEEE Trans Knowl Data Eng 23(7):977–990 Bollegala D, Matsuo Y, Ishizuka M (2011) A web search engine-based approach to measure semantic similarity between words. IEEE Trans Knowl Data Eng 23(7):977–990
7.
Zurück zum Zitat Cai Y, Zhang Q, Lu W, Che X (2017) A hybrid approach for measuring semantic similarity based on IC-weighted path distance in wordnet. J Intell Inf Syst 1:1–25 Cai Y, Zhang Q, Lu W, Che X (2017) A hybrid approach for measuring semantic similarity based on IC-weighted path distance in wordnet. J Intell Inf Syst 1:1–25
8.
Zurück zum Zitat Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G, Ruppin E (2002) Placing search in context: the concept revisited. ACM Trans Inf Syst 20(1):116–131 Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G, Ruppin E (2002) Placing search in context: the concept revisited. ACM Trans Inf Syst 20(1):116–131
9.
Zurück zum Zitat Formica A (2009) Concept similarity by evaluating information contents and feature vectors: a combined approach. Commun ACM 52(3):145–149 Formica A (2009) Concept similarity by evaluating information contents and feature vectors: a combined approach. Commun ACM 52(3):145–149
10.
Zurück zum Zitat Gao JB, Zhang BW, Chen XH A (2015) wordnet-based semantic similarity measurement combining edge-counting and information content theory. Eng Appl Artif Intell, 39, 80–88 Gao JB, Zhang BW, Chen XH A (2015) wordnet-based semantic similarity measurement combining edge-counting and information content theory. Eng Appl Artif Intell, 39, 80–88
11.
Zurück zum Zitat GhazizadehAhsaee M, Naghibzadeh M (2012) Weighted semantic similarity assessment using wordnet. In: 2012 international conference on computer and information science GhazizadehAhsaee M, Naghibzadeh M (2012) Weighted semantic similarity assessment using wordnet. In: 2012 international conference on computer and information science
12.
Zurück zum Zitat Gibbons JD (1976) Nonparametric methods for quantitative analysis. Holt Rinehart and Winston, AustinMATH Gibbons JD (1976) Nonparametric methods for quantitative analysis. Holt Rinehart and Winston, AustinMATH
13.
Zurück zum Zitat Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge, pp 305–332 Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge, pp 305–332
14.
Zurück zum Zitat Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the 10th international conference research on computational linguistics. Taiwan Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the 10th international conference research on computational linguistics. Taiwan
15.
Zurück zum Zitat Jiang L, Li C (2013) An augmented value difference measure. Pattern Recognit Lett 34(10):1169–1174 Jiang L, Li C (2013) An augmented value difference measure. Pattern Recognit Lett 34(10):1169–1174
16.
Zurück zum Zitat Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: frequency difference metric. Int J Pattern Recognit Artif Intell 28(02):1451,002 Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: frequency difference metric. Int J Pattern Recognit Artif Intell 28(02):1451,002
17.
Zurück zum Zitat Lastra-Díaz JJ, García-Serrano A (2015) A novel family of IC-based similarity measures with a detailed experimental survey on wordnet. Eng Appl Artif Intell 46:140–153 Lastra-Díaz JJ, García-Serrano A (2015) A novel family of IC-based similarity measures with a detailed experimental survey on wordnet. Eng Appl Artif Intell 46:140–153
18.
Zurück zum Zitat Lastra-Díaz JJ, García-Serrano A, Batet M, Fernández M, Chirigati F (2017) HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Inf Syst 66:97–118 Lastra-Díaz JJ, García-Serrano A, Batet M, Fernández M, Chirigati F (2017) HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Inf Syst 66:97–118
19.
Zurück zum Zitat Leacock C, Chodrow M (1998) Combining local context and WordNet similarity for word sense identification. MIT Press, Cambridge Leacock C, Chodrow M (1998) Combining local context and WordNet similarity for word sense identification. MIT Press, Cambridge
20.
Zurück zum Zitat Li C, Jiang L, Li H, Wu J, Zhang P (2017) Toward value difference metric with attribute weighting. Knowl Inf Syst 50(3):795–825 Li C, Jiang L, Li H, Wu J, Zhang P (2017) Toward value difference metric with attribute weighting. Knowl Inf Syst 50(3):795–825
21.
Zurück zum Zitat Li Y, Bandar Z, McLean S (2003) An approach for measuring semantic similarity between words using multiple information sources. Trans Data Knowl Eng 15(4):871–882 Li Y, Bandar Z, McLean S (2003) An approach for measuring semantic similarity between words using multiple information sources. Trans Data Knowl Eng 15(4):871–882
22.
Zurück zum Zitat Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning ICML. Madison, Wisconsin Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning ICML. Madison, Wisconsin
23.
Zurück zum Zitat Liu HZ, Bao H, Xu D (2012) Concept vector for semantic similarity and relatedness based on wordnet structure. J Syst Softw 85(2):370–381 Liu HZ, Bao H, Xu D (2012) Concept vector for semantic similarity and relatedness based on wordnet structure. J Syst Softw 85(2):370–381
24.
Zurück zum Zitat Liu JNK, He YL, Lim EHY, Wang XZ (2014) Domain ontology graph model and its application in Chinese text classification. Neural Comput Appl 24(3–4):779–798 Liu JNK, He YL, Lim EHY, Wang XZ (2014) Domain ontology graph model and its application in Chinese text classification. Neural Comput Appl 24(3–4):779–798
25.
Zurück zum Zitat Lu W, Cai Y, Che X, Lu Y (2016) Joint semantic similarity assessment with raw corpus and structured ontology for semantic-oriented service discovery. Pers Ubiquitous Comput 20(3):311–323 Lu W, Cai Y, Che X, Lu Y (2016) Joint semantic similarity assessment with raw corpus and structured ontology for semantic-oriented service discovery. Pers Ubiquitous Comput 20(3):311–323
26.
Zurück zum Zitat Meng L, Gu J, Zhou Z (2012) A new model of information content based on concept’s topology for measuring semantic similarity in wordnet. Int J Grid Distrib Comput 5(3):81–94 Meng L, Gu J, Zhou Z (2012) A new model of information content based on concept’s topology for measuring semantic similarity in wordnet. Int J Grid Distrib Comput 5(3):81–94
27.
Zurück zum Zitat Meng L, Huang R, Gu J (2013) An effective algorithm for semantic similarity metric of word pairs. Int J Multimed Ubiquitous Eng 8(2):1–12 Meng L, Huang R, Gu J (2013) An effective algorithm for semantic similarity metric of word pairs. Int J Multimed Ubiquitous Eng 8(2):1–12
28.
Zurück zum Zitat Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41 Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
29.
Zurück zum Zitat Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Lang Cognit Process 6(1):1–28MathSciNet Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Lang Cognit Process 6(1):1–28MathSciNet
30.
Zurück zum Zitat Paliwal AV, Shafiq B, Vaidya J, Xiong H, Adam N (2012) Semantics-based automated service discovery. IEEE Trans Serv Comput 5(2):260–275 Paliwal AV, Shafiq B, Vaidya J, Xiong H, Adam N (2012) Semantics-based automated service discovery. IEEE Trans Serv Comput 5(2):260–275
31.
Zurück zum Zitat Patwardhan S, Pedersen T (2006) Using wordnet-based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL workshop on making sense of sense-bringing computational linguistics and psycholinguistics together. Citeseer, pp 1–8 Patwardhan S, Pedersen T (2006) Using wordnet-based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL workshop on making sense of sense-bringing computational linguistics and psycholinguistics together. Citeseer, pp 1–8
32.
Zurück zum Zitat Petrakis EG, Varelas G, Hliaoutakis A, Raftopoulou P (2006) X-similarity: computing semantic similarity between concepts from different ontologies. J Digit Inf Manag 4(4):233–237 Petrakis EG, Varelas G, Hliaoutakis A, Raftopoulou P (2006) X-similarity: computing semantic similarity between concepts from different ontologies. J Digit Inf Manag 4(4):233–237
33.
Zurück zum Zitat Pirró G (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68(11):1289–1308 Pirró G (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68(11):1289–1308
34.
Zurück zum Zitat Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30 Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30
35.
Zurück zum Zitat Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence. Montréal Québec, Canada, pp 448–453 Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence. Montréal Québec, Canada, pp 448–453
36.
Zurück zum Zitat Rodríguez MA, Egenhofer MJ (2003) Determining semantic similarity among entity classes from different ontologies. IEEE Trans Knowl Data Eng 15(2):442–456 Rodríguez MA, Egenhofer MJ (2003) Determining semantic similarity among entity classes from different ontologies. IEEE Trans Knowl Data Eng 15(2):442–456
37.
Zurück zum Zitat Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633 Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633
38.
Zurück zum Zitat Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl Based Syst 24(2):297–303 Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl Based Syst 24(2):297–303
39.
Zurück zum Zitat Sánchez D, Batet M, Isern D, Valls A (2012) Ontology-based semantic similarity: a new feature-based approach. Expert Syst Appl 39(9):7718–7728 Sánchez D, Batet M, Isern D, Valls A (2012) Ontology-based semantic similarity: a new feature-based approach. Expert Syst Appl 39(9):7718–7728
40.
Zurück zum Zitat Seco N, Veale T, Hayes J (2004) An intrinsic information content metric for semantic similarity in wordnet. In: de Mántaras RL, Saitta L (eds) Proceedings of the 16th European conference on artificial intelligence. IOS Press, Valencia, Spain, pp 1089–1090 Seco N, Veale T, Hayes J (2004) An intrinsic information content metric for semantic similarity in wordnet. In: de Mántaras RL, Saitta L (eds) Proceedings of the 16th European conference on artificial intelligence. IOS Press, Valencia, Spain, pp 1089–1090
41.
Zurück zum Zitat Sheldon R (2002) A first course in probability. Pearson Education India, BengaluruMATH Sheldon R (2002) A first course in probability. Pearson Education India, BengaluruMATH
42.
Zurück zum Zitat Simonoff JS (1996) Smoothing methods in statistics. Springer, BerlinMATH Simonoff JS (1996) Smoothing methods in statistics. Springer, BerlinMATH
43.
Zurück zum Zitat Singh J, Kumar R (2017) Lexical co-occurrence and contextual window-based approach with semantic similarity for query expansion. Int J Intell Inf Technol (IJIIT) 13:57–78 Singh J, Kumar R (2017) Lexical co-occurrence and contextual window-based approach with semantic similarity for query expansion. Int J Intell Inf Technol (IJIIT) 13:57–78
44.
Zurück zum Zitat Turney PD (2002) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning, pp 491–502 Turney PD (2002) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning, pp 491–502
45.
Zurück zum Zitat Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37(1):141–188MathSciNetMATH Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37(1):141–188MathSciNetMATH
46.
Zurück zum Zitat Tversky A (1977) Features of similarity. Psychol Rev 84(4):327–352 Tversky A (1977) Features of similarity. Psychol Rev 84(4):327–352
47.
Zurück zum Zitat Wang X, Liu Y, Xiong F (2016) Improved personalized recommendation based on a similarity network. Physica A Stat Mech Appl 456:271–280 Wang X, Liu Y, Xiong F (2016) Improved personalized recommendation based on a similarity network. Physica A Stat Mech Appl 456:271–280
48.
Zurück zum Zitat Wang X, Liu Y, Zhang G, Xiong F, Lu J (2017) Diffusion-based recommendation with trust relations on tripartite graphs. J Stat Mech Theory Exp 2017(8):083,405MathSciNet Wang X, Liu Y, Zhang G, Xiong F, Lu J (2017) Diffusion-based recommendation with trust relations on tripartite graphs. J Stat Mech Theory Exp 2017(8):083,405MathSciNet
49.
Zurück zum Zitat Wei T, Lu Y, Chang H, Zhou Q, Bao X (2015) A semantic approach for text clustering using wordnet and lexical chains. Expert Syst Appl 42(4):2264–2275 Wei T, Lu Y, Chang H, Zhou Q, Bao X (2015) A semantic approach for text clustering using wordnet and lexical chains. Expert Syst Appl 42(4):2264–2275
52.
Zurück zum Zitat Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, pp 133–138 Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, pp 133–138
54.
Zurück zum Zitat Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2018) Effective uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst 19(1):220–229 Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2018) Effective uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst 19(1):220–229
55.
Zurück zum Zitat Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2018) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst 19(1):284–295 Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2018) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst 19(1):284–295
56.
Zurück zum Zitat Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Sig Process Lett 21(5):573–576 Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Sig Process Lett 21(5):573–576
57.
Zurück zum Zitat Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089 Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089
58.
Zurück zum Zitat Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in wordnet. In: Proceedings of the 2nd international conference on future generation communication and networking symposia FGCNS. Sanya, Hainan Island, China, pp 85–89 Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in wordnet. In: Proceedings of the 2nd international conference on future generation communication and networking symposia FGCNS. Sanya, Hainan Island, China, pp 85–89
59.
Zurück zum Zitat Zhu X, Li F, Chen H, Peng Q (2018) An efficient path computing model for measuring semantic similarity using edge and density. Knowl Inf Syst 55(1):79–111 Zhu X, Li F, Chen H, Peng Q (2018) An efficient path computing model for measuring semantic similarity using edge and density. Knowl Inf Syst 55(1):79–111
Metadaten
Titel
Measuring distance-based semantic similarity using meronymy and hyponymy relations
verfasst von
Yuanyuan Cai
Shirui Pan
Ximeng Wang
Hongshu Chen
Xiaoyan Cai
Min Zuo
Publikationsdatum
16.10.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 8/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3766-9

Weitere Artikel der Ausgabe 8/2020

Neural Computing and Applications 8/2020 Zur Ausgabe