Skip to main content

2019 | OriginalPaper | Buchkapitel

Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity

verfasst von : Silvio Domingos Cardoso, Marcos Da Silveira, Ying-Chi Lin, Victor Christen, Erhard Rahm, Chantal Reynaud-Delaître, Cédric Pruski

Erschienen in: Data Integration in the Life Sciences

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The use of similarity measures in various domains is cornerstone for different tasks ranging from ontology alignment to information retrieval. To this end, existing metrics can be classified into several categories among which lexical and semantic families of similarity measures predominate but have rarely been combined to complete the aforementioned tasks. In this paper, we propose an original approach combining lexical and ontology-based semantic similarity measures to improve the evaluation of terms relatedness. We validate our approach through a set of experiments based on a corpus of reference constructed by domain experts of the medical field and further evaluate the impact of ontology evolution on the used semantic similarity measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aouicha, M.B., Taieb, M.A.H.: Computing semantic similarity between biomedical concepts using new information content approach. J. Biomed. Inform. 59, 258–275 (2016)CrossRef Aouicha, M.B., Taieb, M.A.H.: Computing semantic similarity between biomedical concepts using new information content approach. J. Biomed. Inform. 59, 258–275 (2016)CrossRef
3.
Zurück zum Zitat Cardoso, S.D., Reynaud-Delaître, C., Da Silveira, M., Pruski, C.: Combining rules, background knowledge and change patterns to maintain semantic annotations. In: AMIA Annual Symposium, Washington DC, USA, November 2017 (2017) Cardoso, S.D., Reynaud-Delaître, C., Da Silveira, M., Pruski, C.: Combining rules, background knowledge and change patterns to maintain semantic annotations. In: AMIA Annual Symposium, Washington DC, USA, November 2017 (2017)
6.
Zurück zum Zitat Couto, F., Pinto, S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1371001 (2013)CrossRef Couto, F., Pinto, S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1371001 (2013)CrossRef
7.
Zurück zum Zitat Couto, F.M., Silva, M.J., Coutinho, P.M.: Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information And Knowledge Management, pp. 343–344. ACM (2005) Couto, F.M., Silva, M.J., Coutinho, P.M.: Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information And Knowledge Management, pp. 343–344. ACM (2005)
8.
Zurück zum Zitat Cross, V.: Tversky’s parameterized similarity ratio model: a basis for semantic relatedness. In: 2006 Fuzzy Information Processing Society, NAFIPS 2006, Annual meeting of the North American, pp. 541–546. IEEE (2006) Cross, V.: Tversky’s parameterized similarity ratio model: a basis for semantic relatedness. In: 2006 Fuzzy Information Processing Society, NAFIPS 2006, Annual meeting of the North American, pp. 541–546. IEEE (2006)
9.
Zurück zum Zitat Cross, V., Silwal, P., Chen, X.: Experiments varying semantic similarity measures and reference ontologies for ontology alignment. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 279–281. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41242-4_42CrossRef Cross, V., Silwal, P., Chen, X.: Experiments varying semantic similarity measures and reference ontologies for ontology alignment. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 279–281. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-41242-4_​42CrossRef
10.
Zurück zum Zitat Da Silveira, M., Dos Reis, J.C., Pruski, C.: Management of dynamic biomedical terminologies: current status and future challenges. Yearb. Med. Inf. 10(1), 125–133 (2015) Da Silveira, M., Dos Reis, J.C., Pruski, C.: Management of dynamic biomedical terminologies: current status and future challenges. Yearb. Med. Inf. 10(1), 125–133 (2015)
11.
Zurück zum Zitat Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaître, C.: DyKOSMap: a framework for mapping adaptation between biomedical knowledge organization systems. J. Biomed. Inf. 55, 153–173 (2015)CrossRef Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaître, C.: DyKOSMap: a framework for mapping adaptation between biomedical knowledge organization systems. J. Biomed. Inf. 55, 153–173 (2015)CrossRef
12.
Zurück zum Zitat Faria, D., Pesquita, C., Couto, F.M., Falcão, A.: Proteinon: a web tool for protein semantic similarity. Department of Informatics, University of Lisbon (2007) Faria, D., Pesquita, C., Couto, F.M., Falcão, A.: Proteinon: a web tool for protein semantic similarity. Department of Informatics, University of Lisbon (2007)
13.
Zurück zum Zitat Ferreira, R., Lins, R.D., Simske, S.J., Freitas, F., Riss, M.: Assessing sentence similarity through lexical, syntactic and semantic analysis. Comput. Speech Lang. 39, 1–28 (2016)CrossRef Ferreira, R., Lins, R.D., Simske, S.J., Freitas, F., Riss, M.: Assessing sentence similarity through lexical, syntactic and semantic analysis. Comput. Speech Lang. 39, 1–28 (2016)CrossRef
14.
Zurück zum Zitat Garla, V.N., Brandt, C.: Semantic similarity in the biomedical domain: an evaluation across knowledge sources. BMC Bioinf. 13(1), 261 (2012)CrossRef Garla, V.N., Brandt, C.: Semantic similarity in the biomedical domain: an evaluation across knowledge sources. BMC Bioinf. 13(1), 261 (2012)CrossRef
15.
Zurück zum Zitat Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013) Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)
16.
Zurück zum Zitat Harispe, S.: Knowledge-based semantic measures: from theory to applications. Ph.D. thesis (2014) Harispe, S.: Knowledge-based semantic measures: from theory to applications. Ph.D. thesis (2014)
17.
Zurück zum Zitat Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J. Biomed. Inf. 48, 38–53 (2014)CrossRef Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J. Biomed. Inf. 48, 38–53 (2014)CrossRef
18.
Zurück zum Zitat Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data 2(2), 10:1–10:25 (2008)CrossRef Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data 2(2), 10:1–10:25 (2008)CrossRef
20.
Zurück zum Zitat Li, B., Wang, J.Z., Feltus, F.A., Zhou, J., Luo, F.: Effectively integrating information content and structural relationship to improve the go-based similarity measure between proteins. arXiv preprint arXiv:1001.0958 (2010) Li, B., Wang, J.Z., Feltus, F.A., Zhou, J., Luo, F.: Effectively integrating information content and structural relationship to improve the go-based similarity measure between proteins. arXiv preprint arXiv:​1001.​0958 (2010)
22.
Zurück zum Zitat Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)CrossRef Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)CrossRef
23.
Zurück zum Zitat Mazandu, G.K., Mulder, N.J.: A topology-based metric for measuring term similarity in the gene ontology. Adv. Bioinform. 2012 (2012)CrossRef Mazandu, G.K., Mulder, N.J.: A topology-based metric for measuring term similarity in the gene ontology. Adv. Bioinform. 2012 (2012)CrossRef
24.
Zurück zum Zitat McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inf. 46(6), 1116–1124 (2013). Special Section: Social Media EnvironmentsCrossRef McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inf. 46(6), 1116–1124 (2013). Special Section: Social Media EnvironmentsCrossRef
25.
Zurück zum Zitat Mihalcea, R., Corley, C., Strapparava, C., et al.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, vol. 6, 775–780 (2006) Mihalcea, R., Corley, C., Strapparava, C., et al.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, vol. 6, 775–780 (2006)
26.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
27.
Zurück zum Zitat Morris, J.F.: A quantitative methodology for vetting dark network intelligence sources for social network analysis. Technical report, Air Force Inst of Tech Wright-Patterson AFB OH Graduate School of Engineering and Management (2012) Morris, J.F.: A quantitative methodology for vetting dark network intelligence sources for social network analysis. Technical report, Air Force Inst of Tech Wright-Patterson AFB OH Graduate School of Engineering and Management (2012)
28.
Zurück zum Zitat Nguyen, T.T., Conrad, S.: Ontology matching using multiple similarity measures. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 01, pp. 603–611, November 2015. doi.ieeecomputersociety.org/ Nguyen, T.T., Conrad, S.: Ontology matching using multiple similarity measures. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 01, pp. 603–611, November 2015. doi.ieeecomputersociety.org/
29.
Zurück zum Zitat Oliva, J., Serrano, J.I., del Castillo, M.D., Iglesias, Á.: SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl. Eng. 70(4), 390–405 (2011)CrossRef Oliva, J., Serrano, J.I., del Castillo, M.D., Iglesias, Á.: SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl. Eng. 70(4), 390–405 (2011)CrossRef
30.
Zurück zum Zitat Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.B.: Semantic similarity and relatedness between clinical terms: an experimental study. In: Annual Symposium proceedings, AMIA Symposium, vol. 2010, pp. 572–576. AMIA (2010) Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.B.: Semantic similarity and relatedness between clinical terms: an experimental study. In: Annual Symposium proceedings, AMIA Symposium, vol. 2010, pp. 572–576. AMIA (2010)
31.
Zurück zum Zitat Pakhomov, S.V., Pedersen, T., McInnes, B., Melton, G.B., Ruggieri, A., Chute, C.G.: Towards a framework for developing semantic relatedness reference standards. J. Biomed. Inf. 44(2), 251–265 (2011)CrossRef Pakhomov, S.V., Pedersen, T., McInnes, B., Melton, G.B., Ruggieri, A., Chute, C.G.: Towards a framework for developing semantic relatedness reference standards. J. Biomed. Inf. 44(2), 251–265 (2011)CrossRef
32.
Zurück zum Zitat Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 288–299 (2007)CrossRef Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 288–299 (2007)CrossRef
34.
Zurück zum Zitat Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, New York (1988)MATH Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, New York (1988)MATH
35.
Zurück zum Zitat Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc. (1995) Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc. (1995)
36.
Zurück zum Zitat Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24(2), 297–303 (2011)CrossRef Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24(2), 297–303 (2011)CrossRef
37.
Zurück zum Zitat Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)CrossRef Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)CrossRef
38.
Zurück zum Zitat Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, p. 1089 (2004) Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, p. 1089 (2004)
39.
Zurück zum Zitat Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), vol. 58, p. 64 (2000) Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), vol. 58, p. 64 (2000)
40.
41.
Zurück zum Zitat Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 Second International Conference on Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008) Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 Second International Conference on Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)
Metadaten
Titel
Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity
verfasst von
Silvio Domingos Cardoso
Marcos Da Silveira
Ying-Chi Lin
Victor Christen
Erhard Rahm
Chantal Reynaud-Delaître
Cédric Pruski
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-06016-9_2