Skip to main content
Top

2018 | OriginalPaper | Chapter

An Empirical Study of Word Sense Disambiguation for Biomedical Information Retrieval System

Authors : Mohammed Rais, Abdelmonaime Lachkar

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Document representation is an important stage to ensure the indexation of biomedical document. The ordinary way to represent a text is a bag of words BoW, This Representation suffers from the lack of sense in resulting representations ignoring all semantics that reside in the original text; instead of, the Conceptualization using background knowledge enriches document representation models. Three strategies can be used in order to realize the conceptualization task: Adding Concept, Partial Conceptualization, and Complete Conceptualization. While searching polysemic term corresponding senses in semantic resources, multiple matches are detected then introduce some ambiguities in the final document representation, three strategies for Disambiguation can be used: First Concept, All Concepts and Context-Based. SenseRelate is a well-known Context-Based algorithm, which uses a fixed window size and taking into consideration the distance weight on how far the terms in the context are from the target word. This may impact negatively on the yielded concepts or senses, we propose a simple modified version of SenseRelate algorithm namely NoDistanceSenseRelate, which simply ignore the distance that is the terms in the context will have the same distance weight. In order to evaluate the effect of the conceptualization strategies and Disambiguation strategies in the indexing process, in this study, several experiments have been conducted using OHSUMED corpus on a biomedical information retrieval system. The obtained results using OHSUMED corpus show that the Context-Based methods (SenseRelate and NoDistanceSenseRelate) outperform the others ones when applying Adding Concept Conceptualization strategy results using Biomedical Information retrieval system. The obtained results prove the evidence of adding the sense of concepts to the Term Representation in the IR process.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Elberrichi, Z., Taibi, M., Belaggoun, A.: Multilingual Medical Documents Classification Based on MesH Domain Ontology. CoRR abs/1206.4883 (2012) Elberrichi, Z., Taibi, M., Belaggoun, A.: Multilingual Medical Documents Classification Based on MesH Domain Ontology. CoRR abs/1206.4883 (2012)
4.
go back to reference Amine, A., Elberrichi, Z., Simonet, M.: Evaluation of text clustering methods using WordNet. Int. Arab J. Inf. Technol. 7, 351 (2010) Amine, A., Elberrichi, Z., Simonet, M.: Evaluation of text clustering methods using WordNet. Int. Arab J. Inf. Technol. 7, 351 (2010)
5.
go back to reference Guyot, J., Radhoum, S., Falquet, G.: Ontology-based multilingual information retrieval. In: CLEF (2005) Guyot, J., Radhoum, S., Falquet, G.: Ontology-based multilingual information retrieval. In: CLEF (2005)
6.
go back to reference Litvak, M., Last, M., Kisilevich, S.: Improving classification of multilingual web documents using domain ontologies. In: KDO05, The Second International Workshop on Knowledge Discovery and Ontologies, Porto, Portugal, 7 October 2006 Litvak, M., Last, M., Kisilevich, S.: Improving classification of multilingual web documents using domain ontologies. In: KDO05, The Second International Workshop on Knowledge Discovery and Ontologies, Porto, Portugal, 7 October 2006
7.
go back to reference Song, M.-H., Lim, S-Yeon, Park, S.-B., Kang, D.-J., Lee, S.-J.: An automatic approach to classify web documents using a domain ontology. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 666–671. Springer, Heidelberg (2005). https://doi.org/10.1007/11590316_107CrossRef Song, M.-H., Lim, S-Yeon, Park, S.-B., Kang, D.-J., Lee, S.-J.: An automatic approach to classify web documents using a domain ontology. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 666–671. Springer, Heidelberg (2005). https://​doi.​org/​10.​1007/​11590316_​107CrossRef
8.
9.
go back to reference Stokoe, C., Oakes, M.P., Tait, J.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 159–166 (2003) Stokoe, C., Oakes, M.P., Tait, J.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 159–166 (2003)
10.
go back to reference Kim, S.B., Seo, H.C., Rim, H.C.: Information retrieval using word senses: root sense tagging approach. In: Proceedings of the 27th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 258–265 (2004) Kim, S.B., Seo, H.C., Rim, H.C.: Information retrieval using word senses: root sense tagging approach. In: Proceedings of the 27th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 258–265 (2004)
11.
go back to reference Fang, H.: A re-examination of query expansion using lexical resources. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics: Human Language Technologies, pp. 139–147 (2008) Fang, H.: A re-examination of query expansion using lexical resources. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics: Human Language Technologies, pp. 139–147 (2008)
12.
go back to reference Agirre, E., Arregi, X., Otegi, A.: Document expansion based on WordNet for robust IR. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 9–17 (2010) Agirre, E., Arregi, X., Otegi, A.: Document expansion based on WordNet for robust IR. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 9–17 (2010)
13.
go back to reference Majdoubi, J., Loukil, H., Tmar, M., Gargouri, F.: An approach based on language modeling for improving biomedical information retrieval. Int. J. Knowl.-based Intell. Eng. Syst. 16(4), 235–246 (2012) Majdoubi, J., Loukil, H., Tmar, M., Gargouri, F.: An approach based on language modeling for improving biomedical information retrieval. Int. J. Knowl.-based Intell. Eng. Syst. 16(4), 235–246 (2012)
15.
go back to reference McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inform. 46(6), 1116–1124 (2013)CrossRef McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inform. 46(6), 1116–1124 (2013)CrossRef
17.
go back to reference Rais, M., Lachkar, A.: Biomedical word sense disambiguation context-based: improvement of SenseRelate method. In: IEEE Explore - 2016 International Conference on Information Technology for Organizations Development (IT4OD) (2016) Rais, M., Lachkar, A.: Biomedical word sense disambiguation context-based: improvement of SenseRelate method. In: IEEE Explore - 2016 International Conference on Information Technology for Organizations Development (IT4OD) (2016)
20.
go back to reference Hersh, W., et al.: OHSUMED: an interactive retrieval evaluation and new large test collection for research. In: 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 192–201. New York, Inc., Dublin (1994) Hersh, W., et al.: OHSUMED: an interactive retrieval evaluation and new large test collection for research. In: 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 192–201. New York, Inc., Dublin (1994)
21.
go back to reference Voorhees, E.M., Harman, D.K.: TREC: “Experiment and Evaluation in Information Retrieval”. MIT Press, Cambridge (2005) Voorhees, E.M., Harman, D.K.: TREC: “Experiment and Evaluation in Information Retrieval”. MIT Press, Cambridge (2005)
Metadata
Title
An Empirical Study of Word Sense Disambiguation for Biomedical Information Retrieval System
Authors
Mohammed Rais
Abdelmonaime Lachkar
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-78723-7_27

Premium Partner