Skip to main content

2015 | OriginalPaper | Buchkapitel

Low Ambiguity First Algorithm: A New Approach to Knowledge-Based Word Sense Disambiguation

verfasst von : Dongjin Choi, Myunggwon Hwang, Byeongkyu Ko, Sicheon You, Pankoo Kim

Erschienen in: HCI in Business

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The Word Sense Disambiguation (WSD) problem has been considered as one of the most important challenging task in Natural Language Processing (NLP) research area. Even though, many of scientists applied the robust machine learning, statistical techniques, and structural pattern matching approach, the performance of WSD is still not able to bit human results due to the complexity of human language. In order to overcome this limitation, currently, the knowledge base such as WordNet has gained high popularity among researchers due to the fact that this knowledge base can extensively provide not only the definitions of nouns and verbs, but also the semantic networks between senses which were defined by linguists. However, knowledge bases are not fully dealing with entire words of human languages because maintaining and expanding the knowledge base is huge task which requires many efforts and time. Expanding knowledge base is not a big issue to concern however, a new approach is the major goal of this paper to solve WSD problem only based on limited knowledge resources. In this paper, we propose a method, named low ambiguity first (LAF) algorithm, which disambiguates a polysemous word with a low ambiguity degree first with given disambiguated words, based on the structural semantic interconnections (SSI) approach. The LAF algorithm is based on the two hypothesises that first, adjacent words are semantically relevant than other words far way. Second, word ambiguity can be measured by frequency differences between synsets of the given word in WordNet. We have proved these hypothesises in the experiment results, the LAF algorithm can improve the performance of traditional WSD results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Nocturnal mouselike mammal with forelimbs modified to form membranous wings and anatomical adaptations for echolocation by which they navigate.
 
2
is a contiguous sequence of n items from a given sequence of text or speech.
 
3
is a lexical database for the human languages provides definitions and relations among synonyms developed by Cognitive Science Laboratory of Princeton University.
 
4
having only single meaning or sense.
 
5
The Brown University Standard Corpus of Present-Day American English.
 
6
A SemCor corpus is a manually sense-tagged corpora created by the WordNet project research team in Princeton University.
 
7
We simply run a comparison test by using small amount of sentences.
 
Literatur
1.
Zurück zum Zitat Ide, N., Veronis, J.: Introduction to the special issue on word sense disambiguation. Comput. Linguist. 24(1), 2–40 (1998) Ide, N., Veronis, J.: Introduction to the special issue on word sense disambiguation. Comput. Linguist. 24(1), 2–40 (1998)
2.
Zurück zum Zitat Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone form an ice cream cone. In: SIGDOC 1986: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM, New York (1986) Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone form an ice cream cone. In: SIGDOC 1986: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM, New York (1986)
3.
Zurück zum Zitat Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. 27(7), 1075–1086 (2005)CrossRef Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. 27(7), 1075–1086 (2005)CrossRef
4.
Zurück zum Zitat Weiss, S.F.: Learning to disambiguate. Inform. Storage Retrieval 9(1), 33–41 (1973)CrossRef Weiss, S.F.: Learning to disambiguate. Inform. Storage Retrieval 9(1), 33–41 (1973)CrossRef
5.
Zurück zum Zitat Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, Upper Saddle River (2000) Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, Upper Saddle River (2000)
6.
Zurück zum Zitat Pedersen, T.: Unsupervised corpus-based methods for WSD. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation: Text, Speech and Language Technology, vol. 33, pp. 133–166. Springer, New York (2006)CrossRef Pedersen, T.: Unsupervised corpus-based methods for WSD. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation: Text, Speech and Language Technology, vol. 33, pp. 133–166. Springer, New York (2006)CrossRef
7.
Zurück zum Zitat Hwang, M., Choi, C., Kim, P.: Automatic enrichment of semantic relation network and its application to word sense disambiguation. IEEE Trans. Knowl. Data Eng. 23(6), 845–858 (2011)CrossRef Hwang, M., Choi, C., Kim, P.: Automatic enrichment of semantic relation network and its application to word sense disambiguation. IEEE Trans. Knowl. Data Eng. 23(6), 845–858 (2011)CrossRef
8.
Zurück zum Zitat Choi, D. Kim, P.: Identifying the most appropriate expansion of acronyms used in wikipedia text. Softw. Pract. Experience (2014). doi:10.1002/spe.2006 Choi, D. Kim, P.: Identifying the most appropriate expansion of acronyms used in wikipedia text. Softw. Pract. Experience (2014). doi:10.​1002/​spe.​2006
Metadaten
Titel
Low Ambiguity First Algorithm: A New Approach to Knowledge-Based Word Sense Disambiguation
verfasst von
Dongjin Choi
Myunggwon Hwang
Byeongkyu Ko
Sicheon You
Pankoo Kim
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-20895-4_52