Skip to main content

2018 | OriginalPaper | Buchkapitel

3. Morphological Disambiguation for Turkish

verfasst von : Dilek Zeynep Hakkani-Tür, Murat Saraçlar, Gökhan Tür, Kemal Oflazer, Deniz Yuret

Erschienen in: Turkish Natural Language Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Morphological disambiguation is the task of determining the contextually correct morphological parses of tokens in a sentence. A morphological disambiguator takes in a set of morphological parses for each token, generated by a morphological analyzer, and then selects a morphological parse for each, considering statistical and/or linguistic contextual information. This task can be seen as a generalization of the part-of-speech (POS) tagging problem, for morphologically rich languages. The disambiguated morphological analysis is usually crucial for further processing steps such as dependency parsing. In this chapter, we review the morphological disambiguation problem for Turkish and discuss approaches for solving this problem as they have evolved from manually crafted constraint-based rule systems to systems employing machine learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
See Chap. 2 for many additional examples of morphological ambiguity.
 
Literatur
Zurück zum Zitat Arslan BB (2009) An approach to the morphological disambiguation problem using conditional random fields. Master’s thesis, Sabancı University, Istanbul Arslan BB (2009) An approach to the morphological disambiguation problem using conditional random fields. Master’s thesis, Sabancı University, Istanbul
Zurück zum Zitat Bilmes JA, Kirchhoff K (2003) Factored language models and generalized parallel backoff. In: Proceedings of NAACL-HLT, Edmonton, pp 4–6 Bilmes JA, Kirchhoff K (2003) Factored language models and generalized parallel backoff. In: Proceedings of NAACL-HLT, Edmonton, pp 4–6
Zurück zum Zitat Çetinoğlu Ö (2014) Turkish treebank as a gold standard for morphological disambiguation and its influence on parsing. In: Proceedings of LREC, Reykjavík, pp 3360–3365 Çetinoğlu Ö (2014) Turkish treebank as a gold standard for morphological disambiguation and its influence on parsing. In: Proceedings of LREC, Reykjavík, pp 3360–3365
Zurück zum Zitat Charniak E, Hendrickson C, Jacobson N, Perkowitz M (1993) Equations for part-of-speech tagging. In: Proceedings of AAAI, Washington, DC, pp 784–789 Charniak E, Hendrickson C, Jacobson N, Perkowitz M (1993) Equations for part-of-speech tagging. In: Proceedings of AAAI, Washington, DC, pp 784–789
Zurück zum Zitat Collins M (2002) Discriminative training methods for Hidden Markov Models: theory and experiments with perceptron algorithms. In: Proceedings of EMNLP, Philadelphia, PA, pp 1–8 Collins M (2002) Discriminative training methods for Hidden Markov Models: theory and experiments with perceptron algorithms. In: Proceedings of EMNLP, Philadelphia, PA, pp 1–8
Zurück zum Zitat Creutz M, Lagus K (2005) Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Publications in Computer and Information Science Report A81, Helsinki University of Technology, Helsinki Creutz M, Lagus K (2005) Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Publications in Computer and Information Science Report A81, Helsinki University of Technology, Helsinki
Zurück zum Zitat Ehsani R, Alper ME, Eryiğit G, Adalı E (2012) Disambiguating main POS tags for Turkish. In: Proceedings of the 24th conference on computational linguistics and speech processing, Chung-Li Ehsani R, Alper ME, Eryiğit G, Adalı E (2012) Disambiguating main POS tags for Turkish. In: Proceedings of the 24th conference on computational linguistics and speech processing, Chung-Li
Zurück zum Zitat Eryiğit G (2012) The impact of automatic morphological analysis and disambiguation on dependency parsing of Turkish. In: Proceedings of LREC, Istanbul Eryiğit G (2012) The impact of automatic morphological analysis and disambiguation on dependency parsing of Turkish. In: Proceedings of LREC, Istanbul
Zurück zum Zitat Eryiğit G, Pamay T (2014) ITU validation set. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 7(1):103–106 Eryiğit G, Pamay T (2014) ITU validation set. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 7(1):103–106
Zurück zum Zitat Görgün O, Yıldız OT (2011) A novel approach to morphological disambiguation for Turkish. In: Proceedings of ISCIS, London, pp 77–83 Görgün O, Yıldız OT (2011) A novel approach to morphological disambiguation for Turkish. In: Proceedings of ISCIS, London, pp 77–83
Zurück zum Zitat Güngördü Z, Oflazer K (1995) Parsing Turkish using the Lexical-Functional Grammar formalism. Mach Transl 10(4):515–544 Güngördü Z, Oflazer K (1995) Parsing Turkish using the Lexical-Functional Grammar formalism. Mach Transl 10(4):515–544
Zurück zum Zitat Hakkani-Tür DZ, Oflazer K, Tür G (2002) Statistical morphological disambiguation for agglutinative languages. Comput Hum 36(4):381–410 Hakkani-Tür DZ, Oflazer K, Tür G (2002) Statistical morphological disambiguation for agglutinative languages. Comput Hum 36(4):381–410
Zurück zum Zitat Kirchhoff K, Yang M (2005) Improved language modeling for statistical machine translation. In: Proceedings of the workshop on building and using parallel texts, Ann Arbor, MI, pp 125–128 Kirchhoff K, Yang M (2005) Improved language modeling for statistical machine translation. In: Proceedings of the workshop on building and using parallel texts, Ann Arbor, MI, pp 125–128
Zurück zum Zitat Kneissler J, Klakow D (2001) Speech recognition for huge vocabularies by using optimized sub-word units. In: Proceedings of INTERSPEECH, Aalborg, pp 69–72 Kneissler J, Klakow D (2001) Speech recognition for huge vocabularies by using optimized sub-word units. In: Proceedings of INTERSPEECH, Aalborg, pp 69–72
Zurück zum Zitat Lafferty JD, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, Williams, MA, pp 282–289 Lafferty JD, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, Williams, MA, pp 282–289
Zurück zum Zitat Marcus M, Marcinkiewicz M, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330 Marcus M, Marcinkiewicz M, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330
Zurück zum Zitat Oflazer K (2003) Dependency parsing with an extended finite-state approach. Comput Linguist 29(4):515–544 Oflazer K (2003) Dependency parsing with an extended finite-state approach. Comput Linguist 29(4):515–544
Zurück zum Zitat Oflazer K, Kuruöz İ (1994) Tagging and morphological disambiguation of Turkish text. In: Proceedings of ANLP, Stuttgart, pp 144–149 Oflazer K, Kuruöz İ (1994) Tagging and morphological disambiguation of Turkish text. In: Proceedings of ANLP, Stuttgart, pp 144–149
Zurück zum Zitat Oflazer K, Tür G (1996) Combining hand-crafted rules and unsupervised learning in constraint-based morphological disambiguation. In: Proceedings of EMNLP-VLC, Philadelphia, PA Oflazer K, Tür G (1996) Combining hand-crafted rules and unsupervised learning in constraint-based morphological disambiguation. In: Proceedings of EMNLP-VLC, Philadelphia, PA
Zurück zum Zitat Oflazer K, Tür G (1997) Morphological disambiguation by voting constraints. In: Proceedings of ACL-EACL, Madrid, pp 222–229 Oflazer K, Tür G (1997) Morphological disambiguation by voting constraints. In: Proceedings of ACL-EACL, Madrid, pp 222–229
Zurück zum Zitat Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish Treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic Publishers, Berlin Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish Treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic Publishers, Berlin
Zurück zum Zitat Rivest R (1987) Learning decision lists. Mach Learn 2(3):229–246 Rivest R (1987) Learning decision lists. Mach Learn 2(3):229–246
Zurück zum Zitat Sak H, Güngör T, Saraçlar M (2007) Morphological disambiguation of Turkish text with perceptron algorithm. In: Proceedings of CICLING, Mexico City, pp 107–118 Sak H, Güngör T, Saraçlar M (2007) Morphological disambiguation of Turkish text with perceptron algorithm. In: Proceedings of CICLING, Mexico City, pp 107–118
Zurück zum Zitat Sak H, Güngör T, Saraçlar M (2008) Turkish language resources: morphological parser, morphological disambiguator and web corpus. In: Proceedings of the 6th GoTAL conference, Gothenburg, pp 417–427 Sak H, Güngör T, Saraçlar M (2008) Turkish language resources: morphological parser, morphological disambiguator and web corpus. In: Proceedings of the 6th GoTAL conference, Gothenburg, pp 417–427
Zurück zum Zitat Sak H, Güngör T, Saraçlar M (2011) Resources for Turkish morphological processing. Lang Resour Eval 45(2):249–261 Sak H, Güngör T, Saraçlar M (2011) Resources for Turkish morphological processing. Lang Resour Eval 45(2):249–261
Zurück zum Zitat Yuret D, de la Maza M (2005) The greedy prepend algorithm for decision list induction. In: Proceedings of ISCIS, Istanbul Yuret D, de la Maza M (2005) The greedy prepend algorithm for decision list induction. In: Proceedings of ISCIS, Istanbul
Zurück zum Zitat Yuret D, Türe F (2006) Learning morphological disambiguation rules for Turkish. In: Proceedings of NAACL-HLT, New York, NY, pp 328–334 Yuret D, Türe F (2006) Learning morphological disambiguation rules for Turkish. In: Proceedings of NAACL-HLT, New York, NY, pp 328–334
Metadaten
Titel
Morphological Disambiguation for Turkish
verfasst von
Dilek Zeynep Hakkani-Tür
Murat Saraçlar
Gökhan Tür
Kemal Oflazer
Deniz Yuret
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-90165-7_3

Neuer Inhalt