Skip to main content

2014 | OriginalPaper | Buchkapitel

Marathi Parts-of-Speech Tagger Using Supervised Learning

verfasst von : Jyoti Singh, Nisheeth Joshi, Iti Mathur

Erschienen in: Intelligent Computing, Networking, and Informatics

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a parts-of-speech tagger for inflectional and derivational morphologically rich language Marathi. Marathi is spoken by the native people of Maharashtra. The general approach used for the development of tagger is statistical-based hidden Markov model (HMM). We establish a methodology of parts-of-speech (POS) tagging for Marathi using HMM. The main concept of HMM is to calculate probabilities to determine which is the best sequence of tags that correspond to observation sequence of words. In this paper, we show the development of the tagger. Moreover, we have also shown the evaluation done.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bharti, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. LTRC-TR31 (2006) Bharti, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. LTRC-TR31 (2006)
2.
Zurück zum Zitat Singh, T.D., Bandyopadhyay, S.: Morphology driven Manipuri POS tagger. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 91–98. Hyderabad, India (2008) Singh, T.D., Bandyopadhyay, S.: Morphology driven Manipuri POS tagger. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 91–98. Hyderabad, India (2008)
3.
Zurück zum Zitat Ekbal, A., Bandyopadhyay, S.: Web-based Bengali news corpus for lexicon development and POS tagging. In: Proceeding of Language Resource and Evaluation (2008) Ekbal, A., Bandyopadhyay, S.: Web-based Bengali news corpus for lexicon development and POS tagging. In: Proceeding of Language Resource and Evaluation (2008)
4.
Zurück zum Zitat Dhanalakshmi, V., Anandkumar, M., Rajendran, S., Soman, K.P.: Tamil POS tagging using linear programming. In: proceeding of International Journal of Recent Trends in Engineering, vol. 1, No. 2 (2009) Dhanalakshmi, V., Anandkumar, M., Rajendran, S., Soman, K.P.: Tamil POS tagging using linear programming. In: proceeding of International Journal of Recent Trends in Engineering, vol. 1, No. 2 (2009)
5.
Zurück zum Zitat Dalal, A., Nagaraj, K., Swant, U., Shelke, S., Bhattacharyya, P.: Building feature rich pos tagger for morphologically rich languages: Experience in Hindi. In: Proceedings of International Conference on Natural Language Processing (ICON) at IIIT, Hyderabad (2007) Dalal, A., Nagaraj, K., Swant, U., Shelke, S., Bhattacharyya, P.: Building feature rich pos tagger for morphologically rich languages: Experience in Hindi. In: Proceedings of International Conference on Natural Language Processing (ICON) at IIIT, Hyderabad (2007)
6.
Zurück zum Zitat Gill, M.S., Lehal, G.S., Joshi, S.S.: Part-of-Speech tagging for grammar checking of Punjabi. Linguis. J. 4(1), 6–21 (2009) Gill, M.S., Lehal, G.S., Joshi, S.S.: Part-of-Speech tagging for grammar checking of Punjabi. Linguis. J. 4(1), 6–21 (2009)
7.
Zurück zum Zitat Manju, K., Soumya, S., Idicula, S.M.: Development of a POS tagger for Malayalam-an experience. In: Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing, IEEE (2009) Manju, K., Soumya, S., Idicula, S.M.: Development of a POS tagger for Malayalam-an experience. In: Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing, IEEE (2009)
8.
Zurück zum Zitat Joshi, N., Darbari, H., Mathur, I.: HMM based POS tagger for Hindi. In: Proceeding of 2013 International Conference on Artificial Intelligence, Soft Computing (AISC-2013) (2013) Joshi, N., Darbari, H., Mathur, I.: HMM based POS tagger for Hindi. In: Proceeding of 2013 International Conference on Artificial Intelligence, Soft Computing (AISC-2013) (2013)
9.
Zurück zum Zitat Patel, C., Gali, K.: Part-Of-Speech tagging for Gujarati using conditional random fields. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 117–122. Hyderabad, India (2008) Patel, C., Gali, K.: Part-Of-Speech tagging for Gujarati using conditional random fields. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 117–122. Hyderabad, India (2008)
10.
Zurück zum Zitat Reddy, S., Sharoff, S.: Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In: Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Thailand (2011) Reddy, S., Sharoff, S.: Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In: Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Thailand (2011)
11.
Zurück zum Zitat Brants T.: Tnt: a statistical part-ofspeech tagger. In: Proceedings of the sixth conference on Applied natural language processing, ANLC ’00, pp. 224–231, Association for Computational Linguistics, Stroudsburg, PA, USA (2000) Brants T.: Tnt: a statistical part-ofspeech tagger. In: Proceedings of the sixth conference on Applied natural language processing, ANLC ’00, pp. 224–231, Association for Computational Linguistics, Stroudsburg, PA, USA (2000)
Metadaten
Titel
Marathi Parts-of-Speech Tagger Using Supervised Learning
verfasst von
Jyoti Singh
Nisheeth Joshi
Iti Mathur
Copyright-Jahr
2014
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-1665-0_24