Skip to main content

2018 | OriginalPaper | Buchkapitel

Vietnamese Part of Speech Tagging Based on Multi-category Words Disambiguation Model

verfasst von : Zhao Chen, Liu Yanchao, Guo Jianyi, Chen Wei, Yan Xin, Yu Zhengtao, Chen Xiuqin

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

POS tagging is a fundamental work in Natural Language Processing, which determines the subsequent processing quality, and the ambiguity of multi-category words directly affects the accuracy of Vietnamese POS tagging. At present, the POS tagging of English and Chinese has achieved better results, but the accuracy of Vietnamese POS tagging is still to be improved. For address this problem, this paper proposes a novel method of Vietnamese POS tagging based on multi-category words disambiguation model and Part of Speech dictionary, the multi-category words dictionary and the non-multi-category words dictionary are generated from the Vietnamese dictionary, which are used to build POS tagging corpus. 396,946 multi-category words have been extracted from the corpus, by using statistical method, the maximum entropy disambiguation model of Vietnamese part of speech is constructed, based on it, the multi-category words and the non-multi-category words are tagged. Experimental results show that the method proposed in the paper is higher than the existing model, which is proved that the method is feasible and effective.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Brill, E., Pop, M.: Unsupervised learning of disambiguation rules for part-of-speech tagging. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora, vol. 11, pp. 27–42. Springer, Dordrecht (1999). https://doi.org/10.1007/978-94-017-2390-9_3 CrossRef Brill, E., Pop, M.: Unsupervised learning of disambiguation rules for part-of-speech tagging. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora, vol. 11, pp. 27–42. Springer, Dordrecht (1999). https://​doi.​org/​10.​1007/​978-94-017-2390-9_​3 CrossRef
2.
Zurück zum Zitat Hu, G., Zhang, J., Li, M.: Improved transformation based POS tagging of Latin Mongolian. Comput. Appl. 27(4), 963–965 (2007). (in Chinese) Hu, G., Zhang, J., Li, M.: Improved transformation based POS tagging of Latin Mongolian. Comput. Appl. 27(4), 963–965 (2007). (in Chinese)
3.
Zurück zum Zitat Wang, G., Wang, X.: POS tagging method based on rule priority. J. Anhui Univ. Technol. Nat. Sci. 25(4), 426–429 (2008). (in Chinese) Wang, G., Wang, X.: POS tagging method based on rule priority. J. Anhui Univ. Technol. Nat. Sci. 25(4), 426–429 (2008). (in Chinese)
4.
Zurück zum Zitat Bernard, M.: Tagging English text with a probabilistic model. Comput. Linguist. 20(2), 1–29 (1994) Bernard, M.: Tagging English text with a probabilistic model. Comput. Linguist. 20(2), 1–29 (1994)
5.
Zurück zum Zitat Wang, L., Che, W., Liu, T.: Chinese POS tagging based on SVMTool. J. Chin. Inf. Process. 23(4), 16–21 (2009). (in Chinese) Wang, L., Che, W., Liu, T.: Chinese POS tagging based on SVMTool. J. Chin. Inf. Process. 23(4), 16–21 (2009). (in Chinese)
6.
Zurück zum Zitat Binulal, G.S., Goud, P.A., Soman, K.P.: A SVM based approach to Telugu parts of speech tagging using SVMTool. Int. J. Recent Trends Eng. 1(2), 183–185 (2009) Binulal, G.S., Goud, P.A., Soman, K.P.: A SVM based approach to Telugu parts of speech tagging using SVMTool. Int. J. Recent Trends Eng. 1(2), 183–185 (2009)
8.
Zurück zum Zitat Jiang, S., Chen, Q.: Research on Japanese word segmentation and POS tagging based on rules and statistics. J. Chin. Inf. Process. 24(1), 117–122 (2010). (in Chinese) Jiang, S., Chen, Q.: Research on Japanese word segmentation and POS tagging based on rules and statistics. J. Chin. Inf. Process. 24(1), 117–122 (2010). (in Chinese)
9.
Zurück zum Zitat Nghiem, M., Dinh, D., Nguyen, M.: Improving Vietnamese POS tagging by integrating a rich feature set and support vector machines. In Proceedings of Research, Innovation and, Vision for the Future, RIVF, pp. 128–133 (2008) Nghiem, M., Dinh, D., Nguyen, M.: Improving Vietnamese POS tagging by integrating a rich feature set and support vector machines. In Proceedings of Research, Innovation and, Vision for the Future, RIVF, pp. 128–133 (2008)
10.
Zurück zum Zitat Oanh, T.T., Cuong, A.L., Thuy, Q.H., Quynh, H.L.: An experimental study on Vietnamese POS tagging. In: Proceedings of International Conference on Asian Language Processing, IALP, Singapore (2009) Oanh, T.T., Cuong, A.L., Thuy, Q.H., Quynh, H.L.: An experimental study on Vietnamese POS tagging. In: Proceedings of International Conference on Asian Language Processing, IALP, Singapore (2009)
11.
Zurück zum Zitat Phuong, L.-H., Azim, R.: An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts. In: Proceedings of TALN 2010, Montreal, Canada (2010) Phuong, L.-H., Azim, R.: An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts. In: Proceedings of TALN 2010, Montreal, Canada (2010)
12.
Zurück zum Zitat Xiong, M.: Research on Vietnamese lexical analysis method. Kunming University of Science and Technology (2016) Xiong, M.: Research on Vietnamese lexical analysis method. Kunming University of Science and Technology (2016)
13.
Zurück zum Zitat Ban, D.Q., Ban, H.: Vietnamese Grammar. Education Publisher, Hanoi (2004) Ban, D.Q., Ban, H.: Vietnamese Grammar. Education Publisher, Hanoi (2004)
14.
Zurück zum Zitat Hoa, N.C.: Practical Vietnamese Grammar. Vietname National University Publisher, Hanoi (2001) Hoa, N.C.: Practical Vietnamese Grammar. Vietname National University Publisher, Hanoi (2001)
15.
Zurück zum Zitat Zhi, T., Zhang, Y.: The acquiring method of chinese ambiguity word POS tagging rules based on rough sets and fuzzy neural network. Comput. Eng. Appl. 38(12), 89–91 (2002). (in Chinese) Zhi, T., Zhang, Y.: The acquiring method of chinese ambiguity word POS tagging rules based on rough sets and fuzzy neural network. Comput. Eng. Appl. 38(12), 89–91 (2002). (in Chinese)
16.
Zurück zum Zitat Li, H., Jia, Z., Yin, H., et al.: Chinese ambiguity word’s annotation based on rules. Comput. Appl. 34(8), 2197–2201 (2014). (in Chinese) Li, H., Jia, Z., Yin, H., et al.: Chinese ambiguity word’s annotation based on rules. Comput. Appl. 34(8), 2197–2201 (2014). (in Chinese)
Metadaten
Titel
Vietnamese Part of Speech Tagging Based on Multi-category Words Disambiguation Model
verfasst von
Zhao Chen
Liu Yanchao
Guo Jianyi
Chen Wei
Yan Xin
Yu Zhengtao
Chen Xiuqin
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-73618-1_23