Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 4/2015

01.08.2015 | Original Article

A bilingual word alignment algorithm of Vietnamese-Chinese based on feature constraint

verfasst von: Yuanyuan Mo, Jianyi Guo, Zhengtao Yu, Lin Luo, Shengxiang Gao

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

It is difficult to achieve auto-alignment between Vietnamese and Chinese, because their syntax and structure are quite different. In this case we present a novel method for the Vietnamese-Chinese word alignment which merges a variety of feature constraint models. In this article, an improved model based on the Vietnamese-Chinese progressive structure and offset features of word sequence is described. From this model which is trained by a log-linear model framework, and with parameters trained by the minimum error rate algorithm, the result of the Vietnamese-Chinese auto-alignment is obtained. The basic model of the experiments is IBM Model 3, and as experimental results suggest, this bilingual word alignment method for Vietnamese and Chinese performs well and precision, recall rates are increased by 28.57 and 25.02 %, AER is reduced by 14.25 %.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Wang XZ, He YL, Wang DD (2014) Non-naive bayesian classifiers for classification problems with continuous attributes. Cybern IEEE Trans 44(1):21–39CrossRef Wang XZ, He YL, Wang DD (2014) Non-naive bayesian classifiers for classification problems with continuous attributes. Cybern IEEE Trans 44(1):21–39CrossRef
2.
Zurück zum Zitat Wang XZ, Wang R, Feng HM, Wang HC (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 44(5):620CrossRefMATH Wang XZ, Wang R, Feng HM, Wang HC (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 44(5):620CrossRefMATH
3.
Zurück zum Zitat Jiang J, Yan X, Yu Z, Guo J, and Tian W (2014) A Chinese expert disambiguation method based on semi-supervised graph clustering. Intern J Mach Learn Cybern. doi:10.1007/s13042-014-0255-z Jiang J, Yan X, Yu Z, Guo J, and Tian W (2014) A Chinese expert disambiguation method based on semi-supervised graph clustering. Intern J Mach Learn Cybern. doi:10.​1007/​s13042-014-0255-z
4.
Zurück zum Zitat Riley D and Gildea D (2012) Improving the IBM alignment models using variational bayes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol 2. Association for Computational Linguistics, pp 306–310 Riley D and Gildea D (2012) Improving the IBM alignment models using variational bayes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol 2. Association for Computational Linguistics, pp 306–310
5.
Zurück zum Zitat Cherry C and Foster G (2012) Batch tuning strategies for statistical machine translation. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, pp 427–436 Cherry C and Foster G (2012) Batch tuning strategies for statistical machine translation. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, pp 427–436
6.
Zurück zum Zitat Tang J, Gentzler E (2009) Globalisation, networks and translation: a Chinese perspective. Perspect Stud Transl 16(3–4):169–182CrossRef Tang J, Gentzler E (2009) Globalisation, networks and translation: a Chinese perspective. Perspect Stud Transl 16(3–4):169–182CrossRef
7.
Zurück zum Zitat Chu C, Nakazawa T, Kawahara D, and Kurohashi S (2012) Exploiting shared Chinese characters in Chinese word segmentation optimization for Chinese-Japanese machine translation. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT’12) Chu C, Nakazawa T, Kawahara D, and Kurohashi S (2012) Exploiting shared Chinese characters in Chinese word segmentation optimization for Chinese-Japanese machine translation. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT’12)
8.
Zurück zum Zitat Wang Z, Dong S, and Guo Y (2012) Machine translation of Japanese-Chinese for conditional sentences based on templates. In: Proceedings of 2012 international conference on measurement, information and control, vol 1, pp 397–400 Wang Z, Dong S, and Guo Y (2012) Machine translation of Japanese-Chinese for conditional sentences based on templates. In: Proceedings of 2012 international conference on measurement, information and control, vol 1, pp 397–400
9.
Zurück zum Zitat Le HP and Ho TV (2008) A maximum entropy approach to sentence boundary detection of Vietnamese texts. In: IEEE international conference on research, innovation and vision for the future-RIVF 2008 Le HP and Ho TV (2008) A maximum entropy approach to sentence boundary detection of Vietnamese texts. In: IEEE international conference on research, innovation and vision for the future-RIVF 2008
10.
Zurück zum Zitat Huyên NTM, Roussanaly A, and Vinh, HT (2008) A hybrid approach to word segmentation of Vietnamese texts. In: language and automata theory and applications. Springer Berlin Heidelberg, pp 240–249 Huyên NTM, Roussanaly A, and Vinh, HT (2008) A hybrid approach to word segmentation of Vietnamese texts. In: language and automata theory and applications. Springer Berlin Heidelberg, pp 240–249
11.
Zurück zum Zitat Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311 Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311
12.
Zurück zum Zitat Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51CrossRefMATH Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51CrossRefMATH
13.
Zurück zum Zitat Blunsom P and Cohn T (2006) Discriminative word alignment with conditional random fields. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp 65–72 Blunsom P and Cohn T (2006) Discriminative word alignment with conditional random fields. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp 65–72
14.
Zurück zum Zitat Tufiş D, Ion R, Ceauşu A, and Ştefánescu D (2005) Combined word alignments. In: Proceedings of the ACL workshop on building and using parallel texts. Association for Computational Linguistics, pp 107–110 Tufiş D, Ion R, Ceauşu A, and Ştefánescu D (2005) Combined word alignments. In: Proceedings of the ACL workshop on building and using parallel texts. Association for Computational Linguistics, pp 107–110
15.
Zurück zum Zitat Liu Y, Liu Q, Lin S (2010) Discriminative word alignment by linear modeling. Comput Linguist 36(3):303–339CrossRef Liu Y, Liu Q, Lin S (2010) Discriminative word alignment by linear modeling. Comput Linguist 36(3):303–339CrossRef
16.
Zurück zum Zitat Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol 1, Association for Computational Linguistics, pp 160–167 Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol 1, Association for Computational Linguistics, pp 160–167
17.
Zurück zum Zitat Och FJ and Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp 295–302 Och FJ and Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp 295–302
Metadaten
Titel
A bilingual word alignment algorithm of Vietnamese-Chinese based on feature constraint
verfasst von
Yuanyuan Mo
Jianyi Guo
Zhengtao Yu
Lin Luo
Shengxiang Gao
Publikationsdatum
01.08.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 4/2015
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-014-0293-6

Weitere Artikel der Ausgabe 4/2015

International Journal of Machine Learning and Cybernetics 4/2015 Zur Ausgabe

Neuer Inhalt