Skip to main content

2018 | OriginalPaper | Buchkapitel

Combining Machine Translated Sentence Chunks from Multiple MT Systems

verfasst von : Matīss Rikters, Inguna Skadiņa

Erschienen in: Computational Linguistics and Intelligent Text Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT output compared to individual online MT systems. The proposed approach show improvement up to +1.48 points in BLEU and −0.015 in TER scores compared to the baselines and related research.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Latvian public administration machine translation service - http://​hugo.​lv.
 
5
Latvian public administration machine translation service API - http://​hugo.​lv/​TranslationAPI.
 
Literatur
1.
Zurück zum Zitat Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014) Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014)
2.
Zurück zum Zitat Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014) Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014)
3.
Zurück zum Zitat Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)CrossRef Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)CrossRef
4.
Zurück zum Zitat Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016) Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016)
5.
Zurück zum Zitat Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009) Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009)
6.
Zurück zum Zitat Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006) Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006)
7.
Zurück zum Zitat Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010) Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010)
8.
Zurück zum Zitat Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)CrossRef Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)CrossRef
9.
Zurück zum Zitat Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009) Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
10.
Zurück zum Zitat Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015) Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015)
11.
Zurück zum Zitat Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009) Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
12.
Zurück zum Zitat Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009) Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
13.
Zurück zum Zitat Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009) Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009)
14.
Zurück zum Zitat Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014) Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014)
15.
Zurück zum Zitat Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015) Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015)
16.
Zurück zum Zitat Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006) Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)
17.
Zurück zum Zitat Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011) Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011)
18.
Zurück zum Zitat Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005) Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005)
19.
Zurück zum Zitat Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001) Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001)
20.
Zurück zum Zitat Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002) Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002)
21.
Zurück zum Zitat Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/0609058 (2006) Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/​0609058 (2006)
22.
Zurück zum Zitat Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226 (2013) Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:​1309.​5226 (2013)
23.
Zurück zum Zitat Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010) Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010)
24.
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002) Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)
25.
Zurück zum Zitat Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002) Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002)
26.
Zurück zum Zitat Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)CrossRef Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)CrossRef
27.
Zurück zum Zitat Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011) Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011)
Metadaten
Titel
Combining Machine Translated Sentence Chunks from Multiple MT Systems
verfasst von
Matīss Rikters
Inguna Skadiņa
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75487-1_3

Premium Partner