Skip to main content
Top

2020 | OriginalPaper | Chapter

Improving Unsupervised Neural Machine Translation with Dependency Relationships

Authors : Jia Xu, Na Ye, GuiPing Zhang

Published in: Natural Language Processing and Chinese Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Nowadays, neural networks have been widely used in the domain of machine translation (MT) and achieved good results. Neural machine translation (NMT) models need large bilingual parallel corpora to perform training. However, in many languages or domains, such corpora are scarce. Therefore, the technology of unsupervised neural machine translation (UNMT) which does not need bilingual parallel corpora attracted wide interest. State-of-the-art UNMT models use Transformer for training and cannot learn the syntactic knowledge from the corpora. In this paper, we propose a method to improve UNMT by using dependency relationships extracted from dependency parsing. The extracted dependency relationships are concatenated with the original training data after Byte Pair Encoding (BPE) to obtain new sentence representations for UNMT training. Models that combine dependency relationships allow for a better understanding of the underlying syntactic structure in sentences and thus affect the quality of UNMT. We leverage linearized parsing trees of the training sentences in order to incorporate syntax into the Transformer architecture without modifying it. Compared with state-of-the-art UNMT method, our method increased the BLEU scores by 5.11 and 9.41 respectively on WMT 2019 English-French and German-English monolingual news corpora with 5 million sentence pairs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, San Diego (ICLR 2015), CA, USA Conference Track Proceedings, 7–9 May 2015 Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, San Diego (ICLR 2015), CA, USA Conference Track Proceedings, 7–9 May 2015
2.
go back to reference Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1243–1252. JMLR. org (2017) Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1243–1252. JMLR. org (2017)
3.
go back to reference Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc. (2017) Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc. (2017)
4.
go back to reference Ravi, S., Knight, K.: Deciphering foreign language. In: Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2011) Ravi, S., Knight, K.: Deciphering foreign language. In: Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2011)
5.
go back to reference Artetxe, M., Labaka, G., Agirre, E., Cho, K.: Unsupervised neural machine translation. In: International Conference on Learning Representations (2018) Artetxe, M., Labaka, G., Agirre, E., Cho, K.: Unsupervised neural machine translation. In: International Conference on Learning Representations (2018)
6.
go back to reference Lample, G., Ott, M., Conneau, A., Denoyer, L., Ranzato, M.A.: Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2018) Lample, G., Ott, M., Conneau, A., Denoyer, L., Ranzato, M.A.: Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2018)
7.
go back to reference Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Morgan and Claypool Publishers, San Rafael (2009)CrossRef Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Morgan and Claypool Publishers, San Rafael (2009)CrossRef
8.
go back to reference Lample, G., Conneau, A.: Cross-lingual language model pretraining. In: Proceedings of 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada (2019) Lample, G., Conneau, A.: Cross-lingual language model pretraining. In: Proceedings of 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada (2019)
9.
go back to reference Leng, Y., Tan, X., Qin, T., et al.: Unsupervised pivot translation for distant languages. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 175–183 (2019) Leng, Y., Tan, X., Qin, T., et al.: Unsupervised pivot translation for distant languages. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 175–183 (2019)
10.
go back to reference Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 451–462 (2017) Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 451–462 (2017)
11.
go back to reference Conneau, A., Lample, G., Ranzato, M.A., Denoyer, L., Jégou, H.: Word translation without parallel data. In: International Conference on Learning Representations (ICLR) (2018) Conneau, A., Lample, G., Ranzato, M.A., Denoyer, L., Jégou, H.: Word translation without parallel data. In: International Conference on Learning Representations (ICLR) (2018)
12.
go back to reference Liu, Z., Xu, Y., Winata, G.I., Fung, P.: Incorporating word and subword units in unsupervised machine translation using language model rescoring. arXiv preprint arXiv:1908.05925 (2019) Liu, Z., Xu, Y., Winata, G.I., Fung, P.: Incorporating word and subword units in unsupervised machine translation using language model rescoring. arXiv preprint arXiv:​1908.​05925 (2019)
13.
go back to reference Li, J., Xiong, D., Tu, Z., Zhu, M., et al.: Modeling source syntax for neural machine translation. arXiv preprint arXiv:1705.01020 (2017) Li, J., Xiong, D., Tu, Z., Zhu, M., et al.: Modeling source syntax for neural machine translation. arXiv preprint arXiv:​1705.​01020 (2017)
14.
go back to reference Currey, A., Heafield, K.: Incorporating source syntax into transformer-based neural machine translation. In: Proceedings of the Fourth Conference on Machine Translation (vol. 1: Research Papers) (2019) Currey, A., Heafield, K.: Incorporating source syntax into transformer-based neural machine translation. In: Proceedings of the Fourth Conference on Machine Translation (vol. 1: Research Papers) (2019)
15.
go back to reference Tesnière, L.: Éléments de syntaxe structurale. Klincksieck, Paris (1959) Tesnière, L.: Éléments de syntaxe structurale. Klincksieck, Paris (1959)
16.
go back to reference Manning, C.D., Surdeanu, M., Bauer, J., et al.: The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014) Manning, C.D., Surdeanu, M., Bauer, J., et al.: The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014)
17.
go back to reference Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. Comput. Sci. (2015) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. Comput. Sci. (2015)
18.
go back to reference Koehn, P., Hoang, H., Birch, A., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the Association for Computational Linguistics (ACL 2007), vol. 9, no. 1, pp. 177–180 (2007) Koehn, P., Hoang, H., Birch, A., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the Association for Computational Linguistics (ACL 2007), vol. 9, no. 1, pp. 177–180 (2007)
19.
go back to reference Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725 (2015) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725 (2015)
21.
go back to reference Hendrycks, D., Gimpel, K.: Bridging nonlinearities and stochastic regularizers with Gaussian error linear units (2016) Hendrycks, D., Gimpel, K.: Bridging nonlinearities and stochastic regularizers with Gaussian error linear units (2016)
22.
go back to reference Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311–318 (2002) Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311–318 (2002)
Metadata
Title
Improving Unsupervised Neural Machine Translation with Dependency Relationships
Authors
Jia Xu
Na Ye
GuiPing Zhang
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-60450-9_34

Premium Partner