Top

Published in:

2020 | OriginalPaper | Chapter

Improving Unsupervised Neural Machine Translation with Dependency Relationships

Authors : Jia Xu, Na Ye, GuiPing Zhang

Published in: Natural Language Processing and Chinese Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Nowadays, neural networks have been widely used in the domain of machine translation (MT) and achieved good results. Neural machine translation (NMT) models need large bilingual parallel corpora to perform training. However, in many languages or domains, such corpora are scarce. Therefore, the technology of unsupervised neural machine translation (UNMT) which does not need bilingual parallel corpora attracted wide interest. State-of-the-art UNMT models use Transformer for training and cannot learn the syntactic knowledge from the corpora. In this paper, we propose a method to improve UNMT by using dependency relationships extracted from dependency parsing. The extracted dependency relationships are concatenated with the original training data after Byte Pair Encoding (BPE) to obtain new sentence representations for UNMT training. Models that combine dependency relationships allow for a better understanding of the underlying syntactic structure in sentences and thus affect the quality of UNMT. We leverage linearized parsing trees of the training sentences in order to incorporate syntax into the Transformer architecture without modifying it. Compared with state-of-the-art UNMT method, our method increased the BLEU scores by 5.11 and 9.41 respectively on WMT 2019 English-French and German-English monolingual news corpora with 5 million sentence pairs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Incorporating Phrase-Level Agreement into Neural Machine Translation

next chapter Incorporating Knowledge and Content Information to Boost News Recommendation

https://stanfordnlp.github.io/CoreNLP/index.html#download.

https://github.com/glample/fastBPE.

http://www.statmt.org/moses/.

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, San Diego (ICLR 2015), CA, USA Conference Track Proceedings, 7–9 May 2015

Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1243–1252. JMLR. org (2017)

Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc. (2017)

Ravi, S., Knight, K.: Deciphering foreign language. In: Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2011)

Artetxe, M., Labaka, G., Agirre, E., Cho, K.: Unsupervised neural machine translation. In: International Conference on Learning Representations (2018)

Lample, G., Ott, M., Conneau, A., Denoyer, L., Ranzato, M.A.: Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2018)

Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Morgan and Claypool Publishers, San Rafael (2009)CrossRef

Lample, G., Conneau, A.: Cross-lingual language model pretraining. In: Proceedings of 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada (2019)

Leng, Y., Tan, X., Qin, T., et al.: Unsupervised pivot translation for distant languages. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 175–183 (2019)

10.

Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 451–462 (2017)

11.

Conneau, A., Lample, G., Ranzato, M.A., Denoyer, L., Jégou, H.: Word translation without parallel data. In: International Conference on Learning Representations (ICLR) (2018)

12.

Liu, Z., Xu, Y., Winata, G.I., Fung, P.: Incorporating word and subword units in unsupervised machine translation using language model rescoring. arXiv preprint arXiv:1908.05925 (2019)

13.

Li, J., Xiong, D., Tu, Z., Zhu, M., et al.: Modeling source syntax for neural machine translation. arXiv preprint arXiv:1705.01020 (2017)

14.

Currey, A., Heafield, K.: Incorporating source syntax into transformer-based neural machine translation. In: Proceedings of the Fourth Conference on Machine Translation (vol. 1: Research Papers) (2019)

15.

Tesnière, L.: Éléments de syntaxe structurale. Klincksieck, Paris (1959)

16.

Manning, C.D., Surdeanu, M., Bauer, J., et al.: The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014)

17.

Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. Comput. Sci. (2015)

18.

Koehn, P., Hoang, H., Birch, A., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the Association for Computational Linguistics (ACL 2007), vol. 9, no. 1, pp. 177–180 (2007)

19.

Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725 (2015)

20.

Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

21.

Hendrycks, D., Gimpel, K.: Bridging nonlinearities and stochastic regularizers with Gaussian error linear units (2016)

22.

Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311–318 (2002)

Title: Improving Unsupervised Neural Machine Translation with Dependency Relationships
Authors: Jia Xu
Na Ye
GuiPing Zhang
Publisher: Springer International Publishing
Book: Natural Language Processing and Chinese Computing
Print ISBN: 978-3-030-60449-3

Electronic ISBN: 978-3-030-60450-9

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-60450-9_34

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner