A Survey of Multilingual Neural Machine Translation

Authors:
Raj Dabre

National Institute of Information and Communications Technology (NICT), Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan

National Institute of Information and Communications Technology (NICT), Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan
View Profile

,
Chenhui Chu

Osaka University, Yamadaoka, Suita, Osaka, Japan

Osaka University, Yamadaoka, Suita, Osaka, Japan
View Profile

,
Anoop Kunchukuttan

Microsoft, Hyderabad, Telangana, India

Microsoft, Hyderabad, Telangana, India
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 53 Issue 5Article No.: 99pp 1–38https://doi.org/10.1145/3406095

Published:28 September 2020Publication History

ACM Computing Surveys

Abstract

We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in recent years. MNMT has been useful in improving translation quality as a result of translation knowledge transfer (transfer learning). MNMT is more promising and interesting than its statistical machine translation counterpart, because end-to-end modeling and distributed representations open new avenues for research on machine translation. Many approaches have been proposed to exploit multilingual parallel corpora for improving translation quality. However, the lack of a comprehensive survey makes it difficult to determine which approaches are promising and, hence, deserve further exploration. In this article, we present an in-depth survey of existing literature on MNMT. We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, core-issues, and challenges. Wherever possible, we address the strengths and weaknesses of several techniques by comparing them with each other. We also discuss the future directions for MNMT. This article is aimed towards both beginners and experts in NMT. We hope this article will serve as a starting point as well as a source of new ideas for researchers and engineers interested in MNMT.

References

Željko Agić and Ivan Vulić. 2019. JW300: A wide-coverage parallel corpus for low-resource languages. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3204--3210. DOI:https://doi.org/10.18653/v1/P19-1310Google Scholar
Roee Aharoni, Melvin Johnson, and Orhan Firat. 2019. Massively multilingual neural machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 3874--3884. Retrieved from https://www.aclweb.org/anthology/N19-1388.Google ScholarCross Ref
Maruan Al-Shedivat and Ankur Parikh. 2019. Consistency by agreement in zero-shot neural machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 1184--1197. Retrieved from https://www.aclweb.org/anthology/N19-1121.Google ScholarCross Ref
Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, and Wolfgang Macherey. 2019. The missing ingredient in zero-shot neural machine translation. CoRR abs/1903.07091 (2019).Google Scholar
Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, and Yonghui Wu. 2019. Massively multilingual neural machine translation in the wild: Findings and challenges. CoRR abs/1907.05019 (2019).Google Scholar
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2289--2294. DOI:https://doi.org/10.18653/v1/D16-1250Google ScholarCross Ref
Mikel Artetxe and Holger Schwenk. 2019. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Ling. 7 (2019), 597--610.Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15). Retrieved from http://arxiv.org/abs/1409.0473.Google Scholar
Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract meaning representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. Association for Computational Linguistics, 178--186. Retrieved from http://www.aclweb.org/anthology/W13-2322.Google Scholar
Tamali Banerjee, Anoop Kunchukuttan, and Pushpak Bhattacharya. 2018. Multilingual Indian language translation system at WAT 2018: Many-to-one phrase-based SMT. In Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation. Association for Computational Linguistics. Retrieved from https://www.aclweb.org/anthology/Y18-3013.Google Scholar
Ankur Bapna and Orhan Firat. 2019. Simple, scalable adaptation for neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 1538--1548. DOI:https://doi.org/10.18653/v1/D19-1165Google ScholarCross Ref
Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, and Doina Precup. 2016. Conditional computation in neural networks for faster models. In Proceedings of the International Conference on Learning Representations (ICLR’16) Workshop Track.Google Scholar
Graeme Blackwood, Miguel Ballesteros, and Todd Ward. 2018. Multilingual neural machine translation with task-specific attention. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3112--3122. Retrieved from http://aclweb.org/anthology/C18-1263.Google Scholar
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, and Marco Turchi. 2017. Findings of the 2017 conference on machine translation (WMT’17). In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 169--214. Retrieved from http://www.aclweb.org/anthology/W17-4717.Google Scholar
Ondřej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, and Christof Monz. 2018. Findings of the 2018 conference on machine translation (WMT’18). In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 272--303. Retrieved from http://aclweb.org/anthology/W18-6401.Google Scholar
Mauro Cettolo, Marcello Federico, Luisa Bentivogli, Jan Niehues, Sebastian Stüker, Katsuhito Sudoh, Koichiro Yoshino, and Christian Federmann. 2017. Overview of the IWSLT 2017 evaluation campaign. In Proceedings of the 14th International Workshop on Spoken Language Translation. 2--14.Google Scholar
Sarath Chandar, Stanislas Lauly, Hugo Larochelle, Mitesh Khapra, Balaraman Ravindran, Vikas C. Raykar, and Amrita Saha. 2014. An autoencoder approach to learning bilingual word representations. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 1853--1861.Google Scholar
Rajen Chatterjee, M. Amin Farajian, Matteo Negri, Marco Turchi, Ankit Srivastava, and Santanu Pal. 2017. Multi-source neural automatic post-editing: FBK’s participation in the WMT 2017 APE shared task. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics. 630--638. DOI:https://doi.org/10.18653/v1/W17-4773Google Scholar
Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. Black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, and Kathleen McKeown. 2019. The ARIEL-CMU systems for LoReHLT18. CoRR abs/1902.08899 (2019).Google Scholar
Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang, and Claire Cardie. 2019. Multi-source cross-lingual model transfer: Learning what to share. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3098--3112. DOI:https://doi.org/10.18653/v1/P19-1299Google ScholarCross Ref
Yun Chen, Yang Liu, Yong Cheng, and Victor O. K. Li. 2017. A teacher-student framework for zero-resource neural machine translation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1925--1935. DOI:https://doi.org/10.18653/v1/P17-1176Google Scholar
Yun Chen, Yang Liu, and Victor O. K. Li. 2018. Zero-resource neural machine translation with multi-agent communication game. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI Press, 5086--5093.Google Scholar
Yong Cheng, Qian Yang, Yang Liu, Maosong Sun, and Wei Xu. 2017. Joint training for pivot-based neural machine translation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 3974--3980. DOI:https://doi.org/10.24963/ijcai.2017/555Google ScholarDigital Library
Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder--decoder approaches. In Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST’14). Association for Computational Linguistics, 103--111. DOI:https://doi.org/10.3115/v1/W14-4012Google ScholarCross Ref
Gyu Hyeon Choi, Jong Hun Shin, and Young Kil Kim. 2018. Improving a multi-source neural machine translation model with corpus extension for low-resource languages. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18). European Language Resource Association, 900--904. Retrieved from http://aclweb.org/anthology/L18-1144.Google Scholar
Christos Christodouloupoulos and Mark Steedman. 2015. A massively parallel corpus: The Bible in 100 languages. Lang. Resour. Eval. 49, 2 (2015), 375--395.Google ScholarDigital Library
Chenhui Chu and Raj Dabre. 2018. Multilingual and multi-domain adaptation for neural machine translation. In Proceedings of the 24th Meeting of the Association for Natural Language Processing (NLP’18). 909--912.Google Scholar
Chenhui Chu and Raj Dabre. 2019. Multilingual multi-domain adaptation approaches for neural machine translation. CoRR abs/1906.07978 (2019).Google Scholar
Chenhui Chu, Raj Dabre, and Sadao Kurohashi. 2017. An empirical comparison of domain adaptation methods for neural machine translation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 385--391. DOI:https://doi.org/10.18653/v1/P17-2061Google ScholarCross Ref
Chenhui Chu and Rui Wang. 2018. A survey of domain adaptation for neural machine translation. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 1304--1319. Retrieved from http://aclweb.org/anthology/C18-1111.Google Scholar
Michael Collins, Philipp Koehn, and Ivona Kučerová. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 531--540. DOI:https://doi.org/10.3115/1219840.1219906Google ScholarDigital Library
Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. In Proceedings of the 32nd Conference on Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 7059--7069. Retrieved from http://papers.nips.cc/paper/8928-cross-lingual-language-model-pretraining.pdf.Google Scholar
Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2018. Word translation without parallel data. In Proceedings of the International Conference on Learning Representations. Retrieved from https://github.com/facebookresearch/MUSE.Google Scholar
Alexis Conneau, Ruty Rinott, Guillaume Lample, Adina Williams, Samuel Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. XNLI: Evaluating cross-lingual sentence representations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. 2475--2485. Retrieved from https://www.aclweb.org/anthology/D18-1269.Google ScholarCross Ref
Anna Currey and Kenneth Heafield. 2019. Zero-resource neural machine translation with monolingual pivot data. In Proceedings of the 3rd Workshop on Neural Generation and Translation. Association for Computational Linguistics, 99--107. DOI:https://doi.org/10.18653/v1/D19-5610Google ScholarCross Ref
Raj Dabre, Fabien Cromieres, and Sadao Kurohashi. 2017. Enabling multi-source neural machine translation by concatenating source sentences in multiple languages. In Proceedings of the Machine Translation Summit XVI, Vol.1: Research Track. 96--106.Google Scholar
Raj Dabre, Atsushi Fujita, and Chenhui Chu. 2019. Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 1410--1416. DOI:https://doi.org/10.18653/v1/D19-1146Google ScholarCross Ref
Raj Dabre, Anoop Kunchukuttan, Atsushi Fujita, and Eiichiro Sumita. 2018. NICT’s participation in WAT 2018: Approaches using multilingualism and recurrently stacked layers. In Proceedings of the 5th Workshop on Asian Language Translation.Google Scholar
Raj Dabre and Sadao Kurohashi. 2017. MMCR4NLP: Multilingual multiway corpora repository for natural language processing. arXiv preprint arXiv:1710.01025 (2017).Google Scholar
Raj Dabre, Tetsuji Nakagawa, and Hideto Kazawa. 2017. An empirical study of language relatedness for transfer learning in neural machine translation. In Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation. The National University (Philippines), 282--286. Retrieved from http://aclweb.org/anthology/Y17-1038.Google Scholar
Mattia A. Di Gangi, Roldano Cattoni, Luisa Bentivogli, Matteo Negri, and Marco Turchi. 2019. MuST-C: A multilingual speech translation corpus. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics. 2012--2017. Retrieved from https://www.aclweb.org/anthology/N19-1202.Google Scholar
Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-task learning for multiple language translation. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 1723--1732. DOI:https://doi.org/10.3115/v1/P15-1166Google Scholar
Bonnie J. Dorr. 1987. UNITRAN: An interlingua approach to machine translation. In Proceedings of the 6th Conference of the American Association of Artificial Intelligence.Google Scholar
Kevin Duh, Graham Neubig, Katsuhito Sudoh, and Hajime Tsukada. 2013. Adaptation data selection using neural language models: Experiments in machine translation. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 678--683. Retrieved from http://www.aclweb.org/anthology/P13-2119.Google Scholar
Carlos Escolano, Marta R. Costa-jussà, and José A. R. Fonollosa. 2019. From bilingual to multilingual neural machine translation by incremental training. In Proceedings of the 57th Meeting of the Association for Computational Linguistics.Google Scholar
Cristina España-Bonet, Ádám Csaba Varga, Alberto Barrón-Cedeño, and Josef van Genabith. 2017. An empirical analysis of NMT-derived interlingual embeddings and their use in parallel sentence identification. IEEE J. Select. Topics Sig. Proc. 11, 8 (Dec. 2017), 1340--1350. DOI:https://doi.org/10.1109/JSTSP.2017.2764273Google Scholar
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. 1126--1135. Retrieved from http://proceedings.mlr.press/v70/finn17a.html.Google Scholar
Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 866--875. DOI:https://doi.org/10.18653/v1/N16-1101Google ScholarCross Ref
Orhan Firat, Baskaran Sankaran, Yaser Al-Onaizan, Fatos T. Yarman Vural, and Kyunghyun Cho. 2016. Zero-resource translation with multi-lingual neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 268--277. DOI:https://doi.org/10.18653/v1/D16-1026Google ScholarCross Ref
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 1 (2016), 2096--2030.Google ScholarDigital Library
Ekaterina Garmash and Christof Monz. 2016. Ensemble learning for multi-source neural machine translation. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). The COLING 2016 Organizing Committee, 1409--1418. Retrieved from http://aclweb.org/anthology/C16-1133.Google Scholar
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. 1243--1252. Retrieved from http://proceedings.mlr.press/v70/gehring17a.html.Google Scholar
Adrià De Gispert and José B. Mariño. 2006. Catalan-English statistical machine translation without parallel corpus: Bridging through Spanish. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC’06). 65--68.Google Scholar
Jiatao Gu, Hany Hassan, Jacob Devlin, and Victor O. K. Li. 2018. Universal neural machine translation for extremely low resource languages. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, 344--354. DOI:https://doi.org/10.18653/v1/N18-1032Google Scholar
Jiatao Gu, Yong Wang, Yun Chen, Victor O. K. Li, and Kyunghyun Cho. 2018. Meta-learning for low-resource neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3622--3631. Retrieved from http://aclweb.org/anthology/D18-1398.Google ScholarCross Ref
Jiatao Gu, Yong Wang, Kyunghyun Cho, and Victor O. K. Li. 2019. Improved zero-shot neural machine translation via ignoring spurious correlations. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1258--1268. DOI:https://doi.org/10.18653/v1/P19-1121Google Scholar
Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, and Marc’Aurelio Ranzato. 2019. The FLORES evaluation datasets for low-resource machine translation: Nepali--English and Sinhala--English. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 6098--6111. DOI:https://doi.org/10.18653/v1/D19-1632Google ScholarCross Ref
Thanh-Le Ha, Jan Niehues, and Alexander H. Waibel. 2016. Toward multilingual neural machine translation with universal encoder and decoder. In Proceedings of the 13th International Workshop on Spoken Language Translation. 1--7.Google Scholar
Thanh-Le Ha, Jan Niehues, and Alexander H. Waibel. 2017. Effective strategies in zero-shot neural machine translation. In Proceedings of the 14th International Workshop on Spoken Language Translation. 105--112.Google Scholar
Barry Haddow and Faheem Kirefu. 2020. PMIndia—A collection of parallel corpora of languages of India. arxiv 2001.09907 (2020).Google Scholar
Junxian He, Jiatao Gu, Jiajun Shen, and Marc’Aurelio Ranzato. 2020. Revisiting self-training for neural sequence generation. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). Retrieved from https://openreview.net/forum?id=SJgdnAVKDH.Google Scholar
Carlos Henríquez, Marta R. Costa-jussá, Rafael E. Banchs, Lluis Formiga, and José B. Mariño. 2011. Pivot strategies as an alternative for statistical machine translation tasks involving Iberian languages. In Proceedings of the Workshop on Iberian Cross-language Natural Language Processing Tasks (ICL’11). 22--27.Google Scholar
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2014. Distilling the knowledge in a neural network. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS’14) Deep Learning Workshop.Google Scholar
Chris Hokamp, John Glover, and Demian Gholipour Ghalandari. 2019. Evaluating the supervised and zero-shot performance of multi-lingual translation models. In Proceedings of the 4th Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). Association for Computational Linguistics, 209--217. DOI:https://doi.org/10.18653/v1/W19-5319Google ScholarCross Ref
Yanping Huang, Yonglong Cheng, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, and Zhifeng Chen. 2019. GPipe: Efficient training of giant neural networks using pipeline parallelism. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’19).Google Scholar
Pratik Jawanpuria, Arjun Balgovind, Anoop Kunchukuttan, and Bamdev Mishra. 2019. Learning multilingual word embeddings in latent metric space: A geometric approach. Trans. Assoc. Comput. Ling. 7 (2019), 107--120. Retrieved from https://www.aclweb.org/anthology/Q19-1007.Google Scholar
Sébastien Jean, Orhan Firat, and Melvin Johnson. 2019. Adaptive scheduling for multi-task learning. In Proceedings of the Continual Learning Workshop at NeurIPS’18.Google Scholar
Girish Nath Jha. 2010. The TDIL program and the Indian language Corpora Intitiative (ILCI). In Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC’10). European Languages Resources Association (ELRA). Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/874_Paper.pdf.Google Scholar
Baijun Ji, Zhirui Zhang, Xiangyu Duan, Min Zhang, Boxing Chen, and Weihua Luo. 2020. Cross-lingual pre-training based transfer for zero-shot neural machine translation. In Proceedings of the 34th AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Trans. Assoc. Comput. Ling. 5 (2017), 339--351. Retrieved from http://aclweb.org/anthology/Q17-1024.Google ScholarCross Ref
Yunsu Kim, Yingbo Gao, and Hermann Ney. 2019. Effective cross-lingual transfer of neural machine translation models without shared vocabularies. In Proceedings of the 57th Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, and Hermann Ney. 2019. Pivot-based transfer learning for neural machine translation between non-English languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 866--876. DOI:https://doi.org/10.18653/v1/D19-1080Google ScholarCross Ref
Yoon Kim and Alexander M. Rush. 2016. Sequence-level knowledge distillation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1317--1327. DOI:https://doi.org/10.18653/v1/D16-1139Google Scholar
Eliyahu Kiperwasser and Miguel Ballesteros. 2018. Scheduled multi-task learning: From syntax to translation. Trans. Assoc. Comput. Ling. 6 (2018), 225--240. DOI:https://doi.org/10.1162/tacl_a_00017Google Scholar
Alexandre Klementiev, Ivan Titov, and Binod Bhattarai. 2012. Inducing crosslingual distributed representations of words. In Proceedings of the International Conference on Computational Linguistics (COLING’12). The COLING 2012 Organizing Committee, 1459--1474. Retrieved from https://www.aclweb.org/anthology/C12-1089.Google Scholar
Tom Kocmi and Ondřej Bojar. 2018. Trivial transfer learning for low-resource neural machine translation. In Proceedings of the Third Conference on Machine Translation, Volume 1: Research Papers. Association for Computational Linguistics, 244--252. Retrieved from http://www.aclweb.org/anthology/W18-6325.Google ScholarCross Ref
Philipp Koehn. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of the 10th Machine Translation Summit. AAMT, 79--86. Retrieved from http://mt-archive.info/MTS-2005-Koehn.pdf.Google Scholar
Philipp Koehn. 2017. Neural machine translation. CoRR abs/1709.07809 (2017).Google Scholar
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics, 177--180. Retrieved from http://www.aclweb.org/anthology/P/P07/P07-2045.Google ScholarCross Ref
Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural machine translation. In Proceedings of the 1st Workshop on Neural Machine Translation. Association for Computational Linguistics, 28--39. Retrieved from http://www.aclweb.org/anthology/W17-3204.Google ScholarCross Ref
Philipp Koehn, Franz J. Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 127--133. Retrieved from https://www.aclweb.org/anthology/N03-1017.Google ScholarCross Ref
Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 66--71. DOI:https://doi.org/10.18653/v1/D18-2012Google ScholarCross Ref
Sneha Kudugunta, Ankur Bapna, Isaac Caswell, and Orhan Firat. 2019. Investigating multilingual NMT representations at scale. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 1565--1575. DOI:https://doi.org/10.18653/v1/D19-1167Google ScholarCross Ref
Anoop Kunchukuttan. 2020. IndoWordnet Parallel Corpus. Retrieved from https://github.com/anoopkunchukuttan/indowordnet_parallel.Google Scholar
Anoop Kunchukuttan and Pushpak Bhattacharyya. 2016. Orthographic syllable as basic unit for SMT between related languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1912--1917. DOI:https://doi.org/10.18653/v1/D16-1196Google ScholarCross Ref
Anoop Kunchukuttan and Pushpak Bhattacharyya. 2017. Learning variable length units for SMT between related languages via byte pair encoding. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. Association for Computational Linguistics, 14--24. DOI:https://doi.org/10.18653/v1/W17-4102Google ScholarCross Ref
Anoop Kunchukuttan, Abhijit Mishra, Rajen Chatterjee, Ritesh Shah, and Pushpak Bhattacharyya. 2014. Shata-Anuvadak: Tackling multiway translation of Indian languages. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA), 1781--1787. Retrieved from http://www.lrec-conf.org/proceedings/lrec2014/pdf/414_Paper.pdf.Google Scholar
Anoop Kunchukuttan, Maulik Shah, Pradyot Prakash, and Pushpak Bhattacharyya. 2017. Utilizing lexical similarity between related, low-resource languages for pivot-based SMT. In Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Asian Federation of Natural Language Processing, 283--289. Retrieved from http://aclweb.org/anthology/I17-2048.Google Scholar
Surafel Melaku Lakew, Mauro Cettolo, and Marcello Federico. 2018. A comparison of transformer and recurrent neural networks on multilingual neural machine translation. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 641--652. Retrieved from http://aclweb.org/anthology/C18-1054.Google Scholar
Surafel Melaku Lakew, Aliia Erofeeva, Matteo Negri, Marcello Federico, and Marco Turchi. 2018. Transfer learning in multilingual neural machine translation with dynamic vocabulary. In Proceedings of the 15th International Workshop on Spoken Language Translation (IWSLT’18). 54--61.Google Scholar
Surafel Melaku Lakew, Quintino F. Lotito, Matteo Negri, Marco Turchi, and Marcello Federico. 2017. Improving zero-shot translation of low-resource languages. In Proceedings of the 14th International Workshop on Spoken Language Translation. 113--119.Google Scholar
Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. In Proceedings of the International Conference on Learning Representations. Retrieved from https://openreview.net/forum?id=rkYTTf-AZ.Google Scholar
Jason Lee, Kyunghyun Cho, and Thomas Hofmann. 2017. Fully character-level neural machine translation without explicit segmentation. Trans. Assoc. Comput. Ling. 5 (2017), 365--378. Retrieved from http://aclweb.org/anthology/Q17-1026.Google ScholarCross Ref
Yichao Lu, Phillip Keung, Faisal Ladhak, Vikas Bhardwaj, Shaonan Zhang, and Jason Sun. 2018. A neural interlingua for multilingual machine translation. In Proceedings of the 3rd Conference on Machine Translation: Research Papers. Association for Computational Linguistics, 84--92. Retrieved from http://aclweb.org/anthology/W18-6309.Google ScholarCross Ref
Mieradilijiang Maimaiti, Yang Liu, Huanbo Luan, and Maosong Sun. 2019. Multi-round transfer learning for low-resource NMT using multiple high-resource languages. ACM Trans. Asian Low-Resour. Lang. Inf. Proc. 18, 4 (May 2019). DOI:https://doi.org/10.1145/3314945Google Scholar
Chaitanya Malaviya, Graham Neubig, and Patrick Littell. 2017. Learning language representations for typology prediction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2529--2535. DOI:https://doi.org/10.18653/v1/D17-1268Google ScholarCross Ref
Giulia Mattoni, Pat Nagle, Carlos Collantes, and Dimitar Shterionov. 2017. Zero-shot translation for Indian languages with sparse data. In Proceedings of Machine Translation Summit XVI, Vol. 2: Users and Translators Track. 1--10.Google Scholar
Evgeny Matusov, Nicola Ueffing, and Hermann Ney. 2006. Computing consensus translation for multiple machine translation systems using enhanced hypothesis alignment. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. 33--40. Retrieved from https://www.aclweb.org/anthology/E06-1005.Google Scholar
Cettolo Mauro, Girardi Christian, and Federico Marcello. 2012. Wit3: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of European Association for Machine Translation. 261--268.Google Scholar
Tomas Mikolov, Quoc V. Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. CoRR abs/1309.4168 (2013).Google Scholar
Rudra Murthy, Anoop Kunchukuttan, and Pushpak Bhattacharyya. 2019. Addressing word-order divergence in multilingual neural machine translation for extremely low resource languages. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 3868--3873. Retrieved from https://www.aclweb.org/anthology/N19-1387.Google ScholarCross Ref
Toshiaki Nakazawa, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Win Pa Pa, Isao Goto, Hideya Mino, Katsuhito Sudoh, and Sadao Kurohashi. 2018. Overview of the 5th workshop on Asian translation. In Proceedings of the 5th Workshop on Asian Translation (WAT’18). 1--41.Google Scholar
Preslav Nakov and Hwee Tou Ng. 2009. Improved statistical machine translation for resource-poor languages using related resource-rich languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1358--1367. Retrieved from https://www.aclweb.org/anthology/D09-1141.Google ScholarCross Ref
Graham Neubig. 2017. Neural machine translation and sequence-to-sequence models: A tutorial. CoRR abs/1703.01619 (2017).Google Scholar
Graham Neubig and Junjie Hu. 2018. Rapid adaptation of neural machine translation to new languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 875--880. Retrieved from http://aclweb.org/anthology/D18-1103.Google ScholarCross Ref
Toan Q. Nguyen and David Chiang. 2017. Transfer learning across low-resource, related languages for neural machine translation. In Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Asian Federation of Natural Language Processing, 296--301. Retrieved from http://aclweb.org/anthology/I17-2050.Google Scholar
Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, and Satoshi Nakamura. 2018. Multi-source neural machine translation with missing data. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, 92--99. Retrieved from http://aclweb.org/anthology/W18-2711.Google ScholarCross Ref
Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, and Satoshi Nakamura. 2018. Multi-source neural machine translation with data augmentation. In Proceedings of the 15th International Workshop on Spoken Language Translation (IWSLT’18). 48--53. Retrieved from https://arxiv.org/abs/1810.06826.Google Scholar
Eric Nyberg, Teruko Mitamura, and Jaime Carbonell. 1997. The KANT machine translation system: From R&D to initial deployment. In Proceedings of the LISA Workshop on Integrating Advanced Translation Technology. 1--7.Google Scholar
Franz Josef Och and Hermann Ney. 2001. Statistical multi-source translation. In Proceedings of the Machine Translation Summit, Vol. 8. 253--258.Google Scholar
Robert Östling and Jörg Tiedemann. 2017. Continuous multilinguality with language vectors. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, 644--649. Retrieved from https://www.aclweb.org/anthology/E17-2102.Google ScholarCross Ref
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (Oct. 2010), 1345--1359. DOI:https://doi.org/10.1109/TKDE.2009.191Google ScholarDigital Library
Ngoc-Quan Pham, Jan Niehues, Thanh-Le Ha, and Alexander Waibel. 2019. Improving zero-shot translation with language-independent constraints. In Proceedings of the 4th Conference on Machine Translation (Volume 1: Research Papers). Association for Computational Linguistics, 13--23. DOI:https://doi.org/10.18653/v1/W19-5202Google ScholarCross Ref
Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual BERT? In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4996--5001. DOI:https://doi.org/10.18653/v1/P19-1493Google ScholarCross Ref
Emmanouil Antonios Platanios, Mrinmaya Sachan, Graham Neubig, and Tom Mitchell. 2018. Contextual parameter generation for universal neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 425--435. Retrieved from http://aclweb.org/anthology/D18-1039.Google ScholarCross Ref
Matt Post, Chris Callison-Burch, and Miles Osborne. 2012. Constructing parallel corpora for six Indian languages via crowdsourcing. In Proceedings of the 7th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 401--409.Google Scholar
Raj Noel Dabre Prasanna. 2018. Exploiting Multilingualism and Transfer Learning for Low Resource Machine Translation. Ph.D. Dissertation. Kyoto University. Retrieved from http://hdl.handle.net/2433/232411.Google Scholar
Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein. 2017. SVCCA: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. In Proceedings of the 30th Conference on Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 6076--6085. Retrieved from http://papers.nips.cc/paper/7188-svcca-singular-vector-canonical-correlation-analysis-for-deep-learning-dynamics-and-interpretability.pdf.Google Scholar
Prajit Ramachandran, Peter Liu, and Quoc Le. 2017. Unsupervised pretraining for sequence to sequence learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 383--391. DOI:https://doi.org/10.18653/v1/D17-1039Google ScholarCross Ref
Ananthakrishnan Ramanathan, Jayprasad Hegde, Ritesh Shah, Pushpak Bhattacharyya, and M. Sasikumar. 2008. Simple syntactic and morphological processing can help English-Hindi statistical machine translation. In Proceedings of the International Joint Conference on Natural Language Processing.Google Scholar
Matīss Rikters, Mārcis Pinnis, and Rihards Krišlauks. 2018. Training and adapting multilingual NMT for less-resourced and morphologically rich languages. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18). European Language Resources Association (ELRA), 3766--3773.Google Scholar
Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. CoRR abs/1609.04747 (2016).Google Scholar
Devendra Sachan and Graham Neubig. 2018. Parameter sharing methods for multilingual self-attentional translation models. In Proceedings of the 3rd Conference on Machine Translation: Research Papers. Association for Computational Linguistics, 261--271. Retrieved from http://aclweb.org/anthology/W18-6327.Google ScholarCross Ref
Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, and Kyunghyun Cho. 2016. A correlational encoder decoder architecture for pivot based sequence generation. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). The COLING 2016 Organizing Committee, 109--118. Retrieved from https://www.aclweb.org/anthology/C16-1011.Google Scholar
Peter H. Schönemann. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1 (1966), 1--10.Google ScholarCross Ref
Josh Schroeder, Trevor Cohn, and Philipp Koehn. 2009. Word lattices for multi-source translation. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL’09). Association for Computational Linguistics, 719--727. Retrieved from https://www.aclweb.org/anthology/E09-1082.Google ScholarCross Ref
Mike Schuster and Kaisuke Nakajima. 2012. Japanese and Korean voice search. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’12). IEEE, 5149--5152. Retrieved from http://dblp.uni-trier.de/db/conf/icassp/icassp2012.html#SchusterN12.Google ScholarCross Ref
Holger Schwenk, Vishrav Chaudhary, Shuo Sun, Hongyu Gong, and Francisco Guzmán. 2019. WikiMatrix: Mining 135M parallel sentences in 1620 language pairs from Wikipedia. CoRR abs/1907.05791 (2019).Google Scholar
Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, and Pushpak Bhattacharyya. 2019. Multilingual unsupervised NMT using shared encoder and language-specific decoders. In Proceedings of the 57th Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 86--96. Retrieved from http://www.aclweb.org/anthology/P16-1009.Google ScholarCross Ref
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1715--1725. Retrieved from http://www.aclweb.org/anthology/P16-1162.Google ScholarCross Ref
Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, and Thomas Hofmann. 2018. Zero-shot dual machine translation. CoRR abs/1805.10338 (2018).Google Scholar
Petr Sgall and Jarmila Panevová. 1987. Machine translation, linguistics, and interlingua. In Proceedings of the 3rd Conference on European Chapter of the Association for Computational Linguistics (EACL’87). Association for Computational Linguistics, 99--103. DOI:https://doi.org/10.3115/976858.976876Google ScholarDigital Library
Itamar Shatz. 2016. Native language influence during second language acquisition: A large-scale learner corpus analysis. In Proceedings of the Pacific Second Language Research Forum (PacSLRF’16). 175--180.Google Scholar
Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Arivazhagan, Jason Riesa, Ankur Bapna, Orhan Firat, and Karthik Raman. 2020. Evaluating the cross-lingual effectiveness of massively multilingual neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20).Google ScholarCross Ref
Shashank Siripragrada, Jerin Philip, Vinay P. Namboodiri, and C. V. Jawahar. 2020. A multilingual parallel corpora collection effort for Indian languages. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, 3743--3751. Retrieved from https://www.aclweb.org/anthology/2020.lrec-1.462.Google Scholar
Anders Søgaard, Sebastian Ruder, and Ivan Vulić. 2018. On the limitations of unsupervised bilingual dictionary induction. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 778--788. DOI:https://doi.org/10.18653/v1/P18-1072Google ScholarCross Ref
Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, and Min Zhang. 2019. Code-switching for enhancing NMT with pre-specified translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 449--459. Retrieved from https://www.aclweb.org/anthology/N19-1044.Google Scholar
Ralf Steinberger, Mohamed Ebrahim, Alexandros Poulis, Manuel Carrasco-Benitez, Patrick Schlüter, Marek Przybyszewski, and Signe Gilbro. 2014. An overview of the European Union’s highly multilingual parallel corpora. Lang. Resour. Eval. 48, 4 (2014), 679--707.Google ScholarDigital Library
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14). The MIT Press, 3104--3112. Retrieved from http://dl.acm.org/citation.cfm?id=2969033.2969173.Google ScholarDigital Library
Xu Tan, Jiale Chen, Di He, Yingce Xia, Tao Qin, and Tie-Yan Liu. 2019. Multilingual neural machine translation with language clustering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 963--973. DOI:https://doi.org/10.18653/v1/D19-1089Google ScholarCross Ref
Xu Tan, Yi Ren, Di He, Tao Qin, and Tie-Yan Liu. 2019. Multilingual neural machine translation with knowledge distillation. In Proceedings of the International Conference on Learning Representations (ICLR’19). Retrieved from http://arxiv.org/abs/1902.10461.Google Scholar
Ye Kyaw Thu, Win Pa Pa, Masao Utiyama, Andrew M. Finch, and Eiichiro Sumita. 2016. Introducing the Asian language treebank (ALT). In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 1574--1578.Google Scholar
Jörg Tiedemann. 2012. Character-based pivot translation for under-resourced languages and domains. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 141--151. Retrieved from https://www.aclweb.org/anthology/E12-1015.Google ScholarDigital Library
Jörg Tiedemann. 2012. Parallel data, tools, and interfaces in OPUS. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), 2214--2218. Retrieved from http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf.Google Scholar
Hiroshi Uchida. 1996. UNL: Universal networking language—An electronic language for communication, understanding, and collaboration. In UNU/IAS/UNL Center. Retrieved from https://www.semanticscholar.org/paper/UNL%3A-Universal-Networking-Language-An-Electronic-Uchida/f281c6a61ee69e4fa0f15f3f6d03faeee7a74e10.Google Scholar
Masao Utiyama and Hitoshi Isahara. 2007. A comparison of pivot methods for phrase-based statistical machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 484--491. Retrieved from https://www.aclweb.org/anthology/N07-1061.Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 30th Conference on Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008. Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.Google Scholar
Raúl Vázquez, Alessandro Raganato, Jörg Tiedemann, and Mathias Creutz. 2018. Multilingual NMT with a language-independent attention bridge. CoRR abs/1811.00498 (2018).Google Scholar
David Vilar, Jan-Thorsten Peter, and Hermann Ney. 2007. Can we translate letters? In Proceedings of the 2nd Workshop on Statistical Machine Translation. Association for Computational Linguistics, 33--39. Retrieved from https://www.aclweb.org/anthology/W07-0705.Google ScholarCross Ref
Karthik Visweswariah, Rajakrishnan Rajkumar, Ankur Gandhe, Ananthakrishnan Ramanathan, and Jiri Navratil. 2011. A word reordering model for improved machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. 486--496. Retrieved from https://www.aclweb.org/anthology/D11-1045.Google ScholarDigital Library
Rui Wang, Andrew Finch, Masao Utiyama, and Eiichiro Sumita. 2017. Sentence embedding for neural machine translation domain adaptation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 560--566. Retrieved from http://aclweb.org/anthology/P17-2089.Google ScholarCross Ref
Rui Wang, Masao Utiyama, Lemao Liu, Kehai Chen, and Eiichiro Sumita. 2017. Instance weighting for neural machine translation domain adaptation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1482--1488. DOI:https://doi.org/10.18653/v1/D17-1155Google ScholarCross Ref
Xinyi Wang and Graham Neubig. 2019. Target conditioned sampling: Optimizing data selection for multilingual neural machine translation. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5823--5828. DOI:https://doi.org/10.18653/v1/P19-1583Google ScholarCross Ref
Xinyi Wang, Hieu Pham, Philip Arthur, and Graham Neubig. 2019. Multilingual neural machine translation with soft decoupled encoding. In Proceedings of the International Conference on Learning Representations (ICLR’19). Retrieved from https://arxiv.org/abs/1902.03499.Google Scholar
Yining Wang, Jiajun Zhang, Feifei Zhai, Jingfang Xu, and Chengqing Zong. 2018. Three strategies to improve one-to-many multilingual translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2955--2960. Retrieved from http://aclweb.org/anthology/D18-1326.Google ScholarCross Ref
Yining Wang, Long Zhou, Jiajun Zhang, Feifei Zhai, Jingfang Xu, and Chengqing Zong. 2019. A compact and language-sensitive multilingual translation method. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1213--1223. DOI:https://doi.org/10.18653/v1/P19-1117Google ScholarCross Ref
Toon Witkam. 2006. History and heritage of the DLT (Distributed Language Translation) project. In Utrecht, The Netherlands: Private Publication. 1--11. Retrieved from http://www.mt-archive.info/Witkam-2006.pdf.Google Scholar
Hua Wu and Haifeng Wang. 2007. Pivot language approach for phrase-based statistical machine translation. Mach. Translat. 21, 3 (2007), 165--181.Google ScholarDigital Library
Hua Wu and Haifeng Wang. 2009. Revisiting pivot language approach for machine translation. In Proceedings of the Joint Conference of the 47th Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, 154--162. Retrieved from https://www.aclweb.org/anthology/P09-1018.Google Scholar
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016).Google Scholar
Fei Xia and Michael McCord. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04). COLING, 508--514. Retrieved from https://www.aclweb.org/anthology/C04-1073.Google ScholarDigital Library
Chang Xu, Tao Qin, Gang Wang, and Tie-Yan Liu. 2019. Polygon-Net: A general framework for jointly boosting multiple unsupervised neural machine translation models. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). International Joint Conferences on Artificial Intelligence Organization, 5320--5326. DOI:https://doi.org/10.24963/ijcai.2019/739Google ScholarCross Ref
Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive knowledge sharing in multi-task learning: Improving low-resource neural machine translation. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 656--661. Retrieved from http://aclweb.org/anthology/P18-2104.Google ScholarCross Ref
Yang Zhao, Jiajun Zhang, and Chengqing Zong. 2018. Exploiting pre-ordering for neural machine translation. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18). European Language Resources Association (ELRA). Retrieved from https://www.aclweb.org/anthology/L18-1143.Google Scholar
Long Zhou, Wenpeng Hu, Jiajun Zhang, and Chengqing Zong. 2017. Neural system combination for machine translation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 378--384. DOI:https://doi.org/10.18653/v1/P17-2060Google ScholarCross Ref
Michał Ziemski, Marcin Junczys-Dowmunt, and Bruno Pouliquen. 2016. The United Nations parallel corpus v1.0. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 3530--3534. Retrieved from https://www.aclweb.org/anthology/L16-1561.Google Scholar
Barret Zoph and Kevin Knight. 2016. Multi-source neural translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 30--34. DOI:https://doi.org/10.18653/v1/N16-1004Google ScholarCross Ref
Barret Zoph and Quoc V. Le. 2017. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17). Retrieved from https://openreview.net/forum?id=r1Ue8Hcxg.Google Scholar
Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1568--1575. DOI:https://doi.org/10.18653/v1/D16-1163Google ScholarCross Ref

Index Terms

A Survey of Multilingual Neural Machine Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation

Recommendations

A Survey of Orthographic Information in Machine Translation
Abstract
Machine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention towards machine translation for resource-poor languages and closely related ...
Read More
Multi-way, multilingual neural machine translation

The first attention-based neural-MT for multi-way, multilingual translation is proposed.Multi-way multilingual model is tested on more than 8 languages (En, Fr, Cz, De, Ru, Fi, Tr and Uz).It achieves the translation quality comparable to single-pair ...
Read More
Improving Multilingual Neural Machine Translation with Artificial Labels
SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Inspired by the work which uses Artificial Translation Units for generation of synthetic data in low-resource Neural Machine Translation systems [12], we propose using these translation units to enhance ability of sharing information between translation ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 53, Issue 5
September 2021
782 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3426973
Editor:
Albert Zomaya
University of Sydney, Austraila
Issue’s Table of Contents
Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 September 2020
- Revised: 1 June 2020
- Accepted: 1 June 2020
- Received: 1 July 2019
Published in csur Volume 53, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Neural machine translation
low-resource
multi-source
multilingualism
survey
zero-shot
Qualifiers
- survey
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 89
  Total Citations
  View Citations
- 12,153
  Total Downloads
- Downloads (Last 12 months)3,024
- Downloads (Last 6 weeks)399
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Survey of Multilingual Neural Machine Translation

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

A Survey of Orthographic Information in Machine Translation

Multi-way, multilingual neural machine translation

Improving Multilingual Neural Machine Translation with Artificial Labels