nach oben

Neural Processing Letters

Erschienen in:

27.07.2019

Speed Up the Training of Neural Machine Translation

verfasst von: Xinyue Liu, Weixuan Wang, Wenxin Liang, Yuangang Li

Erschienen in: Neural Processing Letters | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Neural machine translation (NMT) has achieved notable achievements in recent years. Although existing models provide reasonable translation performance, they cost too much training time. Especially, when the corpus is enormous, their computational cost will be extremely high. In this paper, we propose a novel NMT model based on the conventional bidirectional recurrent neural network (bi-RNN). In this model, we apply a tanh activation function, which can learn the future and history context information more sufficiently, to speed up the training process. Experimental results on tasks of German–English and English–French translation demonstrate that the proposed model can save much training time compared with the state-of-the-art models and provide better translation performances.

Vorheriger Artikel Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks

Nächster Artikel Synchronous Reluctance Motor Speed Tracking Using a Modified Second-Order Sliding Mode Control Method

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Alinejad A, Siahbani M, Sarkar A (2018) Prediction improves simultaneous neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018, pp 3022–3027

Amari S (1998) Natural gradient works efficiently in learning. Neural Comput 10(2):251–276

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

Chen X, Liu X, Wang Y, Gales MJF, Woodland PC (2016) Efficient training and evaluation of recurrent neural network language models for automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(11):2146–2157

Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1724–1734

Gehring J, Auli M, Grangier D, Dauphin YN (2017) A convolutional encoder model for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 123–135

Gu J, Bradbury J, Xiong C, Li VOK, Socher R (2017) Non-autoregressive neural machine translation. CoRR arXiv:1711.02281

Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput 9(8):1735–1780

10.

Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: long papers, pp 1–10

11.

Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, 18–21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1700–1709

12.

Kalchbrenner N, Espeholt L, Simonyan K, Oord AVD, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. CoRR arXiv:1610.10099

13.

Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: open-source toolkit for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, System Demonstrations, pp 67–72

14.

Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Conference of the North American chapter of the association for computational linguistics on human language technology, NAACL2003, May 27–June 1, Edmonton, Canda, pp 48–54

15.

Lei T, Zhang Y (2017) Training rnns as fast as cnns. arXiv preprint arXiv:1709.02755

16.

Luong M, Brevdo E, Zhao R (2017) Neural machine translation (seq2seq) tutorial. https://github.com/tensorflow/nmt

17.

Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 1412–1421

18.

Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL 2002, Pennsylvania Philadelphia, PA 19104 , July 2–12, pp 311–318. Association for Computational Linguistics

19.

Press O, Smith NA (2018) You may not need attention. CoRR arXiv:1810.13409

20.

Ranzato M, Chopra S, Auli M, Zaremba W (2015) Sequence level training with recurrent neural networks. CoRR arXiv:1511.06732

21.

Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

22.

Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th annual conference on neural information processing systems, NIPS 2014, December 8–13, 2014, Montreal, Quebec, Canada, pp 3104–3112

23.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 30th annual conference on neural information processing systems, NIPS 2017, December 4–9, 2017, Long Beach, CA, USA, pp 6000–6010

24.

Wu L, Xia Y, Zhao L, Tian F, Qin T, Lai J, Liu TY (2017) Adversarial neural machine translation. arXiv preprint arXiv:1704.06933

25.

Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al. (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144

26.

Yan Y, Wang Y, Gao W, Zhang B, Yang C, Yin X (2018) \(\text{ Lstm }^{2}\): multi-label ranking for document classification. Neural Process Lett 47(1):117–138

27.

Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 25(12):2424–2432

28.

Zhang D, Kim J, Crego JM, Senellart J (2017) Boosting neural machine translation. In: Proceedings of the eighth international joint conference on natural language processing, IJCNLP 2017, Taipei, Taiwan, November 27–December 1, 2017, Volume 2: short papers, pp 271–276

29.

Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. TACL 4:371–383

Titel: Speed Up the Training of Neural Machine Translation
verfasst von: Xinyue Liu
Weixuan Wang
Wenxin Liang
Yuangang Li
Publikationsdatum: 27.07.2019
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 1/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-019-10084-y

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2020

Discriminative Face Recognition Methods with Structure and Label Information via -Norm Regularization

Dimensionality Reduction Using Discriminant Collaborative Locality Preserving Projections

Action Recognition with Multiple Relative Descriptors of Trajectories

Traffic Signs Detection for Real-World Application of an Advanced Driving Assisting System Using Deep Learning

Finite Time Stability Analysis of Fractional-Order Complex-Valued Memristive Neural Networks with Proportional Delays

Adaptively Denoising Proposal Collection for Weakly Supervised Object Localization

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.