Skip to main content
Erschienen in: Neural Processing Letters 1/2020

27.07.2019

Speed Up the Training of Neural Machine Translation

verfasst von: Xinyue Liu, Weixuan Wang, Wenxin Liang, Yuangang Li

Erschienen in: Neural Processing Letters | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Neural machine translation (NMT) has achieved notable achievements in recent years. Although existing models provide reasonable translation performance, they cost too much training time. Especially, when the corpus is enormous, their computational cost will be extremely high. In this paper, we propose a novel NMT model based on the conventional bidirectional recurrent neural network (bi-RNN). In this model, we apply a tanh activation function, which can learn the future and history context information more sufficiently, to speed up the training process. Experimental results on tasks of German–English and English–French translation demonstrate that the proposed model can save much training time compared with the state-of-the-art models and provide better translation performances.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alinejad A, Siahbani M, Sarkar A (2018) Prediction improves simultaneous neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018, pp 3022–3027 Alinejad A, Siahbani M, Sarkar A (2018) Prediction improves simultaneous neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018, pp 3022–3027
2.
Zurück zum Zitat Amari S (1998) Natural gradient works efficiently in learning. Neural Comput 10(2):251–276 Amari S (1998) Natural gradient works efficiently in learning. Neural Comput 10(2):251–276
3.
Zurück zum Zitat Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:​1409.​0473
4.
Zurück zum Zitat Chen X, Liu X, Wang Y, Gales MJF, Woodland PC (2016) Efficient training and evaluation of recurrent neural network language models for automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(11):2146–2157 Chen X, Liu X, Wang Y, Gales MJF, Woodland PC (2016) Efficient training and evaluation of recurrent neural network language models for automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(11):2146–2157
5.
Zurück zum Zitat Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1724–1734 Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1724–1734
6.
Zurück zum Zitat Gehring J, Auli M, Grangier D, Dauphin YN (2017) A convolutional encoder model for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 123–135 Gehring J, Auli M, Grangier D, Dauphin YN (2017) A convolutional encoder model for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 123–135
8.
Zurück zum Zitat Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:​1207.​0580
9.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput 9(8):1735–1780 Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput 9(8):1735–1780
10.
Zurück zum Zitat Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: long papers, pp 1–10 Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: long papers, pp 1–10
11.
Zurück zum Zitat Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, 18–21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1700–1709 Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, 18–21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1700–1709
12.
Zurück zum Zitat Kalchbrenner N, Espeholt L, Simonyan K, Oord AVD, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. CoRR arXiv:1610.10099 Kalchbrenner N, Espeholt L, Simonyan K, Oord AVD, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. CoRR arXiv:​1610.​10099
13.
Zurück zum Zitat Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: open-source toolkit for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, System Demonstrations, pp 67–72 Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: open-source toolkit for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, System Demonstrations, pp 67–72
14.
Zurück zum Zitat Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Conference of the North American chapter of the association for computational linguistics on human language technology, NAACL2003, May 27–June 1, Edmonton, Canda, pp 48–54 Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Conference of the North American chapter of the association for computational linguistics on human language technology, NAACL2003, May 27–June 1, Edmonton, Canda, pp 48–54
17.
Zurück zum Zitat Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 1412–1421 Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 1412–1421
18.
Zurück zum Zitat Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL 2002, Pennsylvania Philadelphia, PA 19104 , July 2–12, pp 311–318. Association for Computational Linguistics Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL 2002, Pennsylvania Philadelphia, PA 19104 , July 2–12, pp 311–318. Association for Computational Linguistics
20.
21.
Zurück zum Zitat Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681 Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
22.
Zurück zum Zitat Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th annual conference on neural information processing systems, NIPS 2014, December 8–13, 2014, Montreal, Quebec, Canada, pp 3104–3112 Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th annual conference on neural information processing systems, NIPS 2014, December 8–13, 2014, Montreal, Quebec, Canada, pp 3104–3112
23.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 30th annual conference on neural information processing systems, NIPS 2017, December 4–9, 2017, Long Beach, CA, USA, pp 6000–6010 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 30th annual conference on neural information processing systems, NIPS 2017, December 4–9, 2017, Long Beach, CA, USA, pp 6000–6010
24.
25.
Zurück zum Zitat Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al. (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al. (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:​1609.​08144
26.
Zurück zum Zitat Yan Y, Wang Y, Gao W, Zhang B, Yang C, Yin X (2018) \(\text{ Lstm }^{2}\): multi-label ranking for document classification. Neural Process Lett 47(1):117–138 Yan Y, Wang Y, Gao W, Zhang B, Yang C, Yin X (2018) \(\text{ Lstm }^{2}\): multi-label ranking for document classification. Neural Process Lett 47(1):117–138
27.
Zurück zum Zitat Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 25(12):2424–2432 Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 25(12):2424–2432
28.
Zurück zum Zitat Zhang D, Kim J, Crego JM, Senellart J (2017) Boosting neural machine translation. In: Proceedings of the eighth international joint conference on natural language processing, IJCNLP 2017, Taipei, Taiwan, November 27–December 1, 2017, Volume 2: short papers, pp 271–276 Zhang D, Kim J, Crego JM, Senellart J (2017) Boosting neural machine translation. In: Proceedings of the eighth international joint conference on natural language processing, IJCNLP 2017, Taipei, Taiwan, November 27–December 1, 2017, Volume 2: short papers, pp 271–276
29.
Zurück zum Zitat Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. TACL 4:371–383 Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. TACL 4:371–383
Metadaten
Titel
Speed Up the Training of Neural Machine Translation
verfasst von
Xinyue Liu
Weixuan Wang
Wenxin Liang
Yuangang Li
Publikationsdatum
27.07.2019
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-019-10084-y

Weitere Artikel der Ausgabe 1/2020

Neural Processing Letters 1/2020 Zur Ausgabe

Neuer Inhalt