Skip to main content
Top

2018 | OriginalPaper | Chapter

A Comparable Study on Model Averaging, Ensembling and Reranking in NMT

Authors : Yuchen Liu, Long Zhou, Yining Wang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Published in: Natural Language Processing and Chinese Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Neural machine translation has become a benchmark method in machine translation. Many novel structures and methods have been proposed to improve the translation quality. However, it is difficult to train and turn parameters. In this paper, we focus on decoding techniques that boost translation performance by utilizing existing models. We address the problem from three aspects—parameter, word and sentence level, corresponding to checkpoint averaging, model ensembling and candidates reranking which all do not need to retrain the model. Experimental results have shown that the proposed decoding approaches can significantly improve the performance over baseline model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR (2015) Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR (2015)
2.
go back to reference Cheng, Y., Xu, W., He, Z., He, W., Wu, H., Sun, M., Liu, Y.: Semi-supervised learning for neural machine translation. In: Proceedings of ACL (2016) Cheng, Y., Xu, W., He, Z., He, W., Wu, H., Sun, M., Liu, Y.: Semi-supervised learning for neural machine translation. In: Proceedings of ACL (2016)
3.
go back to reference Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL (2005) Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL (2005)
4.
go back to reference Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning (2017). arXiv preprint: arXiv:1705.03122 Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning (2017). arXiv preprint: arXiv:​1705.​03122
5.
go back to reference He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: Proceedings of AAAI (2016) He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: Proceedings of AAAI (2016)
6.
go back to reference Herbrich, R.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers (2000) Herbrich, R.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers (2000)
7.
go back to reference Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of EMNLP (2013) Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of EMNLP (2013)
8.
go back to reference Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015) Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)
9.
go back to reference Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of ACL-NAACL (2003) Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of ACL-NAACL (2003)
10.
go back to reference Mi, H., Sankaran, B., Wang, Z., Ittycheriah, A.: A coverage embedding model for neural machine translation (2016). arXiv preprint: arXiv:1605.03148 Mi, H., Sankaran, B., Wang, Z., Ittycheriah, A.: A coverage embedding model for neural machine translation (2016). arXiv preprint: arXiv:​1605.​03148
11.
go back to reference Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL (2003) Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL (2003)
12.
go back to reference Sennrich, R., Birch, A., Currey, A., Germann, U., Haddow, B., Heafield, K., Barone, A.V.M., Williams, P.: The University of Edinburgh’s neural MT systems for WMT 2017 (2017). arXiv preprint: arXiv:1708.00726 Sennrich, R., Birch, A., Currey, A., Germann, U., Haddow, B., Heafield, K., Barone, A.V.M., Williams, P.: The University of Edinburgh’s neural MT systems for WMT 2017 (2017). arXiv preprint: arXiv:​1708.​00726
13.
go back to reference Sennrich, R., Haddow, B., Birch, A.: Edinburgh neural machine translation systems for WMT 2016 (2016). arXiv preprint: arXiv:1606.02891 Sennrich, R., Haddow, B., Birch, A.: Edinburgh neural machine translation systems for WMT 2016 (2016). arXiv preprint: arXiv:​1606.​02891
14.
go back to reference Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL (2016) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL (2016)
15.
go back to reference Shen, L., Sarkar, A., Och, F.J.: Discriminative reranking for machine translation. In: Proceedings of HLT-NAACL (2004) Shen, L., Sarkar, A., Och, F.J.: Discriminative reranking for machine translation. In: Proceedings of HLT-NAACL (2004)
16.
go back to reference Shen, S., Cheng, Y., He, Z., He, W., Wu, H., Sun, M., Liu, Y.: Minimum risk training for neural machine translation (2015) Shen, S., Cheng, Y., He, Z., He, W., Wu, H., Sun, M., Liu, Y.: Minimum risk training for neural machine translation (2015)
17.
18.
go back to reference Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS (2014) Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS (2014)
19.
go back to reference Tromble, R.W., Kumar, S., Och, F., Macherey, W.: Minimum Bayes-risk decoding for statistical machine translation. In: Proceedings of HLT-NAACL (2004) Tromble, R.W., Kumar, S., Och, F., Macherey, W.: Minimum Bayes-risk decoding for statistical machine translation. In: Proceedings of HLT-NAACL (2004)
20.
go back to reference Tu, Z., Lu, Z., Liu, Y., Liu, X., Li, H.: Modeling coverage for neural machine translation. In: Proceedings of ACL (2016) Tu, Z., Lu, Z., Liu, Y., Liu, X., Li, H.: Modeling coverage for neural machine translation. In: Proceedings of ACL (2016)
21.
go back to reference Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)
22.
go back to reference Wang, X., Lu, Z., Tu, Z., Li, H., Xiong, D., Zhang, M.: Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI (2017) Wang, X., Lu, Z., Tu, Z., Li, H., Xiong, D., Zhang, M.: Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI (2017)
23.
go back to reference Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016). arXiv preprint: arXiv:1609.08144 Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016). arXiv preprint: arXiv:​1609.​08144
24.
go back to reference Zhai, F., Zhang, J., Zhou, Y., Zong, C., et al.: Tree-based translation without using parse trees. In: Proceedings of COLING (2012) Zhai, F., Zhang, J., Zhou, Y., Zong, C., et al.: Tree-based translation without using parse trees. In: Proceedings of COLING (2012)
25.
go back to reference Zhang, J., Zong, C.: Exploiting source-side monolingual data in neural machine translation. In: Proceedings of EMNLP (2016) Zhang, J., Zong, C.: Exploiting source-side monolingual data in neural machine translation. In: Proceedings of EMNLP (2016)
26.
go back to reference Zhou, L., Hu, W., Zhang, J., Zong, C.: Neural system combination for machine translation. In: Proceedings of ACL (2017) Zhou, L., Hu, W., Zhang, J., Zong, C.: Neural system combination for machine translation. In: Proceedings of ACL (2017)
Metadata
Title
A Comparable Study on Model Averaging, Ensembling and Reranking in NMT
Authors
Yuchen Liu
Long Zhou
Yining Wang
Yang Zhao
Jiajun Zhang
Chengqing Zong
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-99501-4_26

Premium Partner