Skip to main content
Top

2021 | OriginalPaper | Chapter

Improving Word Alignment with Contextualized Embedding and Bilingual Dictionary

Authors : Minhan Xu, Yu Hong

Published in: Big Data

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Word alignment is a natural language processing task that identifies the relationship of the among words of multiword units in a bitext. Large pre-trained models can generate significantly improved contextual word embedding. However, Statistical methods are still preferred choices. In this paper, we utilize bilingual dictionaries and contextualized word embeddings generated by pre-trained models in word alignment. We use statistical methods to generate rough alignment first, then use bilingual dictionaries to modify the alignment to make it more accurate. We use this alignment as training data and leverage this in training to optimize alignment. We demonstrate that our approach produces better or comparable performance compared to statistical approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014) Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:​1409.​0473 (2014)
2.
go back to reference Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993) Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)
3.
go back to reference Chen, W., Matusov, E., Khadivi, S., Peter, J.T.: Guided alignment training for topic-aware neural machine translation. arXiv preprint arXiv:1607.01628 (2016) Chen, W., Matusov, E., Khadivi, S., Peter, J.T.: Guided alignment training for topic-aware neural machine translation. arXiv preprint arXiv:​1607.​01628 (2016)
4.
go back to reference Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017) Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. arXiv preprint arXiv:​1710.​04087 (2017)
5.
go back to reference Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)MathSciNetMATH Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)MathSciNetMATH
6.
go back to reference Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805 (2018)
7.
go back to reference Dyer, C., Chahuneau, V., Smith, N.A.: A simple, fast, and effective reparameterization of IBM model 2. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 644–648 (2013) Dyer, C., Chahuneau, V., Smith, N.A.: A simple, fast, and effective reparameterization of IBM model 2. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 644–648 (2013)
8.
go back to reference Gal, Y., Blunsom, P.: A systematic Bayesian treatment of the IBM alignment models. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 969–977 (2013) Gal, Y., Blunsom, P.: A systematic Bayesian treatment of the IBM alignment models. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 969–977 (2013)
9.
go back to reference Garg, S., Peitz, S., Nallasamy, U., Paulik, M.: Jointly learning to align and translate with transformer models. arXiv preprint arXiv:1909.02074 (2019) Garg, S., Peitz, S., Nallasamy, U., Paulik, M.: Jointly learning to align and translate with transformer models. arXiv preprint arXiv:​1909.​02074 (2019)
10.
11.
go back to reference Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2009) Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2009)
13.
go back to reference Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. UNiversity Of Southern California Marina Del Rey Information Sciences Inst, Technical Report (2003) Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. UNiversity Of Southern California Marina Del Rey Information Sciences Inst, Technical Report (2003)
14.
go back to reference Li, X., Liu, L., Tu, Z., Shi, S., Meng, M.: Target foresight based attention for neural machine translation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long Papers), vol. 1, pp. 1380–1390 (2018) Li, X., Liu, L., Tu, Z., Shi, S., Meng, M.: Target foresight based attention for neural machine translation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long Papers), vol. 1, pp. 1380–1390 (2018)
15.
go back to reference Liu, Y., Liu, Q., Lin, S.: Log-linear models for word alignment. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 459–466 (2005) Liu, Y., Liu, Q., Lin, S.: Log-linear models for word alignment. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 459–466 (2005)
16.
go back to reference Martin, L., et al.: CamemBERT: a tasty French language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020) Martin, L., et al.: CamemBERT: a tasty French language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
17.
go back to reference Mermer, C., Saraçlar, M.: Bayesian word alignment for statistical machine translation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 182–187 (2011) Mermer, C., Saraçlar, M.: Bayesian word alignment for statistical machine translation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 182–187 (2011)
18.
go back to reference Mermer, C., Saraçlar, M., Sarikaya, R.: Improving statistical machine translation using bayesian word alignment and gibbs sampling. IEEE Trans. Audio Speech Lang. Proces. 21(5), 1090–1101 (2013)CrossRef Mermer, C., Saraçlar, M., Sarikaya, R.: Improving statistical machine translation using bayesian word alignment and gibbs sampling. IEEE Trans. Audio Speech Lang. Proces. 21(5), 1090–1101 (2013)CrossRef
19.
go back to reference Mihalcea, R., Pedersen, T.: An evaluation exercise for word alignment. In: Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, pp. 1–10 (2003) Mihalcea, R., Pedersen, T.: An evaluation exercise for word alignment. In: Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, pp. 1–10 (2003)
20.
go back to reference Nagata, M., Katsuki, C., Nishino, M.: A supervised word alignment method based on cross-language span prediction using multilingual bert. arXiv preprint arXiv:2004.14516 (2020) Nagata, M., Katsuki, C., Nishino, M.: A supervised word alignment method based on cross-language span prediction using multilingual bert. arXiv preprint arXiv:​2004.​14516 (2020)
21.
go back to reference Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRef Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRef
22.
go back to reference Östling, R.: Bayesian models for multilingual word alignment. Ph.D. thesis, Department of Linguistics, Stockholm University (2015) Östling, R.: Bayesian models for multilingual word alignment. Ph.D. thesis, Department of Linguistics, Stockholm University (2015)
23.
go back to reference Östling, R., Tiedemann, J.: Efficient word alignment with markov chain monte carlo. Prague Bull. Math. Linguist. 106(1), 125–146 (2016)CrossRef Östling, R., Tiedemann, J.: Efficient word alignment with markov chain monte carlo. Prague Bull. Math. Linguist. 106(1), 125–146 (2016)CrossRef
24.
go back to reference Peter, J.T., Nix, A., Ney, H.: Generating alignments using target foresight in attention-based neural machine translation. Prague Bull. Math. Linguist. 108(1), 27–36 (2017)CrossRef Peter, J.T., Nix, A., Ney, H.: Generating alignments using target foresight in attention-based neural machine translation. Prague Bull. Math. Linguist. 108(1), 27–36 (2017)CrossRef
25.
go back to reference Pitman, J., Yor, M.: The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Ann. Probab. 855–900 (1997) Pitman, J., Yor, M.: The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Ann. Probab. 855–900 (1997)
26.
go back to reference Riley, D., Gildea, D.: Improving the IBM alignment models using variational bayes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Short Papers), vol. 2, pp. 306–310 (2012) Riley, D., Gildea, D.: Improving the IBM alignment models using variational bayes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Short Papers), vol. 2, pp. 306–310 (2012)
27.
go back to reference Sabet, M.J., Dufter, P., Schütze, H.: SimAlign: high quality word alignments without parallel training data using static and contextualized embeddings. arXiv preprint arXiv:2004.08728 (2020) Sabet, M.J., Dufter, P., Schütze, H.: SimAlign: high quality word alignments without parallel training data using static and contextualized embeddings. arXiv preprint arXiv:​2004.​08728 (2020)
28.
go back to reference Zenkel, T., Wuebker, J., DeNero, J.: Adding interpretable attention to neural translation models improves word alignment. arXiv preprint arXiv:1901.11359 (2019) Zenkel, T., Wuebker, J., DeNero, J.: Adding interpretable attention to neural translation models improves word alignment. arXiv preprint arXiv:​1901.​11359 (2019)
Metadata
Title
Improving Word Alignment with Contextualized Embedding and Bilingual Dictionary
Authors
Minhan Xu
Yu Hong
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-16-0705-9_13

Premium Partner