Skip to main content
Top
Published in: Neural Computing and Applications 8/2021

04-08-2020 | Review

An automatic evaluation metric for Ancient-Modern Chinese translation

Authors: Kexin Yang, Dayiheng Liu, Qian Qu, Yongsheng Sang, Jiancheng Lv

Published in: Neural Computing and Applications | Issue 8/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

As a written language used for thousands of years, Ancient Chinese has some special characteristics like complex semantics as polysemy and the one-to-many alignment with Modern Chinese. Thus it may be translated in a large number of fully different but equally correct ways. In the absence of multiple references, reference-dependent evaluations like Bilingual Evaluation Understudy (BLEU) cannot identify potentially correct translation results. The explore on automatic evaluation of Ancient-Modern Chinese Translation is completely lacking. In this paper, we proposed an automatic evaluation metric for Ancient-Modern Chinese Translation called DTE (Dual-based Translation Evaluation), which can be used to evaluate one-to-many alignment in the absence of multiple references. When using DTE to evaluate, we found that the proper nouns often could not be correctly translated. Hence, we designed a new word segmentation method to improve the translation of proper nouns without increasing the size of the model vocabulary. Experiments show that DTE outperforms several general evaluations in terms of similarity to the evaluation of human experts. Meanwhile, the new word segmentation method promotes the Ancient-Modern Chinese translation models perform better on proper nouns’ translation, and get higher scores on both BLEU and DTE.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
1
In Experiment section, we make length statistics for Ancient-Modern Chinese corpus in Table 1. The average length of the augmented sentences is about a dozen words and the original sentences are shorter than that.
 
2
In order to ensure an objective comparison, the English translations of all pictures are literal translation without modification.
 
3
’ (elegance and delicacy) is also an evaluation requirement in this theory. Since this evaluation is very subjective and our task is to translate narrative Ancient Chinese which records facts of Chinese history, we ignored it here.
 
4
Two sentences are the ancient input sentence and the retranslation sentence from the symmetrical Modern-Ancient Chinese translation model.
 
5
We collected this special dictionary to include people names, place names, and some proper nouns that often appear in ancient China, containing about 6000 words.
 
6
Most of the translations for classical poems are subjective, with wide variations between different versions and requiring a lot of additional explanation.
 
7
A clause is a sentence that is obtained by dividing a sentence into fragments when meeting commas, semicolons, periods, exclamation marks and question marks.
 
9
The weights are determined by where the options are arranged. For example, we have three options for ranking, where the first position has a weight of 3, the second position has a weight of 2, and the third position has a weight of 1.
 
10
Similar to the way that converting the human expert ranking results into scores, we ranked the three candidate sentences from high to low in the automatic ranking method. After that, the first one gets 3 points, the second one gets 2 points and the third one gets 1 point, thus converting the ranking results into discrete scores.
 
Literature
1.
go back to reference Agarwal A, Lavie A (2008) Meteor, m-Bleu and m-Ter: evaluation metrics for high-correlation with human rankings of machine translation output. In: WMT-08, ACL Agarwal A, Lavie A (2008) Meteor, m-Bleu and m-Ter: evaluation metrics for high-correlation with human rankings of machine translation output. In: WMT-08, ACL
2.
go back to reference Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: ICLR Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: ICLR
3.
go back to reference Callison-Burch C, Osborne M, Koehn P (2006) Re-evaluation the role of bleu in machine translation research. In: EACL Callison-Burch C, Osborne M, Koehn P (2006) Re-evaluation the role of bleu in machine translation research. In: EACL
4.
go back to reference Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2007) (meta-) Evaluation of machine translation. In: WMT-07, ACL Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2007) (meta-) Evaluation of machine translation. In: WMT-07, ACL
5.
go back to reference Chang PC, Galley M, Manning CD (2008) Optimizing Chinese word segmentation for machine translation performance. In: WMT-08, ACL Chang PC, Galley M, Manning CD (2008) Optimizing Chinese word segmentation for machine translation performance. In: WMT-08, ACL
6.
go back to reference Cheng Y, Tu Z, Meng F, Zhai J, Liu Y (2018) Towards robust neural machine translation. In: ACL Cheng Y, Tu Z, Meng F, Zhai J, Liu Y (2018) Towards robust neural machine translation. In: ACL
7.
go back to reference Fu Z, Tan X, Peng N, Zhao D, Yan R (2018) Style transfer in text: exploration and evaluation. In: AAAI Fu Z, Tan X, Peng N, Zhao D, Yan R (2018) Style transfer in text: exploration and evaluation. In: AAAI
8.
go back to reference Gomaa WH, Fahmy AA (2013) A survey of text similarity approaches. Int J Comput Appl 68(13):13–18 Gomaa WH, Fahmy AA (2013) A survey of text similarity approaches. Int J Comput Appl 68(13):13–18
9.
go back to reference He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: NIPS He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: NIPS
10.
go back to reference Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: EACL Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: EACL
11.
go back to reference Lample G, Conneau A, Denoyer L, Ranzato M (2018) Unsupervised machine translation using monolingual corpora only. In: ICLR Lample G, Conneau A, Denoyer L, Ranzato M (2018) Unsupervised machine translation using monolingual corpora only. In: ICLR
12.
go back to reference Lavie A, Agarwal A (2007) Meteor: an automatic metric for mt evaluation with high levels of correlation with human judgments. In: WMT-07, ACL Lavie A, Agarwal A (2007) Meteor: an automatic metric for mt evaluation with high levels of correlation with human judgments. In: WMT-07, ACL
13.
go back to reference Li J, Song Y, Zhang H, Chen D, Shi S, Zhao D, Yan R (2018) Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In: EMNLP Li J, Song Y, Zhang H, Chen D, Shi S, Zhao D, Yan R (2018) Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In: EMNLP
14.
go back to reference Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out
15.
go back to reference Liu D, Fu J, Qu Q, Lv J (2018) Bfgan: backward and forward generative adversarial networks for lexically constrained sentence generation. ArXiv preprint arXiv:180608097 Liu D, Fu J, Qu Q, Lv J (2018) Bfgan: backward and forward generative adversarial networks for lexically constrained sentence generation. ArXiv preprint arXiv:​180608097
16.
go back to reference Liu D, Yang K, Qu Q, Lv J (2019) Ancient-modern Chinese translation with a new large training dataset. In: TALLIP Liu D, Yang K, Qu Q, Lv J (2019) Ancient-modern Chinese translation with a new large training dataset. In: TALLIP
17.
go back to reference Liu D, Yang X, He F, Chen Y, Lv J (2019b) mu-forcing: Training variational recurrent autoencoders for text generation. ArXiv preprint arXiv:190510072 Liu D, Yang X, He F, Chen Y, Lv J (2019b) mu-forcing: Training variational recurrent autoencoders for text generation. ArXiv preprint arXiv:​190510072
18.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS
19.
go back to reference Mitchell J, Lapata M (2008) Vector-based models of semantic composition. In: ACL Mitchell J, Lapata M (2008) Vector-based models of semantic composition. In: ACL
20.
go back to reference Nakov P, Guzman F, Vogel S (2012) Optimizing for sentence-level bleu+ 1 yields short translations. In: COLING Nakov P, Guzman F, Vogel S (2012) Optimizing for sentence-level bleu+ 1 yields short translations. In: COLING
21.
go back to reference Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: ACL Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: ACL
22.
go back to reference Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: EMNLP Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: EMNLP
23.
go back to reference Scwartz B (2009) In search of wealth and power: Yen Fu and the West. Harvard University Press, HarvardCrossRef Scwartz B (2009) In search of wealth and power: Yen Fu and the West. Harvard University Press, HarvardCrossRef
24.
25.
go back to reference Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: AMTA Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: AMTA
26.
go back to reference Su J, Zeng J, Xiong D, Liu Y, Wang M, Xie J (2018) A hierarchy-to-sequence attentional neural machine translation model. In: TASLP Su J, Zeng J, Xiong D, Liu Y, Wang M, Xie J (2018) A hierarchy-to-sequence attentional neural machine translation model. In: TASLP
27.
go back to reference Sundermeyer M, Alkhouli T, Wuebker J, Ney H (2014) Translation modeling with bidirectional recurrent neural networks. In: EMNLP Sundermeyer M, Alkhouli T, Wuebker J, Ney H (2014) Translation modeling with bidirectional recurrent neural networks. In: EMNLP
28.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: NIPS Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: NIPS
29.
go back to reference Wieting J, Bansal M, Gimpel K, Livescu K (2016) Towards universal paraphrastic sentence embeddings. In: ICLR Wieting J, Bansal M, Gimpel K, Livescu K (2016) Towards universal paraphrastic sentence embeddings. In: ICLR
30.
go back to reference Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient. In: AAAI Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient. In: AAAI
31.
go back to reference Zhang H, Li J, Ji Y, Yue H (2016) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Indus Inf 13(2):616–624CrossRef Zhang H, Li J, Ji Y, Yue H (2016) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Indus Inf 13(2):616–624CrossRef
32.
33.
34.
go back to reference Zhang WX, Qiu LK, Song ZY, Chen B (2012) Corpus-based quantitative analysis on stylistic difference of chinese synonyms. Chin Lang Learn 3:72–80 Zhang WX, Qiu LK, Song ZY, Chen B (2012) Corpus-based quantitative analysis on stylistic difference of chinese synonyms. Chin Lang Learn 3:72–80
35.
go back to reference Zhang Z, Li W, Sun X (2018) Automatic transferring between ancient chinese and contemporary chinese. ArXiv preprint arXiv:180301557 Zhang Z, Li W, Sun X (2018) Automatic transferring between ancient chinese and contemporary chinese. ArXiv preprint arXiv:​180301557
Metadata
Title
An automatic evaluation metric for Ancient-Modern Chinese translation
Authors
Kexin Yang
Dayiheng Liu
Qian Qu
Yongsheng Sang
Jiancheng Lv
Publication date
04-08-2020
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 8/2021
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05216-8

Other articles of this Issue 8/2021

Neural Computing and Applications 8/2021 Go to the issue

Premium Partner