nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

ENTF: An Entropy-Based MT Evaluation Metric

verfasst von : Hui Yu, Weizhi Xu, Shouxun Lin, Qun Liu

Erschienen in: Machine Translation

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The widely-used automatic evaluation metrics cannot adequately reflect the fluency of the translations. The n-gram-based metrics, like BLEU, limit the maximum length of matched fragments to n and cannot catch the matched fragments longer than n, so they can only reflect the fluency indirectly. METEOR, which is not limited by n-gram, uses the number of matched chunks but it does not consider the length of each chunk. In this paper, we propose an entropy-based metric (ENTF), which can sufficiently reflect the fluency of translations through the distribution of matched words. To evaluate the accuracy, we also introduce the unigram F-score into the new metric. Experiment shows that ENTF obtains state-of-the-art performance on system level, and is comparable with METEOR on sentence level on into English direction on WMT 2012, WMT 2013 and WMT 2014.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel RGraph: Generating Reference Graphs for Better Machine Translation Evaluation

Nächstes Kapitel Translation Oriented Sentence Level Collocation Identification and Extraction

The words in each chunk are in adjacent positions in the hypothesis, and are also mapped to unigrams that are in adjacent positions in the reference.

http://www.cs.cmu.edu/~alavie/METEOR/.

http://wordnet.princeton.edu/.

ftp://jaguar.ncsl.nist.gov/mt/resources/mteval-v13a.pl.

http://www.cs.umd.edu/~snover/tercom.

http://www.cs.cmu.edu/~alavie/METEOR/download/meteor-1.4.tgz.

ftp://jaguar.ncsl.nist.gov/mt/resources/mteval-v13a.pl.

http://www.cs.cmu.edu/~alavie/METEOR/download/meteor-1.4.tgz.

Chan, Y.S., Ng, H.T.: Maxsim: a maximum similarity metric for machine translation evaluation. In: Proceedings of ACL 2008: HLT, pp. 55–62 (2008)

Chen, B., Kuhn, R.: Amber: a modified bleu, enhanced ranking metric. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 71–77. Association for Computational Linguistics, Edinburgh, Scotland, July 2011. http://www.aclweb.org/anthology/W11-2105

Chen, B., Kuhn, R., Foster, G.: Improving amber, an MT evaluation metric. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, WMT 2012, pp. 59–63. Association for Computational Linguistics, Stroudsburg, PA, USA (2012). http://dl.acm.org/citation.cfm?id=2393015.2393021

Comelles, E., Atserias, J.: Verta participation in the WMT14 metrics task. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 368–375. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3347

Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, HLT 2002, pp. 138–145. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002). http://dl.acm.org/citation.cfm?id=1289189.1289273

Gautam, S., Bhattacharyya, P.: Layered: metric for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 387–393. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3350

Gonzàlez, M., Barrón-Cedeño, A., Màrquez, L.: Ipa and stout: leveraging linguistic and source-based features for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 394–401. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3351

Joty, S., Guzmán, F., Màrquez, L., Nakov, P.: Discotk: using discourse structure for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 402–408. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3352

Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2), 81–93 (1938)CrossRefMATH

10.

Lavie, A., Agarwal, A.: Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation. StatMT 2007, pp. 228–231. Association for Computational Linguistics, Stroudsburg, PA, USA (2007). http://dl.acm.org/citation.cfm?id=1626355.1626389

11.

Liu, D., Gildea, D.: Syntactic features for evaluation of machine translation. J. Colloid Interface Sci. 332(2), 291–297 (2005)

12.

Lo, C.k., Wu, D.: Meant: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 220–229. Association for Computational Linguistics, Portland, Oregon, USA, June 2011. http://www.aclweb.org/anthology/P11-1023

13.

Macháček, M., Bojar, O.: Approximating a deep-syntactic metric for MT evaluation and tuning. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 92–98. Association for Computational Linguistics (2011)

14.

Mehay, D., Brew, C.: BLEUÂTRE: flattening syntactic dependencies for MT evaluation. In: Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI) (2007)

15.

Och, F.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003)

16.

Owczarzak, K., van Genabith, J., Way, A.: Labelled dependencies in machine translation evaluation. In: Proceedings of the Second Workshop on Statistical Machine Translation, StatMT 2007, pp. 104–111. Association for Computational Linguistics, Stroudsburg, PA, USA (2007). http://dl.acm.org/citation.cfm?id=1626355.1626369

17.

Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311–318. Association for Computational Linguistics (2002)

18.

Pirie, W.: Spearman rank correlation coefficient. Encyclopedia of statistical sciences (1988)

19.

Porter, M.F.: Snowball: a language for stemming algorithms (2001)

20.

Shannon, C.E.: Communication theory of secrecy systems*. Bell Syst. Tech. J. 28(4), 656–715 (1949)CrossRefMATHMathSciNet

21.

Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association For Machine Translation in the Americas, pp. 223–231 (2006)

22.

Zhu, J., Yang, M., Wang, B., Li, S., Zhao, T.: All in strings: a powerful string-based automatic mt evaluation metric with multiple granularities. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 1533–1540. Association for Computational Linguistics, Stroudsburg, PA, USA (2010). http://dl.acm.org/citation.cfm?id=1944566.1944741

Titel: ENTF: An Entropy-Based MT Evaluation Metric
verfasst von: Hui Yu
Weizhi Xu
Shouxun Lin
Qun Liu
Verlag: Springer Singapore
Buch: Machine Translation
Print ISBN: 978-981-10-7133-1

Electronic ISBN: 978-981-10-7134-8

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-981-10-7134-8_7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"