Skip to main content

2017 | OriginalPaper | Buchkapitel

ENTF: An Entropy-Based MT Evaluation Metric

verfasst von : Hui Yu, Weizhi Xu, Shouxun Lin, Qun Liu

Erschienen in: Machine Translation

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The widely-used automatic evaluation metrics cannot adequately reflect the fluency of the translations. The n-gram-based metrics, like BLEU, limit the maximum length of matched fragments to n and cannot catch the matched fragments longer than n, so they can only reflect the fluency indirectly. METEOR, which is not limited by n-gram, uses the number of matched chunks but it does not consider the length of each chunk. In this paper, we propose an entropy-based metric (ENTF), which can sufficiently reflect the fluency of translations through the distribution of matched words. To evaluate the accuracy, we also introduce the unigram F-score into the new metric. Experiment shows that ENTF obtains state-of-the-art performance on system level, and is comparable with METEOR on sentence level on into English direction on WMT 2012, WMT 2013 and WMT 2014.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chan, Y.S., Ng, H.T.: Maxsim: a maximum similarity metric for machine translation evaluation. In: Proceedings of ACL 2008: HLT, pp. 55–62 (2008) Chan, Y.S., Ng, H.T.: Maxsim: a maximum similarity metric for machine translation evaluation. In: Proceedings of ACL 2008: HLT, pp. 55–62 (2008)
2.
Zurück zum Zitat Chen, B., Kuhn, R.: Amber: a modified bleu, enhanced ranking metric. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 71–77. Association for Computational Linguistics, Edinburgh, Scotland, July 2011. http://www.aclweb.org/anthology/W11-2105 Chen, B., Kuhn, R.: Amber: a modified bleu, enhanced ranking metric. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 71–77. Association for Computational Linguistics, Edinburgh, Scotland, July 2011. http://​www.​aclweb.​org/​anthology/​W11-2105
4.
Zurück zum Zitat Comelles, E., Atserias, J.: Verta participation in the WMT14 metrics task. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 368–375. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3347 Comelles, E., Atserias, J.: Verta participation in the WMT14 metrics task. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 368–375. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://​www.​aclweb.​org/​anthology/​W14-3347
5.
Zurück zum Zitat Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, HLT 2002, pp. 138–145. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002). http://dl.acm.org/citation.cfm?id=1289189.1289273 Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, HLT 2002, pp. 138–145. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002). http://​dl.​acm.​org/​citation.​cfm?​id=​1289189.​1289273
6.
Zurück zum Zitat Gautam, S., Bhattacharyya, P.: Layered: metric for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 387–393. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3350 Gautam, S., Bhattacharyya, P.: Layered: metric for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 387–393. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://​www.​aclweb.​org/​anthology/​W14-3350
7.
Zurück zum Zitat Gonzàlez, M., Barrón-Cedeño, A., Màrquez, L.: Ipa and stout: leveraging linguistic and source-based features for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 394–401. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3351 Gonzàlez, M., Barrón-Cedeño, A., Màrquez, L.: Ipa and stout: leveraging linguistic and source-based features for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 394–401. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://​www.​aclweb.​org/​anthology/​W14-3351
8.
Zurück zum Zitat Joty, S., Guzmán, F., Màrquez, L., Nakov, P.: Discotk: using discourse structure for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 402–408. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://www.aclweb.org/anthology/W14-3352 Joty, S., Guzmán, F., Màrquez, L., Nakov, P.: Discotk: using discourse structure for machine translation evaluation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 402–408. Association for Computational Linguistics, Baltimore, Maryland, USA, June 2014. http://​www.​aclweb.​org/​anthology/​W14-3352
9.
10.
Zurück zum Zitat Lavie, A., Agarwal, A.: Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation. StatMT 2007, pp. 228–231. Association for Computational Linguistics, Stroudsburg, PA, USA (2007). http://dl.acm.org/citation.cfm?id=1626355.1626389 Lavie, A., Agarwal, A.: Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation. StatMT 2007, pp. 228–231. Association for Computational Linguistics, Stroudsburg, PA, USA (2007). http://​dl.​acm.​org/​citation.​cfm?​id=​1626355.​1626389
11.
Zurück zum Zitat Liu, D., Gildea, D.: Syntactic features for evaluation of machine translation. J. Colloid Interface Sci. 332(2), 291–297 (2005) Liu, D., Gildea, D.: Syntactic features for evaluation of machine translation. J. Colloid Interface Sci. 332(2), 291–297 (2005)
12.
Zurück zum Zitat Lo, C.k., Wu, D.: Meant: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 220–229. Association for Computational Linguistics, Portland, Oregon, USA, June 2011. http://www.aclweb.org/anthology/P11-1023 Lo, C.k., Wu, D.: Meant: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 220–229. Association for Computational Linguistics, Portland, Oregon, USA, June 2011. http://​www.​aclweb.​org/​anthology/​P11-1023
13.
Zurück zum Zitat Macháček, M., Bojar, O.: Approximating a deep-syntactic metric for MT evaluation and tuning. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 92–98. Association for Computational Linguistics (2011) Macháček, M., Bojar, O.: Approximating a deep-syntactic metric for MT evaluation and tuning. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 92–98. Association for Computational Linguistics (2011)
14.
Zurück zum Zitat Mehay, D., Brew, C.: BLEUÂTRE: flattening syntactic dependencies for MT evaluation. In: Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI) (2007) Mehay, D., Brew, C.: BLEUÂTRE: flattening syntactic dependencies for MT evaluation. In: Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI) (2007)
15.
Zurück zum Zitat Och, F.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003) Och, F.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003)
16.
17.
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311–318. Association for Computational Linguistics (2002) Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311–318. Association for Computational Linguistics (2002)
18.
Zurück zum Zitat Pirie, W.: Spearman rank correlation coefficient. Encyclopedia of statistical sciences (1988) Pirie, W.: Spearman rank correlation coefficient. Encyclopedia of statistical sciences (1988)
19.
Zurück zum Zitat Porter, M.F.: Snowball: a language for stemming algorithms (2001) Porter, M.F.: Snowball: a language for stemming algorithms (2001)
21.
Zurück zum Zitat Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association For Machine Translation in the Americas, pp. 223–231 (2006) Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association For Machine Translation in the Americas, pp. 223–231 (2006)
22.
Zurück zum Zitat Zhu, J., Yang, M., Wang, B., Li, S., Zhao, T.: All in strings: a powerful string-based automatic mt evaluation metric with multiple granularities. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 1533–1540. Association for Computational Linguistics, Stroudsburg, PA, USA (2010). http://dl.acm.org/citation.cfm?id=1944566.1944741 Zhu, J., Yang, M., Wang, B., Li, S., Zhao, T.: All in strings: a powerful string-based automatic mt evaluation metric with multiple granularities. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 1533–1540. Association for Computational Linguistics, Stroudsburg, PA, USA (2010). http://​dl.​acm.​org/​citation.​cfm?​id=​1944566.​1944741
Metadaten
Titel
ENTF: An Entropy-Based MT Evaluation Metric
verfasst von
Hui Yu
Weizhi Xu
Shouxun Lin
Qun Liu
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7134-8_7