Skip to main content

2017 | OriginalPaper | Buchkapitel

RGraph: Generating Reference Graphs for Better Machine Translation Evaluation

verfasst von : Hongjie Ji, Shujian Huang, Qi Hou, Cunyan Yin, Jiajun Chen

Erschienen in: Machine Translation

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Statistical machine translation systems perform parameter learning (i.e. training) basing on automatic translation evaluation methods, which usually evaluate the translation quality according to one or more human-translated references. Although producing more references would improve the coverage of translation choices and lead to improved training performances, only several references are used due to the cost of human translation. In this paper, we propose automatic methods to explore the information among the limited references. By generating a reference graph (RGraph) from given references, we could automatically generate exponential number of references. These diverse references make it possible to better evaluate each individual translations, without using any other resources. Experiments showed that our RGraph could improve the evaluation performance and lead to better tuned machine translation systems. The method could be extended to improve the evaluation with single reference as well.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
including LDC2002E18, LDC2003E14, LDC2004E12, LDC2004T08, LDC2005T10, LDC2007T09.
 
Literatur
1.
Zurück zum Zitat Agarwal, A., Lavie, A.: Meteor, M-BLEU and M-TER: evaluation metrics for high-correlation with human rankings of machine translation output. In: Proceedings of the Third Workshop on Statistical Machine Translation, Columbus, Ohio, pp. 115–118. Association for Computational Linguistics (2008) Agarwal, A., Lavie, A.: Meteor, M-BLEU and M-TER: evaluation metrics for high-correlation with human rankings of machine translation output. In: Proceedings of the Third Workshop on Statistical Machine Translation, Columbus, Ohio, pp. 115–118. Association for Computational Linguistics (2008)
2.
Zurück zum Zitat Albrecht, J., Hwa, R.: Regression for sentence-level MT evaluation with pseudo references. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 296–303. Association for Computational Linguistics (2007) Albrecht, J., Hwa, R.: Regression for sentence-level MT evaluation with pseudo references. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 296–303. Association for Computational Linguistics (2007)
3.
Zurück zum Zitat Albrecht, J., Hwa, R.: The role of pseudo references in MT evaluation. In: Proceedings of the Third Workshop on Statistical Machine Translation, Columbus, Ohio, pp. 187–190. Association for Computational Linguistics, June 2008 Albrecht, J., Hwa, R.: The role of pseudo references in MT evaluation. In: Proceedings of the Third Workshop on Statistical Machine Translation, Columbus, Ohio, pp. 187–190. Association for Computational Linguistics, June 2008
4.
Zurück zum Zitat Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan, pp. 65–72. Association for Computational Linguistics (2005) Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan, pp. 65–72. Association for Computational Linguistics (2005)
5.
Zurück zum Zitat Bangalore, B., Bordel, G., Riccardi, G.: Computing consensus translation from multiple machine translation systems. In: 2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001, pp. 351–354 (2001) Bangalore, B., Bordel, G., Riccardi, G.: Computing consensus translation from multiple machine translation systems. In: 2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001, pp. 351–354 (2001)
6.
Zurück zum Zitat Chan, Y.S., Ng, H.T.: MaxSim: performance and effects of translation fluency. Mach. Transl. 23(2–3), 157–168 (2009)CrossRef Chan, Y.S., Ng, H.T.: MaxSim: performance and effects of translation fluency. Mach. Transl. 23(2–3), 157–168 (2009)CrossRef
7.
Zurück zum Zitat Chen, B., Kuhn, R., Larkin, S.: Port: a precision-order-recall MT evaluation metric for tuning. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 930–939. Association for Computational Linguistics (2012) Chen, B., Kuhn, R., Larkin, S.: Port: a precision-order-recall MT evaluation metric for tuning. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 930–939. Association for Computational Linguistics (2012)
8.
Zurück zum Zitat Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA, pp. 310–318. Association for Computational Linguistics (1996) Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA, pp. 310–318. Association for Computational Linguistics (1996)
9.
Zurück zum Zitat Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan, pp. 263–270. Association for Computational Linguistics (2005) Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan, pp. 263–270. Association for Computational Linguistics (2005)
10.
Zurück zum Zitat Clark, J.H., Dyer, C., Lavie, A., Smith, N.A.: Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 176–181. Association for Computational Linguistics (2011) Clark, J.H., Dyer, C., Lavie, A., Smith, N.A.: Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 176–181. Association for Computational Linguistics (2011)
11.
Zurück zum Zitat Denkowski, M., Lavie, A.: Meteor-next and the meteor paraphrase tables: improved evaluation support for five target languages. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 339–342. Association for Computational Linguistics (2010) Denkowski, M., Lavie, A.: Meteor-next and the meteor paraphrase tables: improved evaluation support for five target languages. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 339–342. Association for Computational Linguistics (2010)
12.
Zurück zum Zitat Denkowski, M., Lavie, A.: Meteor universal: language specific translation evaluation for any target language. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, Maryland, USA, pp. 376–380. Association for Computational Linguistics (2014) Denkowski, M., Lavie, A.: Meteor universal: language specific translation evaluation for any target language. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, Maryland, USA, pp. 376–380. Association for Computational Linguistics (2014)
13.
Zurück zum Zitat Du, J., Jiang, J., Way, A.: Facilitating translation using source language paraphrase lattices. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA, pp. 420–429. Association for Computational Linguistics (2010) Du, J., Jiang, J., Way, A.: Facilitating translation using source language paraphrase lattices. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA, pp. 420–429. Association for Computational Linguistics (2010)
14.
Zurück zum Zitat Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1105–1113. Association for Computational Linguistics (2009) Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1105–1113. Association for Computational Linguistics (2009)
15.
Zurück zum Zitat Han, A.L.F., Wong, D.F., Chao, L.S.: LEPOR: a robust evaluation metric for machine translation with augmented factors, pp. 441–450 (2012) Han, A.L.F., Wong, D.F., Chao, L.S.: LEPOR: a robust evaluation metric for machine translation with augmented factors, pp. 441–450 (2012)
16.
Zurück zum Zitat Hopkins, M., May, J.: Tuning as ranking. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 1352–1362. Association for Computational Linguistics (2011) Hopkins, M., May, J.: Tuning as ranking. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 1352–1362. Association for Computational Linguistics (2011)
17.
Zurück zum Zitat Hovy, E.: Toward finely differentiated evaluation metrics for machine translation. In: Proceedings of the EAGLES Workshop on Standards and Evaluation, pp. 127–133 (1999) Hovy, E.: Toward finely differentiated evaluation metrics for machine translation. In: Proceedings of the EAGLES Workshop on Standards and Evaluation, pp. 127–133 (1999)
18.
Zurück zum Zitat Jayaraman, S., Lavie, A.: Multi-engine machine translation guided by explicit word matching. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, Ann Arbor, Michigan, pp. 101–104. Association for Computational Linguistics (2005) Jayaraman, S., Lavie, A.: Multi-engine machine translation guided by explicit word matching. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, Ann Arbor, Michigan, pp. 101–104. Association for Computational Linguistics (2005)
19.
Zurück zum Zitat Jiang, J., Du, J., Way, A.: Incorporating source-language paraphrases into phrase-based SMT with confusion networks. In: Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation, Portland, Oregon, USA, pp. 31–40. Association for Computational Linguistics (2011) Jiang, J., Du, J., Way, A.: Incorporating source-language paraphrases into phrase-based SMT with confusion networks. In: Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation, Portland, Oregon, USA, pp. 31–40. Association for Computational Linguistics (2011)
20.
Zurück zum Zitat Leusch, G., Ueffing, N., Ney, H.: A novel string-to-string distance measure with applications to machine translation evaluation. In: MT Summit IX, New Orleans, LA, pp. 240–247 (2003) Leusch, G., Ueffing, N., Ney, H.: A novel string-to-string distance measure with applications to machine translation evaluation. In: MT Summit IX, New Orleans, LA, pp. 240–247 (2003)
21.
Zurück zum Zitat Leusch, G., Ueffing, N., Ney, H.: CDER: efficient MT evaluation using block movements. In: 11th Conference of the European Chapter of the Association for Computational Linguistics (2006) Leusch, G., Ueffing, N., Ney, H.: CDER: efficient MT evaluation using block movements. In: 11th Conference of the European Chapter of the Association for Computational Linguistics (2006)
22.
Zurück zum Zitat Li, L., Yu, H., Liu, Q.: MT tuning on RED: a dependency-based evaluation metric. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, pp. 428–433. Association for Computational Linguistics (2015) Li, L., Yu, H., Liu, Q.: MT tuning on RED: a dependency-based evaluation metric. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, pp. 428–433. Association for Computational Linguistics (2015)
23.
Zurück zum Zitat Liu, C., Dahlmeier, D., Ng, H.T.: TESLA: translation evaluation of sentences with linear-programming-based analysis. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 354–359. Association for Computational Linguistics (2010) Liu, C., Dahlmeier, D., Ng, H.T.: TESLA: translation evaluation of sentences with linear-programming-based analysis. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 354–359. Association for Computational Linguistics (2010)
24.
Zurück zum Zitat Mariño, J.B., Banchs, R.E., Crego, J.M., de Gispert, A., Lambert, P., Fonollosa, J.A.R., Costa-jussà, M.R.: N-gram-based machine translation. Comput. Linguist. 32(4), 527–549 (2006)CrossRefMATHMathSciNet Mariño, J.B., Banchs, R.E., Crego, J.M., de Gispert, A., Lambert, P., Fonollosa, J.A.R., Costa-jussà, M.R.: N-gram-based machine translation. Comput. Linguist. 32(4), 527–549 (2006)CrossRefMATHMathSciNet
25.
Zurück zum Zitat Matusov, E., Ueffing, N., Ney, H.: Computing consensus translation for multiple machine translation systems using enhanced hypothesis alignment. In: 11th Conference of the European Chapter of the Association for Computational Linguistics (2006) Matusov, E., Ueffing, N., Ney, H.: Computing consensus translation for multiple machine translation systems using enhanced hypothesis alignment. In: 11th Conference of the European Chapter of the Association for Computational Linguistics (2006)
26.
Zurück zum Zitat Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL 2003, Sapporo, Japan, pp. 160–167. Association for Computational Linguistics (2003) Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL 2003, Sapporo, Japan, pp. 160–167. Association for Computational Linguistics (2003)
27.
Zurück zum Zitat Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRefMATH Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRefMATH
28.
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002, Philadelphia, Pennsylvania, USA, pp. 311–318. Association for Computational Linguistics (2002) Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002, Philadelphia, Pennsylvania, USA, pp. 311–318. Association for Computational Linguistics (2002)
29.
Zurück zum Zitat Pavlick, E., Rastogi, P., Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 425–430. Association for Computational Linguistics (2015). http://www.aclweb.org/anthology/P15-2070 Pavlick, E., Rastogi, P., Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 425–430. Association for Computational Linguistics (2015). http://​www.​aclweb.​org/​anthology/​P15-2070
30.
Zurück zum Zitat Qin, Y., Specia, L.: Truly exploring multiple references for machine translation evaluation. In: 18th Annual Conference of the European Association for Machine Translation, EAMT, Antalya, Turkey (2015) Qin, Y., Specia, L.: Truly exploring multiple references for machine translation evaluation. In: 18th Annual Conference of the European Association for Machine Translation, EAMT, Antalya, Turkey (2015)
31.
Zurück zum Zitat Snover, M., Dorr, B.J., Schwartz, R.: A study of translation edit rate with targeted human annotation. In: Proceedings of AMTA (2006) Snover, M., Dorr, B.J., Schwartz, R.: A study of translation edit rate with targeted human annotation. In: Proceedings of AMTA (2006)
32.
Zurück zum Zitat Snover, M.G., Madnani, N., Dorr, B., Schwartz, R.: TER-Plus: paraphrase, semantic, and alignment enhancements to translation edit rate. Mach. Transl. 23(2), 117–127 (2009)CrossRef Snover, M.G., Madnani, N., Dorr, B., Schwartz, R.: TER-Plus: paraphrase, semantic, and alignment enhancements to translation edit rate. Mach. Transl. 23(2), 117–127 (2009)CrossRef
Metadaten
Titel
RGraph: Generating Reference Graphs for Better Machine Translation Evaluation
verfasst von
Hongjie Ji
Shujian Huang
Qi Hou
Cunyan Yin
Jiajun Chen
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7134-8_6