Skip to main content

2015 | OriginalPaper | Buchkapitel

Quality Estimation of MT-Engine Output Using Language Models for Post-editing and Their Comparative Study

verfasst von : Kuldeep Kumar Yogi, Nishith Joshi, Chandra Kumar Jha

Erschienen in: Information Systems Design and Intelligent Applications

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Machine Translation (MT) systems do not have real-world knowledge or contextual awareness. MT errors are possible at any level: lexical, grammatical, syntactic, etc., MT systems give 10–70 % accurate output, so human post-editing(HPE) is required for final output. But HPE is very expensive and slow, if we can filter out good translations out of all translations, those can make correct via miner edits then our HPE would be fast and less expensive. We can estimate good quality of a sentence using language model (LM). There are different LMs available. We showed in our experiment that Kneser-Ney smoothing LM is the right choice for measuring MT-Engine-output’s quality for the post-editing.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lagarda, A.-L., Alabau, V., Casacuberta, F., Silva, R., Diaz-de-Liano, E.: Statistical post-editing of a rule-based machine translation system. In: Proceedings of NAACL HLT 2009, ACL, pp. 217–220 (2009) Lagarda, A.-L., Alabau, V., Casacuberta, F., Silva, R., Diaz-de-Liano, E.: Statistical post-editing of a rule-based machine translation system. In: Proceedings of NAACL HLT 2009, ACL, pp. 217–220 (2009)
2.
Zurück zum Zitat NAACL (2012) Shared task: quality estimation. In: Seventh Workshop on Statistical Machine Translation, Montreal, Quebec, Canada, 7–8 June 2012 NAACL (2012) Shared task: quality estimation. In: Seventh Workshop on Statistical Machine Translation, Montreal, Quebec, Canada, 7–8 June 2012
3.
Zurück zum Zitat Knight, K., Chander, I.: Automated post-editing of documents. In: Proceedings of the Twelfth National (1994) Knight, K., Chander, I.: Automated post-editing of documents. In: Proceedings of the Twelfth National (1994)
4.
Zurück zum Zitat Simard, M., Goutte, C., Isabelle, P.: Statistical phrase-based post-editing. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 508–515. Association for Computational Linguistics, April 2007 Simard, M., Goutte, C., Isabelle, P.: Statistical phrase-based post-editing. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 508–515. Association for Computational Linguistics, April 2007
5.
Zurück zum Zitat Allen, J.: Post-editing. In: Harold, S. (ed.) Benjamins Translation Library, p. 35, (2003) Allen, J.: Post-editing. In: Harold, S. (ed.) Benjamins Translation Library, p. 35, (2003)
6.
Zurück zum Zitat Suzuki, H.: Automatic post-editing based on SMT and its selective application by sentence-level automatic quality evaluation. In: Proceedings of the Machine Translation Summit XIII, pp. 156-163 (2011) Suzuki, H.: Automatic post-editing based on SMT and its selective application by sentence-level automatic quality evaluation. In: Proceedings of the Machine Translation Summit XIII, pp. 156-163 (2011)
7.
Zurück zum Zitat Bahl, L.R., Jelinek, F., Mercer, L.R.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. Patt. Anal. Mach. Intell. 5, 179–190 (1983)CrossRef Bahl, L.R., Jelinek, F., Mercer, L.R.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. Patt. Anal. Mach. Intell. 5, 179–190 (1983)CrossRef
8.
Zurück zum Zitat Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin,P.S.: A statistical approach to machine translation. Comput. Linguist. 16, 79–85 (1990) Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin,P.S.: A statistical approach to machine translation. Comput. Linguist. 16, 79–85 (1990)
9.
Zurück zum Zitat Dugast, L., Senellart, J., Koehn, P.: Statistical post-editing on Systran’s rule-based translation system. In: WMT, pp. 220–223 (2007) Dugast, L., Senellart, J., Koehn, P.: Statistical post-editing on Systran’s rule-based translation system. In: WMT, pp. 220–223 (2007)
10.
Zurück zum Zitat Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process. (ASSP) 35(3), 400–401 (1987) Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process. (ASSP) 35(3), 400–401 (1987)
11.
Zurück zum Zitat Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech and Signal Processing, pp. 181–184 (1995) Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech and Signal Processing, pp. 181–184 (1995)
12.
Zurück zum Zitat de Ilarraza, A.D., Labaka, G., Sarasola, K.: Statistical postediting: a valuable method in domain adaptation of RBMT systems for less-resourced languages. In: MATMT, pp. 35–40 (2008) de Ilarraza, A.D., Labaka, G., Sarasola, K.: Statistical postediting: a valuable method in domain adaptation of RBMT systems for less-resourced languages. In: MATMT, pp. 35–40 (2008)
13.
Zurück zum Zitat Dugast, L., Senellart, J., Koehn, P.: Statistical post editing and dictionary extraction: Systran/Edinburgh submissions for ACL-WMT2009. In: WMT, pp. 110–114 (2009) Dugast, L., Senellart, J., Koehn, P.: Statistical post editing and dictionary extraction: Systran/Edinburgh submissions for ACL-WMT2009. In: WMT, pp. 110–114 (2009)
14.
Zurück zum Zitat Chen, S.F., Goodman, J.T.: An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13(4), 359–393 (1999)CrossRef Chen, S.F., Goodman, J.T.: An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13(4), 359–393 (1999)CrossRef
15.
Zurück zum Zitat Allen, J., Hogan, C.: Toward the development of a post editing module for raw machine translation output: a controlled language perspective. In: CLAW, pp. 62–71 (2000) Allen, J., Hogan, C.: Toward the development of a post editing module for raw machine translation output: a controlled language perspective. In: CLAW, pp. 62–71 (2000)
16.
Zurück zum Zitat Béchara, H., Ma, Y., van Genabith, J.: Statistical post-editing for a statistical MT system. In: MT Summit XIII, pp. 308–315 (2011) Béchara, H., Ma, Y., van Genabith, J.: Statistical post-editing for a statistical MT system. In: MT Summit XIII, pp. 308–315 (2011)
Metadaten
Titel
Quality Estimation of MT-Engine Output Using Language Models for Post-editing and Their Comparative Study
verfasst von
Kuldeep Kumar Yogi
Nishith Joshi
Chandra Kumar Jha
Copyright-Jahr
2015
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2247-7_52

Premium Partner