ABSTRACT
The purpose of this work is to investigate the use of machine learning approaches for confidence estimation within a statistical machine translation application. Specifically, we attempt to learn probabilities of correctness for various model predictions, based on the native probabilites (i.e. the probabilites given by the original model) and on features of the current context. Our experiments were conducted using three original translation models and two types of neural nets (single-layer and multilayer perceptrons) for the confidence estimation task.
- Christopher M. Bishop. 1995. Neural Networks for Pattern Recognition. Oxford. Google ScholarDigital Library
- Peter F. Brown, Stephen A. Della Pietra, Vincent Della J. Pietra, and Robert L. Mercer. 1993. The mathematics of Machine Translation: Parameter estimation. Computational Linguistics, 19(2):263--312, June. Google ScholarDigital Library
- R. Collobert, S. Bengio, and J. Mariéthoz. 2002. Torch: a modular machine learning software library. Technical Report IDIAP-RR 02-46, IDIAP.Google Scholar
- George Foster, Philippe Langlais, and Guy Lapalme. 2002a. Text prediction with fuzzy alignments. In Stephen D. Richardson, editor, Proceedings of the 5th Conference of the Association for Machine Translation in the Americas, Tiburon, California, October. Springer-Verlag. Google ScholarDigital Library
- George Foster, Philippe Langlais, and Guy Lapalme. 2002b. User-friendly text prediction for translators. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, PA. Google ScholarDigital Library
- George Foster. 2000. Incorporating position information into a Maximum Entropy / Minimum Divergence translation model. In Proceedings of the 4th Computational Natural Language Learning Workshop (CoNLL), Lisbon, Portugal, September. ACL SigNLL. Google ScholarDigital Library
- Didier Guillevic, Simona Gandrabur, and Yves Normandin. 2002. Robust semantic confidence scoring. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP) 2002, Denver, Colorado, September.Google Scholar
- L. Mangu, E. Brill, and A. Stolcke. 2000. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech and Language, 14(4):373--400.Google ScholarDigital Library
- R. Manmatha and H. Sever. 2002. A formal approach to score normalization for meta-search. In M. Marcus, editor, Proceedings of HLT 2002, Second International Conference on Human Language Technology Research, pages 98--103, San Francisco. Morgan Kaufmann. Google ScholarDigital Library
- P. Moreno, B. Logan, and B. Raj. 2001. A boosting approach for confidence scoring. In Eurospeech.Google Scholar
- A. Sanchis, A. Juan, and E. Vidal. 2003. A simple hybrid aligner for generating lexical correspondences in parallel texts. In ICASSP 2003, pages 29--35.Google Scholar
- A. Stolcke, Y. Koenig, and M. Weintraub. 1997. Explicit word error minimization in n-best list rescoring. In Proc. 5th Eur. Conf. Speech Communication and Technology, volume 1, pages 163--166.Google Scholar
- Frank Wessel, Ralf Schlüter, Klaus Macherey, and Hermann Ney. 2001. Confidence measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processsing, 9(3):288--298.Google ScholarCross Ref
Recommendations
Word-level confidence estimation for machine translation using phrase-based translation models
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingConfidence measures for machine translation is a method for labeling each word in an automatically generated translation as correct or incorrect. In this paper, we will present a new approach to confidence estimation which has the advantage that it does ...
Word-Level Confidence Estimation for Machine Translation
This article introduces and evaluates several different word-level confidence measures for machine translation. These measures provide a method for labeling each word in an automatically generated translation as correct or incorrect. All approaches to ...
Confidence estimation for NLP applications
Confidence measures are a practical solution for improving the usefulness of Natural Language Processing applications. Confidence estimation is a generic machine learning approach for deriving confidence measures. We give an overview of the application ...
Comments