2013 | OriginalPaper | Buchkapitel
A Language Modeling Approach for Extracting Translation Knowledge from Comparable Corpora
verfasst von : Razieh Rahimi, Azadeh Shakery
Erschienen in: Advances in Information Retrieval
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
A main challenge in
Cross-Language
information retrieval is to estimate a translation language model, as its quality directly affects the retrieval performance. The translation language model is built using translation resources such as bilingual dictionaries, parallel corpora, or comparable corpora. In general, high quality resources may not be available for scarce-resource languages. For these languages, efficient exploitation of commonly available resources such as
comparable corpora
is considered more crucial. In this paper, we focus on using only comparable corpora to extract translation information more efficiently. We propose a
language modeling
approach for estimating the translation language model. The proposed method is based on probability distribution estimation, and can be tuned easier in comparison with heuristically adjusted previous work. Experiment results show a significant improvement in the translation quality and CLIR performance compared to the previous approaches.