research-article

Combining resources with confidence measures for cross language information retrieval

Authors:
Youssef Kadri

Université de Montréal, Montreal, PQ, Canada

Université de Montréal, Montreal, PQ, Canada
View Profile

,
Jian-Yun Nie

Université de Montréal, Montreal, PQ, Canada

Université de Montréal, Montreal, PQ, Canada
View Profile

PIKM '07: Proceedings of the ACM first Ph.D. workshop in CIKMNovember 2007Pages 131–138https://doi.org/10.1145/1316874.1316896

Published:09 November 2007Publication History

PIKM '07: Proceedings of the ACM first Ph.D. workshop in CIKM

Pages 131–138

ABSTRACT

Query translation in Cross Language Information Retrieval (CLIR) can be performed using multiple resources. Previous attempts to combine different translation resources use simple methods such as linear combination. Unfortunately, these approaches are insufficient to combine different types of resources such as bilingual dictionaries and statistical translation models. In this paper, we use confidence measures for this combination for the purpose of English-Arabic CLIR. Confidence measure is used to adjust the original scores of translations and to create a weight of the same nature for translations with different resources. We tested this technique on two test CLIR collections from TREC and obtained encouraging improvements compared to the results of linear combination.

References

Al-Onaizan, Y., Curin, J., Jahr, M., Knight, K., Lafferty, J., Melamed, D., Och, F., Purdy, D., Smith, N., and Yarowsky, D. Statistical Machine Translation. Technical Report, CLSP/JHU 99 Workshop, Baltimore, MD, 1999.Google Scholar
Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kulesza, A., Sanchis, A., and Ueffing, N. Confidence estimation for machine translation. Technical Report, CLSP/JHU 2003 Summer Workshop, Baltimore, 2003.Google Scholar
Brown, P. F., Pietra, S. A., Pietra, V. J., and Mercer, R. L. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2), 263--311, 1993. Google ScholarDigital Library
Fraser, A., Xu, J., and Weischedel, R. TREC 2002 Cross-lingual Retrieval at BBN. TREC11 conference, 2002.Google Scholar
Gandrabur, S., and Foster, G. Confidence Estimation for Text Prediction. Proceedings of the Conference on Natural Language Learning (CoNLL 2003), Edmonton, May 2003. Google ScholarDigital Library
Hazen, T. J., Burianek, T., Polifroni, J., and Seneff, S. Recognition confidence scoring for use in speech understanding systems. Computer Speech and Language, Num. 16, pp. 49--67, 2002.Google ScholarDigital Library
Kadri, Y., and Nie, J. Y. Query translation for English-Arabic cross language information retrieval. Proceedings of the TALN conference, 2004.Google Scholar
Kadri, Y., and Nie, J. Y. Effective stemming for Arabic information retrieval. The challenge of Arabic for NLP/MT Conference. The British Computer Society. London, UK, 2006.Google Scholar
Nie, J. N., Simard, M., and Foster, G. Multilingual information retrieval based on parallel texts from the Web. In LNCS 2069, C. Peters editor, CLEF2000, pages 188--201, Lisbon 2000. Google ScholarDigital Library
Oard, D. W., and Diekema, A. Cross-Language Information Retrieval. In M. Williams (ed.), Annual review of Information science, 1998:223--256, 1998.Google Scholar
Vogel, S., and Monson, C. Augmenting Manual Dictionaries for Statistical Machine Translation Systems. Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC), 2004.Google Scholar
Xu, J., and Weischedel, R. Empirical studies on the impact of lexical resources on CLIR performance. Information processing & management, 41(3), 475--487, 2005. Google ScholarDigital Library
Zhai, C., and Lafferty, J. Model-based feedback in the language modeling approach to information retrieval. Tenth International Conference on Information and Knowledge Management (CIKM 2001), 2001. Google ScholarDigital Library
Zhai, C., and Lafferty, J. A study of smoothing methods for language models applied to ad hoc information retrieval. Proceedings of the ACM--SIGIR, 2001. Google ScholarDigital Library

Index Terms

Combining resources with confidence measures for cross language information retrieval
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
    2. Retrieval models and ranking

Recommendations

Comparing different units for query translation in Chinese cross-language information retrieval
InfoScale '07: Proceedings of the 2nd international conference on Scalable information systems

Although both words and n-grams of characters have been used in Chinese IR, they have often been used as two competing methods. For cross-language IR with Chinese, word translation has been used in all previous studies. In this paper, we re-examine the ...
Read More
Statistical query translation models for cross-language information retrieval

Query translation is an important task in cross-language information retrieval (CLIR), which aims to determine the best translation words and weights for a query. This article presents three statistical query translation models that focus on the ...
Read More
Exploring Bilingual Word Vectors for Hindi-English Cross-Language Information Retrieval
ICIA-16: Proceedings of the International Conference on Informatics and Analytics

Todays, The internet has become a source of multi-lingual content. Users are not aware of multiple languages, so the language diversity becomes a great barrier for world communication. Cross-Language Information Retrieval (CLIR) provides a solution for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PIKM '07: Proceedings of the ACM first Ph.D. workshop in CIKM
November 2007
184 pages
ISBN:9781595938329
DOI:10.1145/1316874
General Chairs:
Aparna Varde
Virginia State University, USA
,
Jian Pei
Simon Fraser University, Canada
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 November 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CLIR
confidence measures
linear combination
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate25of62submissions,40%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 192
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Combining resources with confidence measures for cross language information retrieval

PIKM '07: Proceedings of the ACM first Ph.D. workshop in CIKM

ABSTRACT

References

Cited By

Index Terms

Recommendations

Comparing different units for query translation in Chinese cross-language information retrieval

Statistical query translation models for cross-language information retrieval

Exploring Bilingual Word Vectors for Hindi-English Cross-Language Information Retrieval

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Combining resources with confidence measures for cross language information retrieval

PIKM '07: Proceedings of the ACM first Ph.D. workshop in CIKM

ABSTRACT

References

Cited By

Index Terms

Recommendations

Comparing different units for query translation in Chinese cross-language information retrieval

Statistical query translation models for cross-language information retrieval

Exploring Bilingual Word Vectors for Hindi-English Cross-Language Information Retrieval

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media