ABSTRACT
We systematically compare five representative state-of-the-art methods for estimating query language models with pseudo feedback in ad hoc information retrieval, including two variants of the relevance language model, two variants of the mixture feedback model, and the divergence minimization estimation method. Our experiment results show that a variant of relevance model and a variant of the mixture model tend to outperform other methods. We further propose several heuristics that are intuitively related to the good retrieval performance of an estimation method, and show that the variations in how these heuristics are implemented in different methods provide a good explanation of many empirical observations.
- Nasreen Abdul-Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah Larkey, Xiaoyan Li, Donald Metzler, Mark D. Smucker, Trevor Strohman, Howard Turtle, and Courtney Wade. Umass at trec 2004: Novelty and hard. In TREC '04, 2004.Google ScholarCross Ref
- Hui Fang, Tao Tao, and ChengXiang Zhai. A formal study of information retrieval heuristics. In SIGIR '04, pages 49--56, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- John D. Lafferty and Chengxiang Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- Victor Lavrenko and W. Bruce Croft. Relevance-based language models. In SIGIR '01, pages 120--127, 2001. Google ScholarDigital Library
- Yuanhua Lv and ChengXiang Zhai. Adaptive Relevance Feedback in Information Retrieval. In Proceedings of CIKM '09, 2009. Google ScholarDigital Library
- Jay M. Ponte and W. Bruce Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998. Google ScholarDigital Library
- Tao Tao and ChengXiang Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162--169, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- ChengXiang Zhai and John D. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM '01, pages 403--410, 2001. Google ScholarDigital Library
- ChengXiang Zhai and John D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR '01, pages 334--342, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
Index Terms
- A comparative study of methods for estimating query language models with pseudo feedback
Recommendations
Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls
Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly ...
Pseudo relevance feedback using semantic clustering in relevance language model
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementPseudo relevance feedback has demonstrated to be in general an effective technique for improving retrieval effectiveness, but the noise in the top retrieved documents still can cause topic drift problem that affects the performance of certain topics. By ...
Relevance Feedback Fusion via Query Expansion
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03Relevance Feedback (RF) is an important technique to improve information retrieval and has emerged as one of the hottest topics for both the industry and academic researchers. The performance of RF depends on feedback information. As the volume of ...
Comments