ABSTRACT
Most information retrieval evaluation metrics are designed to measure the satisfaction of the user given the results returned by a search engine. In order to evaluate user satisfaction, most of these metrics have underlying user models, which aim at modeling how users interact with search engine results. Hence, the quality of an evaluation metric is a direct function of the quality of its underlying user model. This paper proposes EBU, a new evaluation metric that uses a sophisticated user model tuned by observations over many thousands of real search sessions. We compare EBU with a number of state of the art evaluation metrics and show that it is more correlated with real user behavior captured by clicks.
- O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM CIKM conference, pages 621--630, Hong Kong, China, 2009. Google ScholarDigital Library
- O. Chapelle and Y. Zhang. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web, pages 1--10, Madrid, Spain, 2009. Google ScholarDigital Library
- G. Dupret and C. Liao. A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine. In Proceedings of the third ACM International Conference on Web Search and Web Data Mining, pages 189--190, New York, NY, 2010. Google ScholarDigital Library
- F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y. Wang, and C. Faloutsos. Click chain model in web search. In Proceedings of the 18th international conference on World wide web, pages 11--20, Madrid, Spain, 2009. ACM. Google ScholarDigital Library
- F. Guo, C. Liu, and Y. Wang. Efficient multiple-click models in web search. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 124--131, Barcelona, Spain, 2009. Google ScholarDigital Library
- K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002. Google ScholarDigital Library
- A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems, 27(1):1--27, 2008. Google ScholarDigital Library
- A. Turpin, F. Scholer, K. Jarvelin, M. Wu, and J. Culpepper. Including summaries in system evaluation. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 508--515, Boston, MA, 2009. Google ScholarDigital Library
- Y. Zhang, L. A. F. Park, and A. Moffat. Click-based evidence for decaying weight distributions in search effectiveness metrics. Information Retrieval. Published on-line 30 June 2009. Google ScholarDigital Library
Index Terms
- Expected browsing utility for web search evaluation
Recommendations
Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementModern search engine result pages often provide immediate value to users and organize information in such a way that it is easy to navigate. The core ranking function contributes to this and so do result snippets, smart organization of result blocks and ...
Grid-based Evaluation Metrics for Web Image Search
WWW '19: The World Wide Web ConferenceCompared to general web search engines, web image search engines display results in a different way. In web image search, results are typically placed in a grid-based manner rather than a sequential result list. In this scenario, users can view results ...
Why People Search for Images using Web Search Engines
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data MiningWhat are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image ...
Comments