Evaluating aggregated search using interleaving

Authors:
Aleksandr Chuklin

Yandex & ISLA, University of Amsterdam, Moscow, Russian Fed.

Yandex & ISLA, University of Amsterdam, Moscow, Russian Fed.
View Profile

,
Anne Schuth

ISLA, University of Amsterdam, Amsterdam, Netherlands

ISLA, University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Katja Hofmann

ISLA, University of Amsterdam, Amsterdam, Netherlands

ISLA, University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Pavel Serdyukov

Yandex, Moscow, Russian Fed.

Yandex, Moscow, Russian Fed.
View Profile

,
Maarten de Rijke

ISLA, University of Amsterdam, Amsterdam, Netherlands

ISLA, University of Amsterdam, Amsterdam, Netherlands
View Profile

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementOctober 2013Pages 669–678https://doi.org/10.1145/2505515.2505698

Published:27 October 2013Publication History

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages 669–678

ABSTRACT

A result page of a modern web search engine is often much more complicated than a simple list of "ten blue links." In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a system is called an aggregated or federated search system.

Because search engines evolve over time, their results need to be constantly evaluated. However, one of the most efficient and widely used evaluation methods, interleaving, cannot be directly applied to aggregated search systems, as it ignores the need to group results originating from the same source (vertical results).

We propose an interleaving algorithm that allows comparisons of search engine result pages containing grouped vertical documents. We compare our algorithm to existing interleaving algorithms and other evaluation methods (such as A/B-testing), both on real-life click log data and in simulation experiments. We find that our algorithm allows us to perform unbiased and accurate interleaved comparisons that are comparable to conventional evaluation techniques. We also show that our interleaving algorithm produces a ranking that does not substantially alter the user experience, while being sensitive to changes in both the vertical result block and the non-vertical document rankings. All this makes our proposed interleaving algorithm an essential tool for comparing IR systems with complex aggregated pages.

References

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM. ACM, 2009. Google ScholarDigital Library
J. Arguello, F. Diaz, J. Callan, and B. Carterette. A methodology for evaluating aggregated search results. In ECIR. Springer, 2011. Google ScholarDigital Library
O. Chapelle, D. Metzler, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In CIKM. ACM, 2009. Google ScholarDigital Library
O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 2012. Google ScholarDigital Library
D. Chen, W. Chen, and H. Wang. Beyond ten blue links: enabling user click modeling in federated web search. In WSDM. ACM, 2012. Google ScholarDigital Library
A. Chuklin, P. Serdyukov, and M. de Rijke. Using intent information to model user behavior in diversified search. In ECIR, 2013. Google ScholarDigital Library
C. Clarke, M. Kolla, and G. Cormack. Novelty and diversity in information retrieval evaluation. In SIGIR. ACM, 2008. Google ScholarDigital Library
C. W. Cleverdon, J. Mills, and M. Keen. Factors determining the performance of indexing systems. Techn. report, ASLIB Cranfield project, 1966.Google Scholar
S. Dumais, E. Cutrell, and H. Chen. Optimizing search by showing results in context. In CHI, 2001. Google ScholarDigital Library
J. He, C. Zhai, and X. Li. Evaluation of methods for relative comparison of retrieval systems based on click throughs. In CIKM. ACM, 2009. Google ScholarDigital Library
K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. In CIKM. ACM, 2011. Google ScholarDigital Library
K. Hofmann, S. Whiteson, and M. Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval, 16(1), Apr. 2012. Google ScholarDigital Library
K. Hofmann, S. Whiteson, and M. de Rijke. Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Trans. Inf. Syst., 31(4), Oct. 2013. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In KDD. ACM, 2002. Google ScholarDigital Library
T. Joachims. Evaluating retrieval performance using clickthrough data. Text Mining, 2003.Google Scholar
T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In SIGIR. ACM, 2005. Google ScholarDigital Library
A. K. Ponnuswami, K. Pattabiraman, Q. Wu, R. Gilad-Bachrach, and T. Kanungo. On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals. In WSDM. ACM, 2011. Google ScholarDigital Library
F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. In WSDM, 2013. Google ScholarDigital Library
F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In CIKM. ACM, 2008. Google ScholarDigital Library
A. Schuth, K. Hofmann, S. Whiteson, and M. de Rijke. Lerot: an Online Learning to Rank Framework. In Living Labs workshop at CIKM. ACM, 2013. Google ScholarDigital Library
J. Seo, W. B. Croft, K. H. Kim, and J. H. Lee. Smoothing click counts for aggregated vertical search. Advances in Information Retrieval, 2011. Google ScholarDigital Library
K. Zhou, R. Cummins, M. Lalmas, and J. M. Jose. Evaluating Aggregated Search Pages. In SIGIR, 2012. Google ScholarDigital Library

Index Terms

Evaluating aggregated search using interleaving
1. Information systems
  1. Information retrieval

Recommendations

A Comparative Analysis of Interleaving Methods for Aggregated Search

A result page of a modern search engine often goes beyond a simple list of “10 blue links.” Many specific user needs (e.g., News, Image, Video) are addressed by so-called aggregated or vertical search solutions: specially presented documents, often ...
Read More
Aggregated Search and Interleaving Methods: A survey
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless Technologies

Aggregated search attempts to satisfy user's need by searching and assembling information from variety verticals and placing them into a single result page. Aggregated search has two research directions namely, cross-vertical Aggregated Search (cvAS) ...
Read More
Interest and Evaluation of Aggregated Search
WI-IAT '11: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
October 2013
2612 pages
ISBN:9781450322638
DOI:10.1145/2505515
General Chairs:
Qi He
LinkedIn, USA
,
Arun Iyengar
IBM T.J. Watson Research Center, USA
,
Program Chairs:
Wolfgang Nejdl
L3S Research Center, Germany
,
Jian Pei
Simon Fraser University, Canada
,
Rajeev Rastogi
Amazon, India
Copyright © 2013 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2013
Check for updates
Author Tags
a/b-testing
evaluation
implicit feedback
vertical search
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 23
  Total Citations
  View Citations
- 654
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evaluating aggregated search using interleaving

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Comparative Analysis of Interleaving Methods for Aggregated Search

Aggregated Search and Interleaving Methods: A survey

Interest and Evaluation of Aggregated Search