skip to main content
10.1145/2505515.2505698acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Open Access

Evaluating aggregated search using interleaving

Published:27 October 2013Publication History

ABSTRACT

A result page of a modern web search engine is often much more complicated than a simple list of "ten blue links." In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a system is called an aggregated or federated search system.

Because search engines evolve over time, their results need to be constantly evaluated. However, one of the most efficient and widely used evaluation methods, interleaving, cannot be directly applied to aggregated search systems, as it ignores the need to group results originating from the same source (vertical results).

We propose an interleaving algorithm that allows comparisons of search engine result pages containing grouped vertical documents. We compare our algorithm to existing interleaving algorithms and other evaluation methods (such as A/B-testing), both on real-life click log data and in simulation experiments. We find that our algorithm allows us to perform unbiased and accurate interleaved comparisons that are comparable to conventional evaluation techniques. We also show that our interleaving algorithm produces a ranking that does not substantially alter the user experience, while being sensitive to changes in both the vertical result block and the non-vertical document rankings. All this makes our proposed interleaving algorithm an essential tool for comparing IR systems with complex aggregated pages.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Arguello, F. Diaz, J. Callan, and B. Carterette. A methodology for evaluating aggregated search results. In ECIR. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. O. Chapelle, D. Metzler, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In CIKM. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Chen, W. Chen, and H. Wang. Beyond ten blue links: enabling user click modeling in federated web search. In WSDM. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Chuklin, P. Serdyukov, and M. de Rijke. Using intent information to model user behavior in diversified search. In ECIR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Clarke, M. Kolla, and G. Cormack. Novelty and diversity in information retrieval evaluation. In SIGIR. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. W. Cleverdon, J. Mills, and M. Keen. Factors determining the performance of indexing systems. Techn. report, ASLIB Cranfield project, 1966.Google ScholarGoogle Scholar
  9. S. Dumais, E. Cutrell, and H. Chen. Optimizing search by showing results in context. In CHI, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. He, C. Zhai, and X. Li. Evaluation of methods for relative comparison of retrieval systems based on click throughs. In CIKM. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. In CIKM. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Hofmann, S. Whiteson, and M. Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval, 16(1), Apr. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Hofmann, S. Whiteson, and M. de Rijke. Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Trans. Inf. Syst., 31(4), Oct. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. Optimizing search engines using clickthrough data. In KDD. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Joachims. Evaluating retrieval performance using clickthrough data. Text Mining, 2003.Google ScholarGoogle Scholar
  16. T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In SIGIR. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. K. Ponnuswami, K. Pattabiraman, Q. Wu, R. Gilad-Bachrach, and T. Kanungo. On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals. In WSDM. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. In WSDM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In CIKM. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Schuth, K. Hofmann, S. Whiteson, and M. de Rijke. Lerot: an Online Learning to Rank Framework. In Living Labs workshop at CIKM. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Seo, W. B. Croft, K. H. Kim, and J. H. Lee. Smoothing click counts for aggregated vertical search. Advances in Information Retrieval, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Zhou, R. Cummins, M. Lalmas, and J. M. Jose. Evaluating Aggregated Search Pages. In SIGIR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evaluating aggregated search using interleaving

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
      October 2013
      2612 pages
      ISBN:9781450322638
      DOI:10.1145/2505515

      Copyright © 2013 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2013

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader