research-article

Diversifying search results

Authors:
Rakesh Agrawal

Search Labs, Microsoft Research

Search Labs, Microsoft Research
View Profile

,
Sreenivas Gollapudi

Search Labs, Microsoft Research

Search Labs, Microsoft Research
View Profile

,
Alan Halverson

Search Labs, Microsoft Research

Search Labs, Microsoft Research
View Profile

,
Samuel Ieong

Search Labs, Microsoft Research

Search Labs, Microsoft Research
View Profile

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningFebruary 2009Pages 5–14https://doi.org/10.1145/1498759.1498766

Published:09 February 2009Publication History

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Pages 5–14

ABSTRACT

We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We propose an algorithm that well approximates this objective in general, and is provably optimal for a natural special case. Furthermore, we generalize several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. We demonstrate empirically that our algorithm scores higher in these generalized metrics compared to results produced by commercial search engines.

References

Kannan Achan, Ariel Fuxman, Panayiotis Tsaparas, and Rakesh Agrawal. Using the wisdom of the crowds for keyword generation. In WWW, pages 1--8, 2008.Google Scholar
Aris Anagnostopoulos, Andrei Z. Broder, and David Carmel. Sampling search-engine results. In WWW, pages 245--256, 2005. Google ScholarDigital Library
A. Bookstein. Information retrieval: A sequential learning process. Journal of the American Society for Information Sciences (ASIS), 34(5):331--342, 1983.Google Scholar
B. Boyce. Beyond topicality: A two stage view of relevance and the retrieval process. Info. Processing and Management, 18(3):105--109, 1982.Google ScholarCross Ref
Jaime G. Carbonell and Jade Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998. Google ScholarDigital Library
Harr Chen and David R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006. Google ScholarDigital Library
Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008. Google ScholarDigital Library
W. Goffman. A searching procedure for information retrieval. Info. Storage and Retrieval, 2:73--78, 1964.Google ScholarCross Ref
Dorit Hochbaum, editor. Approximation Algorithms for NP-Hard problems. Springer, 1999. Google ScholarDigital Library
K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, pages 41--48, 2000. Google ScholarDigital Library
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge Univ Press, 2008. Google ScholarDigital Library
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Math. Programming, 14:265--294, 1978.Google ScholarDigital Library
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML, 2008. First presented at NIPS07 Workshop on Machine Learning for Web Search. Google ScholarDigital Library
Filip Radlinski and Susan T. Dumais. Improving personalized web search using result diversification. In SIGIR, pages 691--692, 2006. Google ScholarDigital Library
Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, and Sihem Amer-Yahia. Efficient computation of diverse query results. In ICDE, pages 228--236, 2008. Google ScholarDigital Library
E. M. Voorhees. Overview of the trec 2004 robust retrieval track. In TREC, 2004.Google Scholar
Yunjie Xu and Hainan Yin. Novelty and topicality in interactive information retrieval. J. Am. Soc. Inf. Sci. Technol., 59(2):201--215, 2008. Google ScholarDigital Library
ChengXiang Zhai. Risk Minimization and Language Modeling in Information Retrieval. PhD thesis, Carnegie Mellon University, 2002.Google Scholar
ChengXiang Zhai, William W. Cohen, and John D. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003. Google ScholarDigital Library
ChengXiang Zhai and John D. Lafferty. A risk minimzation framework for information retrieval. Info. Processing and Management, 42(1):31--55, 2006. Google ScholarDigital Library
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. Improving recommendation lists through topic diversification. In WWW, pages 22--32, 2005. Google ScholarDigital Library

Index Terms

Diversifying search results
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Intent-based diversification of web search results: metrics and algorithms

We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, ...
Read More
Diversifying product search results
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

In recent years, online shopping is becoming more and more popular. Users type keyword queries on product search systems to find relevant products, accessories, and even related products. However, existing product search systems always return very ...
Read More
Post-ranking query suggestion by diversifying search results
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Query suggestion refers to the process of suggesting related queries to search engine users. Most existing researches have focused on improving the relevance of suggested queries. In this paper, we introduce the concept of diversifying the content of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining
February 2009
314 pages
ISBN:9781605583907
DOI:10.1145/1498759
Editors:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Paolo Boldi
Universita degli Studi di Milano, Italy
,
Berthier Ribeiro-Neto
Google Engineering, Brazil & CS Dept., Univ. Fed. de Minas Gerais, Brazil
,
B. Barla Cambazoglu
Yahoo! Research
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 February 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
marginal utility
relevance
result diversification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate498of2,863submissions,17%
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 642
  Total Citations
  View Citations
- 3,519
  Total Downloads
- Downloads (Last 12 months)117
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Diversifying search results

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Intent-based diversification of web search results: metrics and algorithms

Diversifying product search results

Post-ranking query suggestion by diversifying search results