research-article

Learning to rank for freshness and relevance

Authors:
Na Dai

Lehigh University, Bethlehem, PA, USA

Lehigh University, Bethlehem, PA, USA
View Profile

,
Milad Shokouhi

Microsoft Research, Cambridge, United Kingdom

Microsoft Research, Cambridge, United Kingdom
View Profile

,
Brian D. Davison

Lehigh University, Bethlehem, PA, USA

Lehigh University, Bethlehem, PA, USA
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 95–104https://doi.org/10.1145/2009916.2009933

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 95–104

ABSTRACT

Freshness of results is important in modern web search. Failing to recognize the temporal aspect of a query can negatively affect the user experience, and make the search engine appear stale. While freshness and relevance can be closely related for some topics (e.g., news queries), they are more independent in others (e.g., time insensitive queries). Therefore, optimizing one criterion does not necessarily improve the other, and can even do harm in some cases.

We propose a machine-learning framework for simultaneously optimizing freshness and relevance, in which the trade-off is automatically adaptive to query temporal characteristics. We start by illustrating different temporal characteristics of queries, and the features that can be used for capturing these properties. We then introduce our supervised framework that leverages the temporal profile of queries (inferred from pseudo-feedback documents) along with the other ranking features to improve both freshness and relevance of search results. Our experiments on a large archival web corpus demonstrate the efficacy of our techniques.

References

K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proc. of ECIR, pages 13--25, 2010. Google ScholarDigital Library
J. Bian, X. Li, F. Li, Z. Zheng, and H. Zha. Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM. In Proc. of WWW, pages 131--140, 2010. Google ScholarDigital Library
J. Bian, T.-Y. Liu, T. Qin, and H. Zha. Ranking with query-dependent loss for web search. In Proc. of WSDM, pages 141--150, 2010. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. In Proc. of WWW, pages 107--117, Apr. 1998. Google ScholarDigital Library
C. Burges, T. Shaked, E. Renshaw, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of ICML, pages 89--96, 2005. Google ScholarDigital Library
S. Chien and N. Immorlica. Semantic similarity between search engine queries using temporal correlation. In Proc. of WWW, pages 2--11, 2005. Google ScholarDigital Library
R. B. Cleveland, W. S. Cleveland, J. E. Mcrae, and I. Terpenning. STL: A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 6(1):3--73, 1990.Google Scholar
N. Dai and B. D. Davison. Freshness matters: In flowers, food, and web authority. In Proc. of SIGIR, pages 114--121, 2010. Google ScholarDigital Library
N. Dai and B. D. Davison. Mining anchor text trends for retrieval. In Proc. of ECIR, pages 127--139, 2010. Google ScholarDigital Library
W. Dakka, L. Gravano, and P. G. Ipeirotis. Answering general time sensitive queries. In Proc. of CIKM, pages 1437--1438, 2008. Google ScholarDigital Library
F. Diaz. Integration of news content into web results. In Proc. of WSDM, pages 182--191, 2009. Google ScholarDigital Library
A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In Proc. of WSDM, pages 11--20, 2010. Google ScholarDigital Library
A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In Proc. of WWW, pages 331--340, 2010. Google ScholarDigital Library
J. L. Elsas and S. T. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proc. of WSDM, pages 1--10, 2010. Google ScholarDigital Library
Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933--969, December 2003. Google ScholarDigital Library
X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In Proc. of SIGIR, pages 115--122, 2008. Google ScholarDigital Library
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proc. of WWW, pages 381--390, 2009. Google ScholarDigital Library
K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In Proc. of SIGIR, pages 41--48, 2000. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In Proc. of SIGKDD, pages 133--142, 2002. Google ScholarDigital Library
R. Jones and F. Diaz. Temporal profiles of queries. ACM Transactions on Information Systems, 25(3):14, 2007. Google ScholarDigital Library
I. Kang and G. Kim. Query type classification for web document retrieval. In Proc. of SIGIR, pages 64--71, 2003. Google ScholarDigital Library
A. Kulkarni, J. Teevan, K. Svore, and S. Dumais. Understanding temporal query dynamics. In Proc. of WSDM, pages 167--176, 2011. Google ScholarDigital Library
L. Li, F. Liu, and W. Chou. An information theoretic approach for using word cluster information in natural language call routing. Technical Report ALR-2003-014, Avaya Labs Research, 2003.Google Scholar
D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proc. of SIGIR, pages 700--701, 2009. Google ScholarDigital Library
S. Robertson, H. Zaragoza, and M. Taylor. Simple BM25 extension to multiple weighted fields. In Proc. of CIKM, pages 42--49, 2004. Google ScholarDigital Library
R. Steuer. Multiple Criteria Optimization: Theory, Computation and Application. John Wiley, 546 pp, 1986.Google Scholar
R. Tibshirani, G. Walther, and T. Hastie. Estimating the Number of Clusters in a Dataset via the Gap Statistic. Journal of the Royal Statistical Society, Series B, 63:411--423, 2000.Google ScholarCross Ref
L. Wang, J. Lin, and D. Metzler. Learning to efficiently rank. In Proc. of SIGIR, pages 138--145, 2010. Google ScholarDigital Library
Q. Wu, C. Burges, K. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13, 2010. Google ScholarDigital Library
P. S. Yu, X. Li, and B. Liu. On the temporal dimension of search. In Proc. of WWW, pages 448--449, 2004. Google ScholarDigital Library
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems, 22:179--214, 2004. Google ScholarDigital Library
Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Advances in NIPS 20, 2008.Google Scholar

Index Terms

Learning to rank for freshness and relevance
1. Information systems
  1. Information retrieval

Recommendations

Towards Click-Based Models of Geographic Interests in Web Search
WI-IAT '08: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

With the recent surge in the volume of search queries that explicitly or implicitly express users' geographical interests, to accurately infer users' locality preference becomes an increasingly important yet challenging issue. We study two click-based ...
Read More
Improving web search relevance and freshness with content previews
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Traditional web search engines find it challenging to achieve good search quality for recency-sensitive queries, as they are prone to delays in discovering, indexing and ranking new web pages. In this paper we introduce PreGen, an adaptive preview ...
Read More
Learning to re-rank: query-dependent image re-ranking using click data
WWW '11: Proceedings of the 20th international conference on World wide web

Our objective is to improve the performance of keyword based image search engines by re-ranking their original results. To this end, we address three limitations of existing search engines in this paper. First, there is no straight-forward, fully ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
freshness ranking
query classification
temporal profiles
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 49
  Total Citations
  View Citations
- 938
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning to rank for freshness and relevance

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards Click-Based Models of Geographic Interests in Web Search

Improving web search relevance and freshness with content previews

Learning to re-rank: query-dependent image re-ranking using click data