ABSTRACT
Freshness of results is important in modern web search. Failing to recognize the temporal aspect of a query can negatively affect the user experience, and make the search engine appear stale. While freshness and relevance can be closely related for some topics (e.g., news queries), they are more independent in others (e.g., time insensitive queries). Therefore, optimizing one criterion does not necessarily improve the other, and can even do harm in some cases.
We propose a machine-learning framework for simultaneously optimizing freshness and relevance, in which the trade-off is automatically adaptive to query temporal characteristics. We start by illustrating different temporal characteristics of queries, and the features that can be used for capturing these properties. We then introduce our supervised framework that leverages the temporal profile of queries (inferred from pseudo-feedback documents) along with the other ranking features to improve both freshness and relevance of search results. Our experiments on a large archival web corpus demonstrate the efficacy of our techniques.
- K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proc. of ECIR, pages 13--25, 2010. Google ScholarDigital Library
- J. Bian, X. Li, F. Li, Z. Zheng, and H. Zha. Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM. In Proc. of WWW, pages 131--140, 2010. Google ScholarDigital Library
- J. Bian, T.-Y. Liu, T. Qin, and H. Zha. Ranking with query-dependent loss for web search. In Proc. of WSDM, pages 141--150, 2010. Google ScholarDigital Library
- S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. In Proc. of WWW, pages 107--117, Apr. 1998. Google ScholarDigital Library
- C. Burges, T. Shaked, E. Renshaw, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of ICML, pages 89--96, 2005. Google ScholarDigital Library
- S. Chien and N. Immorlica. Semantic similarity between search engine queries using temporal correlation. In Proc. of WWW, pages 2--11, 2005. Google ScholarDigital Library
- R. B. Cleveland, W. S. Cleveland, J. E. Mcrae, and I. Terpenning. STL: A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 6(1):3--73, 1990.Google Scholar
- N. Dai and B. D. Davison. Freshness matters: In flowers, food, and web authority. In Proc. of SIGIR, pages 114--121, 2010. Google ScholarDigital Library
- N. Dai and B. D. Davison. Mining anchor text trends for retrieval. In Proc. of ECIR, pages 127--139, 2010. Google ScholarDigital Library
- W. Dakka, L. Gravano, and P. G. Ipeirotis. Answering general time sensitive queries. In Proc. of CIKM, pages 1437--1438, 2008. Google ScholarDigital Library
- F. Diaz. Integration of news content into web results. In Proc. of WSDM, pages 182--191, 2009. Google ScholarDigital Library
- A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In Proc. of WSDM, pages 11--20, 2010. Google ScholarDigital Library
- A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In Proc. of WWW, pages 331--340, 2010. Google ScholarDigital Library
- J. L. Elsas and S. T. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proc. of WSDM, pages 1--10, 2010. Google ScholarDigital Library
- Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933--969, December 2003. Google ScholarDigital Library
- X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In Proc. of SIGIR, pages 115--122, 2008. Google ScholarDigital Library
- S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proc. of WWW, pages 381--390, 2009. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In Proc. of SIGIR, pages 41--48, 2000. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proc. of SIGKDD, pages 133--142, 2002. Google ScholarDigital Library
- R. Jones and F. Diaz. Temporal profiles of queries. ACM Transactions on Information Systems, 25(3):14, 2007. Google ScholarDigital Library
- I. Kang and G. Kim. Query type classification for web document retrieval. In Proc. of SIGIR, pages 64--71, 2003. Google ScholarDigital Library
- A. Kulkarni, J. Teevan, K. Svore, and S. Dumais. Understanding temporal query dynamics. In Proc. of WSDM, pages 167--176, 2011. Google ScholarDigital Library
- L. Li, F. Liu, and W. Chou. An information theoretic approach for using word cluster information in natural language call routing. Technical Report ALR-2003-014, Avaya Labs Research, 2003.Google Scholar
- D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proc. of SIGIR, pages 700--701, 2009. Google ScholarDigital Library
- S. Robertson, H. Zaragoza, and M. Taylor. Simple BM25 extension to multiple weighted fields. In Proc. of CIKM, pages 42--49, 2004. Google ScholarDigital Library
- R. Steuer. Multiple Criteria Optimization: Theory, Computation and Application. John Wiley, 546 pp, 1986.Google Scholar
- R. Tibshirani, G. Walther, and T. Hastie. Estimating the Number of Clusters in a Dataset via the Gap Statistic. Journal of the Royal Statistical Society, Series B, 63:411--423, 2000.Google ScholarCross Ref
- L. Wang, J. Lin, and D. Metzler. Learning to efficiently rank. In Proc. of SIGIR, pages 138--145, 2010. Google ScholarDigital Library
- Q. Wu, C. Burges, K. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13, 2010. Google ScholarDigital Library
- P. S. Yu, X. Li, and B. Liu. On the temporal dimension of search. In Proc. of WWW, pages 448--449, 2004. Google ScholarDigital Library
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems, 22:179--214, 2004. Google ScholarDigital Library
- Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Advances in NIPS 20, 2008.Google Scholar
Index Terms
- Learning to rank for freshness and relevance
Recommendations
Towards Click-Based Models of Geographic Interests in Web Search
WI-IAT '08: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01With the recent surge in the volume of search queries that explicitly or implicitly express users' geographical interests, to accurately infer users' locality preference becomes an increasingly important yet challenging issue. We study two click-based ...
Improving web search relevance and freshness with content previews
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge managementTraditional web search engines find it challenging to achieve good search quality for recency-sensitive queries, as they are prone to delays in discovering, indexing and ranking new web pages. In this paper we introduce PreGen, an adaptive preview ...
Learning to re-rank: query-dependent image re-ranking using click data
WWW '11: Proceedings of the 20th international conference on World wide webOur objective is to improve the performance of keyword based image search engines by re-ranking their original results. To this end, we address three limitations of existing search engines in this paper. First, there is no straight-forward, fully ...
Comments