skip to main content
10.1145/2009916.2009933acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to rank for freshness and relevance

Published:24 July 2011Publication History

ABSTRACT

Freshness of results is important in modern web search. Failing to recognize the temporal aspect of a query can negatively affect the user experience, and make the search engine appear stale. While freshness and relevance can be closely related for some topics (e.g., news queries), they are more independent in others (e.g., time insensitive queries). Therefore, optimizing one criterion does not necessarily improve the other, and can even do harm in some cases.

We propose a machine-learning framework for simultaneously optimizing freshness and relevance, in which the trade-off is automatically adaptive to query temporal characteristics. We start by illustrating different temporal characteristics of queries, and the features that can be used for capturing these properties. We then introduce our supervised framework that leverages the temporal profile of queries (inferred from pseudo-feedback documents) along with the other ranking features to improve both freshness and relevance of search results. Our experiments on a large archival web corpus demonstrate the efficacy of our techniques.

References

  1. K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proc. of ECIR, pages 13--25, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Bian, X. Li, F. Li, Z. Zheng, and H. Zha. Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM. In Proc. of WWW, pages 131--140, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Bian, T.-Y. Liu, T. Qin, and H. Zha. Ranking with query-dependent loss for web search. In Proc. of WSDM, pages 141--150, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. In Proc. of WWW, pages 107--117, Apr. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Burges, T. Shaked, E. Renshaw, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of ICML, pages 89--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chien and N. Immorlica. Semantic similarity between search engine queries using temporal correlation. In Proc. of WWW, pages 2--11, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. B. Cleveland, W. S. Cleveland, J. E. Mcrae, and I. Terpenning. STL: A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 6(1):3--73, 1990.Google ScholarGoogle Scholar
  8. N. Dai and B. D. Davison. Freshness matters: In flowers, food, and web authority. In Proc. of SIGIR, pages 114--121, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Dai and B. D. Davison. Mining anchor text trends for retrieval. In Proc. of ECIR, pages 127--139, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Dakka, L. Gravano, and P. G. Ipeirotis. Answering general time sensitive queries. In Proc. of CIKM, pages 1437--1438, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Diaz. Integration of news content into web results. In Proc. of WSDM, pages 182--191, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In Proc. of WSDM, pages 11--20, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In Proc. of WWW, pages 331--340, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. L. Elsas and S. T. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proc. of WSDM, pages 1--10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933--969, December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In Proc. of SIGIR, pages 115--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proc. of WWW, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In Proc. of SIGIR, pages 41--48, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Joachims. Optimizing search engines using clickthrough data. In Proc. of SIGKDD, pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Jones and F. Diaz. Temporal profiles of queries. ACM Transactions on Information Systems, 25(3):14, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. I. Kang and G. Kim. Query type classification for web document retrieval. In Proc. of SIGIR, pages 64--71, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Kulkarni, J. Teevan, K. Svore, and S. Dumais. Understanding temporal query dynamics. In Proc. of WSDM, pages 167--176, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Li, F. Liu, and W. Chou. An information theoretic approach for using word cluster information in natural language call routing. Technical Report ALR-2003-014, Avaya Labs Research, 2003.Google ScholarGoogle Scholar
  24. D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proc. of SIGIR, pages 700--701, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Robertson, H. Zaragoza, and M. Taylor. Simple BM25 extension to multiple weighted fields. In Proc. of CIKM, pages 42--49, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Steuer. Multiple Criteria Optimization: Theory, Computation and Application. John Wiley, 546 pp, 1986.Google ScholarGoogle Scholar
  27. R. Tibshirani, G. Walther, and T. Hastie. Estimating the Number of Clusters in a Dataset via the Gap Statistic. Journal of the Royal Statistical Society, Series B, 63:411--423, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  28. L. Wang, J. Lin, and D. Metzler. Learning to efficiently rank. In Proc. of SIGIR, pages 138--145, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Q. Wu, C. Burges, K. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. S. Yu, X. Li, and B. Liu. On the temporal dimension of search. In Proc. of WWW, pages 448--449, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems, 22:179--214, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Advances in NIPS 20, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Learning to rank for freshness and relevance

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
      July 2011
      1374 pages
      ISBN:9781450307574
      DOI:10.1145/2009916

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 July 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader