skip to main content
10.1145/2009916.2009934acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A cascade ranking model for efficient ranked retrieval

Authors Info & Claims
Published:24 July 2011Publication History

ABSTRACT

There is a fundamental tradeoff between effectiveness and efficiency when designing retrieval models for large-scale document collections. Effectiveness tends to derive from sophisticated ranking functions, such as those constructed using learning to rank, while efficiency gains tend to arise from improvements in query evaluation and caching strategies. Given their inherently disjoint nature, it is difficult to jointly optimize effectiveness and efficiency in end-to-end systems. To address this problem, we formulate and develop a novel cascade ranking model, which unlike previous approaches, can simultaneously improve both top k ranked effectiveness and retrieval efficiency. The model constructs a cascade of increasingly complex ranking functions that progressively prunes and refines the set of candidate documents to minimize retrieval latency and maximize result set quality. We present a novel boosting algorithm for learning such cascades to directly optimize the tradeoff between effectiveness and efficiency. Experimental results show that our cascades are faster and return higher quality results than comparable ranking models.

References

  1. A. Arampatzis, J. Kamps, and S. Robertson. Where to stop reading a ranked list? SIGIR 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Baeza-Yates, A. Gionis, F. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. The impact of caching on search engines. SIGIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Barroso and U. Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bendersky, D. Metzler, and W. Croft. Learning concept importance using a weighted dependence model. WSDM 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Brown. Fast evaluation of structured queries for information retrieval. SIGIR 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. ICML 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Cacheda, V. Plachouras, and I. Ounis. A case study of distributed information retrieval architectures to index one terabyte of text. IP &M, 41:1141--1161, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Cambazoglu, H. Zaragoza, O. Chapelle, J. Chen, C. Liao, Z. Zheng, and J. Degenhardt. Early exit optimizations for additive machine learned ranking systems. WSDM 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, and H.-W. Hon. Adapting ranking SVM to document retrieval. SIGIR 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Carmel, D. Cohen, R. Fagin, E. Farchi, M. Herscovici, Y. Maarek, and A. Soffer. Static index pruning for information retrieval systems. SIGIR 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Collins-Thompson. Reducing the risk of query expansion via robust constrained optimization. CIKM 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Hamilton. On designing and deploying Internet-scale services. LISA 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. E. Kanoulas, V. Pavlu, K. Dai, and J. Aslam. Modeling the score distribution of relevant and non-relevant documents. ICTIR 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Lin, D. Metzler, T. Elsayed, and L. Wang. Of Ivory and Smurfs: Loxodontan MapReduce experiments for web search. TREC 2009.Google ScholarGoogle Scholar
  16. T.-Y. Liu. Learning to rank for information retrieval. FnTIR, 3(3), 2009.Google ScholarGoogle Scholar
  17. I. Matveeva, C. Burges, T. Burkard, A. Laucius, and L. Wong. High accuracy retrieval with multiple nested ranker. SIGIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Metzler. Automatic feature selection in the Markov random field model for information retrieval. CIKM 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Nocedal and S. Wright. Numerical optimization. Springer, 2006.Google ScholarGoogle Scholar
  20. A. Ntoulas and J. Cho. Pruning policies for two-tiered inverted index with correctness guarantee. SIGIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Skobeltsyn, F. Junqueira, V. Plachouras, and R. Baeza-Yates. ResIn: A combination of results caching and index pruning for high-performance web search engines. SIGIR 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Steuer. Multiple Criteria Optimization: Theory, Computation and Application. John Wiley, 1986.Google ScholarGoogle Scholar
  23. T. Strohman, H. Turtle, and W. Croft. Optimization strategies for complex queries. SIGIR 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, 58(1):267--288, 1996.Google ScholarGoogle Scholar
  25. H. Turtle and J. Flood. Query evaluation: Strategies and optimizations. IP&M, 31(6):831--850, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Viola and M. Jones. Robust real-time object detection. International Journal of Computer Vision, 57(2):137--154, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Wang, J. Lin, and D. Metzler. Learning to efficiently rank. SIGIR 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Wang, D. Metzler, and J. Lin. Ranking under temporal constraints. CIKM 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Weiss & B. Taskar. Structured prediction cascades. 13th Conference on AI and Statistics, 2010.Google ScholarGoogle Scholar
  30. J. Xu & H. Li. AdaRank: A boosting algorithm for information retrieval. SIGIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A cascade ranking model for efficient ranked retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
      July 2011
      1374 pages
      ISBN:9781450307574
      DOI:10.1145/2009916

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 July 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader