skip to main content
10.1145/1277741.1277792acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

A regression framework for learning ranking functions using relative relevance judgments

Published:23 July 2007Publication History

ABSTRACT

Effective ranking functions are an essential part of commercial search engines. We focus on developing a regression framework for learning ranking functions for improving relevance of search engines serving diverse streams of user queries. We explore supervised learning methodology from machine learning, and we distinguish two types of relevance judgments used as the training data: 1) absolute relevance judgments arising from explicit labeling of search results; and 2) relative relevance judgments extracted from user click throughs of search results or converted from the absolute relevance judgments. We propose a novel optimization framework emphasizing the use of relative relevance judgments. The main contribution is the development of an algorithm based on regression that can be applied to objective functions involving preference data, i.e., data indicating that a document is more relevant than another with respect to a query. Experimental results are carried out using data sets obtained from a commercial search engine. Our results show significant improvements of our proposed methods over some existing methods.

References

  1. R. Atterer, M. Wunk, and A. Schmidt. Knowing the user's every move: user activity tracking for website usability evaluation and implicit interaction. Proceedings of the 15th International Conference on World Wide Web 203--212,2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Berger. Statistical machine learning for information retrieval Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Bertsekas. Nonlinear programming Athena Scienti?c, second edition, 1999.Google ScholarGoogle Scholar
  4. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. Proceedings of international conference on Machine learning 89--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Chen. Machine Learning for information retrieval: Neural networks, symbolic learning and genetic algorithms. JASIS 46:194--216, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Cooper, F. Gey and A. Chen. Probabilistic retrieval in the TIPSTER collections: an application of staged logistic regression. Proceedings of TREC 73--88, 1992.Google ScholarGoogle Scholar
  7. D. Cossock and T. Zhang. Subset ranking using regression. COLT 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Freund, R. Iyer, R. Schapire and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4:933--969, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Friedman. Greedy function approximation: a gradient boosting machine. Ann. Statist. 29:1189--1232, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  10. N. Fuhr. Optimum polynomial retrieval functions based on probability ranking principle. ACM Transactions on Information Systems 7:183--204, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Gey, A. Chen, J. He and J. Meggs. Logistic regression at TREC4: probabilistic retrieval from full text document collections. Proceedings of TREC 65--72, 1995.Google ScholarGoogle Scholar
  12. K. Järvelin and J.Kekäläinen.Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20:422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Joachims. Optimizing search engines using clickthrough data. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. Evaluating retrieval performance using clickthrough data. Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval 2002.Google ScholarGoogle Scholar
  15. T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay. Accurately Interpreting Clickthrough Data as Implicit Feedback. Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Ponte and W. Croft. A language modeling approach to information retrieval. In Proceedings of the ACM Conference on Research and Development in Information Retrieval 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Salton. Automatic Text Processing. Addison Wesley, Reading, MA, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Turtle and W. B. Croft. Inference networks for document retrieval. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1-24, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Zha, Z. Zheng, H. Fu and G. Sun. Incorporating query difference for learning retrieval functions in worldwidewebsearch. Proceedings of the 15th ACM Conference on Information and Knowledge Management 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Diane Kelly and Jaime Teevan. Implicit Feedback for Inferring User Preference: A Bibliography. SIGIR Forum 32:2, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Zhai and J. Lafferty. A risk minimization framework for information retrieval, Information Processing and Management 42:31--55, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A regression framework for learning ranking functions using relative relevance judgments

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
        July 2007
        946 pages
        ISBN:9781595935977
        DOI:10.1145/1277741

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 July 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader