Article

A regression framework for learning ranking functions using relative relevance judgments

Authors:
Zhaohui Zheng

Yahoo! Inc.

Yahoo! Inc.
View Profile

,
Keke Chen

Yahoo! Inc.

Yahoo! Inc.
View Profile

,
Gordon Sun

Yahoo! Inc.

Yahoo! Inc.
View Profile

,
Hongyuan Zha

Georgia Institute of Technology

Georgia Institute of Technology
View Profile

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2007Pages 287–294https://doi.org/10.1145/1277741.1277792

Published:23 July 2007Publication History

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 287–294

ABSTRACT

Effective ranking functions are an essential part of commercial search engines. We focus on developing a regression framework for learning ranking functions for improving relevance of search engines serving diverse streams of user queries. We explore supervised learning methodology from machine learning, and we distinguish two types of relevance judgments used as the training data: 1) absolute relevance judgments arising from explicit labeling of search results; and 2) relative relevance judgments extracted from user click throughs of search results or converted from the absolute relevance judgments. We propose a novel optimization framework emphasizing the use of relative relevance judgments. The main contribution is the development of an algorithm based on regression that can be applied to objective functions involving preference data, i.e., data indicating that a document is more relevant than another with respect to a query. Experimental results are carried out using data sets obtained from a commercial search engine. Our results show significant improvements of our proposed methods over some existing methods.

References

R. Atterer, M. Wunk, and A. Schmidt. Knowing the user's every move: user activity tracking for website usability evaluation and implicit interaction. Proceedings of the 15th International Conference on World Wide Web 203--212,2006. Google ScholarDigital Library
A. Berger. Statistical machine learning for information retrieval Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, 2001. Google ScholarDigital Library
D. Bertsekas. Nonlinear programming Athena Scienti?c, second edition, 1999.Google Scholar
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. Proceedings of international conference on Machine learning 89--96, 2005. Google ScholarDigital Library
H. Chen. Machine Learning for information retrieval: Neural networks, symbolic learning and genetic algorithms. JASIS 46:194--216, 1995. Google ScholarDigital Library
W. Cooper, F. Gey and A. Chen. Probabilistic retrieval in the TIPSTER collections: an application of staged logistic regression. Proceedings of TREC 73--88, 1992.Google Scholar
D. Cossock and T. Zhang. Subset ranking using regression. COLT 2006. Google ScholarDigital Library
Y. Freund, R. Iyer, R. Schapire and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4:933--969, 2003. Google ScholarDigital Library
J. Friedman. Greedy function approximation: a gradient boosting machine. Ann. Statist. 29:1189--1232, 2001.Google ScholarCross Ref
N. Fuhr. Optimum polynomial retrieval functions based on probability ranking principle. ACM Transactions on Information Systems 7:183--204, 1989. Google ScholarDigital Library
F. Gey, A. Chen, J. He and J. Meggs. Logistic regression at TREC4: probabilistic retrieval from full text document collections. Proceedings of TREC 65--72, 1995.Google Scholar
K. Järvelin and J.Kekäläinen.Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20:422--446, 2002. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining 2002. Google ScholarDigital Library
T. Joachims. Evaluating retrieval performance using clickthrough data. Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval 2002.Google Scholar
T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay. Accurately Interpreting Clickthrough Data as Implicit Feedback. Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. Google ScholarDigital Library
J. Ponte and W. Croft. A language modeling approach to information retrieval. In Proceedings of the ACM Conference on Research and Development in Information Retrieval 1998. Google ScholarDigital Library
G. Salton. Automatic Text Processing. Addison Wesley, Reading, MA, 1989. Google ScholarDigital Library
H. Turtle and W. B. Croft. Inference networks for document retrieval. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1-24, 1990. Google ScholarDigital Library
H. Zha, Z. Zheng, H. Fu and G. Sun. Incorporating query difference for learning retrieval functions in worldwidewebsearch. Proceedings of the 15th ACM Conference on Information and Knowledge Management 2006. Google ScholarDigital Library
Diane Kelly and Jaime Teevan. Implicit Feedback for Inferring User Preference: A Bibliography. SIGIR Forum 32:2, 2003. Google ScholarDigital Library
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2005. Google ScholarDigital Library
C. Zhai and J. Lafferty. A risk minimization framework for information retrieval, Information Processing and Management 42:31--55, 2006. Google ScholarDigital Library

Index Terms

A regression framework for learning ranking functions using relative relevance judgments
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
  2. Information systems applications

Recommendations

Learning to rank with ties
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection ...
Read More
Smoothing DCG for learning to rank: a novel approach using smoothed hinge functions
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Discounted cumulative gain (DCG) is widely used for evaluating ranking functions. It is therefore natural to learn a ranking function that directly optimizes DCG. However, DCG is non-smooth, rendering gradient-based optimization algorithms inapplicable. ...
Read More
Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search

Web search engines have become an integral part of the daily life of a knowledge worker, who depends on these search engines to retrieve relevant information from the Web or from the company's vast document databases. Current search engines are very ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
July 2007
946 pages
ISBN:9781595935977
DOI:10.1145/1277741
General Chairs:
Wessel Kraaij
TNO, The Netherlands
,
Arjen P. de Vries
CWI, The Netherlands
,
Program Chairs:
Charles L. A. Clarke
University of Waterloo, Canada
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Noriko Kando
National Institute of Informatics, Japan
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 July 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
absolute relevance judgment
clickthroughs
functional gradient descent
gradient boosting
machine learning
preferences
ranking function
regression
relative relevance judgment
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 130
  Total Citations
  View Citations
- 1,580
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A regression framework for learning ranking functions using relative relevance judgments

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning to rank with ties

Smoothing DCG for learning to rank: a novel approach using smoothed hinge functions

Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search