research-article

Learning to rank with SoftRank and Gaussian processes

Authors:
John Guiver

Microsoft Research Limited, Cambridge, United Kngdm

Microsoft Research Limited, Cambridge, United Kngdm
View Profile

,
Edward Snelson

Microsoft Research Limited, Cambridge, United Kngdm

Microsoft Research Limited, Cambridge, United Kngdm
View Profile

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrievalJuly 2008Pages 259–266https://doi.org/10.1145/1390334.1390380

Published:20 July 2008Publication History

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 259–266

ABSTRACT

In this paper we address the issue of learning to rank for document retrieval using Thurstonian models based on sparse Gaussian processes. Thurstonian models represent each document for a given query as a probability distribution in a score space; these distributions over scores naturally give rise to distributions over document rankings. However, in general we do not have observed rankings with which to train the model; instead, each document in the training set is judged to have a particular relevance level: for example "Bad", "Fair", "Good", or "Excellent". The performance of the model is then evaluated using information retrieval (IR) metrics such as Normalised Discounted Cumulative Gain (NDCG). Recently Taylor et al. presented a method called SoftRank which allows the direct gradient optimisation of a smoothed version of NDCG using a Thurstonian model. In this approach, document scores are represented by the outputs of a neural network, and score distributions are created artificially by adding random noise to the scores. The SoftRank mechanism is a general one; it can be applied to different IR metrics, and make use of different underlying models. In this paper we extend the SoftRank framework to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model. We further develop the model by using sparse Gaussian process techniques, which give improved performance and efficiency, and show competitive results against baseline methods when tested on the publicly available LETOR OHSUMED data set. We also explore how the available uncertainty information can be used in prediction and how it affects model performance.

References

C. Burges, R. Ragno, and Q. V. L. Le. Learning to rank with nonsmooth cost functions. In NIPS, 2006.Google Scholar
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML, 2005. Google ScholarDigital Library
W. Chu and Z. Ghahramani. Gaussian processes for ordinal regression. JMLR, 6:1019--1041, 2005. Google ScholarDigital Library
K. Crammer and Y. Singer. Pranking with ranking. In NIPS 14, 2002.Google Scholar
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, pages 115--132. MIT Press, 2000.Google Scholar
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, 2000. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In KDD, 2002. Google ScholarDigital Library
N. D. Lawrence. Learning for larger datasets with the Gaussian process latent variable model. In M. Meila and X. Shen, editors, AISTATS 11. Omnipress, 2007.Google Scholar
T.-Y. Liu. LETOR: Benchmark datasets for learning to rank, 2007. Microsoft Research Asia. http://research.microsoft.com/users/LETOR/.Google Scholar
R. M. Neal. Bayesian learning for neural networks. In Lecture Notes in Statistics 118. Springer, 1996. Google ScholarDigital Library
J. Nocedal and S. Wright. Numerical Optimization, Second Edition. Springer, 2006.Google Scholar
J. Quiñonero Candela and C. E. Rasmussen. A unifying view of sparse approximate Gaussian process regression. JMLR, 6:1939--1959, Dec 2005. Google ScholarDigital Library
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT press, 2006. Google ScholarDigital Library
S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Retrieval, 10(3):321--339, 2007. Google ScholarDigital Library
S. Robertson, H. Zaragoza, and M. Taylor. A simple BM 25 extension to multiple weighted fields. In CIKM, pages 42--29, 2004. Google ScholarDigital Library
E. Snelson and Z. Ghahramani. Sparse Gaussian processes using pseudo-inputs. In Y. Weiss, B. Schölkopf, and J. Platt, editors, NIPS 18, pages 1257--1264. MIT press, Cambridge, MA, 2006.Google Scholar
M. Taylor, J. Guiver, S. Robertson, and T. Minka. SoftRank: optimizing non-smooth rank metrics. In WSDM '08, pages 77--86. ACM, 2008. Google ScholarDigital Library
L. L. Thurstone. A law of comparative judgement. Psychological Reviews, 34:273--286, 1927.Google ScholarCross Ref

Index Terms

Learning to rank with SoftRank and Gaussian processes
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
    2. Retrieval models and ranking

Recommendations

SoftRank: optimizing non-smooth rank metrics
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data Mining

We address the problem of learning large complex ranking functions. Most IR applications use evaluation metrics that depend only upon the ranks of documents. However, most ranking functions generate document scores, which are sorted to produce a ...
Read More
Effective rank aggregation for metasearching

Nowadays, mashup services and especially metasearch engines play an increasingly important role on the Web. Most of users use them directly or indirectly to access and aggregate information from more than one data sources. Similarly to the rest of the ...
Read More
Rank aggregation model for meta search: an approach using text and rank analysis measures
Intelligent information processing II

One problem domain of meta search is to combine and improve the precision of ranking results from various search systems. This paper describes a rank aggregation model that incorporates text analysis measure with existing rank-based method, e.g. Best ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
July 2008
934 pages
ISBN:9781605581644
DOI:10.1145/1390334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Gaussian process
information retrieval
learning
ranking
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 1,065
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning to rank with SoftRank and Gaussian processes

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

SoftRank: optimizing non-smooth rank metrics

Effective rank aggregation for metasearching

Rank aggregation model for meta search: an approach using text and rank analysis measures