research-article

Combined regression and ranking

Author:
D. Sculley

Google, Inc., Pittsburgh, PA, USA

Google, Inc., Pittsburgh, PA, USA
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 979–988https://doi.org/10.1145/1835804.1835928

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 979–988

ABSTRACT

Many real-world data mining tasks require the achievement of two distinct goals when applied to unseen data: first, to induce an accurate preference ranking, and second to give good regression performance. In this paper, we give an efficient and effective Combined Regression and Ranking method (CRR) that optimizes regression and ranking objectives simultaneously. We demonstrate the effectiveness of CRR for both families of metrics on a range of large-scale tasks, including click prediction for online advertisements. Results show that CRR often achieves performance equivalent to the best of both ranking-only and regression-only approaches. In the case of rare events or skewed distributions, we also find that this combination can actually improve regression performance due to the addition of informative ranking constraints.

Supplemental Material

kdd2010_sculley_crr_01.mov

mov

98.6 MB

Download

References

G. Aggarwal, A. Goel, and R. Motwani. Truthful auctions for pricing search keywords. In EC '06: Proceedings of the 7th ACM conference on Electronic commerce, 2006. Google ScholarDigital Library
C. M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., 2006. Google ScholarDigital Library
L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems, volume 20, pages 161--168. NIPS Foundation (http://books.nips.cc), 2008.Google Scholar
A. P. Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30, 1997. Google ScholarDigital Library
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML '05: Proceedings of the 22nd international conference on Machine learning, 2005. Google ScholarDigital Library
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML '07: Proceedings of the 24th international conference on Machine learning, 2007. Google ScholarDigital Library
D. Chakrabarti, D. Agarwal, and V. Josifovski. Contextual advertising by combining relevance with click feedback. In WWW '08: Proceeding of the 17th international conference on World Wide Web, 2008. Google ScholarDigital Library
M. Ciaramita, V. Murdock, and V. Plachouras. Online learning from click data for sponsored search. In WWW '08: Proceeding of the 17th international conference on World Wide Web, 2008. Google ScholarDigital Library
J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l1-ball for learning in high dimensions. In ICML '08: Proceedings of the 25th international conference on Mach ine learning, 2008. Google ScholarDigital Library
J. L. Elsas, V. R. Carvalho, and J. G. Carbonell. Fast learning of document ranking functions with the committee perceptron. In WSDM '08: Proceedings of the international conference on Web search and web data mining, 2008. Google ScholarDigital Library
Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4, 2003. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002. Google ScholarDigital Library
T. Joachims. A support vector method for multivariate performance measures. In ICML '05: Proceedings of the 22nd international conference on Machine learning, 2005. Google ScholarDigital Library
T. Joachims. Training linear svms in linear time. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006. Google ScholarDigital Library
K. Koh, S.-J. Kim, and S. Boyd. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res., 8, 2007. Google ScholarDigital Library
J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. J. Mach. Learn. Res., 10, 2009. Google ScholarDigital Library
D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. J. Mach. Learn. Res., 5:361--397, 2004. Google ScholarDigital Library
T.-Y. Liu, T. Qin, J. Xu, W. Xiong, and H. Li. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In LR4IR 2007: Workshop on Learning to Rank for Information Retrieval, in conjunction with SIGIR 2007, 2007.Google Scholar
M.-F. M.F. Balcan and A. Blum. On a theory of learning with similarity functions. In ICML '06: Proceedings of the 23rd international conference on Machine learning, 2006. Google ScholarDigital Library
T. M. Mitchell. Generative and discriminative classifiers: Naive bayes and logistic regression. In Machine Learning. http://www.cs.cmu.edu/~tom/mlbook/NBayesLogReg.pdf, 2005.Google Scholar
D. Sculley. Large scale learning to rank. In NIPS 2009 Workshop on Advances in Ranking, 2009.Google Scholar
D. Sculley, R. G. Malkin, S. Basu, and R. J. Bayardo. Predicting bounce rates in sponsored search advertisements. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009. Google ScholarDigital Library
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In ICML '07: Proceedings of the 24th international conference on Machine learning, 2007. Google ScholarDigital Library
J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007. Google ScholarDigital Library
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007. Google ScholarDigital Library

Index Terms

Combined regression and ranking
1. Information systems
  1. Information retrieval

Recommendations

Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More
A regression framework for learning ranking functions using relative relevance judgments
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Effective ranking functions are an essential part of commercial search engines. We focus on developing a regression framework for learning ranking functions for improving relevance of search engines serving diverse streams of user queries. We explore ...
Read More
Regression by Re-Ranking
Highlights
- A novel rank-based approach is proposed for regression tasks;
- The method allows ...
Abstract
Several approaches based on regression have been developed in the past few years with the goal of improving prediction results, including the use of ranking strategies. Re-ranking has been exploited and successfully employed in several ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
large-scale data
ranking
regression
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 75
  Total Citations
  View Citations
- 1,217
  Total Downloads
- Downloads (Last 12 months)70
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Combined regression and ranking

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Re-ranking search results using query logs

A regression framework for learning ranking functions using relative relevance judgments

Regression by Re-Ranking