nach oben

Discover Computing

Erschienen in:

01.06.2010 | Learning to rank for information retrieval

Gradient descent optimization of smoothed information retrieval metrics

verfasst von: Olivier Chapelle, Mingrui Wu

Erschienen in: Discover Computing | Ausgabe 3/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Most ranking algorithms are based on the optimization of some loss functions, such as the pairwise loss. However, these loss functions are often different from the criteria that are adopted to measure the quality of the web page ranking results. To overcome this problem, we propose an algorithm which aims at directly optimizing popular measures such as the Normalized Discounted Cumulative Gain and the Average Precision. The basic idea is to minimize a smooth approximation of these measures with gradient descent. Crucial to this kind of approach is the choice of the smoothing factor. We provide various theoretical analysis on that choice and propose an annealing algorithm to iteratively minimize a less and less smoothed approximation of the measure of interest. Results on the Letor benchmark datasets show that the proposed algorithm achieves state-of-the-art performances.

Vorheriger Artikel Efficient algorithms for ranking with SVMs

Nächster Artikel Knowledge transfer for cross domain learning to rank

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

When the number of queries is large, the average IR metric becomes approximately smooth and it is possible to compute an empirical gradient (Yue and Burges 2007).

Available at http://www.research.microsoft.com/en-us/um/beijing/projects/letor/

Bartlett, P., & Mendelson, S. (2003). Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.MATHCrossRefMathSciNet

Bridle, J. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulie & J. Herault (Eds.), Neurocomputing: Algorithms architectures and applications (pp. 227–236). Berlin: Springer.

Burges, C. J., Le, Q. V., & Ragno, R. (2007). Learning to rank with nonsmooth cost functions. In Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems (Vol. 19).

Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In Proceedings of the international conference on machine learning.

Cao, Y., Xu, J., Liu, T. Y., Li, H,. Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR.

Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007) Learning to rank: From pairwise approach to listwise approach. In International conference on machine learning.

Chakrabarti, S., Khanna, R., Sawant, U., & Bhattacharyya, C. (2008). Structured learning for non-smooth ranking losses. In International conference on knowledge discovery and data mining (KDD).

Chapelle, O., Chi, M., & Zien, A. (2006) A continuation method for semi-supervised SVMs. In International conference on machine learning.

Chapelle, O., Le, Q., & Smola, A. (2007) Large margin optimization of ranking measures. In NIPS workshop on machine learning for web search.

Cossock, D., & Zhang, T. (2006). Subset ranking using regression. In 19th Annual conference on learning theory, Lecture Notes in Computer Science (Vol. 4005, pp. 605–619). Springer.

Donmez, P., Svore, K., & Burges, C. (2009). On the local optimality of LambdaRank. In SIGIR’09: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 460–467).

Dunlavy, D., & O’Leary, D. (2005). Homotopy optimization methods for global optimization. Tech. Rep. SAND2005-7495, Sandia National Laboratories.

Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003) An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.CrossRefMathSciNet

Friedman, J. (2001) Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.MATHCrossRefMathSciNet

Guiver, J., & Snelson, E. (2008). Learning to rank with softrank and gaussian processes. In: SIGIR’08: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 259–266).

Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers. Cambridge, MA: MIT Press.

Jarvelin, K., & Kekalainen, J. (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef

Joachims, T. (2002) Optimizing search engines using clickthrough data. In Proceedings of the ACM conference on knowledge discovery and data mining (KDD), ACM.

Le, Q. V., Smola, A., Chapelle, O., & Teo, C. H. (2009). Optimization of ranking measures (submitted).

Ledoux, M., & Talagrand, M. (1991) Probability in banach spaces: Isoperimetry and processes. Berlin: Springer.MATH

Li, P., Burges, C., & Wu, Q. (2008) Mcrank Learning to rank using multiple classification and gradient boosting. In J. Platt, D. Koller, Y. Singe, & S. Roweis (Eds.), Advances in neural information processing systems (Vol. 20, pp. 897–904). Cambridge, MA: MIT Press.

Liu, T. Y., Xu, J., Qin, T., Xiong, W., & Li, H. (2007). Letor: Benchmark dataset for research on learning to rank for information retrieval. In LR4IR 2007, in conjunction with SIGIR 2007.

Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval.

Shewchuk, J. R. (1994). An introduction to the conjugate gradient method without the agonizing pain. Tech. Rep. CMU-CS-94-125, School of Computer Science, Carnegie Mellon University.

Taylor, M., Guiver, J., Robertson, S., & Minka, T. (2008). Softrank: Optimizing non-smooth rank metrics. In WSDM’08: Proceedings of the international conference on web search and web data mining, ACM (pp. 77–86).

Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). Frank: A ranking method with fidelity loss. In International ACM SIGIR conference on research and development in information retrieval.

Wu, M., Chang, Y., Zheng, Z., & Zha, H. (2009). Smoothing DCG for learning to rank: A novel approach using smoothed hinge functions. In Proceedings of the 17th ACM conference on information and knowledge management, CIKM.

Xia, F., Liu, T., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th international conference on machine learning (pp. 1192–1199).

Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In International ACM SIGIR conference on research and development in information retrieval.

Yilmaz, E., & Robertson, S. (2009). On the choice of effectiveness measures for learning to rank. Information Retrieval Journal To appear in the special issue of Learning to Rank for Information Retrieval. doi:10.1007/s10791-009-9116-x.

Yue, Y., & Burges, C. (2007). On using simultaneous perturbation stochastic approximation for learning to rank, and the empirical optimality of LambdaRank. Tech. Rep. MSR-TR-2007-115, Microsoft Reseach.

Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007) A support vector method for optimizing average precision. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM (pp. 271–278).

Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., & Sun, G. (2008). A general boosting method and its application to learning ranking functions for web search. In Advances in neural information processing systems (Vol. 20, pp 1697–1704). MIT Press.

Titel: Gradient descent optimization of smoothed information retrieval metrics
verfasst von: Olivier Chapelle
Mingrui Wu
Publikationsdatum: 01.06.2010
Verlag: Springer Netherlands
Erschienen in: Discover Computing / Ausgabe 3/2010
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-009-9110-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2010

Introduction to special issue on learning to rank for information retrieval

Learning to rank with (a lot of) word features

Knowledge transfer for cross domain learning to rank

On the choice of effectiveness measures for learning to rank

Efficient algorithms for ranking with SVMs

Adapting boosting for information retrieval measures