Skip to main content
Erschienen in: Discover Computing 3/2010

01.06.2010 | Learning to rank for information retrieval

Gradient descent optimization of smoothed information retrieval metrics

verfasst von: Olivier Chapelle, Mingrui Wu

Erschienen in: Discover Computing | Ausgabe 3/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most ranking algorithms are based on the optimization of some loss functions, such as the pairwise loss. However, these loss functions are often different from the criteria that are adopted to measure the quality of the web page ranking results. To overcome this problem, we propose an algorithm which aims at directly optimizing popular measures such as the Normalized Discounted Cumulative Gain and the Average Precision. The basic idea is to minimize a smooth approximation of these measures with gradient descent. Crucial to this kind of approach is the choice of the smoothing factor. We provide various theoretical analysis on that choice and propose an annealing algorithm to iteratively minimize a less and less smoothed approximation of the measure of interest. Results on the Letor benchmark datasets show that the proposed algorithm achieves state-of-the-art performances.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
When the number of queries is large, the average IR metric becomes approximately smooth and it is possible to compute an empirical gradient (Yue and Burges 2007).
 
Literatur
Zurück zum Zitat Bartlett, P., & Mendelson, S. (2003). Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.MATHCrossRefMathSciNet Bartlett, P., & Mendelson, S. (2003). Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.MATHCrossRefMathSciNet
Zurück zum Zitat Bridle, J. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulie & J. Herault (Eds.), Neurocomputing: Algorithms architectures and applications (pp. 227–236). Berlin: Springer. Bridle, J. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulie & J. Herault (Eds.), Neurocomputing: Algorithms architectures and applications (pp. 227–236). Berlin: Springer.
Zurück zum Zitat Burges, C. J., Le, Q. V., & Ragno, R. (2007). Learning to rank with nonsmooth cost functions. In Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems (Vol. 19). Burges, C. J., Le, Q. V., & Ragno, R. (2007). Learning to rank with nonsmooth cost functions. In Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems (Vol. 19).
Zurück zum Zitat Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In Proceedings of the international conference on machine learning. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In Proceedings of the international conference on machine learning.
Zurück zum Zitat Cao, Y., Xu, J., Liu, T. Y., Li, H,. Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR. Cao, Y., Xu, J., Liu, T. Y., Li, H,. Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR.
Zurück zum Zitat Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007) Learning to rank: From pairwise approach to listwise approach. In International conference on machine learning. Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007) Learning to rank: From pairwise approach to listwise approach. In International conference on machine learning.
Zurück zum Zitat Chakrabarti, S., Khanna, R., Sawant, U., & Bhattacharyya, C. (2008). Structured learning for non-smooth ranking losses. In International conference on knowledge discovery and data mining (KDD). Chakrabarti, S., Khanna, R., Sawant, U., & Bhattacharyya, C. (2008). Structured learning for non-smooth ranking losses. In International conference on knowledge discovery and data mining (KDD).
Zurück zum Zitat Chapelle, O., Chi, M., & Zien, A. (2006) A continuation method for semi-supervised SVMs. In International conference on machine learning. Chapelle, O., Chi, M., & Zien, A. (2006) A continuation method for semi-supervised SVMs. In International conference on machine learning.
Zurück zum Zitat Chapelle, O., Le, Q., & Smola, A. (2007) Large margin optimization of ranking measures. In NIPS workshop on machine learning for web search. Chapelle, O., Le, Q., & Smola, A. (2007) Large margin optimization of ranking measures. In NIPS workshop on machine learning for web search.
Zurück zum Zitat Cossock, D., & Zhang, T. (2006). Subset ranking using regression. In 19th Annual conference on learning theory, Lecture Notes in Computer Science (Vol. 4005, pp. 605–619). Springer. Cossock, D., & Zhang, T. (2006). Subset ranking using regression. In 19th Annual conference on learning theory, Lecture Notes in Computer Science (Vol. 4005, pp. 605–619). Springer.
Zurück zum Zitat Donmez, P., Svore, K., & Burges, C. (2009). On the local optimality of LambdaRank. In SIGIR’09: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 460–467). Donmez, P., Svore, K., & Burges, C. (2009). On the local optimality of LambdaRank. In SIGIR’09: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 460–467).
Zurück zum Zitat Dunlavy, D., & O’Leary, D. (2005). Homotopy optimization methods for global optimization. Tech. Rep. SAND2005-7495, Sandia National Laboratories. Dunlavy, D., & O’Leary, D. (2005). Homotopy optimization methods for global optimization. Tech. Rep. SAND2005-7495, Sandia National Laboratories.
Zurück zum Zitat Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003) An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.CrossRefMathSciNet Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003) An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.CrossRefMathSciNet
Zurück zum Zitat Guiver, J., & Snelson, E. (2008). Learning to rank with softrank and gaussian processes. In: SIGIR’08: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 259–266). Guiver, J., & Snelson, E. (2008). Learning to rank with softrank and gaussian processes. In: SIGIR’08: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 259–266).
Zurück zum Zitat Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers. Cambridge, MA: MIT Press. Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers. Cambridge, MA: MIT Press.
Zurück zum Zitat Jarvelin, K., & Kekalainen, J. (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef Jarvelin, K., & Kekalainen, J. (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef
Zurück zum Zitat Joachims, T. (2002) Optimizing search engines using clickthrough data. In Proceedings of the ACM conference on knowledge discovery and data mining (KDD), ACM. Joachims, T. (2002) Optimizing search engines using clickthrough data. In Proceedings of the ACM conference on knowledge discovery and data mining (KDD), ACM.
Zurück zum Zitat Le, Q. V., Smola, A., Chapelle, O., & Teo, C. H. (2009). Optimization of ranking measures (submitted). Le, Q. V., Smola, A., Chapelle, O., & Teo, C. H. (2009). Optimization of ranking measures (submitted).
Zurück zum Zitat Ledoux, M., & Talagrand, M. (1991) Probability in banach spaces: Isoperimetry and processes. Berlin: Springer.MATH Ledoux, M., & Talagrand, M. (1991) Probability in banach spaces: Isoperimetry and processes. Berlin: Springer.MATH
Zurück zum Zitat Li, P., Burges, C., & Wu, Q. (2008) Mcrank Learning to rank using multiple classification and gradient boosting. In J. Platt, D. Koller, Y. Singe, & S. Roweis (Eds.), Advances in neural information processing systems (Vol. 20, pp. 897–904). Cambridge, MA: MIT Press. Li, P., Burges, C., & Wu, Q. (2008) Mcrank Learning to rank using multiple classification and gradient boosting. In J. Platt, D. Koller, Y. Singe, & S. Roweis (Eds.), Advances in neural information processing systems (Vol. 20, pp. 897–904). Cambridge, MA: MIT Press.
Zurück zum Zitat Liu, T. Y., Xu, J., Qin, T., Xiong, W., & Li, H. (2007). Letor: Benchmark dataset for research on learning to rank for information retrieval. In LR4IR 2007, in conjunction with SIGIR 2007. Liu, T. Y., Xu, J., Qin, T., Xiong, W., & Li, H. (2007). Letor: Benchmark dataset for research on learning to rank for information retrieval. In LR4IR 2007, in conjunction with SIGIR 2007.
Zurück zum Zitat Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval. Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval.
Zurück zum Zitat Shewchuk, J. R. (1994). An introduction to the conjugate gradient method without the agonizing pain. Tech. Rep. CMU-CS-94-125, School of Computer Science, Carnegie Mellon University. Shewchuk, J. R. (1994). An introduction to the conjugate gradient method without the agonizing pain. Tech. Rep. CMU-CS-94-125, School of Computer Science, Carnegie Mellon University.
Zurück zum Zitat Taylor, M., Guiver, J., Robertson, S., & Minka, T. (2008). Softrank: Optimizing non-smooth rank metrics. In WSDM’08: Proceedings of the international conference on web search and web data mining, ACM (pp. 77–86). Taylor, M., Guiver, J., Robertson, S., & Minka, T. (2008). Softrank: Optimizing non-smooth rank metrics. In WSDM’08: Proceedings of the international conference on web search and web data mining, ACM (pp. 77–86).
Zurück zum Zitat Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). Frank: A ranking method with fidelity loss. In International ACM SIGIR conference on research and development in information retrieval. Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). Frank: A ranking method with fidelity loss. In International ACM SIGIR conference on research and development in information retrieval.
Zurück zum Zitat Wu, M., Chang, Y., Zheng, Z., & Zha, H. (2009). Smoothing DCG for learning to rank: A novel approach using smoothed hinge functions. In Proceedings of the 17th ACM conference on information and knowledge management, CIKM. Wu, M., Chang, Y., Zheng, Z., & Zha, H. (2009). Smoothing DCG for learning to rank: A novel approach using smoothed hinge functions. In Proceedings of the 17th ACM conference on information and knowledge management, CIKM.
Zurück zum Zitat Xia, F., Liu, T., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th international conference on machine learning (pp. 1192–1199). Xia, F., Liu, T., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th international conference on machine learning (pp. 1192–1199).
Zurück zum Zitat Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In International ACM SIGIR conference on research and development in information retrieval. Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In International ACM SIGIR conference on research and development in information retrieval.
Zurück zum Zitat Yilmaz, E., & Robertson, S. (2009). On the choice of effectiveness measures for learning to rank. Information Retrieval Journal To appear in the special issue of Learning to Rank for Information Retrieval. doi:10.1007/s10791-009-9116-x. Yilmaz, E., & Robertson, S. (2009). On the choice of effectiveness measures for learning to rank. Information Retrieval Journal To appear in the special issue of Learning to Rank for Information Retrieval. doi:10.​1007/​s10791-009-9116-x.
Zurück zum Zitat Yue, Y., & Burges, C. (2007). On using simultaneous perturbation stochastic approximation for learning to rank, and the empirical optimality of LambdaRank. Tech. Rep. MSR-TR-2007-115, Microsoft Reseach. Yue, Y., & Burges, C. (2007). On using simultaneous perturbation stochastic approximation for learning to rank, and the empirical optimality of LambdaRank. Tech. Rep. MSR-TR-2007-115, Microsoft Reseach.
Zurück zum Zitat Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007) A support vector method for optimizing average precision. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM (pp. 271–278). Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007) A support vector method for optimizing average precision. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM (pp. 271–278).
Zurück zum Zitat Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., & Sun, G. (2008). A general boosting method and its application to learning ranking functions for web search. In Advances in neural information processing systems (Vol. 20, pp 1697–1704). MIT Press. Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., & Sun, G. (2008). A general boosting method and its application to learning ranking functions for web search. In Advances in neural information processing systems (Vol. 20, pp 1697–1704). MIT Press.
Metadaten
Titel
Gradient descent optimization of smoothed information retrieval metrics
verfasst von
Olivier Chapelle
Mingrui Wu
Publikationsdatum
01.06.2010
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 3/2010
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-009-9110-3

Weitere Artikel der Ausgabe 3/2010

Discover Computing 3/2010 Zur Ausgabe

Learning to rank for information retrieval

Learning to rank with (a lot of) word features

Learning to rank for information retrieval

Knowledge transfer for cross domain learning to rank

Learning to Rank for Information Retrieval

On the choice of effectiveness measures for learning to rank

Learning to rank for information retrieval

Efficient algorithms for ranking with SVMs

Learning to rank for information retrieval

Adapting boosting for information retrieval measures