Skip to main content
Top
Published in: Discover Computing 6/2018

02-05-2018

Linear feature extraction for ranking

Authors: Gaurav Pandey, Zhaochun Ren, Shuaiqiang Wang, Jari Veijalainen, Maarten de Rijke

Published in: Discover Computing | Issue 6/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We address the feature extraction problem for document ranking in information retrieval. We then propose LifeRank, a Linear feature extraction algorithm for Ranking. In LifeRank, we regard each document collection for ranking as a matrix, referred to as the original matrix. We try to optimize a transformation matrix, so that a new matrix (dataset) can be generated as the product of the original matrix and a transformation matrix. The transformation matrix projects high-dimensional document vectors into lower dimensions. Theoretically, there could be very large transformation matrices, each leading to a new generated matrix. In LifeRank, we produce a transformation matrix so that the generated new matrix can match the learning to rank problem. Extensive experiments on benchmark datasets show the performance gains of LifeRank in comparison with state-of-the-art feature selection algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Arfken, G. B. (2013). Mathematical methods for physicists. Cambridge: Academic Press.MATH Arfken, G. B. (2013). Mathematical methods for physicists. Cambridge: Academic Press.MATH
go back to reference Bach, F. R., & Jordan, M.I. (2005). A probabilistic interpretation of canonical correlation analysis. Technical report 688, Department of Statistics, University of California, Berkeley. Bach, F. R., & Jordan, M.I. (2005). A probabilistic interpretation of canonical correlation analysis. Technical report 688, Department of Statistics, University of California, Berkeley.
go back to reference Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison Wesley. Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison Wesley.
go back to reference Bertsekas, D. P. (1999). Nonlinear programming. Belmont: Athena Scientific.MATH Bertsekas, D. P. (1999). Nonlinear programming. Belmont: Athena Scientific.MATH
go back to reference Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1), 245–271.MathSciNetCrossRef Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1), 245–271.MathSciNetCrossRef
go back to reference Burges, C. J., Ragno, R., & Le, Q. V. (2007) Learning to rank with nonsmooth cost functions. In NIPS, pp. 193–200. Burges, C. J., Ragno, R., & Le, Q. V. (2007) Learning to rank with nonsmooth cost functions. In NIPS, pp. 193–200.
go back to reference Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In ICML, pp. 89–96. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In ICML, pp. 89–96.
go back to reference Busa-Fekete, R., Kégl, B., Éltetö, T., & Szarvas, G. (2013). Tune and mix: Learning to rank using ensembles of calibrated multi-class classifiers. Machine Learning, 93(2–3), 261–292.MathSciNetCrossRef Busa-Fekete, R., Kégl, B., Éltetö, T., & Szarvas, G. (2013). Tune and mix: Learning to rank using ensembles of calibrated multi-class classifiers. Machine Learning, 93(2–3), 261–292.MathSciNetCrossRef
go back to reference Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR, pp. 186–193. Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR, pp. 186–193.
go back to reference Cao, Z., Qin, T., Liu, T. Y., Tsai, M.F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML, pp. 129–136. Cao, Z., Qin, T., Liu, T. Y., Tsai, M.F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML, pp. 129–136.
go back to reference Chapelle, O., Chang, Y., & Liu, T. Y. (2011). Future directions in learning to rank. Journal of Machine Learning Research, 14, 91–100. Chapelle, O., Chang, Y., & Liu, T. Y. (2011). Future directions in learning to rank. Journal of Machine Learning Research, 14, 91–100.
go back to reference Chen, W., Liu, T. Y., Lan, Y., Ma, Z. M., & Li, H. (2009). Ranking measures and loss functions in learning to rank. In NIPS, pp. 315–323. Chen, W., Liu, T. Y., Lan, Y., Ma, Z. M., & Li, H. (2009). Ranking measures and loss functions in learning to rank. In NIPS, pp. 315–323.
go back to reference Cossock, D., & Zhang, T. (2008). Statistical analysis of Bayes optimal subset ranking. IEEE Transactions on Information Theory, 54(11), 5140–5154.MathSciNetCrossRef Cossock, D., & Zhang, T. (2008). Statistical analysis of Bayes optimal subset ranking. IEEE Transactions on Information Theory, 54(11), 5140–5154.MathSciNetCrossRef
go back to reference Crammer, K., & Singer, Y. (2001). Pranking with ranking. In NIPS, pp. 641–647. Crammer, K., & Singer, Y. (2001). Pranking with ranking. In NIPS, pp. 641–647.
go back to reference Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4(1), 933–969.MathSciNetMATH Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4(1), 933–969.MathSciNetMATH
go back to reference Geng, X., Liu, T., Qin, T., & Li, H. (2007). Feature selection for ranking. In SIGIR, pp. 407–414. Geng, X., Liu, T., Qin, T., & Li, H. (2007). Feature selection for ranking. In SIGIR, pp. 407–414.
go back to reference Gupta, P., & Rosso, P. (2012). Expected divergence based feature selection for learning to rank. In COLING, pp. 431–440. Gupta, P., & Rosso, P. (2012). Expected divergence based feature selection for learning to rank. In COLING, pp. 431–440.
go back to reference Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.CrossRef Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.CrossRef
go back to reference Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef
go back to reference Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.CrossRef Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.CrossRef
go back to reference Joachims, T., Li, H., Liu, T. Y., & Zhai, C. (2007). Learning to rank for information retrieval (LR4IR 2007). SIGIR Forum, 41(2), 58–62.CrossRef Joachims, T., Li, H., Liu, T. Y., & Zhai, C. (2007). Learning to rank for information retrieval (LR4IR 2007). SIGIR Forum, 41(2), 58–62.CrossRef
go back to reference Jolliffe, I. (2002). Principal component analysis. Berlin: Springer.MATH Jolliffe, I. (2002). Principal component analysis. Berlin: Springer.MATH
go back to reference Joachims, T., Swaminathan, A., & de Rijke, M. (2018). Deep learning with logged bandit feedback. In ICLR 2018. Joachims, T., Swaminathan, A., & de Rijke, M. (2018). Deep learning with logged bandit feedback. In ICLR 2018.
go back to reference Kendall, M. G. (1948). Rank correlation methods. London: C. Griffin.MATH Kendall, M. G. (1948). Rank correlation methods. London: C. Griffin.MATH
go back to reference Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRef Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRef
go back to reference Lai, H., Pan, Y., Tang, Y., & Yu, R. (2013). FSMRank: Feature selection algorithm for learning to rank. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 940–952.CrossRef Lai, H., Pan, Y., Tang, Y., & Yu, R. (2013). FSMRank: Feature selection algorithm for learning to rank. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 940–952.CrossRef
go back to reference Lan, Y., Guo, J., Cheng, X., & Liu, T. Y. (2012). Statistical consistency of ranking methods in a rank-differentiable probability space. In NIPS, pp. 1232–1240. Lan, Y., Guo, J., Cheng, X., & Liu, T. Y. (2012). Statistical consistency of ranking methods in a rank-differentiable probability space. In NIPS, pp. 1232–1240.
go back to reference Lange, K. (2010). Singular value decomposition. In Numerical analysis for statisticians (pp. 129–142). Springer. Lange, K. (2010). Singular value decomposition. In Numerical analysis for statisticians (pp. 129–142). Springer.
go back to reference Laporte, L., Flamary, R., Canu, S., Déjean, S., & Mothe, J. (2014). Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1118–1130.CrossRef Laporte, L., Flamary, R., Canu, S., Déjean, S., & Mothe, J. (2014). Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1118–1130.CrossRef
go back to reference Lawson, C., & Hanson, R. (1995). Solving least square problems, classics in applied mathematics (Vol. 15). Philadelphia: SIAM.CrossRef Lawson, C., & Hanson, R. (1995). Solving least square problems, classics in applied mathematics (Vol. 15). Philadelphia: SIAM.CrossRef
go back to reference Li, P., Wu, Q., & Burges, C.J. (2007). McRank: Learning to rank using multiple classification and gradient boosting. In NIPS, pp. 897–904. Li, P., Wu, Q., & Burges, C.J. (2007). McRank: Learning to rank using multiple classification and gradient boosting. In NIPS, pp. 897–904.
go back to reference Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef
go back to reference Liu, T. Y. (2011). Learning to rank for information retrieval. Berlin: Springer.CrossRef Liu, T. Y. (2011). Learning to rank for information retrieval. Berlin: Springer.CrossRef
go back to reference Metzler, D.A. (2007). Automatic feature selection in the markov random field model for information retrieval. In CIKM, ACM, pp. 253–262. Metzler, D.A. (2007). Automatic feature selection in the markov random field model for information retrieval. In CIKM, ACM, pp. 253–262.
go back to reference Motoda, H., & Liu, H. (2002). Feature selection, extraction and construction. Communication of IICM (Institute of Information and Computing Machinery, Taiwan), 5, 67–72. Motoda, H., & Liu, H. (2002). Feature selection, extraction and construction. Communication of IICM (Institute of Information and Computing Machinery, Taiwan), 5, 67–72.
go back to reference Mukuta, Y., & Harada, T. (2014). Probabilistic partial canonical correlation analysis. In ICML, pp. 1449–1457. Mukuta, Y., & Harada, T. (2014). Probabilistic partial canonical correlation analysis. In ICML, pp. 1449–1457.
go back to reference Naini, K. D., & Altingövde, I. S. (2014). Exploiting result diversification methods for feature selection in learning to rank. In ECIR, Springer, pp. 455–461. Naini, K. D., & Altingövde, I. S. (2014). Exploiting result diversification methods for feature selection in learning to rank. In ECIR, Springer, pp. 455–461.
go back to reference Ng, A. Y. (2004). Feature selection, L1 vs. L2 regularization, and rotational invariance. In ICML, pp. 78–82. Ng, A. Y. (2004). Feature selection, L1 vs. L2 regularization, and rotational invariance. In ICML, pp. 78–82.
go back to reference Niu, S., Guo, J., Lan, Y., & Cheng, X. (2012). Top-K learning to rank: Labeling, ranking and evaluation. In SIGIR, pp. 751–760. Niu, S., Guo, J., Lan, Y., & Cheng, X. (2012). Top-K learning to rank: Labeling, ranking and evaluation. In SIGIR, pp. 751–760.
go back to reference Pan, F., Converse, T., Ahn, D., Salvetti, F., & Donato, G. (2009). Feature selection for ranking using boosted trees. In CIKM (pp. 2025–2028). ACM. Pan, F., Converse, T., Ahn, D., Salvetti, F., & Donato, G. (2009). Feature selection for ranking using boosted trees. In CIKM (pp. 2025–2028). ACM.
go back to reference Platt, J. C., & Barr, A. H. (1988). Constrained differential optimization for neural networks. Technical report TR-88-17, Department of Computer Science, California Institute of Technology. Platt, J. C., & Barr, A. H. (1988). Constrained differential optimization for neural networks. Technical report TR-88-17, Department of Computer Science, California Institute of Technology.
go back to reference Qin, T., Liu, T. Y., Xu, J., & Li, H. (2010). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 346–374.CrossRef Qin, T., Liu, T. Y., Xu, J., & Li, H. (2010). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 346–374.CrossRef
go back to reference Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326.CrossRef Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326.CrossRef
go back to reference Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.CrossRef Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.CrossRef
go back to reference Schuth, A., Oosterhuis, H., Whiteson, S., & de Rijke, M. (2016). Multileave gradient descent for fast online learning to rank. In WSDM 2016: The 9th international conference on web search and data mining (pp. 457–466). ACM. Schuth, A., Oosterhuis, H., Whiteson, S., & de Rijke, M. (2016). Multileave gradient descent for fast online learning to rank. In WSDM 2016: The 9th international conference on web search and data mining (pp. 457–466). ACM.
go back to reference Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. In SIGIR (pp 373–382). ACM. Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. In SIGIR (pp 373–382). ACM.
go back to reference Shalit, U., & Chechik, G. (2014). Coordinate-descent for learning orthogonal matrices through Givens rotations. In ICML, pp. 548–556. Shalit, U., & Chechik, G. (2014). Coordinate-descent for learning orthogonal matrices through Givens rotations. In ICML, pp. 548–556.
go back to reference Shivanna, R., & Bhattacharyya, C. (2014). Learning on graphs using orthonormal representation is statistically consistent. In NIPS, pp. 3635–3643. Shivanna, R., & Bhattacharyya, C. (2014). Learning on graphs using orthonormal representation is statistically consistent. In NIPS, pp. 3635–3643.
go back to reference Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRef Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRef
go back to reference Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 61(3), 611–622.MathSciNetCrossRef Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 61(3), 611–622.MathSciNetCrossRef
go back to reference Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). FRank: A ranking method with fidelity loss. In SIGIR (pp. 383–390). ACM. Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). FRank: A ranking method with fidelity loss. In SIGIR (pp. 383–390). ACM.
go back to reference Valizadegan, H., Jin, R., Zhang, R., & Mao, J. (2009). Learning to rank by optimizing NDCG measure. In NIPS, pp. 1883–1891. Valizadegan, H., Jin, R., Zhang, R., & Mao, J. (2009). Learning to rank by optimizing NDCG measure. In NIPS, pp. 1883–1891.
go back to reference Volkovs, M., & Zemel, R. S. (2009). Boltzrank: Learning to maximize expected ranking gain. In ICML, pp. 1089–1096. Volkovs, M., & Zemel, R. S. (2009). Boltzrank: Learning to maximize expected ranking gain. In ICML, pp. 1089–1096.
go back to reference Wang, S., Wu, Y., Gao, B. J., Wang, K., Lauw, H. W., & Ma, J. (2015). A cooperative coevolution framework for parallel learning to rank. IEEE Transactions on Knowledge and Data Engineering, 27(12), 3152–3165.CrossRef Wang, S., Wu, Y., Gao, B. J., Wang, K., Lauw, H. W., & Ma, J. (2015). A cooperative coevolution framework for parallel learning to rank. IEEE Transactions on Knowledge and Data Engineering, 27(12), 3152–3165.CrossRef
go back to reference Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs. In NIPS, pp. 668–674. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs. In NIPS, pp. 668–674.
go back to reference Wolf, L., & Bileschi, S. (2005). Combining variable selection with dimensionality reduction. In CVPR, pp. 801–806. Wolf, L., & Bileschi, S. (2005). Combining variable selection with dimensionality reduction. In CVPR, pp. 801–806.
go back to reference Wyse, N., Dubes, R., & Jain, A. (1980). A critical evaluation of intrinsic dimensionality algorithms. In Gelsema, E., & Kanal, L. (Eds.) Pattern recognition in practice. Proceedings of workshop Amsterdam, May 1980, North-Holland, pp. 415–425. Wyse, N., Dubes, R., & Jain, A. (1980). A critical evaluation of intrinsic dimensionality algorithms. In Gelsema, E., & Kanal, L. (Eds.) Pattern recognition in practice. Proceedings of workshop Amsterdam, May 1980, North-Holland, pp. 415–425.
go back to reference Xu, J., & Li, H. (2007). AdaRank: A boosting algorithm for information retrieval. In SIGIR (pp. 391–398). ACM. Xu, J., & Li, H. (2007). AdaRank: A boosting algorithm for information retrieval. In SIGIR (pp. 391–398). ACM.
go back to reference Yu, H., Oh, J., & Han, W. (2009). Efficient feature weighting methods for ranking. In CIKM (pp 1157–1166). ACM. Yu, H., Oh, J., & Han, W. (2009). Efficient feature weighting methods for ranking. In CIKM (pp 1157–1166). ACM.
go back to reference Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR (pp. 271–278). ACM. Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR (pp. 271–278). ACM.
Metadata
Title
Linear feature extraction for ranking
Authors
Gaurav Pandey
Zhaochun Ren
Shuaiqiang Wang
Jari Veijalainen
Maarten de Rijke
Publication date
02-05-2018
Publisher
Springer Netherlands
Published in
Discover Computing / Issue 6/2018
Print ISSN: 2948-2984
Electronic ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-018-9330-5

Premium Partner