Top

Discover Computing

Published in:

02-05-2018

Linear feature extraction for ranking

Authors: Gaurav Pandey, Zhaochun Ren, Shuaiqiang Wang, Jari Veijalainen, Maarten de Rijke

Published in: Discover Computing | Issue 6/2018

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We address the feature extraction problem for document ranking in information retrieval. We then propose LifeRank, a Linear feature extraction algorithm for Ranking. In LifeRank, we regard each document collection for ranking as a matrix, referred to as the original matrix. We try to optimize a transformation matrix, so that a new matrix (dataset) can be generated as the product of the original matrix and a transformation matrix. The transformation matrix projects high-dimensional document vectors into lower dimensions. Theoretically, there could be very large transformation matrices, each leading to a new generated matrix. In LifeRank, we produce a transformation matrix so that the generated new matrix can match the learning to rank problem. Extensive experiments on benchmark datasets show the performance gains of LifeRank in comparison with state-of-the-art feature selection algorithms.

next article An analysis of evaluation campaigns in ad-hoc medical information retrieval: CLEF eHealth 2013 and 2014

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

http://research.microsoft.com/en-us/um/beijing/projects/letor/.

https://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html.

https://sourceforge.net/p/lemur/wiki/RankLib/.

Arfken, G. B. (2013). Mathematical methods for physicists. Cambridge: Academic Press.MATH

Bach, F. R., & Jordan, M.I. (2005). A probabilistic interpretation of canonical correlation analysis. Technical report 688, Department of Statistics, University of California, Berkeley.

Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison Wesley.

Bertsekas, D. P. (1999). Nonlinear programming. Belmont: Athena Scientific.MATH

Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1), 245–271.MathSciNetCrossRef

Burges, C. J., Ragno, R., & Le, Q. V. (2007) Learning to rank with nonsmooth cost functions. In NIPS, pp. 193–200.

Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In ICML, pp. 89–96.

Busa-Fekete, R., Kégl, B., Éltetö, T., & Szarvas, G. (2013). Tune and mix: Learning to rank using ensembles of calibrated multi-class classifiers. Machine Learning, 93(2–3), 261–292.MathSciNetCrossRef

Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., & Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In SIGIR, pp. 186–193.

Cao, Z., Qin, T., Liu, T. Y., Tsai, M.F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML, pp. 129–136.

Chapelle, O., Chang, Y., & Liu, T. Y. (2011). Future directions in learning to rank. Journal of Machine Learning Research, 14, 91–100.

Chen, W., Liu, T. Y., Lan, Y., Ma, Z. M., & Li, H. (2009). Ranking measures and loss functions in learning to rank. In NIPS, pp. 315–323.

Cossock, D., & Zhang, T. (2008). Statistical analysis of Bayes optimal subset ranking. IEEE Transactions on Information Theory, 54(11), 5140–5154.MathSciNetCrossRef

Crammer, K., & Singer, Y. (2001). Pranking with ranking. In NIPS, pp. 641–647.

Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4(1), 933–969.MathSciNetMATH

Geng, X., Liu, T., Qin, T., & Li, H. (2007). Feature selection for ranking. In SIGIR, pp. 407–414.

Gupta, P., & Rosso, P. (2012). Expected divergence based feature selection for learning to rank. In COLING, pp. 431–440.

Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.CrossRef

Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef

Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.CrossRef

Joachims, T., Li, H., Liu, T. Y., & Zhai, C. (2007). Learning to rank for information retrieval (LR4IR 2007). SIGIR Forum, 41(2), 58–62.CrossRef

Jolliffe, I. (2002). Principal component analysis. Berlin: Springer.MATH

Joachims, T., Swaminathan, A., & de Rijke, M. (2018). Deep learning with logged bandit feedback. In ICLR 2018.

Kendall, M. G. (1948). Rank correlation methods. London: C. Griffin.MATH

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRef

Lai, H., Pan, Y., Tang, Y., & Yu, R. (2013). FSMRank: Feature selection algorithm for learning to rank. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 940–952.CrossRef

Lan, Y., Guo, J., Cheng, X., & Liu, T. Y. (2012). Statistical consistency of ranking methods in a rank-differentiable probability space. In NIPS, pp. 1232–1240.

Lange, K. (2010). Singular value decomposition. In Numerical analysis for statisticians (pp. 129–142). Springer.

Laporte, L., Flamary, R., Canu, S., Déjean, S., & Mothe, J. (2014). Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1118–1130.CrossRef

Lawson, C., & Hanson, R. (1995). Solving least square problems, classics in applied mathematics (Vol. 15). Philadelphia: SIAM.CrossRef

Li, P., Wu, Q., & Burges, C.J. (2007). McRank: Learning to rank using multiple classification and gradient boosting. In NIPS, pp. 897–904.

Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef

Liu, T. Y. (2011). Learning to rank for information retrieval. Berlin: Springer.CrossRef

Metzler, D.A. (2007). Automatic feature selection in the markov random field model for information retrieval. In CIKM, ACM, pp. 253–262.

Motoda, H., & Liu, H. (2002). Feature selection, extraction and construction. Communication of IICM (Institute of Information and Computing Machinery, Taiwan), 5, 67–72.

Mukuta, Y., & Harada, T. (2014). Probabilistic partial canonical correlation analysis. In ICML, pp. 1449–1457.

Naini, K. D., & Altingövde, I. S. (2014). Exploiting result diversification methods for feature selection in learning to rank. In ECIR, Springer, pp. 455–461.

Ng, A. Y. (2004). Feature selection, L1 vs. L2 regularization, and rotational invariance. In ICML, pp. 78–82.

Niu, S., Guo, J., Lan, Y., & Cheng, X. (2012). Top-K learning to rank: Labeling, ranking and evaluation. In SIGIR, pp. 751–760.

Pan, F., Converse, T., Ahn, D., Salvetti, F., & Donato, G. (2009). Feature selection for ranking using boosted trees. In CIKM (pp. 2025–2028). ACM.

Platt, J. C., & Barr, A. H. (1988). Constrained differential optimization for neural networks. Technical report TR-88-17, Department of Computer Science, California Institute of Technology.

Qin, T., & Liu, T. Y. (2013). Introducing LETOR 4.0 datasets. arXiv:1306.2597.

Qin, T., Liu, T. Y., Xu, J., & Li, H. (2010). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 346–374.CrossRef

Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326.CrossRef

Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.CrossRef

Schuth, A., Oosterhuis, H., Whiteson, S., & de Rijke, M. (2016). Multileave gradient descent for fast online learning to rank. In WSDM 2016: The 9th international conference on web search and data mining (pp. 457–466). ACM.

Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. In SIGIR (pp 373–382). ACM.

Shalit, U., & Chechik, G. (2014). Coordinate-descent for learning orthogonal matrices through Givens rotations. In ICML, pp. 548–556.

Shivanna, R., & Bhattacharyya, C. (2014). Learning on graphs using orthonormal representation is statistically consistent. In NIPS, pp. 3635–3643.

Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRef

Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 61(3), 611–622.MathSciNetCrossRef

Tsai, M. F., Liu, T. Y., Qin, T., Chen, H. H., & Ma, W. Y. (2007). FRank: A ranking method with fidelity loss. In SIGIR (pp. 383–390). ACM.

Valizadegan, H., Jin, R., Zhang, R., & Mao, J. (2009). Learning to rank by optimizing NDCG measure. In NIPS, pp. 1883–1891.

Volkovs, M., & Zemel, R. S. (2009). Boltzrank: Learning to maximize expected ranking gain. In ICML, pp. 1089–1096.

Wang, S., Wu, Y., Gao, B. J., Wang, K., Lauw, H. W., & Ma, J. (2015). A cooperative coevolution framework for parallel learning to rank. IEEE Transactions on Knowledge and Data Engineering, 27(12), 3152–3165.CrossRef

Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs. In NIPS, pp. 668–674.

Wolf, L., & Bileschi, S. (2005). Combining variable selection with dimensionality reduction. In CVPR, pp. 801–806.

Wyse, N., Dubes, R., & Jain, A. (1980). A critical evaluation of intrinsic dimensionality algorithms. In Gelsema, E., & Kanal, L. (Eds.) Pattern recognition in practice. Proceedings of workshop Amsterdam, May 1980, North-Holland, pp. 415–425.

Xu, J., & Li, H. (2007). AdaRank: A boosting algorithm for information retrieval. In SIGIR (pp. 391–398). ACM.

Yu, H., Oh, J., & Han, W. (2009). Efficient feature weighting methods for ranking. In CIKM (pp 1157–1166). ACM.

Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR (pp. 271–278). ACM.

Title: Linear feature extraction for ranking
Authors: Gaurav Pandey
Zhaochun Ren
Shuaiqiang Wang
Jari Veijalainen
Maarten de Rijke
Publication date: 02-05-2018
Publisher: Springer Netherlands
Published in: Discover Computing / Issue 6/2018
Print ISSN: 2948-2984
Electronic ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-018-9330-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner