ABSTRACT
The central problem for many applications in Information Retrieval is ranking and learning to rank is considered as a promising approach for addressing the issue. Ranking SVM, for example, is a state-of-the-art method for learning to rank and has been empirically demonstrated to be effective. In this paper, we study the issue of learning to rank, particularly the approach of using SVM techniques to perform the task. We point out that although Ranking SVM is advantageous, it still has shortcomings. Ranking SVM employs a single hyperplane in the feature space as the model for ranking, which is too simple to tackle complex ranking problems. Furthermore, the training of Ranking SVM is also computationally costly. In this paper, we look at an alternative approach to Ranking SVM, which we call "Multiple Hyperplane Ranker" (MHR), and make comparisons between the two approaches. MHR takes the divide-and-conquer strategy. It employs multiple hyperplanes to rank instances and finally aggregates the ranking results given by the hyperplanes. MHR contains Ranking SVM as a special case, and MHR can overcome the shortcomings which Ranking SVM suffers from. Experimental results on two information retrieval datasets show that MHR can outperform Ranking SVM in ranking.
- Baeza-Yates, R., Ribeiro-Neto, B. Modern Information Retrieval, Addison Wesley, 1999. Google ScholarDigital Library
- Borda, J. C. Mémoire sur les élections au scrutin. Histoire de l'Académie Royale des Sciences, 1781.Google Scholar
- Burges, C. J. C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G. Learning to Rank using Gradient Descent, 22nd International Conference on Machine Learning, Bonn, 2005. Google ScholarDigital Library
- Cao, Y., Xu, J., Liu, T. Y., Li, H., Huang, Y., Hon, H. W. Adapting ranking SVM to document retrieval. In SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, USA. Google ScholarDigital Library
- Chirita, P. A., Diederich, J., and Nejdl, W. MailRank: using ranking for spam detection, Proceedings of the 14th ACM international conference on Information and knowledge management, 2005. Google ScholarDigital Library
- Chu , W., and S. Sathiya Keerthi, New approaches to support vector ordinal regression, Proceedings of the 22nd international conference on Machine learning, p. 145--152, August 07-11, 2005, Bonn, Germany. Google ScholarDigital Library
- Cohen, W. W., Schapire, R. E., and Singer, Y. Learning to order things, Proceedings of the 1997 conference on Advances in neural information processing systems 10, p.451--457, July 1998, Denver, Colorado, United States. Google ScholarDigital Library
- Collins, M. Ranking algorithms for named-entity extraction: boosting and the voted perceptron, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, July 07-12, 2002, Philadelphia, Pennsylvania. Google ScholarDigital Library
- Dave, K., Lawrence, S., Pennock, D. M. Mining the peanut gallery: opinion extraction and semantic classification of product reviews, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary. Google ScholarDigital Library
- Dietterich, T. G. and Bakiri, G. Solving multiclass learning problems via error correcting output codes. Journal of artificial Intelligence Research, 2:263--286, January 1995. Google ScholarDigital Library
- Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. Rank aggregation methods for the web. In Proceedings of the 10th International World Wide Web Conference. 2001, 613--622. Google ScholarDigital Library
- Fagin, R., Kumar, R., and Sivakumar, D. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD International Conference on management of data. San Diego, 2003, 301--312. Google ScholarDigital Library
- Freund, Y., Iyer, R., Schapire, R., & Singer, Y., An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 2003 (4). Google ScholarDigital Library
- Gyongyi, Z., Garcia-Molina, H., and Pedersen, J. Combating web spam with TrustRank. In Proceedings of the 30th VLDB Conference, Sept. 2004. Google ScholarDigital Library
- Harrington, E. F. Online Ranking/Collaborative filtering using the Perceptron Algorithm. Proceedings of the 20th International Conference on Machine Learning, pages 250--257, 2003.Google Scholar
- Har-Peled, S. , Roth, D., and Zimak, D. Constraint classification: A new approach to multiclass classification and ranking. In Proc. Advances in Neural Information Processing Systems (NIPS'02), 2002.Google ScholarDigital Library
- Hersh, W. R., Buckley, C., Leone, T. J., and Hickam, D. H. OHSUMED: An interactive retrieval evaluation and new large test collection for research. Proceedings of the 17th Annual ACM SIGIR Conference, pages 192--201, 1994. Google ScholarDigital Library
- Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. Advances in Large Margin Classiers, MIT Press, Pages: 115--132.Google Scholar
- Hsu, C. W., and Lin, C. J. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2):415--425, 2002. Google ScholarDigital Library
- Jarvelin, K., & Kekalainen, J. (2000). IR evaluation methods for retrieving highly relevant documents. Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.41--48, July 24-28, 2000, Athens, Greece. Google ScholarDigital Library
- Jarvelin, K., & Kekalainen, J. Cumulated Gain-Based Evaluation of IR Techniques, ACM Transactions on Information Systems, 2002. Google ScholarDigital Library
- Joachims, T. Making large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT-Press, 1999. Google ScholarDigital Library
- Joachims, T. Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2002. Google ScholarDigital Library
- Kong, E. B., and Dietterich, T. G. Error-correcting output coding corrects bias and variance. In Proceedings of the Twelfth International Conference on Machine Learning, pages 313--321, 1995.Google ScholarDigital Library
- Matveeva, I., Burges, C., Burkard, T., Laucius, A., and Wong, L. High Accuracy Retrieval with Multiple Nested Ranker. In SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, USA. Google ScholarDigital Library
- Nallapati, R. Discriminative models for information retrieval. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom. Google ScholarDigital Library
- Pang, B., and Lee, L. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. ACL 2005 Google ScholarDigital Library
- Robertson, S. E. Overview of the okapi projects, Journal of Documentation, Vol. 53, No. 1, 1997, pp. 3--7.Google ScholarCross Ref
- Shashua, A., and Levin, A. Taxonomy of Large Margin Principle Algorithm for Ordinal Regression Problems. Advances in Neural Information Processing Systems 15. Cambridge, MA: MIT Press, 2000.Google Scholar
- Weston, J. and Watkins, C. Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX, UK, 1998.Google Scholar
- Xu, J., Cao, Y.B., Li, H., Zhao, M. Ranking definitions with supervised learning methods, Special interest tracks and posters of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan. Google ScholarDigital Library
Index Terms
- Ranking with multiple hyperplanes
Recommendations
Adapting ranking SVM to document retrieval
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalThe paper is concerned with applying learning to rank to document retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a "learning to rank" method,...
Learning to Rank with Ensemble Ranking SVM
In this paper, we propose a novel learning to rank method using Ensemble Ranking SVM. Ensemble Ranking SVM is based on Ranking SVM which has been commonly used for learning to rank. The basic idea of Ranking SVM is to formulate the problem of learning ...
Nomogram visualization for ranking support vector machine
ISNN'11: Proceedings of the 8th international conference on Advances in neural networks - Volume Part IIIn this paper, we propose a visualization model for a trained ranking support vector machine. In addition, we also introduce a feature selection method for the ranking support vector machine, and show visually each feature's effect. Nomogram is a well-...
Comments