ABSTRACT
Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for high-dimensional sparse data commonly encountered in applications like text classification, word-sense disambiguation, and drug design. These applications involve a large number of examples n as well as a large number of features N, while each example has only s << N non-zero features. This paper presents a Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n))for ordinal regression problems. The algorithm is based on an alternative, but equivalent formulation of the SVM optimization problem. Empirically, the Cutting-Plane Algorithm is several orders of magnitude faster than decomposition methods like svm light for large datasets.
- R. Caruana, T. Joachims, and L. Backstrom. Kddcup 2004: Results and analysis. ACM SIGKDD Newsletter, 6(2):95--108, 2004.]] Google ScholarDigital Library
- C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.]]Google Scholar
- R. Collobert and S. Bengio. Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research (JMLR), 1:143--160, 2001.]] Google ScholarDigital Library
- J. Díez, J. del Coz, and A. Bahamonde. A support vector method for ranking minimizing the number of swapped pairs. Technical report, Artificial Intelligence Centre, Universidad de Oviedo at Gijón, 2006.]]Google Scholar
- S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In Proceedings of ACM-CIKM98, November 1998.]] Google ScholarDigital Library
- M. Ferris and T. Munson. Interior-point methods for massive support vector machines. SIAM Journal of Optimization, 13(3):783--804, 2003.]] Google ScholarDigital Library
- G. Fung and O. Mangasarian. Proximal support vector classifiers. In ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (KDD), 2001.]] Google ScholarDigital Library
- R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, pages 115--132. MIT Press, Cambridge, MA, 2000.]]Google Scholar
- D. Hush and C. Scovel. Polynomial-time decomposition algorithms for support vector machines. Machine Learning, 51:51--71, 2003.]] Google ScholarDigital Library
- T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning, pages 137--142, Berlin, 1998. Springer.]] Google ScholarDigital Library
- T. Joachims. Making large-scale SVM learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, chapter 11, pages 169--184. MIT Press, Cambridge, MA, 1999.]] Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002.]] Google ScholarDigital Library
- T. Joachims. Learning to align sequences: A maximum-margin approach. online manuscript, August 2003.]]Google Scholar
- T. Joachims. A support vector method for multivariate performance measures. In International Conference on Machine Learning (ICML), 2005.]] Google ScholarDigital Library
- S. Keerthi and D. DeCoste. A modified finite newton method for fast solution of large scale linear svms. Journal of Machine Learning Research (JMLR), 6:341--361, 2005.]] Google ScholarDigital Library
- J. Kelley. The cutting-plane method for solving convex programs. Journal of the Society for Industrial Applied Mathematics, 8:703--712, 1960.]]Google ScholarCross Ref
- D. Lewis, Y. Yang, T. Rose, and F. Li. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research (JMLR), 5:361--397, 2004.]] Google ScholarDigital Library
- O. Mangasarian and D. Musicant. Lagrangian support vector machines. Journal of Machine Learning Research (JMLR), 1:161--177, 2001.]] Google ScholarDigital Library
- J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, chapter 12. MIT-Press, 1999.]] Google ScholarDigital Library
- A. Rakotomamonjy. Svms and area under roc curve. Technical report, PSI-INSA de Rouen, 2004.]]Google Scholar
- B. Schölkopf and A. J. Smola. Learning with Kernels. The MIT Press, Cambridge, MA, 2002.]]Google Scholar
- B. Schölkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms. Neural Computation, 12:1207--1245, 2000.]] Google ScholarDigital Library
- I. Tsang, J. Kwok, and P.-M. Cheung. Core vector machines: Fast svm training on very large data sets. Journal of Machine Learning Research (JMLR), 6:363--392, 2005.]] Google ScholarDigital Library
- I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR), 6:1453--1484, September 2005.]] Google ScholarDigital Library
Index Terms
- Training linear SVMs in linear time
Recommendations
Cutting-plane training of structural SVMs
Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training ...
Training twin support vector regression via linear programming
Special Issue on Theory and applications of swarm intelligenceThis paper improves the recently proposed twin support vector regression (TSVR) by formulating it as a pair of linear programming problems instead of quadratic programming problems. The use of 1-norm distance in the linear programming TSVR as opposed to ...
A primal-dual method for SVM training
Training support vector machines (SVM) consists of solving a convex quadratic problem (QP) with one linear equality and box constraints. In this paper, we solve this QP by a primal-dual approach that combines the adaptive method with an interior point ...
Comments