Skip to main content
Log in

A Feature Selection Newton Method for Support Vector Machine Classification

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

A fast Newton method, that suppresses input space features, is proposed for a linear programming formulation of support vector machine classifiers. The proposed stand-alone method can handle classification problems in very high dimensional spaces, such as 28,032 dimensions, and generates a classifier that depends on very few input features, such as 7 out of the original 28,032. The method can also handle problems with a large number of data points and requires no specialized linear programming packages but merely a linear equation solver. For nonlinear kernel classifiers, the method utilizes a minimal number of kernel functions in the classifier that it generates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D.P. Bertsekas, Nonlinear Programming, 2nd edn. Athena Scientific: Belmont, MA, 1999.

    Google Scholar 

  2. J. Bi, K.P. Bennett, M. Embrechts, C.M. Breneman, and M. Song, “Dimensionality reduction via sparse support vector machines,” Journal of Machine Learning Research, vol. 3, pp. 1229-1243, 2003.

    Google Scholar 

  3. P.S. Bradley and O.L. Mangasarian, “Feature selection via concave minimization and support vector machines,” in Machine Learning Proceedings of the Fifteenth International Conference (ICML '98). J. Shavlik (Ed.), Morgan Kaufmann: San Francisco, California, 1998, pp. 82-90. ftp://ftp.cs.wisc.edu/math-prog/techreports/98-03.ps.

    Google Scholar 

  4. V. Cherkassky and F. Mulier, Learning from Data-Concepts, Theory and Methods. JohnWiley & Sons: New York, 1998.

    Google Scholar 

  5. M.J. van de Vijver et al. “A gene-expression signature as a predictor of survival in breast cancer,” The New England Journal of Medicine, vol. 347, pp. 1999-2009, 2002.

    Google Scholar 

  6. F. Facchinei, “Minimization of SC1 functions and the Maratos effect,” Operations Research Letters, vol. 17, pp. 131-137, 1995.

    Google Scholar 

  7. A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques. John Wiley & Sons: New York, NY, 1968.

    Google Scholar 

  8. G. Fung and O.L. Mangasarian, “Finite Newton method for Lagrangian support vector machine classification,” Technical Report 02-01, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, February 2002. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/02-01.ps. Neurocomputing, vol. 55, pp. 39-55, 2003.

    Google Scholar 

  9. G.H. Golub and C.F. Van Loan, Matrix Computations, 3rd ed. The John Hopkins University Press: Baltimore, Maryland, 1996.

    Google Scholar 

  10. J.-B. Hiriart-Urruty, J.J. Strodiot, and V.H. Nguyen, “Generalized hessian matrix and second-order optimality conditions for problems with C L1 data,” Applied Mathematics and Optimization, vol. 11, pp. 43-56, 1984.

    Google Scholar 

  11. ILOG CPLEX Division, 889 Alder Avenue, Incline Village, Nevada. CPLEX Optimizer. http://www.cplex.com/.

  12. Y.-J. Lee and O.L. Mangasarian, “RSVM: Reduced support vector machines,” Technical Report 00-07, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, July 2000. Proceedings of the First SIAM International Conference on Data Mining, Chicago, April 5-7, 2001, CD-ROM Proceedings. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-07.ps.

    Google Scholar 

  13. S. Lucidi, “A new result in the theory and computation of the least-norm solution of a linear program,” Journal of Optimization Theory and Applications, vol. 55, pp. 103-117, 1987.

    Google Scholar 

  14. O.L. Mangasarian, “Normal solutions of linear programs,” Mathematical Programming Study, vol. 22, pp. 206-216, 1984.

    Google Scholar 

  15. O.L. Mangasarian, Nonlinear Programming, SIAM. Philadelphia: PA, 1994.

    Google Scholar 

  16. O.L. Mangasarian, “Parallel gradient distribution in unconstrained optimization,” SIAM Journal on Control and Optimization, vol. 33, no. 6, pp. 1916-1925, 1995. ftp://ftp.cs.wisc.edu/tech-reports/reports/ 1993/tr1145.ps.

    Google Scholar 

  17. O.L. Mangasarian, “Arbitrary-norm separating plane,” Operations Research Letters, vol. 24, pp. 15-23, 1999. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/97-07r.ps.

    Google Scholar 

  18. O.L. Mangasarian, “Generalized support vector machines,” in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans (Eds.), MIT Press: Cambridge, MA, 2000, pp. 135-146, ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.

    Google Scholar 

  19. O.L. Mangasarian, “A finite Newton method for classification problems,” Technical Report 01-11, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, December 2001. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-11.ps. Optimization Methods and Software, vol. 17, pp. 913-929, 2002.

    Google Scholar 

  20. O.L. Mangasarian and R.R. Meyer, “Nonlinear perturbation of linear programs,” SIAM Journal on Control and Optimization, vol. 17, no. 6, pp. 745-752, 1979.

    Google Scholar 

  21. O.L. Mangasarian and D.R. Musicant, “Lagrangian support vector machines,” Journal of Machine Learning Research, vol. 1, pp. 161-177, 2001. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-06.ps.

    Google Scholar 

  22. MATLAB, User's Guide. The MathWorks, Inc., Natick, MA01760, 1994-2001. http://www.mathworks.com.

  23. M. Molla, M. Waddell, D. Page, and J. Shavlik, “Using machine learning to design and interpret gene-expression microarrays,” AI Magazine, Special Issue on Bioinformatics, 2003 (to appear). ftp://ftp.cs.wisc.edu/machine-learning/shavlik-group/molla.aimag03.pdf.

  24. P.M. Murphy and D.W. Aha, “UCI machine learning repository,” 1992. www.ics.uci.edu/~mlearn/ MLRepository.html.

  25. S. Odewahn, E. Stockwell, R. Pennington, R. Humphreys, and W. Zumach, “Automated star/galaxy discrimination with neural networks,” Astronomical Journal, vol. 103, no. 1, pp. 318-331, 1992.

    Google Scholar 

  26. D. Page, F. Zhan, J. Cussens, M. Waddell, J. Hardin, B. Barlogie, and J. Shaughnessy, Jr., “Comparative data mining for microarrays: A case study based on multiple myeloma,” Technical Report 1453, Computer Sciences Department, University of Wisconsin, November 2002.

  27. V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd edn. Springer: New York, 2000.

    Google Scholar 

  28. J. Zhy, S. Rosset, T. Hastie, and R. Tibshirani, “1-norm support vector machines,”Working paper, Department of Statistics, Stanford University, Stanford CA 94305, 2003. http://www-stat.stanford.edu/~hastie/Papers/.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fung, G.M., Mangasarian, O. A Feature Selection Newton Method for Support Vector Machine Classification. Computational Optimization and Applications 28, 185–202 (2004). https://doi.org/10.1023/B:COAP.0000026884.66338.df

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:COAP.0000026884.66338.df

Navigation