skip to main content
10.1145/1401890.1401985acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Training structural svms with kernels using sampled cuts

Published:24 August 2008Publication History

ABSTRACT

Discriminative training for structured outputs has found increasing applications in areas such as natural language processing, bioinformatics, information retrieval, and computer vision. Focusing on large-margin methods, the most general (in terms of loss function and model structure) training algorithms known to date are based on cutting-plane approaches. While these algorithms are very efficient for linear models, their training complexity becomes quadratic in the number of examples when kernels are used. To overcome this bottleneck, we propose new training algorithms that use approximate cutting planes and random sampling to enable efficient training with kernels. We prove that these algorithms have improved time complexity while providing approximation guarantees. In empirical evaluations, our algorithms produced solutions with training and test error rates close to those of exact solvers. Even on binary classification problems where highly optimized conventional training methods exist (e.g. SVM-light), our methods are about an order of magnitude faster than conventional training methods on large datasets, while remaining competitive in speed on datasets of medium size.

References

  1. R. Collobert, S. Bengio, and Y. Bengio. A parallel mixture of SVMs for very large scale problems. In NIPS, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Fine and K. Scheinberg. Efficient SVM training using low-rank kernel representations. JMLR, 2:243--264, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Franc and S. Sonnenburg. Optimized cutting plane algorithm for support vector machines. In ICML, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. Joachims. A support vector method for multivariate performance measures. In ICML, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Joachims. Training linear SVMs in linear time. In KDD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Joachims, T. Finley, and C.-N. Yu. Cutting plane training of structural SVMs. MLJ, To Appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. S. Keerthi, O. Chapelle, and D. DeCoste. Building support vector machines with reduced classifier complexity. JMLR, 7:1493--1515, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, chapter 12. MIT-Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. D. Ratliff, J. A. Bagnell, and M. A. Zinkevich. (Online) subgradient methods for structured prediction. In AISTATS, 2007.Google ScholarGoogle Scholar
  10. M. Saito and M. Matsumoto. SIMD-oriented fast mersenne twister: a 128-bit pseudorandom number generator. In Monte Carlo and Quasi-Monte Carlo Methods 2006, 2006.Google ScholarGoogle Scholar
  11. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In ICML, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. J. Smola, S. V. N. Vishwanathan, and Q. Le. Bundle methods for machine learning. In NIPS, 2007.Google ScholarGoogle Scholar
  13. B. Taskar, C. Guestrin, and D. Koller. Maximum-Margin Markov networks. In NIPS, 2003.Google ScholarGoogle Scholar
  14. B. Taskar, D. Klein, M. Collins, D. Koller, and C. Manning. Max-margin Parsing. In EMNLP, 2004.Google ScholarGoogle Scholar
  15. I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453--1484, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. V. N. Vishwanathan, N. N. Schraudolph, M. W. Schmidt, and K. P. Murphy. Accelerated training of conditional random fields with stochastic gradient methods. In ICML, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. V. N. Vishwanathan, A. J. Smola, and M. N. Murty. SimpleSVM. In ICML, 2003.Google ScholarGoogle Scholar
  18. C. K. I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In NIPS, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C.-N. Yu, T. Joachims, R. Elber, and J. Pillardy. Support vector training of protein alignment models. In RECOMB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Training structural svms with kernels using sampled cuts

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2008
      1116 pages
      ISBN:9781605581934
      DOI:10.1145/1401890
      • General Chair:
      • Ying Li,
      • Program Chairs:
      • Bing Liu,
      • Sunita Sarawagi

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader