research-article

Training structural svms with kernels using sampled cuts

Authors:
Chun-Nam John Yu

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Thorsten Joachims

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2008Pages 794–802https://doi.org/10.1145/1401890.1401985

Published:24 August 2008Publication History

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 794–802

ABSTRACT

Discriminative training for structured outputs has found increasing applications in areas such as natural language processing, bioinformatics, information retrieval, and computer vision. Focusing on large-margin methods, the most general (in terms of loss function and model structure) training algorithms known to date are based on cutting-plane approaches. While these algorithms are very efficient for linear models, their training complexity becomes quadratic in the number of examples when kernels are used. To overcome this bottleneck, we propose new training algorithms that use approximate cutting planes and random sampling to enable efficient training with kernels. We prove that these algorithms have improved time complexity while providing approximation guarantees. In empirical evaluations, our algorithms produced solutions with training and test error rates close to those of exact solvers. Even on binary classification problems where highly optimized conventional training methods exist (e.g. SVM-light), our methods are about an order of magnitude faster than conventional training methods on large datasets, while remaining competitive in speed on datasets of medium size.

References

R. Collobert, S. Bengio, and Y. Bengio. A parallel mixture of SVMs for very large scale problems. In NIPS, 2002.Google ScholarDigital Library
S. Fine and K. Scheinberg. Efficient SVM training using low-rank kernel representations. JMLR, 2:243--264, 2001. Google ScholarDigital Library
V. Franc and S. Sonnenburg. Optimized cutting plane algorithm for support vector machines. In ICML, 2008. Google ScholarDigital Library
T. Joachims. A support vector method for multivariate performance measures. In ICML, 2005. Google ScholarDigital Library
T. Joachims. Training linear SVMs in linear time. In KDD, 2006. Google ScholarDigital Library
T. Joachims, T. Finley, and C.-N. Yu. Cutting plane training of structural SVMs. MLJ, To Appear. Google ScholarDigital Library
S. S. Keerthi, O. Chapelle, and D. DeCoste. Building support vector machines with reduced classifier complexity. JMLR, 7:1493--1515, 2006. Google ScholarDigital Library
J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, chapter 12. MIT-Press, 1999. Google ScholarDigital Library
N. D. Ratliff, J. A. Bagnell, and M. A. Zinkevich. (Online) subgradient methods for structured prediction. In AISTATS, 2007.Google Scholar
M. Saito and M. Matsumoto. SIMD-oriented fast mersenne twister: a 128-bit pseudorandom number generator. In Monte Carlo and Quasi-Monte Carlo Methods 2006, 2006.Google Scholar
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In ICML, 2007. Google ScholarDigital Library
A. J. Smola, S. V. N. Vishwanathan, and Q. Le. Bundle methods for machine learning. In NIPS, 2007.Google Scholar
B. Taskar, C. Guestrin, and D. Koller. Maximum-Margin Markov networks. In NIPS, 2003.Google Scholar
B. Taskar, D. Klein, M. Collins, D. Koller, and C. Manning. Max-margin Parsing. In EMNLP, 2004.Google Scholar
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453--1484, September 2005. Google ScholarDigital Library
S. V. N. Vishwanathan, N. N. Schraudolph, M. W. Schmidt, and K. P. Murphy. Accelerated training of conditional random fields with stochastic gradient methods. In ICML, 2006. Google ScholarDigital Library
S. V. N. Vishwanathan, A. J. Smola, and M. N. Murty. SimpleSVM. In ICML, 2003.Google Scholar
C. K. I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In NIPS, 2001.Google ScholarDigital Library
C.-N. Yu, T. Joachims, R. Elber, and J. Pillardy. Support vector training of protein alignment models. In RECOMB, 2007. Google ScholarDigital Library
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR, 2007. Google ScholarDigital Library

Index Terms

Training structural svms with kernels using sampled cuts
1. Computing methodologies
  1. Machine learning

Recommendations

Cutting-plane training of structural SVMs

Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training ...
Read More
Sparse kernel SVMs via cutting-plane training

We explore an algorithm for training SVMs with Kernels that can represent the learned rule using arbitrary basis vectors, not just the support vectors (SVs) from the training set. This results in two benefits. First, the added flexibility makes it ...
Read More
Kernels for Generalized Multiple-Instance Learning

The multiple-instance learning (MIL) model has been successful in numerous application areas. Recently, a generalization of this model and an algorithm for it were introduced, showing significant advantages over the conventional MIL model on certain ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2008
1116 pages
ISBN:9781605581934
DOI:10.1145/1401890
General Chair:
Ying Li
Microsoft adCenter Labs
,
Program Chairs:
Bing Liu
University of Illinois at Chicago
,
Sunita Sarawagi
Indian Institute of Technology, Bombay
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
kernels
large-scale problems
support vector machines
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 482
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Training structural svms with kernels using sampled cuts

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cutting-plane training of structural SVMs

Sparse kernel SVMs via cutting-plane training

Kernels for Generalized Multiple-Instance Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Training structural svms with kernels using sampled cuts

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cutting-plane training of structural SVMs

Sparse kernel SVMs via cutting-plane training

Kernels for Generalized Multiple-Instance Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media