skip to main content
10.1145/1015330.1015341acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Support vector machine learning for interdependent and structured output spaces

Published:04 July 2004Publication History

ABSTRACT

Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits the sparseness and structural decomposition of the problem. We demonstrate the versatility and effectiveness of our method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.

References

  1. Altun, Y., Tsochantaridis, I., & Hofmann, T. (2003). Hidden markov support vector machines. ICML.Google ScholarGoogle Scholar
  2. Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Collins, M. (2004). Parameter estimation for statistical parsing models: Theory and practice of distribution-free methods.Google ScholarGoogle Scholar
  4. Crammer, K., & Singer, Y. (2001). On the algorithmic implementation of multi-class kernel-based vector machines. Machine Learning Research, 2, 265--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hofmann, T., Tsochantaridis, I., & Altun, Y. (2002). Learning over structured output spaces via joint kernel functions. Sixth Kernel Workshop.Google ScholarGoogle Scholar
  6. Joachims, T. (2003). Learning to align sequences: A maximum-margin approach (Technical Report). Cornell University.Google ScholarGoogle Scholar
  7. Johnson, M. (1999). PCFG models of linguistic tree representations. Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Lafferty, J. McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Manning, C. D., & Schuetze, H. (1999). Foundations of statistical natural language processing. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Taskar, B., Guestrin, C., & Koller, D. (2004). Maxmargin markov networks. NIPS 16.Google ScholarGoogle Scholar
  11. Vapnik, V. (1998). Statistical learning theory. Wiley and Sons Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Weston, J., Chapelle, O., Elisseeff, A., Schölkopf, B., & Vapnik, V. (2003). Kernel dependency estimation. NIPS 15.Google ScholarGoogle Scholar
  13. Weston, J., & Watkins, C. (1998). Multi-class support vector machines (Technical Report CSD-TR-98-04). Department of Computer Science, Royal Holloway, University of London.Google ScholarGoogle Scholar
  1. Support vector machine learning for interdependent and structured output spaces

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '04: Proceedings of the twenty-first international conference on Machine learning
        July 2004
        934 pages
        ISBN:1581138385
        DOI:10.1145/1015330
        • Conference Chair:
        • Carla Brodley

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 July 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader