ABSTRACT
Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits the sparseness and structural decomposition of the problem. We demonstrate the versatility and effectiveness of our method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.
- Altun, Y., Tsochantaridis, I., & Hofmann, T. (2003). Hidden markov support vector machines. ICML.Google Scholar
- Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. EMNLP. Google ScholarDigital Library
- Collins, M. (2004). Parameter estimation for statistical parsing models: Theory and practice of distribution-free methods.Google Scholar
- Crammer, K., & Singer, Y. (2001). On the algorithmic implementation of multi-class kernel-based vector machines. Machine Learning Research, 2, 265--292. Google ScholarDigital Library
- Hofmann, T., Tsochantaridis, I., & Altun, Y. (2002). Learning over structured output spaces via joint kernel functions. Sixth Kernel Workshop.Google Scholar
- Joachims, T. (2003). Learning to align sequences: A maximum-margin approach (Technical Report). Cornell University.Google Scholar
- Johnson, M. (1999). PCFG models of linguistic tree representations. Computational Linguistics. Google ScholarDigital Library
- Lafferty, J. McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML. Google ScholarDigital Library
- Manning, C. D., & Schuetze, H. (1999). Foundations of statistical natural language processing. MIT Press. Google ScholarDigital Library
- Taskar, B., Guestrin, C., & Koller, D. (2004). Maxmargin markov networks. NIPS 16.Google Scholar
- Vapnik, V. (1998). Statistical learning theory. Wiley and Sons Inc. Google ScholarDigital Library
- Weston, J., Chapelle, O., Elisseeff, A., Schölkopf, B., & Vapnik, V. (2003). Kernel dependency estimation. NIPS 15.Google Scholar
- Weston, J., & Watkins, C. (1998). Multi-class support vector machines (Technical Report CSD-TR-98-04). Department of Computer Science, Royal Holloway, University of London.Google Scholar
- Support vector machine learning for interdependent and structured output spaces
Recommendations
Extreme learning machine for structured output spaces
Recently, extreme learning machine (ELM) has attracted increasing attention due to its successful applications in classification, regression, and ranking. Normally, the desired output of the learning system using these machine learning techniques is a ...
Twin support vector machine: theory, algorithm and applications
Twin support vector machine (TWSVM) has gained increasing interest from various research fields recently. In this paper, we aim to report the current state of the theoretical research and practical advances on TWSVM. We first give the basic thought and ...
Comments