Structural twin support vector machine for classification

doi:10.1016/j.knosys.2013.01.008

Knowledge-Based Systems

Volume 43, May 2013, Pages 74-81

https://doi.org/10.1016/j.knosys.2013.01.008 Get rights and content

Abstract

It has been shown that the structural information of data may contain useful prior domain knowledge for training a classifier. How to apply the structural information of data to build a good classifier is a new research focus recently. As we all know, the all existing structural large margin methods are the common in considering all structural information within classes into one model. In fact, these methods do not balance all structural information’s relationships both infra-class and inter-class, which directly results in these prior information not being exploited sufficiently. In this paper, we design a new Structural Twin Support Vector Machine (called $S$ -TWSVM). Unlike existing methods based on structural information, $S$ -TWSVM uses two hyperplanes to decide the category of new data, of which each model only considers one class’s structural information and closer to the class at the same time far away from the other class. This makes $S$ -TWSVM fully exploit these prior knowledge to directly improve the algorithm’s the capacity of generalization. All experiments show that our proposed method is rigidly superior to the state-of-the-art algorithms based on structural information of data in both computation time and classification accuracy.

Introduction

In the last decade, Support Vector Machines (SVMs) [1], as powerful tools for pattern classification and regression, have already successfully applied in a wide variety of fields [2], [3], [4], [5], [6], [7], [8], [9]. For the standard support vector classification (SVC), the basic idea is to find the optimal separating hyperplane between the positive and negative examples. The optimal hyperplane may be obtained by maximizing the margin between two parallel hyperplanes, which involves the minimization of a quadratic programming problem (QPP). By introducing kernel trick into the dual QPP, SVC can also solve nonlinear classification problem successfully. As the extension of SVM, many new large margin classifiers based on structural information have been proposed. In fact, the traditional SVM does not sufficiently apply the prior example distribution information within classes and an optimal classifier should be sensitive to the structure of the data distribution. Exploiting clustering algorithms to extract the structural information embedded with classes is one popular strategy [10], [11], [12]. The structured large margin machine (SLMM) [10] is a representative work based on the strategy. Firstly, SLMM explores the structural information within classes by Ward’s agglomerative hierarchical clustering method on input data [13], and then introduces the related structure information into the constraints. Finally, SLMM is able to be solved by a sequential second order cone programming (SOCP). Experimentally, SLMM is superior to support vector machine minimax probability machine (MPM) [14] and maxi-min margin machine (M4) [15]. However, as we all know, solving the involved SOCP problem is more difficult than the QPP problem as in SVM, so SLMM has more higher computational complexity than traditional SVM. Consequently, a novel structural support vector machine (SRSVM) was proposed by Xue et al. [12]. Unlike SLMM, SRSVM exploits the classical framework of SVM rather than as constraints in SLMM and the corresponding optimization problem is still able to be solved by the QPP. SRSVM has been shown to be theoretically and empirically better in generalization than SVM and SLMM.

In this paper, inspired by the success of TWSVM methods [16], [17], [18], [19], [20], [21], [22], we proposed a new Structural Twin Support Vector Machine for classification (called $S$ -TWSVM). Similar to structural SVM methods, $S$ -TWSVM exploits the structural information within classes by some clustering technology, and then introduces the data distributions information into the model of $S$ -TWSVM to construct more reasonable classifier. Besides features above, $S$ -TWSVM has still the following compelling properties.

♢
To our knowledge, $S$ -TWSVM is the first TWSVM implementation based on structural information of the data, which is a useful extension of TWSVM.
♢
We show that the TWSVM and TBSVM [17](TWSVM with the regular terms) are the special cases of our proposed models. This provides an alternative explanation to the success of $S$ -TWSVM.
♢
Different with all existing structural classifiers such as [14], [15], [10], [11], [12], $S$ -TWSVM only considers one class’s structural information for each model. This makes $S$ -TWSVM has the following advantages: (1) further reducing the computational complexity of the related QPP problem; (2) to be able to effectively deal with the condition of the structural information between positive class and negative class existing contradiction and more reasonable to apply the prior structural information within classes; and (3) further improving the flexibility of the algorithm and the model’s generalization capacity.

The remaining parts of the paper are organized as follows. Section 2 briefly introduces the background of SLMM and SRSVM; Section 3 describes the detail of $S$ -TWSVM; All experiment results are shown in Section 4; Last section gives the conclusions.

Section snippets

Background

For classification about the training data $T = {(x_{1}, y_{1}), \dots, (x_{l}, y_{l})} \in (R^{n} \times Y)^{l},$ where $x_{i} \in R^{n}, y_{i} \in Y = {1, - 1}, i = 1, \dots, l$ .

Suppose there are respectively C_P and C_N clusters in class P and N, i.e., $P = P_{1} ⋃ \dots P_{i} ⋃ \dots P_{C_{P}}, N = N_{1} ⋃ \dots N_{j} ⋃ \dots N_{C_{N}}$ .

Extracting structural information within classes

Following the strategy of the SLMM and SRSVM, $S$ -TWSVM also has two steps. The first step is to extract the structural information within classes by some clustering method; the second step is the model learning. In order to compare the main difference of the second step between $S$ -TWSVM and the other two methods, here we also adopt the same clustering method: Ward’s linkage clustering (WIL) [13], [10], [11], [12], which is one of the hierarchical clustering analysis. A main advantage of WIL is

Experiments

We compare the $S$ -TWSVM against TBSVM [17], SLMM [10], and SRSVM [11], [12] on various data sets in this section.

In order to simplify, let c₁ = c₄, c₂ = c₅, c₃ = c₆ in $S$ -TWSVM. The testing accuracies of all experiments are computed using standard 10-fold cross validation. c₁, c₂, c₃ and RBF kernel parameter σ are all selected from the set {2ⁱ∣i = −7, … , 7} by 10-fold cross validation on the tuning set comprising of random 10% of the training data. Once the parameters are selected, the tuning set was

Conclusion

In this paper, we proposed a new Structural Twin Support Vector Machine (called $S$ -TWSVM), which is sensitive to the structure of the data distribution. In the view of structural information, we firstly point out the shortcomings of the existing algorithms based on structural information. Next, we design a new $S$ -TWSVM algorithm and analysis its advantages and relationships with other algorithms. Theoretical analysis and all experimental results show $S$ -TWSVM can more fully exploit these prior

Acknowledgment

This work has been partially supported by Grants from National Natural Science Foundation of China (NO.70921061, NO.11271361), the CAS/SAFEA International Partnership Program for Creative Research Teams, Major International (Ragional) Joint Research Project (NO.71110107026), the President Fund of GUCAS.

References (30)

M.M. Adankon et al.
Model selection for the LS-SVM. Application to handwriting recognition
Pattern Recognition
(2009)
N. Khan et al.
A novel SVM+NDA model for classification with an application to face recognition
Pattern Recognition
(2012)
Y.-C. Wu et al.
Robust and efficient multiclass SVM models for phrase pattern recognition
Pattern Recognition
(2008)
R. Liu et al.
SVM-based active feedback in image retrieval using clustering and unlabeled data
Pattern Recognition
(2008)
Y.-H. Shao et al.
A coordinate descent margin based-twin support vector machine for classification
Neural Networks
(2012)
Z. Qi et al.
Robust twin support vector machine for pattern classification
Pattern Recognition
(2013)
Z. Qi et al.
Twin support vector machine with universum data
Neural Networks
(2012)
Z. Qi et al.
Laplacian twin support vector machine for semi-supervised classification
Neural Networks
(2012)
C. Cortes et al.
Support-vector networks
Machine Learning
(1995)
W.S. Noble
Support Vector Machine Applications in Computational Biology
(2004)

B. Schölkopf, I. Guyon, J. Weston, Statistical Learning and Kernel Methods in Bioinformatics, Tech. Rep.,...

Y. Tian et al.

Recent advances on support vector machines research

Technological and Economic Development of Economy

(2012)

J. Tan, Z. Zhang, L. Zhen, C. Zhang, N. Deng, Adaptive feature selection via a new version of support vector machine,...

D. Yeung et al.

Structured large margin machines: sensitive to data distributions

Machine Learning

(2007)

H. Xue, S. Chen, Q. Yang, Structural support vector machine, in: The 15th International Symposium on Neural Networks,...

Cited by (139)

Bipolar fuzzy based least squares twin bounded support vector machine
2022, Fuzzy Sets and Systems
Data classification is a key domain of research in real-world applications. One of the big challenges of real-world data classification is to tackle the presence of noise and outliers. In this paper, we handle the computational cost of Bipolar evaluation pairs score value based support vector machine model by formulating two efficient variants, referred to as bipolar fuzzy least squares support vector machines and bipolar fuzzy least squares twin bounded support vector machines, in which the score value is obtained as a bipolar evaluation pair with membership and non-membership functions. The solution of primal problems is attained by solving a system of linear equations that leads to less training time complexity unlike other fuzzy based twin models obtain the solution by handling two quadratic programming problems. Furthermore, our proposed models minimize the impact of noise in the data and facilitate the separation of support vectors from noise. The proposed works have been computationally analyzed on many publicly available real-world benchmark datasets as well as simulated artificial datasets in the non-linear case for different significant levels of noise, namely 0% (noise-free) and 5% (noise-corrupted). In comparison to other similar models, our suggested model is showing phenomenal generalization performance by controlling the negative impact of noise and needs substantially less training time. The results of the proposed models are also validated using quality metrics including AUC, F₁-score, G-mean, and Precision Predictive Value.
Multi-category intuitionistic fuzzy twin support vector machines with an application to plant leaf recognition
2022, Engineering Applications of Artificial Intelligence
The intuitionistic fuzzy twin support vector machine for multi-categorization is developed in this study, which incorporates both structural and empirical risk concepts. In this method, each training pattern is first aggregated with the appropriate membership and non-membership degrees, which describe the position of a pattern in relation to its class centre and surrounding circumstances in input or feature space, and then the separating hyperplane is constructed using the kernel function and convex quadratic programming. Empirical findings on an artificial and thirteen UCI standard datasets show that it outperforms well-known existing methods including improved support vector machines, K-nearest neighbour, logistic regression, decision trees, random forests, and multilayer perceptrons. Furthermore, the suggested classifier with linear, polynomial, and Gaussian kernels has been used to identify the leaves of various plants, where the shape, texture, and margin data are extracted from the leaf in order to categorize the plant species. The method’s generalization capacity is demonstrated by the classification results on two leaf datasets of thirty and one hundred species, respectively. To compare the suggested method’s prediction capacity with others, statistical analysis is performed using two non-parametric tests, Friedman and Wilcoxon, with a 5% threshold of significance. The results show that the proposed method yields better performance for both linear and non-linear kernels.
NPrSVM: Nonparallel sparse projection support vector machine with efficient algorithm
2020, Applied Soft Computing Journal
The recently proposed projection twin support vector machine (PTSVM) is an excellent nonparallel classifier. However, PTSVM employs the least-squares loss function to measure its within-class empirical risk, resulting in several drawbacks, such as non-sparseness for decision, sensitivity to outliers, expensive matrix inversion, and inconsistency in the linear and nonlinear models. To alleviate these issues, in this paper, we propose a novel nonparallel sparse projection support vector machine (NPrSVM). Different from the original PTSVM that squeezes the projected values of within-class instances to its own class center, NPrSVM aims to cluster them as much as possible within an insensitive tube. Specifically, our NPrSVM owns the following attractive merits: (i) Benefiting from the $L_{1}$ -norm symmetric Hinge loss function, NPrSVM not only enjoys sparseness for decision but also improves robustness to outliers. (ii) The elegant formulation of dual problems in NPrSVM no longer involves the matrix inversion during the training procedure. This greatly saves the computing time compared to PTSVM. (iii) While the nonlinear formulation of PTSVM is not the direct extension of linear PTSVM, the linear and nonlinear versions of our NPrSVM are consistent. (iv) An efficient dual coordinate descent algorithm is further designed for NPrSVM to handle large-scale classification. Finally, the feasibility and effectiveness of NPrSVM are validated by extensive experiments on both synthetic and real-world datasets.
ν-projection twin support vector machine for pattern classification
2020, Neurocomputing
In this paper, we improve the projection twin support vector machine (PTSVM) to a novel nonparallel classifier, termed as ν-PTSVM. Specifically, our ν-PTSVM aims to seek an optimal projection for each class such that, in each projection direction, instances of their own class are clustered around their class center while keep instances of the other class at least one distance away from such center. Different from PTSVM, our ν-PTSVM enjoys the following characteristics: (i) ν-PTSVM is equipped by a more theoretically sound parameter ν, which can be used to control the bounds of fraction of both support vectors and margin-error instances. (ii) By reformulating the least-square loss of within-class instances in primal problems of ν-PTSVM, its dual problems no longer involve the time-costly matrix inversion. (iii) ν-PTSVM behaves consistent between its linear and nonlinear cases. Namely, the kernel trick can be applied directly to ν-PTSVM for its nonlinear extension. Experimental evaluations on both synthetic and real-world datasets demonstrate the feasibility and effectiveness of the proposed approach.
Twin minimax probability extreme learning machine for pattern recognition
2020, Knowledge-Based Systems
Minimax probability machine (MPM) is an excellent discriminant classifier based on prior knowledge. It can directly estimate a probability accuracy bound by minimizing the maximum probability of misclassification. However, the traditional MPM learns only one hyperplane to separate different classes in the feature space, and it may bear a heavy computational burden during the training process because it needs to address a large-scale second-order cone programming (SOCP)-type problem. In this work, we propose a novel twin minimax probability extreme learning machine (TMPELM) for pattern classification. TMPELM indirectly determines the separation hyperplane by solving a pair of smaller-sized SOCP-type problems to generate a pair of non-parallel hyperplanes. Specifically, for each hyperplane, TMPELM attempts to maximize the probability of correct classification for one class sample points, that is, to minimize the worst-case (maximum) probability of misclassification of a class sample points, and is far away from the other class. TMPELM first utilizes the random feature mapping mechanism to construct the feature space, and then two nonparallel separating hyperplanes are learned for the final classification. The proposed TMPELM not only utilizes the geometric information of the samples, but also takes advantage of the statistical information (mean and covariance of the samples). Moreover, we extend a linear model of TMPELM to a nonlinear model by exploiting kernelization techniques. We also analyzed the computational complexity of TMPELM. Experimental results on both NIR spectroscopy datasets and benchmark datasets demonstrate the effectiveness of TMPELM.
Polycentric Intuitionistic Fuzzy Weighted Least Squares Twin Svms
2023, SSRN

View all citing articles on Scopus

View full text

Structural twin support vector machine for classification

Abstract

Introduction

Section snippets

Background

Extracting structural information within classes

Experiments

Conclusion

Acknowledgment

Pattern Recognition

Pattern Recognition

Pattern Recognition

Pattern Recognition

Neural Networks

Pattern Recognition

Neural Networks

Neural Networks

Support-vector networks

Machine Learning

Support Vector Machine Applications in Computational Biology

Recent advances on support vector machines research

Technological and Economic Development of Economy

Structured large margin machines: sensitive to data distributions

Machine Learning