Elsevier

Knowledge-Based Systems

Volume 43, May 2013, Pages 74-81
Knowledge-Based Systems

Structural twin support vector machine for classification

https://doi.org/10.1016/j.knosys.2013.01.008Get rights and content

Abstract

It has been shown that the structural information of data may contain useful prior domain knowledge for training a classifier. How to apply the structural information of data to build a good classifier is a new research focus recently. As we all know, the all existing structural large margin methods are the common in considering all structural information within classes into one model. In fact, these methods do not balance all structural information’s relationships both infra-class and inter-class, which directly results in these prior information not being exploited sufficiently. In this paper, we design a new Structural Twin Support Vector Machine (called S-TWSVM). Unlike existing methods based on structural information, S-TWSVM uses two hyperplanes to decide the category of new data, of which each model only considers one class’s structural information and closer to the class at the same time far away from the other class. This makes S-TWSVM fully exploit these prior knowledge to directly improve the algorithm’s the capacity of generalization. All experiments show that our proposed method is rigidly superior to the state-of-the-art algorithms based on structural information of data in both computation time and classification accuracy.

Introduction

In the last decade, Support Vector Machines (SVMs) [1], as powerful tools for pattern classification and regression, have already successfully applied in a wide variety of fields [2], [3], [4], [5], [6], [7], [8], [9]. For the standard support vector classification (SVC), the basic idea is to find the optimal separating hyperplane between the positive and negative examples. The optimal hyperplane may be obtained by maximizing the margin between two parallel hyperplanes, which involves the minimization of a quadratic programming problem (QPP). By introducing kernel trick into the dual QPP, SVC can also solve nonlinear classification problem successfully. As the extension of SVM, many new large margin classifiers based on structural information have been proposed. In fact, the traditional SVM does not sufficiently apply the prior example distribution information within classes and an optimal classifier should be sensitive to the structure of the data distribution. Exploiting clustering algorithms to extract the structural information embedded with classes is one popular strategy [10], [11], [12]. The structured large margin machine (SLMM) [10] is a representative work based on the strategy. Firstly, SLMM explores the structural information within classes by Ward’s agglomerative hierarchical clustering method on input data [13], and then introduces the related structure information into the constraints. Finally, SLMM is able to be solved by a sequential second order cone programming (SOCP). Experimentally, SLMM is superior to support vector machine minimax probability machine (MPM) [14] and maxi-min margin machine (M4) [15]. However, as we all know, solving the involved SOCP problem is more difficult than the QPP problem as in SVM, so SLMM has more higher computational complexity than traditional SVM. Consequently, a novel structural support vector machine (SRSVM) was proposed by Xue et al. [12]. Unlike SLMM, SRSVM exploits the classical framework of SVM rather than as constraints in SLMM and the corresponding optimization problem is still able to be solved by the QPP. SRSVM has been shown to be theoretically and empirically better in generalization than SVM and SLMM.

In this paper, inspired by the success of TWSVM methods [16], [17], [18], [19], [20], [21], [22], we proposed a new Structural Twin Support Vector Machine for classification (called S-TWSVM). Similar to structural SVM methods, S-TWSVM exploits the structural information within classes by some clustering technology, and then introduces the data distributions information into the model of S-TWSVM to construct more reasonable classifier. Besides features above, S-TWSVM has still the following compelling properties.

  • To our knowledge, S-TWSVM is the first TWSVM implementation based on structural information of the data, which is a useful extension of TWSVM.

  • We show that the TWSVM and TBSVM [17](TWSVM with the regular terms) are the special cases of our proposed models. This provides an alternative explanation to the success of S-TWSVM.

  • Different with all existing structural classifiers such as [14], [15], [10], [11], [12], S-TWSVM only considers one class’s structural information for each model. This makes S-TWSVM has the following advantages: (1) further reducing the computational complexity of the related QPP problem; (2) to be able to effectively deal with the condition of the structural information between positive class and negative class existing contradiction and more reasonable to apply the prior structural information within classes; and (3) further improving the flexibility of the algorithm and the model’s generalization capacity.

The remaining parts of the paper are organized as follows. Section 2 briefly introduces the background of SLMM and SRSVM; Section 3 describes the detail of S-TWSVM; All experiment results are shown in Section 4; Last section gives the conclusions.

Section snippets

Background

For classification about the training dataT={(x1,y1),,(xl,yl)}(Rn×Y)l,where xiRn,yiY={1,-1},i=1,,l.

Suppose there are respectively CP and CN clusters in class P and N, i.e., P=P1PiPCP,N=N1NjNCN.

Extracting structural information within classes

Following the strategy of the SLMM and SRSVM, S-TWSVM also has two steps. The first step is to extract the structural information within classes by some clustering method; the second step is the model learning. In order to compare the main difference of the second step between S-TWSVM and the other two methods, here we also adopt the same clustering method: Ward’s linkage clustering (WIL) [13], [10], [11], [12], which is one of the hierarchical clustering analysis. A main advantage of WIL is

Experiments

We compare the S-TWSVM against TBSVM [17], SLMM [10], and SRSVM [11], [12] on various data sets in this section.

In order to simplify, let c1 = c4, c2 = c5, c3 = c6 in S-TWSVM. The testing accuracies of all experiments are computed using standard 10-fold cross validation. c1, c2, c3 and RBF kernel parameter σ are all selected from the set {2ii = −7,  , 7} by 10-fold cross validation on the tuning set comprising of random 10% of the training data. Once the parameters are selected, the tuning set was

Conclusion

In this paper, we proposed a new Structural Twin Support Vector Machine (called S-TWSVM), which is sensitive to the structure of the data distribution. In the view of structural information, we firstly point out the shortcomings of the existing algorithms based on structural information. Next, we design a new S-TWSVM algorithm and analysis its advantages and relationships with other algorithms. Theoretical analysis and all experimental results show S-TWSVM can more fully exploit these prior

Acknowledgment

This work has been partially supported by Grants from National Natural Science Foundation of China (NO.70921061, NO.11271361), the CAS/SAFEA International Partnership Program for Creative Research Teams, Major International (Ragional) Joint Research Project (NO.71110107026), the President Fund of GUCAS.

References (30)

  • B. Schölkopf, I. Guyon, J. Weston, Statistical Learning and Kernel Methods in Bioinformatics, Tech. Rep.,...
  • Y. Tian et al.

    Recent advances on support vector machines research

    Technological and Economic Development of Economy

    (2012)
  • J. Tan, Z. Zhang, L. Zhen, C. Zhang, N. Deng, Adaptive feature selection via a new version of support vector machine,...
  • D. Yeung et al.

    Structured large margin machines: sensitive to data distributions

    Machine Learning

    (2007)
  • H. Xue, S. Chen, Q. Yang, Structural support vector machine, in: The 15th International Symposium on Neural Networks,...
  • Cited by (139)

    View all citing articles on Scopus
    View full text