1 Introduction
2 Basic concepts
2.1 Support vector machine
2.2 Twin support vector machine
2.3 Least squares twin support vector machine
3 Proposed algorithm
3.1 Simulated annealing
3.2 SA-LSTSVM
Data sets | # features | # samples | Lost data? |
---|---|---|---|
Australian Credit Approval | 14 | 690 | No |
Liver Disorders | 7 | 345 | No |
Contraceptive Method Choice (CMC) | 9 | 1473 | No |
Statlog (Heart) | 13 | 270 | No |
Hepatitis | 19 | 155 | Yes |
Ionosphere | 34 | 351 | No |
Connectionist Bench (Sonar) | 60 | 208 | No |
Congressional Voting Records | 16 | 435 | Yes |
Breast Cancer Wisconsin (Prognostic) | 34 | 198 | No |
Data set | Algorithm | ||||||
---|---|---|---|---|---|---|---|
SA-LSTSVM | LSTSVM | TSVM | GEPSVM | PSVM | SVM | C4.5 | |
Australian Credit Approval |
88.21
\(\pm \) 0.02, c 0.5, sigma 0.015 | 86.61 \(\pm \) 4.0 | 86.91 \(\pm \) 3.5 | 80.00 \(\pm \) 3.99 | 85.43 \(\pm \) 3.0 | 85.51 \(\pm \) 4.58 | 85.2 \(\pm \) 1.3 |
Liver Disorder |
71.3
\(\pm \) 0.15, c 0.0004, sigma 0.037 | 70.90 \(\pm \) 6.09 | 70.5 \(\pm \) 6.6 | 66.36 \(\pm \) 4.39 | 70.15 \(\pm \) 8.82 | 58.32 \(\pm \) 8.2 | 68.3 \(\pm \) 0.7 |
Contraceptive Method Choice (CMC) |
70.48
\(\pm \) 0.04, c 0.5, sigma 5.08E\(-\)05 | 68.84 \(\pm \) 2.77 | 68.84 \(\pm \) 2.39 | 68.76 \(\pm \) 2.98 | 68.98 \(\pm \) 3.95 | 67.82 \(\pm \) 2.63 | 65.1 \(\pm \) 0.02 |
Statlog (Heart) |
90.61
\(\pm \) 0.43, c 0.5, sigma 0.0004 | 85.55 \(\pm \) 4.07 | 86.66 \(\pm \) 6.8 | 85.55 \(\pm \) 6.1 | 85.55 \(\pm \) 7.27 | 84.07 \(\pm \) 4.4 | 76.6 \(\pm \) 0.4 |
Hepatitis |
98.21
\(\pm \) 0.30, c 0.007, sigma 2.09E\(-\)07 | 86.42 \(\pm \) 9.78 | 85.71 \(\pm \) 6.73 | 85 \(\pm \) 9.19 | 85.71 \(\pm \) 5.83 | 80.83 \(\pm \) 8.3 | 60.6 \(\pm \) 1.08 |
Ionosphere |
91.37
\(\pm \) 0.11, c 0.5, sigma 5.08E\(-\)05 | 89.70 \(\pm \) 5.58 | 88.23 \(\pm \) 3.10 | 84.11 \(\pm \) 3.2 | 89.11 \(\pm \) 2.79 | 86.04 \(\pm \) 2.37 | 90.8 \(\pm \) 2.3 |
Connectionist Bench (Sonar) |
82.81
\(\pm \) 0.18, c 0.031, sigma 1.69E\(-\)05 | 80.47 \(\pm \) 6.7 | 80.52 \(\pm \) 4.9 | 79.47 \(\pm \) 7.6 | 78.94 \(\pm \) 4.43 | 79.79 \(\pm \) 5.31 | 68.3 \(\pm \) 3.5 |
Congressional Voting Records |
98.22
\(\pm \) 0.01, c 0.25, sigma 5.08E\(-\)05 | 95.23 \(\pm \) 1.94 | 95.9 \(\pm \) 2.2 | 95 \(\pm \) 2.36 | 95 \(\pm \) 3.06 | 94.5 \(\pm \) 2.71 | 91.6 \(\pm \) 0.87 |
Breast Cancer Wisconsin (Prognostic) |
97.35
\(\pm \) 0.005, c 0.5, sigma 0.012 | 83.88 \(\pm \) 5.52 | 83.68 \(\pm \) 6.24 | 81.11 \(\pm \) 7.94 | 83.3 \(\pm \) 4.53 | 79.92 \(\pm \) 9.18 | 90.5 \(\pm \) 3.9 |
4 Experimental results
4.1 Small data sets
4.2 Larger data sets
4.3 Statistical comparison of classifiers
Dataset | Algorithms | ||||||
---|---|---|---|---|---|---|---|
SA-LSTSVM | LSTSVM | TSVM | GEPSVM | PSVM | SVM | C4.5 | |
NDC-3k |
85.16
| 79.24 | 77.73 | 77.20 | 79.23 | 62 | 80 |
NDC-4k |
84.32
| 79.87 | 78.65 | 75.98 | 79.87 | 61.85 | 80.42 |
NDC-5k |
85.40
| 78.93 | 77.49 | 75.43 | 78.01 | 61.72 | 79.52 |
NDC-10k |
87.64
| 86.17 | 85.31 | 84.32 | 85.95 | 61.4 | 82.5 |
NDC-100k | 88.31 | 86.07 | * | 84.02 | 86.32 |
\(*\)
|
89.2
|
Dataset | Algorithms | ||||||
---|---|---|---|---|---|---|---|
SA-LSTSVM | LSTSVM | TSVM | GEPSVM | PSVM | SVM | C4.5 | |
Australian Credit Approval |
88.21 (1) | 86.61 (3) | 86.91 (2) | 80.00 (7) | 85.43 (5) | 85.51 (4) | 85.2 (6) |
Liver Disorder |
71.3 (1) | 70.90 (2) | 70.5 (3) | 66.36 (6) | 70.15 (4) | 58.32 (7) | 68.3 (5) |
Contraceptive Method Choice (CMC) |
70.48 (1) | 68.84 (3.5) | 68.84 (3.5) | 68.76 (5) | 68.98 (2) | 67.82 (6) | 65.1 (7) |
Statlog (Heart) |
90.61 (1) | 85.55 (4) | 86.66 (2) | 85.55 (4) | 85.55 (4) | 84.07 (6) | 76.6 (7) |
Hepatitis |
98.21 (1) | 86.42 (2) | 85.71 (3.5) | 85 (5) | 85.71 (3.5) | 80.83 (6) | 60.6 (7) |
Ionosphere |
91.37 (1) | 89.70 (3) | 88.23 (5) | 84.11 (7) | 89.11 (4) | 86.04 (6) | 90.8 (2) |
Connectionist Bench (Sonar) |
82.81 (1) | 80.47 (3) | 80.52 (2) | 79.47 (5) | 78.94 (6) | 79.79 (4) | 68.3 (7) |
Congressional Voting Records |
98.22 (1) | 95.23 (3) | 95.9 (2) | 95 (4.5) | 95 (4.5) | 94.5 (6) | 91.6 (7) |
Breast Cancer Wisconsin (Prognostic) |
97.35 (1) | 83.88 (3) | 83.68 (4) | 81.11 (6) | 83.3 (5) | 79.92 (7) | 90.5 (2) |
NDC-3k |
85.16 (1) | 79.24 (3) | 77.73 (5) | 77.20 (6) | 79.23 (4) | 62 (7) | 80 (2) |
NDC-4k |
84.32 (1) | 79.87 (3.5) | 78.65 (5) | 75.98 (6) | 79.87 (3.5) | 61.85 (7) | 80.42 (2) |
NDC-5k |
85.40 (1) | 78.93 (3) | 77.49 (5) | 75.43 (6) | 78.01 (4) | 61.72 (7) | 79.52 (2) |
NDC-10k |
87.64 (1) | 86.17 (2) | 85.31 (4) | 84.32 (5) | 85.95 (3) | 61.4 (7) | 82.5 (6) |
Average Rank | 1 | 2.923 | 3.538 | 5.576 | 4.038 | 6.153 | 4.769 |
Data sets Algorithms | SVM | LSTSVM | SA-LSTSVM |
---|---|---|---|
Australian Credit Approval | 1.9 | 0.014 | 1.74 |
Liver Disorder | 1.85 | 0.008 | 1.01 |
Contraceptive Method Choice (CMC) | 3.6 | 0.018 | 0.87 |
Statlog (Heart) | 1.58 | 0.013 | 1.11 |
Hepatitis | 1.3 | 0.009 | 0.93 |
Ionosphere | 1.49 | 0.035 | 0.69 |
Connectionist Bench (Sonar) | 1.45 | 0.053 | 1.29 |
Congressional Voting Records | 3.21 | 0.008 | 1.6 |
Breast Cancer Wisconsin (Prognostic) | 3.73 | 0.028 | 0.8 |
NDC-3k | 11.08 | 0.009 | 3.05 |
NDC-4k | 22.83 | 0.014 | 7.54 |
NDC-5k | 59.58 | 0.018 | 45.50 |
NDC-10k | 241.68 | 0.026 | 211.56 |
NDC-100k |
\(*\)
| 0.19 | 1684.82 |
-
SA-LSTSVM performs significantly better that LSTSVM, since \(1-2.923 < 2.235\)
-
SA-LSTSVM performs significantly better that TSVM, since \(1-3.538 < 2.235\)
-
SA-LSTSVM performs significantly better that GEPSVM, since \(1-5.576 < 2.235\)
-
SA-LSTSVM performs significantly better that PSVM, since \(1-4.038 < 2.235\)
-
SA-LSTSVM performs significantly better that SVM, since \(1-6.153 < 2.235\)
-
SA-LSTSVM performs significantly better that C4.5, since \(1-4.769 < 2.235\).