1 Introduction
-
We have viewed the problem of mixture of hyperspheres under the probabilistic perspective and proposed a mixture of support vector data description (mSVDD) for novelty detection. We have employed the EM algorithm to train mSVDD. The resultant model can automatically discover the appropriate number of experts in use and the weight of each expert.
-
We have conducted the experiments on several benchmark datasets. The results have showed that mSVDD can learn mixture of hyperspheres in the input space which can approximately represent the set of contours generated by the traditional SVDD in the feature space. While the accuracy of mSVDD is comparable, the training time of mSVDD is faster than the kernelized baselines, since all computations are performed directly in the input space.
-
Compared with the conference version [15], we have introduced two new strategies, i.e., approximate SVDD (amSVDD) and probabilistic mSVDD (pmSVDD), to improve the training time and accuracy compared with the general mSVDD. We have also conducted more experiments to evaluate these two new strategies and investigated the behaviors of the proposed algorithms.
2 Related work
3 Background
3.1 Expectation maximization principle
3.2 Weighted support vector data description
4 Mixture of support vector data descriptions
4.1 Idea of mixture of support vector data descriptions
4.2 Optimization problem
4.2.1 E-step
4.2.2 M-step
4.3 How to do model selection in mixture of SVDDs
4.4 The approaches to make a decision in mSVDD
4.5 Approximate mixture of SVDDs
5 Experiment
5.1 Experimental settings
Datasets | #Train | #Dim | Domain |
---|---|---|---|
a9a | 26,049 | 123 | Social survey |
usps | 5833 | 256 | OCR images |
Mushroom | 6500 | 112 | Biology |
Shuttle | 34,800 | 9 | Physical |
Splice | 800 | 60 | Biology |
5.2 How fast and accurate the proposed method compares with the baselines
Dataset | One-class accuracy | ||||
---|---|---|---|---|---|
mSVDD | SVDD | KBVM | KCVM | LCVM | |
a9a | 76.18 | 75.06 | 76.62 | 75.24 | 50.00 |
usps | 94.81 | 91.65 | 99.15 | 98.89 | 50.02 |
Shuttle | 95.43 | 98.24 | 99.99 | 99.99 | 50.00 |
Mushroom | 95.48 | 99.75 | 100 | 100 | 50.10 |
Splice | 80.00 | 70.98 | 84.92 | 83.84 | 48.28 |
Dataset | Negative prediction value | ||||
---|---|---|---|---|---|
mSVDD | KSVDD | KBVM | KCVM | LCVM | |
a9a | 88.99 | 89.80 | 88.13 | 87.24 | 73.44 |
usps | 98.20 | 96.84 | 99.67 | 99.58 | 86.67 |
Shuttle | 95.68 | 80.63 | 99.99 | 99.97 | 20.00 |
Mushroom | 95.61 | 99.64 | 100 | 100 | 62.86 |
Splice | 75.70 | 67.52 | 83.60 | 84.11 | 12.50 |
Dataset |
\(F_{1}\) score | ||||
---|---|---|---|---|---|
mSVDD | KSVDD | KBVM | KCVM | LCVM | |
a9a | 63.76 | 59.75 | 65.91 | 64.30 | 38.76 |
usps | 92.60 | 90.82 | 99.07 | 98.04 | 28.14 |
Shuttle | 95.40 | 97.45 | 100 | 99.99 | 90.99 |
Mushroom | 95.76 | 99.74 | 100 | 100 | 65.02 |
Splice | 79.80 | 69.96 | 85.24 | 84.59 | 66.44 |
Dataset | Training time | ||||
---|---|---|---|---|---|
mSVDD | KSVDD | KBVM | KCVM | LCVM | |
a9a | 55.48 | 1,334.75 | 1,342.05 | 3,396.91 | 1.45 |
usps | 1.53 | 11.92 | 10.95 | 33.30 | 1.02 |
Shuttle | 5.00 | 57.11 | 10.55 | 20.92 | 0.62 |
Mushroom | 5.00 | 39.36 | 11.90 | 7.29 | 0.53 |
Splice | 0.50 | 8.10 | 1.75 | 4.47 | 0.06 |
5.3 How approximate and probabilistic approaches improve mSVDD
Dataset | One-class accuracy | Training time | ||||
---|---|---|---|---|---|---|
mSVDD | amSVDD | pmSVDD | mSVDD | amSVDD | pmSVDD | |
a9a | 76.18 | 77.42 | 77.43 | 55.48 | 60.89 | 58.77 |
usps | 94.81 | 93.29 | 95.20 | 1.53 | 1.41 | 1.09 |
Mushroom | 95.48 | 92.74 | 92.35 | 5.00 | 4.74 | 5.39 |
Australian | 81.39 | 81.06 | 82.57 | 0.06 | 0.02 | 0.06 |
Breast-cancer | 96.44 | 94.51 | 96.75 | 0.02 | 0.01 | 0.02 |
Dataset | Negative prediction value |
\(F_{1}\) score | ||||
---|---|---|---|---|---|---|
mSVDD | amSVDD | pmSVDD | mSVDD | amSVDD | pmSVDD | |
a9a | 88.99 | 90.77 | 90.78 | 63.76 | 63.49 | 63.49 |
usps | 98.20 | 97.77 | 98.36 | 92.60 | 89.85 | 92.68 |
Mushroom | 95.61 | 92.14 | 91.42 | 95.76 | 92.25 | 91.79 |
Australian | 83.02 | 84.69 | 84.12 | 79.23 | 79.90 | 80.66 |
Breast-cancer | 92.37 | 86.59 | 94.19 | 97.13 | 95.14 | 97.54 |
5.4 How variation of the parameter \(\delta \) influences accuracy
\(\varvec{\delta =}\)
|
–1
|
–0.75
|
–0.50
|
–0.25
|
0
|
0.25
|
0.50
|
0.75
|
---|---|---|---|---|---|---|---|---|
a9a | 74.93 | 76.08 | 75.54 | 74.78 | 74.32 | 75.69 | 75.07 | 75.41 |
usps | 93.96 | 95.011 | 92.68 | 93.51 | 93.41 | 94.46 | 93.50 | 92.57 |
Mushroom | 95.55 | 95.30 | 96.20 | 95.83 | 95.86 | 95.00 | 95.05 | 95.55 |
Australian | 78.90 | 79.04 | 79.08 | 77.12 | 78.94 | 78.06 | 77.42 | 77.49 |
Diabetes | 67.99 | 68.40 | 68.30 | 72.78 | 72.70 | 71.69 | 74.03 | 73.21 |
\(\varvec{\delta =}\)
|
–1
|
–0.75
|
–0.50
|
–0.25
|
0
|
0.25
|
0.50
|
0.75
|
---|---|---|---|---|---|---|---|---|
a9a | 87.86 | 88.83 | 88.56 | 87.69 | 87.49 | 88.39 | 88.00 | 88.14 |
usps | 98.01 | 98.37 | 97.56 | 97.79 | 97.68 | 98.08 | 97.78 | 97.52 |
mushroom | 95.61 | 96.12 | 96.19 | 95.24 | 96.29 | 94.45 | 95.02 | 94.82 |
australian | 79.73 | 79.21 | 80.00 | 77.30 | 79.87 | 78.62 | 77.34 | 76.74 |
diabetes | 49.28 | 50.73 | 51.88 | 51.78 | 52.99 | 52.42 | 54.21 | 52.10 |