1 Introduction
2 Background
2.1 TSK inference
-
1 Determine the firing strength of each rule \(R_r\) (\(r\in \{1,2,\ldots ,n\}\)) by integrating the similarity degrees between its antecedents and the given inputs:where \(\wedge \) is a t-norm usually implemented as a minimum operator, and \(S(A_s^*,A_{sr})\) (\(s\in \{1,2,\ldots , m\}\)) is the similarity degree between fuzzy sets \(A_s^*\) and \(A_{sr}\):$$\begin{aligned} \alpha _r = S(A_{1}^*,A_{1r}) \wedge \ldots \wedge S(A_m^*,A_{mr}), \end{aligned}$$(2)where \(\mu _{A_{s}^*}(x) \text { and } \mu _{A_{sr}}(x)\) are the degrees of membership for a given value x within the domain.$$\begin{aligned} S(A_s^*,A_{sr}) = max\{min\{\mu _{A_s^*}(x),\mu _{A_{sr}}(x)\}\}, \end{aligned}$$(3)
-
2 Calculate the sub-output led by each rule \(R_r\) based on the given observation (\(A_1^*,\ldots ,A_m^*\)):where \(Rep(A^*_s)\) is the representative value or defuzzified value of fuzzy set fuzzy set \(A^*_s\), which is often calculated as the centre of area of the membership function.$$\begin{aligned} \begin{aligned}&f_r(x_1^*,\ldots ,x_m^*) \\&\quad = \beta _{0r}+\beta _{1r}Rep(A_1^*)+ \cdots + \beta _{mr}Rep(A_m^*), \end{aligned} \end{aligned}$$(4)
-
3 Generate the final output by integrating all the sub-outputs from all the rules:$$\begin{aligned} y=\frac{\displaystyle \sum \nolimits _{r=1}^{n}\alpha _r f_r(x_1^*,\ldots ,x_m^*)}{\displaystyle \sum \nolimits _{r=1}^{n}\alpha _r}. \end{aligned}$$(5)
2.2 TSK rule base generation
3 TSK inference with fuzzy interpolation (TSK+)
3.1 Modified similarity measure
3.2 Extended TSK inference
4 Sparse TSK rule base generation
4.1 Rule base initialisation
4.1.1 Number of clusters determination
4.1.2 Dense sub-data set generation
4.1.3 Rule cluster generation
4.1.4 Fuzzy rule extraction
4.2 Rule base optimisation
4.2.1 Problem representation
4.2.2 Population initialisation
4.2.3 Objective function
4.2.4 Selection
4.2.5 Reproduction
4.2.6 Iteration and termination
5 Experimentation
5.1 Experiment 1
5.1.1 Rule base initialisation
No. of k
| SSE | PI |
---|---|---|
1 | 2961.91 | |
2 | 2248.98 | 712.93 |
3 | 1763.74 | 485.24 |
4 | 1478.32 | 285.42 |
5 | 1286.41 | 191.91 |
6 | 1180.48 | 105.93 |
7 | 1079.48 | 101.00 |
8 | 997.626 | 81.85 |
9 | 946.238 | 51.39 |
10 | 880.43 | 65.81 |
11 | 840.865 | 39.57 |
12 | 808.51 | 32.36 |
13 | 767.61 | 40.90 |
14 | 709.997 | 57.61 |
15 | 689.301 | 20.70 |
k
|
\(T_1\)
|
\(T_2\)
|
\(T_3\)
|
\(T_4\)
|
\(T_5\)
|
\(T_6\)
| ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
SSE | PI | SSE | PI | SSE | PI | SSE | PI | SSE | PI | SSE | PI | |
1 | 159.9 | 271.6 | 128.3 | 222.28 | 133.98 | 259.65 | ||||||
2 | 109.42 | 50.48 | 187.65 | 83.95 | 68.61 | 59.69 | 140.16 | 82.12 | 96.06 | 37.92 | 187.95 | 71.70 |
3 | 81.44 | 27.98 | 149.88 | 37.78 | 56.66 | 11.95 | 114.23 | 25.93 | 66.42 | 29.64 | 147.02 | 40.93 |
4 | 66.85 | 14.86 | 126.35 | 23.53 | 46.16 | 10.50 | 99.48 | 14.75 | 52.46 | 13.96 | 123.73 | 23.29 |
5 | 56.45 | 10.13 | 109.73 | 16.62 | 39.87 | 6.29 | 85.06 | 14.42 | 42.79 | 9.67 | 109.39 | 14.34 |
6 | 48.04 | 8.41 | 94.14 | 15.59 | 33.04 | 6.83 | 77.22 | 7.84 | 37.69 | 5.10 | 95.25 | 14.104 |
7 | 42.01 | 6.03 | 83.67 | 10.47 | 28.33 | 4.71 | 69.90 | 7.32 | 33.52 | 4.17 | 85.26 | 9.99 |
8 | 36.22 | 5.79 | 77.63 | 6.04 | 26.26 | 2.07 | 61.87 | 8.03 | 29.18 | 4.34 | 80.06 | 5.2 |
9 | 32.4 | 3.82 | 74.63 | 8.00 | 24.03 | 2.23 | 56.17 | 5.70 | 25.01 | 4.17 | 74.62 | 5.44 |
10 | 28.77 | 3.63 | 67.11 | 7.52 | 22.69 | 1.34 | 52.39 | 3.78 | 24.29 | 0.72 | 68.56 | 6.06 |
5.1.2 TSK fuzzy interpolation
5.1.3 Rule base optimisation
\(T_1\)
|
\(T_2\)
|
\(T_3\)
|
\(T_4\)
|
\(T_5\)
|
\(T_6\)
| |
---|---|---|---|---|---|---|
Determined number of k
| 5 | 5 | 3 | 4 | 6 | 5 |
No. | Input 1 | Input 2 | Output | ||||||
---|---|---|---|---|---|---|---|---|---|
\(a_{11}\)
|
\(a_{12}\)
|
\(a_{13}\)
|
\(a_{21}\)
|
\(a_{22}\)
|
\(a_{23}\)
|
\(\beta _0\)
|
\(\beta _1\)
|
\(\beta _2\)
| |
1 | 10.08 | 12.72 | 15.36 | 26.22 | 27.00 | 27.78 |
\(-\) 0.14 | 0.06 |
\(-\) 0.38 |
2 | 13.22 | 14.87 | 16.52 | 27.87 | 28.84 | 29.82 |
\(-\) 0.00 | 0.28 |
\(-\) 8.24 |
3 | 11.82 | 13.02 | 14.22 | 22.97 | 24.46 | 25.96 |
\(-\) 0.15 |
\(-\) 0.02 | 1.72 |
4 | 10.90 | 13.68 | 16.45 | 18.85 | 20.50 | 22.16 |
\(-\) 0.04 |
\(-\) 0.20 | 4.47 |
5 | 10.06 | 11.18 | 12.30 | 14.89 | 16.91 | 18.93 | 0.25 |
\(-\) 0.06 |
\(-\) 1.49 |
6 | 13.16 | 14.57 | 15.99 | 15.63 | 17.69 | 19.76 | 0.00 |
\(-\) 0.21 | 4.30 |
7 | 19.24 | 20.48 | 21.71 | 26.83 | 28.25 | 29.68 | 0.14 |
\(-\) 0.06 |
\(-\) 1.21 |
8 | 19.59 | 20.52 | 21.45 | 22.15 | 22.89 | 23.64 | 0.24 | 0.08 |
\(-\) 6.50 |
9 | 19.20 | 20.39 | 21.58 | 23.90 | 24.66 | 25.42 | 0.30 | 0.00 |
\(-\) 6.00 |
10 | 10.16 | 11.26 | 12.37 | 10.85 | 12.44 | 14.02 | 0.12 | 0.09 |
\(-\) 2.22 |
11 | 13.36 | 15.15 | 16.95 | 10.05 | 10.78 | 11.51 |
\(-\) 0.01 | 0.30 |
\(-\) 2.85 |
12 | 12.25 | 14.14 | 16.03 | 11.75 | 13.42 | 15.09 | 0.05 | 0.10 |
\(-\) 2.21 |
13 | 20.51 | 21.24 | 21.97 | 15.28 | 17.48 | 19.69 |
\(-\) 0.17 | 0.12 | 1.32 |
14 | 19.27 | 20.31 | 21.35 | 19.30 | 20.33 | 21.35 | 0.13 |
\(-\) 0.00 |
\(-\) 2.49 |
15 | 16.64 | 18.59 | 20.54 | 15.11 | 16.71 | 18.32 |
\(-\) 0.27 |
\(-\) 0.00 | 5.31 |
16 | 20.95 | 21.42 | 21.89 | 11.14 | 12.25 | 13.35 |
\(-\) 0.17 |
\(-\) 0.13 | 4.92 |
17 | 19.23 | 20.61 | 21.98 | 12.46 | 13.57 | 14.69 |
\(-\) 0.28 |
\(-\) 0.01 | 5.71 |
18 | 19.10 | 20.14 | 21.19 | 10.00 | 10.81 | 11.62 |
\(-\) 0.06 |
\(-\) 0.03 | 1.52 |
19 | 24.39 | 25.46 | 26.53 | 28.19 | 28.94 | 29.68 |
\(-\) 0.00 |
\(-\) 0.27 | 8.02 |
20 | 24.02 | 25.03 | 26.04 | 21.77 | 24.08 | 26.39 |
\(-\) 0.01 | 0.05 |
\(-\) 0.13 |
21 | 27.38 | 28.63 | 29.89 | 21.95 | 24.91 | 27.88 |
\(-\) 0.27 |
\(-\) 0.00 | 8.17 |
22 | 24.32 | 26.04 | 27.75 | 17.93 | 19.32 | 20.71 | 0.03 | 0.28 |
\(-\) 6.23 |
23 | 24.31 | 25.73 | 27.15 | 15.75 | 16.37 | 16.98 | 0.09 | 0.15 |
\(-\) 5.72 |
24 | 28.40 | 29.19 | 29.99 | 15.89 | 18.20 | 20.52 | 0.23 | 0.03 |
\(-\) 7.43 |
25 | 26.03 | 27.77 | 29.52 | 10.50 | 11.39 | 12.29 | 0.07 |
\(-\) 0.18 |
\(-\) 0.01 |
26 | 24.15 | 25.29 | 26.42 | 10.16 | 11.62 | 13.08 | 0.03 |
\(-\) 0.26 | 1.80 |
27 | 28.40 | 29.16 | 29.92 | 12.79 | 13.89 | 14.99 | 0.29 |
\(-\) 0.02 |
\(-\) 8.38 |
28 | 25.07 | 26.62 | 28.17 | 12.92 | 13.92 | 14.92 | 0.18 |
\(-\) 0.06 |
\(-\) 4.66 |
5.1.4 Results comparison
5.2 Experiment 2
p
|
\(S(A_1^*,A_{p1})\)
|
\(S(A_2^*,A_{p2})\)
|
\(FD_p\)
| Consequence |
---|---|---|---|---|
1 | 0.0053 | 0.059 | 0.053 |
\(-\) 0.014 |
2 | 0.020 | 0.003 | 0.003 |
\(-\) 0.012 |
3 | 0.004 | 0.010 | 0.004 |
\(-\) 0.014 |
4 | 0.001 | 0.836 | 0.001 | 0.006 |
5 | 0.007 | 0.455 | 0.007 | 0.004 |
6 | 0.016 | 0.780 | 0.016 | 0.023 |
7 | 0.467 | 0.157 | 0.157 | 0.170 |
8 | 0.461 | 0.004 | 0.004 | 0.008 |
9 | 0.449 | 0.052 | 0.052 | 0.118 |
10 | 0.012 | 0.954 | 0.012 | 0.018 |
11 | 0.024 | 0.870 | 0.024 | 0.024 |
12 | 0.002 | 0.939 | 0.002 | 0.005 |
13 | 0.211 | 0.848 | 0.211 |
\(-\) 0.430 |
14 | 0.567 | 0.812 | 0.567 |
\(-\) 0.996 |
15 | 0.440 | 0.480 | 0.440 | 0.114 |
16 | 0.591 | 0.941 | 0.591 |
\(-\) 0.913 |
17 | 0.480 | 0.969 | 0.480 |
\(-\) 1.042 |
18 | 0.414 | 0.871 | 0.414 |
\(-\) 0.232 |
19 | 0.926 | 0.009 | 0.009 | 0.005 |
20 | 0.932 | 0.004 | 0.004 | 0.014 |
21 | 0.925 | 0.077 | 0.077 | 0.005 |
22 | 0.927 | 0.868 | 0.868 |
\(-\) 0.937 |
23 | 0.918 | 0.736 | 0.736 |
\(-\) 0.471 |
24 | 0.932 | 0.616 | 0.616 |
\(-\) 1.030 |
25 | 0.948 | 0.902 | 0.902 |
\(-\) 0.622 |
26 | 0.947 | 0.966 | 0.947 |
\(-\) 0.547 |
27 | 0.906 | 0.913 | 0.906 |
\(-\) 0.849 |
28 | 0.920 | 0.964 | 0.920 |
\(-\) 0.589 |
Parameters | Values |
---|---|
Population Size | 100 |
Max value in fitness function | 2 |
Crossover rate | 0.85 |
Mutation rate | 0.05 |
Maximum iteration | 10,000 |
Termination threshold | 0.01 |
5.3 Experiment 3
5.3.1 Data set pre-processing
5.3.2 TSK+ model construction
Feature # | Feature |
---|---|
5 | Size from source to destination |
6 | Size from destination to source |
23 | Number of connections in past 2 second |
35 | Different services rate for destination host |
Datasets | No. of instances | Classes | Rule consequence |
---|---|---|---|
\(\mathbb {T}_1\)
| 53,874 | Normal traffic | 1 |
\(\mathbb {T}_2\)
| 36,741 | DoS | 2 |
\(\mathbb {T}_3\)
| 42 | U2R | 3 |
\(\mathbb {T}_4\)
| 796 | R2U | 4 |
\(\mathbb {T}_5\)
| 9,325 | Probes | 5 |
Dense data set | Normal \(\mathbb {T}_1\)
| DoS \(\mathbb {T}_2\)
| R2U \(\mathbb {T}_3\)
| U2R \(\mathbb {T}_4\)
| Probes \(\mathbb {T}_5\)
| ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
\(T_{11}\)
|
\(T_{12}\)
|
\(T_{13}\)
|
\(T_{21}\)
|
\(T_{22}\)
|
\(T_{23}\)
|
\(T_{31}\)
|
\(T_{41}\)
|
\(T_{42}\)
|
\(T_{43}\)
|
\(T_{51}\)
|
\(T_{52}\)
|
\(T_{53}\)
| |
Index i of rule cluster \(RC_i\)
| 1–4 | 5–7 | 8–11 | 12–14 | 15–18 | 19–24 | 25–30 | 31–35 | 36–38 | 39–41 | 42–44 | 45 | 46 |
Index i of rule \(R_i\)
| 1–4 | 5–7 | 8–11 | 12–14 | 15–18 | 19–24 | 25–30 | 31–35 | 36–38 | 39–41 | 42–44 | 45 | 46 |
Normal Traffic | DoS | U2R | R2U | Probes | |
---|---|---|---|---|---|
Decision tree Wang et al. (2010) | 91.22 | 97.24 | 15.38 | 1.43 | 78.13 |
Naive bayes Wang et al. (2010) | 89.22 | 96.65 | 7.69 | 8.57 | 88.13 |
BPNN Wang et al. (2010) | 89.75 | 97.20 | 23.08 | 5.71 |
88.75
|
FC-ANN Wang et al. (2010) | 91.32 | 96.70 | 76.92 | 58.57 | 80.00 |
MOPF Bostani and Sheikhan (2017) | N/A | 96.89 |
77.98
| 81.13 | 85.92 |
TSK+ without GA (the proposed approach) | 77.10 | 94.07 | 57.69 | 55.29 | 78.71 |
TSK+ (the proposed approach) |
93.10
|
97.84
| 65.38 |
84.65
| 85.69 |