1 Introduction
2 Related work
2.1 Methods based on PSO
2.2 Hybrid methods based on PSO and regressions
3 Proposed method for training fuzzy systems
3.1 High-order Takagi–Sugeno fuzzy system
3.2 Training the consequent parameters
3.2.1 Ordinary least squares
3.2.2 Ridge regression
3.2.3 Sparse regressions
-
Forward selection (FS)—This is a stepwise regression, i.e., variables are added one by one to the model. The algorithm starts with all coefficients equal to zero, and the next variable is chosen based on a certain criterion. For example, it can be the one with the highest correlation with the current residual vector [27].
-
Least angle regression (LAR) [8, 27]—The LAR works similarly to the FS procedure, but instead of moving in the direction of one variable, the estimated parameters are calculated in a direction in which the angles with each of the variable currently in the model are equal. The LAR algorithm is the basis for other sparse methods, such as the least absolute shrinkage and selection operator and elastic net regression.
-
Least absolute shrinkage and selection operator (LASSO) [27, 34]—This regression has a mechanism that implements a coefficient shrinkage and variable selection. The cost function combines the sum of the squared errors, and the penalty function is based on the \(L_1\) norm:where \( \lambda \) is a nonnegative regularization parameter.$$\begin{aligned} J_{\mathrm {LASSO}}({\mathbf {w}},\delta )&= {\Vert {\mathbf {y}}-\mathbf {Xw} \Vert }_2^2 + \delta {\Vert {\mathbf {w}}\Vert }_1 \end{aligned}$$(19)
-
Elastic net (ENET) [27, 44]—The ENET regression combines the features of ridge regression and the LASSO. The cost function contains a penalty term related to both the \(L_1\) and the \(L_2\) norms:where \(\lambda \) and \(\delta \) are nonnegative regularization parameters. To find the solution, the LARS-EN algorithm, which is based on the LARS algorithm [8], is used.$$\begin{aligned} J_{\mathrm {ENET}}({\mathbf {w}},\lambda ,\delta )&= {\Vert {\mathbf {y}}-\mathbf {Xw} \Vert }_2^2 + \lambda {\Vert {\mathbf {w}}\Vert }_2^2 + \delta {\Vert {\mathbf {w}}\Vert }_1 \end{aligned}$$(20)The sparse regressions have been implemented in MATLAB using the toolbox SpaSM [27].
3.2.4 Example of an application
3.3 Training the antecedent parameters
-
the cognition component: attracts particles toward its local most promising position.
-
the social component: attracts particles toward the global best position discovered by the swarm.
3.4 Performance criteria
3.5 Procedure for designing fuzzy models
-
OLS: the method in which the fuzzy sets are defined by the user, while the polynomials are determined by the OLS regression.
-
RIDGE: the method in which the fuzzy sets are defined by the user, while the polynomials are determined by the ridge regression.
-
SR: the method in which the fuzzy sets are defined by the user, while the polynomials are determined by a sparse regression (SR), e.g., FS, LAR, LASSO, or ENET.
-
PSO-OLS: the method in which the fuzzy sets are determined by the PSO algorithm, while the polynomials are determined by the OLS regression.
-
PSO-RIDGE: the method in which the fuzzy sets are determined by the PSO algorithm, while the polynomials are determined by the ridge regression.
-
PSO-SR: the method in which the fuzzy sets are determined by the PSO algorithm, while the polynomials are determined by a sparse regression.
4 Experimental results and discussion
4.1 Experiment 1
Algorithm | RMSE | z | q |
---|---|---|---|
OLS | 1.635e−03 | 0 | 1 |
RIDGE | 2.795e−02 | 0 | 9.045 |
FS | 1.454e−03 | 0.0370 | 0.9260 |
LAR | 1.514e−03 | 0.0370 | 0.9445 |
LASSO | 1.298e−03 | 0.0370 | 0.8784 |
ENET | 1.901e−03 | 0 | 1.081 |
PSO-OLS | * | * | * |
PSO-RIDGE | 1.762e−04 | 0 | 0.5032 |
PSO-FS | 2.171e−04 | 0.5185 | 0.3071 |
PSO-LAR | 2.363e−04 | 0.5185 | 0.3130 |
PSO-LASSO | 1.690e−04 | 0.4444 | 0.3295 |
PSO-ENET | 1.186e−04 | 0.4815 | 0.2955 |
Rule | p | \(\sigma \) | \(w_2\) | \(w_1\) | \(w_0\) |
---|---|---|---|---|---|
OLS | |||||
\(R_{1}\) | 3 | 0.2123 | − 13.05 | 71.89 | − 98.03 |
\(R_{2}\) | 3.5 | 0.2123 | 20.82 | − 148.3 | 262.5 |
\(R_{3}\) | 4 | 0.2123 | − 4.886 | 45.18 | − 103.4 |
\(R_{4}\) | 4.5 | 0.2123 | − 30.30 | 260.5 | − 561.4 |
\(R_{5}\) | 5 | 0.2123 | 33.93 | − 349.8 | 893.7 |
\(R_{6}\) | 5.5 | 0.2123 | 18.79 | − 189.9 | 469.8 |
\(R_{7}\) | 6 | 0.2123 | − 38.98 | 473.9 | − 1441 |
\(R_{8}\) | 6.5 | 0.2123 | 30.23 | − 391.9 | 1272 |
\(R_{9}\) | 7 | 0.2123 | 1.451 | − 1.518 | − 54.22 |
PSO-ENET | |||||
\(R_{1}\) | 3 | 1.362 | 1.085 | − 23.27 | 0 |
\(R_{2}\) | 3.939 | 0.4366 | 5.993 | − 31.31 | 0 |
\(R_{3}\) | 3.428 | 0.5482 | − 4.775 | 0 | 0 |
\(R_{4}\) | 4.397 | 0.4716 | − 3.421 | 1.371 | 0 |
\(R_{5}\) | 5.187 | 0.6116 | − 2.290 | − 32.52 | 0 |
\(R_{6}\) | 3.496 | 1.999 | 14.70 | − 4.811 | 0 |
\(R_{7}\) | 6.047 | 0.6575 | − 20.51 | 112.3 | 0 |
\(R_{8}\) | 6.726 | 5 | 0 | − 6.985 | 0 |
\(R_{9}\) | 7 | 5 | 0 | 0 | 0 |
4.2 Experiment 2
Algorithm | RMSE | z | q |
---|---|---|---|
OLS | 4.144e−02 | 0 | 1 |
RIDGE | 3.700e−02 | 0 | 0.9464 |
FS | 3.526e−02 | 0.1482 | 0.8514 |
LAR | 3.047e−02 | 0.0741 | 0.8306 |
LASSO | 3.179e−02 | 0.1482 | 0.8095 |
ENET | 3.179e−02 | 0.1482 | 0.8095 |
PSO-OLS | * | * | * |
PSO-RIDGE | 1.413e−03 | 0 | 0.5191 |
PSO-FS | 2.941e−03 | 0.2593 | 0.4059 |
PSO-LAR | 3.128e−03 | 0.2593 | 0.4081 |
PSO-LASSO | 2.963e−03 | 0.2222 | 0.4246 |
PSO-ENET | 3.004e−03 | 0.2593 | 0.4066 |
Rule | p | \(\sigma \) | \(w_2\) | \(w_1\) | \(w_0\) |
---|---|---|---|---|---|
OLS | |||||
\(R_{1}\) | − 8 | 1.062 | 3.429 | 56.91 | 240.3 |
\(R_{2}\) | − 5.5 | 1.062 | − 4.327 | − 48.15 | − 135.1 |
\(R_{3}\) | − 3 | 1.062 | 6.106 | 39.67 | 73.47 |
\(R_{4}\) | − 0.5 | 1.062 | − 10.14 | − 12.99 | − 6.421 |
\(R_{5}\) | 2 | 1.062 | 6.344 | − 28.99 | 38.91 |
\(R_{6}\) | 4.5 | 1.062 | − 3.369 | 32.26 | − 79.26 |
\(R_{7}\) | 7 | 1.062 | 1.986 | − 28.49 | 105.4 |
\(R_{8}\) | 9.5 | 1.062 | − 1.399 | 27.04 | − 130.4 |
\(R_{9}\) | 12 | 1.062 | 0.8924 | − 22.08 | 138.6 |
PSO-FS | |||||
\(R_{1}\) | − 8 | 0.9144 | − 0.0565 | − 0.7357 | 0 |
\(R_{2}\) | − 3.040 | 1.756 | − 0.2759 | − 2.456 | − 1.725 |
\(R_{3}\) | − 1.624 | 0.5104 | − 1.138 | − 2.621 | 0 |
\(R_{4}\) | − 0.8196 | 0.5 | − 3.031 | − 3.186 | 1.676 |
\(R_{5}\) | − 1.919 | 0.8712 | 3.216 | 19.18 | 36.43 |
\(R_{6}\) | 9.408 | 6.938 | 0.5113 | 2.321 | − 1.786 |
\(R_{7}\) | 8.237 | 4.424 | 0.1441 | − 3.222 | 0 |
\(R_{8}\) | 11.55 | 5.518 | − 0.5326 | 0 | 0 |
\(R_{9}\) | 12 | 3.921 | 0.0117 | 0 | 0 |
4.3 Experiment 3
Algorithm | RMSE | z | q |
---|---|---|---|
OLS | 9.688e−03 | 0 | 1 |
RIDGE | 4.039e−02 | 0 | 2.585 |
FS | 8.539e−03 | 0.0741 | 0.9037 |
LAR | 9.505e−03 | 0.0370 | 0.9720 |
LASSO | 9.507e−03 | 0.0741 | 0.9537 |
ENET | 9.945e−03 | 0 | 1.013 |
PSO-OLS | * | * | * |
PSO-RIDGE | 1.351e−04 | 0 | 0.5017 |
PSO-FS | 1.935e−04 | 0.4074 | 0.3063 |
PSO-LAR | 1.060e−03 | 0.5185 | 0.2954 |
PSO-LASSO | * | * | * |
PSO-ENET | 5.946e−03 | 0.5185 | 0.2714 |
Rule | p | \(\sigma \) | \(w_2\) | \(w_1\) | \(w_0\) |
---|---|---|---|---|---|
OLS | |||||
\(R_{1}\) | 0 | 0.0531 | − 46.46 | − 1.027 | − 0.0060 |
\(R_{2}\) | 0.125 | 0.0531 | 132.9 | − 22.80 | 1.798 |
\(R_{3}\) | 0.25 | 0.0531 | − 97.88 | 64.29 | − 7.759 |
\(R_{4}\) | 0.375 | 0.0531 | − 3.852 | 4.315 | 2.501 |
\(R_{5}\) | 0.5 | 0.0531 | − 425.3 | 374.3 | − 81.12 |
\(R_{6}\) | 0.625 | 0.0531 | 1030 | − 1267 | 389.8 |
\(R_{7}\) | 0.75 | 0.0531 | − 922.5 | 1432 | − 553.3 |
\(R_{8}\) | 0.875 | 0.0531 | 121.9 | − 291.9 | 162.9 |
\(R_{9}\) | 1 | 0.0531 | 655.8 | − 1249 | 595.1 |
PSO-ENET | |||||
\(R_{1}\) | 0 | 0.2139 | 0 | 14.51 | − 2.028 |
\(R_{2}\) | 0.8145 | 0.1 | 0 | 0 | 126.3 |
\(R_{3}\) | 0.0090 | 0.9261 | − 28.92 | − 33.44 | 0 |
\(R_{4}\) | 0.7227 | 0.1 | − 81.64 | 0 | 0 |
\(R_{5}\) | 0.8748 | 0.1 | − 108.1 | 0 | 0 |
\(R_{6}\) | 0.7960 | 0.1356 | 0 | 0 | 0 |
\(R_{7}\) | 0.6655 | 0.1225 | 17.93 | 0 | − 10.26 |
\(R_{8}\) | 0.5463 | 0.2385 | − 19.69 | 0 | 30.78 |
\(R_{9}\) | 1 | 0.1 | 32.53 | 0 | 32.71 |
4.4 Discussion
-
in the first variant (without PSO), the use of sparse regressions may reduce the validation error \(\mathrm {RMSE}\) and the quality index q compared to the fuzzy OLS-based reference model,
-
in the second variant (with PSO), the experiments did not give results for the PSO-OLS method, but they were obtained for the PSO-RIDGE method,
-
comparing the results in the second variant with the PSO-RIDGE method:
-
in the first experiment, a reduction in the error \(\mathrm {RMSE}\) and the quality index q was obtained for the PSO-LASSO and PSO-ENET methods,
-
in the second and third experiments, the error \(\mathrm {RMSE}\) for sparse regression methods was worse than for the PSO-RIDGE method, while the quality index q was better,
-
-
the results of the error \(\mathrm {RMSE}\) and the quality index q in the second variant are better than in the first variant,
-
in each experiment, the model was simplified by reducing the number of polynomial coefficients: In the first experiment, a reduction of 48% was obtained, in the second—26%, while in the third 52%.