Skip to main content
Erschienen in: The Journal of Supercomputing 18/2022

Open Access 26.06.2022

An efficient DBSCAN optimized by arithmetic optimization algorithm with opposition-based learning

verfasst von: Yang Yang, Chen Qian, Haomiao Li, Yuchao Gao, Jinran Wu, Chan-Juan Liu, Shangrui Zhao

Erschienen in: The Journal of Supercomputing | Ausgabe 18/2022

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As unsupervised learning algorithm, clustering algorithm is widely used in data processing field. Density-based spatial clustering of applications with noise algorithm (DBSCAN), as a common unsupervised learning algorithm, can achieve clusters via finding high-density areas separated by low-density areas based on cluster density. Different from other clustering methods, DBSCAN can work well for any shape clusters in the spatial database and can effectively cluster exceptional data. However, in the employment of DBSCAN, the parameters, EPS and MinPts, need to be preset for different clustering object, which greatly influences the performance of the DBSCAN. To achieve automatic optimization of parameters and improve the performance of DBSCAN, we proposed an improved DBSCAN optimized by arithmetic optimization algorithm (AOA) with opposition-based learning (OBL) named OBLAOA-DBSCAN. In details, the reverse search capability of OBL is added to AOA for obtaining proper parameters for DBSCAN, to achieve adaptive parameter optimization. In addition, our proposed OBLAOA optimizer is compared with standard AOA and several latest meta heuristic algorithms based on 8 benchmark functions from CEC2021, which validates the exploration improvement of OBL. To validate the clustering performance of the OBLAOA-DBSCAN, 5 classical clustering methods with 10 real datasets are chosen as the compare models according to the computational cost and accuracy. Based on the experimental results, we can obtain two conclusions: (1) the proposed OBLAOA-DBSCAN can provide highly accurately clusters more efficiently; and (2) the OBLAOA can significantly improve the exploration ability, which can provide better optimal parameters.
Hinweise
Chen Qian, Haomiao Li have contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Clustering, a common unsupervised learning algorithm [14], groups the samples in the unlabeled dataset according to the nature of features, so that the similarity of data objects in the same cluster is the highest while that of different clusters is the lowest [57]. Clustering is popularly used in biology [8], medicine [9], psychology [10], statistics [11], mathematics [12] and computer science [13]. Since the early 1950s, many clustering algorithms have been proposed. In this paper, considering the novelty and effectiveness of density-based method, we will focus on density-based noise application spatial clustering algorithm (DBSCAN) and explore an adaptive method to tune the hyperparameter for DBSCAN instead of empirical setting.

1.1 Literature review

In clustering algorithms, K-means [14], as the most basic partition clustering algorithm at present, has the advantages of simple principle, strong practicability, fast convergence speed and strong model interpretation and so on. However, it is difficult to converge non-convex datasets and often stops at the local optimal solution.
Different from K-means, DBSCAN [15, 16] is another popular clustering algorithm based on density. It achieves clusters via finding high-density areas separated by low-density areas based on cluster density. Compared with other clustering algorithms based on the distance between objects, DBSCAN is suitable for finding clusters of any shape in spatial database and connecting adjacent regions with corresponding density. It can effectively deal with abnormal data, especially the clustering of spatial data [17]. Although DBSCAN has many advantages in clustering, it still has some disadvantages. For different datasets, DBSCAN needs to set the most appropriate parameters, MinPts and EPS, to achieve the best clustering effect. To some extent, the process of setting parameters limits the application of DBSCAN [18].
Over the years, to apply DBSCAN effectively, many researchers have improved DBSCAN [19] through meta-heuristic algorithm [2023] to realize the automatic search and determination of EPS and MinPts parameters in DBSCAN. For example, Lai et al. [24] proposed a multi-segment optimization algorithm. As a special variable updating method, it has good optimization performance, can obtain good DBSCAN accuracy, and can quickly obtain appropriate EPS parameter selection. Ji’an et al. [25] proposed an adaptive DBSCAN to solve the clustering problem, taking the target solution and its motion range as noise points, in which DBSCAN \(\epsilon\) The neighborhood is affected by some specific physical factors. Zhu et al. [26] applied the harmony search optimization algorithm to DBSCAN, and obtained better clustering parameters and better clustering results. Hu et al. [27] proposed a density-based clustering algorithm, KR-DBSCAN, which is based on reverse nearest neighbor and influence space. Li et al. [28] combined the improved DBSCAN algorithm based on bat optimization and DP algorithm for clustering, and obtained good results. However, these methods still have the characteristics of low convergence accuracy, poor universality, and slow convergence speed.
Meta heuristic algorithm is a popular algorithm in recent years, such as Gray Wolf Whale (GWO), Dragonfly algorithm (DA) and Ant Lion Optimizer (ALO). It has the characteristics of high convergence accuracy and strong robustness. It can be used to solve the selection of parameters in DBSCAN. However, the common meta heuristic algorithms are easy to fall into local optimization. Therefore, we choose Arithmetic optimization algorithm (AOA) as the optimization algorithm. AOA is a new population-based metaheuristic algorithm proposed by Abualah [29], which uses four basic arithmetic operators in mathematics. AOA can not only deal with low dimensional problems [30], but also has a strong ability to solve high-dimensional problems [31]. The distribution mechanism enhances its global search ability, and the algorithm based on population [32] without optimization also helps to achieve faster convergence speed.
However, the ability of standard AOA to balance global optimization and local optimization is still insufficient, and the optimization accuracy is also insufficient. To better balance global optimization and local optimization and improve the optimization accuracy, we proposed some search strategies of improving development (local search) and exploration (global search). In addition, Opposition-based learning (OBL) [3335] is one of the most popular strategies to enhance exploration, which can improve the population diversity of the algorithm in the search space. In the optimization problem, the strategy of checking the candidate solution and its opposite solution at the same time is adopted to speed up the convergence speed to the global optimal solution.
In general, the current clustering effect of DBSCAN is limited by the optimization results of its parameters. At present, the optimization algorithm used to solve DBSCAN parameter optimization has low convergence accuracy and is easy to fall into local optimal solution. Although the standard AOA improves the global dispersion compared with other optimization algorithms, it still has some shortcomings, such as insufficient convergence accuracy and global search ability.

1.2 The gap

To sum up, the demand for the accuracy of DBSCAN clustering algorithm is still increasing. To improve the accuracy of DBSCAN clustering algorithm, more advanced machine learning methods are needed to automatically optimize the parameters of DBSCAN clustering algorithm to improve the accuracy of clustering.

1.3 The contribution

To improve the accuracy and convergence speed of the automatic selection of DBSCAN parameters, this paper proposes a new meta-heuristic improvement strategy, OBLAOA-DBSCAN, which combines the advantages of AOA and OBL with DBSCAN to adjust dynamically the two parameters of DBSCAN. In addition, according to the experimental results, DBSCAN improved with OBLAOA performs well in a variety of public datasets. Therefore, the contributions of this article are as follows:
(1)
An OBLAOA-DBSCAN clustering algorithm is proposed, which can realize automatic parameter search and improve the clustering accuracy and efficiency.
 
(2)
By adding the OBL strategy, an OBLAOA optimizer is established, which can effectively improve the exploration performance of AOA.
 
(3)
The proposed OBLAOA-DBSCAN algorithm can provide better clustering results than other clustering algorithms including K-means, Spectral, Optics, DPC and the combination method of DBSCAN and other meta-heuristic optimization algorithms.
 

1.4 The structure of the paper

The remaining contents are organized as following. Section 2 outlines some backgrounds of DBSCAN and AOA. Section 3 introduces the OBLAOA and gives the use principle and concrete operation. Section 4 illustrates the proposed OBLAOA-DBSCAN algorithm. Section 5 compares the proposed OBLAOA with the original AOA by using 12 benchmark functions. Section 6 demonstrates the superiority of the proposed algorithm with 10 datasets by comparing with some considered clustering algorithms. Section 7 concludes the paper.

2.1 The basic theory of DBSCAN

DBSCAN, an unsupervised learning method, is proposed by [36] handling the clustering problem efficiently based on density. DBSCAN has the capacity to identify noise points efficiently and exactly. Furthermore, it can also distinguish clusters with arbitrary shapes.
In this clustering method, two parameters, the epsilon (EPS) and MinPts, are required to be pre-set to appraise the density distribution of points. DBSCAN starts from an unvisited point randomly. Then, it counts the points fallen within the adjacent area radius of the point less than EPS.
If the number of points is more than MinPts, the current point and its nearby points from a cluster, and the starting point is marked as visited. Then, all the points in the cluster are not marked as visited that are processed in the same way recursively, to expand the cluster. Otherwise, the point is temporarily marked as a noise point. If the cluster is fully expanded, that is, all points in the cluster are marked as visited, and then the same algorithm is used to process the non-visited points. Until all objects are marked as a certain cluster or noise, the clustering process ends. The DBSCAN algorithm flow is presented in Algorithm 1.
DBSCAN suffers from the determination of these two parameters. Previous studies have presented that these two parameters can be found by statistical and classical methods of combining different data mining ways, but these methods consume excessive time. Therefore, we introduce a meta-heuristic optimization to improve the accuracy and efficiency of finding these parameters considerably to achieve clustering faster and more precisely.

2.2 The arithmetic optimization algorithm

Arithmetic Optimization Algorithm (AOA) is a new meta-heuristic optimization algorithm [29] inspired by four major arithmetic operators (Multiplication (M), Division (D), Subtraction(S)), and Addition (A)). The mathematical models of exploration and exploitation phase are detailed as follows. Note that the exploration stage and exploitation stage is conditioned by the math optimizer accelerated (MOA) function. It is calculated by
$$\begin{aligned} \mathrm{MOA}\left( C_\mathrm{Iter }\right) =\delta +C_\mathrm{Iter} *\left( \frac{\gamma -\delta }{M_{Iter}}\right) , \end{aligned}$$
(1)
where \(M_{Iter}\) is the maximum number of iterations, and \(C_{Iter}\) represents the current iteration, which is between 1 and \(M_{Iter}\). \(MOA (C_{Iter})\) represents the value of MOA at the current iteration. \(\gamma\) and \(\delta\) are set to 1 and 0.2 respectively. The math optimizer probability (MOP) at the current iteration is calculated by
$$\begin{aligned} \mathrm{MOP} (C_\mathrm{Iter} ) = 1 - \frac{{C_\mathrm{Iter}}^{\frac{1}{\alpha }}}{{M_\mathrm{Iter}}^{\frac{1}{\alpha }}}, \end{aligned}$$
(2)
where \(\alpha\) is a sensitive parameter and represents the exploitation accuracy over the iterations, which is set to 0.5.
\(r_1, r_2, r_3\) are random numbers. When \(MOA < r_1\), we carry out exploration section by executing D or M. The position updating equation in the exploration stage is followed:
$$\begin{aligned} x_{i,j}(C_\mathrm{Iter}+1) = {\left\{ \begin{array}{ll} x^{\star }(C_\mathrm{Iter}) \div (\mathrm{MOP} + \epsilon ) \times ( (ub_j - lb_j ) \times \mu + lb_j), &{} r_2 < 0.5 \\ x^{\star }(C_\mathrm{Iter}) \times \mathrm{MOP} \times ((ub_j - lb_j) \times \mu + lb_j), &{} \text { otherwise}, \end{array}\right. } \end{aligned}$$
(3)
where \(x_{i,j}(C_{\text {Iter}}+1)\) denotes the jth dimension of the ith solution in the next iteration, and \(x^{\star }(C_{Iter})\) is the best-obtained solution in the previous iteration. \(\epsilon\) is a small integer, \(ub_j\) and \(lb_j\) refer to the upper and lower bound value of jth position. \(\mu\) is a control parameter, which is set to 0.5.
When \(MOA \ge r_1\), we carry out exploitation section by executing S or A. In the case of \(r_3 < 0.5\), S performs (first rule in Eq. 4). Otherwise, A performs the task in the position of S (second rule in Eq. 4). The position updating equation in the exploitation stage is followed:
$$\begin{aligned} x_{i,j}(C_{\text {Iter}} + 1 ) = {\left\{ \begin{array}{ll} x^{\star }(C_\mathrm{Iter}) - \mathrm{MOP} \times ((ub_j - lb_j) \times \mu + lb_j), &{} r_3 < 0.5 \\ x^{\star }(C_\mathrm{Iter}) + \mathrm{MOP} \times ((ub_j - lb_j) \times \mu + lb_j), &{} \text{ otherwise } . \end{array}\right. } \end{aligned}$$
(4)

2.3 The opposition-based learning

Opposition-based learning (OBL) is employed to consider candidate schemes and their inverses. Depending on which estimate, or inverse estimate is closer to the solution, the search interval can be recursively halved until the estimate or inverse estimate is close enough to the existing solution. It determines whether the original solution x is replaced by the opposite solution \(\bar{x}\) by comparing the fitness function values of them. Considering the solution \({x} \in [lb,ub]\), \(\bar{x}\) is calculated by the following equation:
$$\begin{aligned} \bar{x} = ub + lb - x. \end{aligned}$$
(5)
This equation above can be popularized to n-dimension via:
$$\begin{aligned} \bar{x}_{j}=ub_{j} +lb_{j}-x_{j}, j = 1,2,\cdots ,n. \end{aligned}$$
(6)
According to the results of comparison, it ends up with storing the best of two solutions.

3 The proposed OBLAOA

OBL is committed to taking both candidate solutions and their opposite solutions into consideration, which shows greater opportunity to reach the global optimal and faster convergence acceleration than only executing S or A. It is adopted to find a solution, which is opposite to the present solution, and subsequently it determines if the opposite solution is used by comparing the fitness function values of them. For example, if \(f(x^{\star }(C_{\text {Iter}})) \le f(\bar{x}^{\star }(C_{\text {Iter}}))\), then \(x^{\star }(C_{\text {Iter}})\) is saved; otherwise, \(\bar{x}^{\star }(C_{\text {Iter}})\) is stored. The equation used in OBLAOA to get the opposite solution is as,
$$\begin{aligned} \bar{x}^{\star }(C_{\text {Iter}}) = ub + lb - x^{\star }(C_{\text {Iter}}) \end{aligned}$$
(7)
where \(x^{\star }(C_{\text {Iter}})\) denotes the position of the best solution in the current iteration. \(\bar{x}^{\star }(C_\mathrm{Iter} )\) denotes the opposite position of the best solution in the current iteration.
The flowchart of the proposed OBLAOA is given in Fig. 1 and the pseudocode is recorded in Algorithm 2.

4 The improved DBSCAN with OBLAOA

In this section, we apply OBLAOA to DBSCAN to optimize two parameters of DBSCAN (EPS and MinPts). Here more advanced modification method, namely OBLAOA-DBSCAN, is proposed, which can further improve the performance of the clustering algorithm.
In details, the OBLAOA-DBSCAN can perform the optimization process of determining the parameters EPS and MinPts automatically in an extensive scope of search spaces via a meta-heuristic method. First, set the normalized range matrix of two parameters (EPS and MinPts) as the upper bounds (\(ub_{j}\)) and lower bounds (\(lb_{j}\)) of search space. Then, the OBLAOA is used to search for suitable parameters within the effective search space.
To get the best clustering results, the sum of the average Euclidean distance of each cluster, the fitness function in OBLAOA-DBSCAN is given as,
$$\begin{aligned} \mathrm {D} \left( o_{i}, o_{l}\right) =\left( \sum _{j=1}^{m}\left( o_{i j}-o_{l j}\right) ^{r}\right) ^{\frac{1}{r}} \end{aligned}$$
(8)
where \(D (o_{i}, o_{l})\) is an Euclidean distance function that produces different metrics between object i and object l, \(o_{i j}, o_{l j} (i, l=1, \ldots , n, j=1, \ldots ,m)\) represents the value of the j-th attribute of object i and object l, respectively.
With the value of fitness function updates continuously, the position of best-obtained solution, which determines the value of two parameters, varies. At this time, the corresponding parameters MinPts and EPS will change. Until the fitness value no longer changes, apply the obtained parameters into DBSCAN algorithm for clustering.
When only using DBSCAN for clustering, problems, such as low accuracy of clustering results and low definition of noise points, always appear because of parameters setting manually. By introducing OBL to enhance the exploration ability of AOA, OBLAOA can provide effective parameter solutions for DBSCAN, thereby improving the clustering ability. The flowchart is shown in Fig. 2. After calculation, the time complexity of OBLAOA-DBSCAN is \(O(N(1 + M \times {nlog(n)} + M \times n))\).Where N represents the number of candidate solutions, M is the number of iterations, and n is the dimension of solving the problem.

5 Numerical simulation

5.1 The benchmark functions

To evaluate the performance of the proposed OBLAOA optimizer, we conducted numerical simulation experiments with 8 test functions in CEC2021. The benchmark functions are presented in Table 1, and its constraint range is represented by Range in the table.
Table 1
The CEC2021 benchmark functions
Function
Description
Range
\(F_{1}\)
\(f(x)= x_1^2 + {10^6}*\sum \limits _{i = 2}^D {x_i^2}\)
[− 100,100]
\(F_{2}\)
\(f(x) = \sum \limits _{i = 1}^D {(x_i^2 - 10*\cos (2\pi {x_i})}\)
[− 100,100]
\(F_{3}\)
\(f(x) = \sum \limits _{i = 1}^D {{{({{10}^6})}^{\frac{{i - 1}}{{D - 1}}}}*x_i^2}\)
[− 100,100]
\(F_{4}\)
\(f(x) = {|{{{\left( {\sum \limits _{i = 1}^D {x_i^2} } \right) }^2} - {{\left( {\sum \limits _{i = 1}^D {{x_i}} } \right) }^2}} |^{1/2}} + \left( {0.5*\sum \limits _{i = 1}^D {x_i^2} + \sum \limits _{i = 1}^D {{x_i}} } \right) /D + 0.5\)
[− 100,100]
\(F_{5}\)
\(f(x)= \sum \limits _{i = 1}^{D - 1} {(100*{{(x_i^2 - {x_{i + 1}})}^2} + {{({x_i} - 1)}^2})}\)
[− 100,100]
\(F_{6}\)
\(f(x)= \sum \limits _{i = 1}^D {\frac{{x_i^2}}{{4000}}} - \prod \limits _{i = 1}^D {\cos \left( {\frac{{{x_i}}}{{\sqrt{i} }}} \right) + 1}\)
[− 100,100]
\(F_{7}\)
\(f(x)= - 20*\exp ( - 0.2*\sqrt{\frac{1}{D}\sum \limits _{i = 1}^D {x_i^2} } ) - \exp \left( {\frac{1}{D}\sum \limits _{i = 1}^D {\cos (2\pi {x_i})} } \right) + 20 + e\)
[− 100,100]
\(F_{8}\)
\(f(x)= {|{\sum \limits _{i = 1}^D {x_i^2} - D} |^{1/4}} + \left( {0.5\sum \limits _{i = 1}^D {x_i^2} + \sum \limits _{i = 1}^D {{x_i}} } \right) /D + 0.5\)
[− 100,100]

5.2 The setting of experimental parameters

The results of OBLAOA are saved and compared with five traditional methods (i.e., AOA, IAOA, DAOA, EN-GWO and WSSA) for each test case. The parameters of each algorithm are set as follows. The maximum number of iterations and population size of all algorithms are set as 500 and 20, respectively, and the number of function evaluations is 30 [37]. In addition, the initial random population set of all algorithms are the same. All CEC2021 test functions are simulated in 10 and 20 dimensions, respectively.
Table 2
Results of 10-dimensional CECE2021 test functions (\(F_1\)-\(F_8\))
 
\(F_1\)
\(F_2\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
2.94e+9
1.08e-3
3.10e+18
1
1.94e+3
1.77e+3
2.11e+4
1
DAOA
1.04e+10
1.04e-10
3.97e+16
1
2.02e+3
2.17e+3
2.57e+3
1
IAOA
2.95e+7
2.20e-198
3.43e+17
1
203.06
0
1.3e+5
1
ENGWO
9.90e+9
9.85e-9
2.38e+16
1
2.31e+3
2.28e+3
1.5e+3
1
WSSA
1.30e+10
1.30e-10
5.26e-9
1
2.59e+3
2.58e+3
869.1
1
OBLAOA
2.62e+7
0
3.38e-17
0
179.94
0
2.11e+4
0
 
\(F_3\)
\(F_4\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
218.02
200.89
2.59e+3
1
4.16e+4
9.31e+5
1.52e+11
1
DAOA
559.48
468.61
28.37
1
7.74e+4
5.01e+5
2.23e+11
1
IAOA
114.78
109.62
490.39
1
3.49e+5
3.29e+5
8.46e+10
1
ENGWO
453.99
450.90
35.45
1
1.64e+5
1.35e+5
8.83e+10
1
WSSA
559.48
559.48
0
1
5.95e+6
5.92e+6
1.50e+10
1
OBLAOA
106.76
106.76
1.10e+13
0
2.37e+5
1.99e+5
9.47e+10
0
 
\(F_5\)
\(F_6\)
Avg
Best
h
Std
Avg
Best
Std
h
AOA
2.71e+7
2.17e+3
3.40e+14
1
2.14e+3
2.86e+3
2.29e+3
1
DAOA
6.04e+7
5.01e+3
3.77e+14
1
2.18e+3
2.05e+3
4.16e+3
1
IAOA
1.53e+5
1700
1.10e+14
1
1.62e+3
1600
3.52e+3
1
ENGWO
2.11e+7
2.09e+7
5.89e+12
1
2.14e+3
2.13e+3
330.07
1
WSSA
2.85e+7
2.75e+7
2.85e+13
1
2.35e+3
2.35e+3
0.02
1
OBLAOA
106.76
1700
1.10e+13
0
1.62e+3
1600
2.35e+3
0
 
\(F_{7}\)
\(F_{8}\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
1.24e+07
1.42e+6
3.72e+13
1
0.29
3.19e+3
3.03e+3
1
DAOA
1.67e+7
6.06e+6
4.70e+12
1
3.98e+3
3.68e+3
2.40e+4
1
IAOA
06.89e+5
5.8e+5
6.84e11
1
3.33e+3
3.33+e3
1.10e+4
1
ENGWO
7.42e+6
7.05e+06
7.08e+11
1
4.18e+3
4.13e+3
716.21
1
WSSA
1.74e+6
1.74e+7
3.08e-36
1
4.4e+3
4.4e+3
0.32
1
OBLAOA
3.23e+5
1.40e+5
8.19e+11
0
3.04
2.99e+3
9.45
0
Table 3
Results of 20-dimensional CECE2021 test functions (\(F_1\)-\(F_8\))
 
\(F_1\)
\(F_2\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
3.36e+10
3.2e-10
4.7e+17
1
5.7e+3
3.3e+3
4.9e+4
1
DAOA
3.7e+10
3.6e-10
7.6e+15
1
5.7e+3
5.4e+3
3.5e+4
1
IAOA
9.9e+7
2.2e-160
3e+18
1
109
0
3.2e+5
1
ENGWO
3.2e+10
3.2e-10
1.04e+17
1
5.8e+3
5.7e+3
1.26e+3
1
WSSA
3.7e+10
3.7e-10
1.3e-8
1
6.2e+3
6.2e+3
5.6e+22
1
OBLAOA
7.7e+7
2.6e-177
2.8e+18
0
108
0
1.25e+5
0
 
\(F_3\)
\(F_4\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
1.2e+3
12e+3
3.6e+3
1
1.24e+6
1.06e+6
9.2e+11
1
DAOA
1.7e+3
1.62e+3
11
1
5.6e+6
3.6e+6
1.03e+13
1
IAOA
341
333
4e+9
1
4.5e+5
4.17e+5
4.2e+11
1
ENGWO
1.56e+3
01.56e+3
34
1
4.05e+6
4.01e+6
2.23e+11
1
WSSA
1.69e+3
1.69e+3
0
1
1.45e+7
1.45e+7
1.6e+14
1
OBLAOA
305
300
4e+3
0
4.5e+5
3.5e+5
4.5e+11
0
 
\(F_5\)
\(F_6\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
8.1e+7
2.8e+4
4.5e+15
1
3.55e+3
3.35e+3
2.98e+3
1
DAOA
2.2e+8
3.17e+6
1.13e+13
1
3.83e+3
3.56e+3
1.14e+3
1
IAOA
4.9e+5
1700
1.13e+14
1
1.66e+3
1600
2.4e+4
1
ENGWO
1.26e+8
1.25e+8
2.78e+13
1
3.71e+3
3.7e+3
405
1
WSSA
1.84e+8
1.814e+8
1.08e+14
1
1.84e+8
1.81e+8
1.08e+14
1
OBLAOA
4.7e+5
1700
1.13e+14
0
1.64e+3
1600
1.45e+4
0
 
\(F_{7}\)
\(F_{8}\)
Avg
Best
Std
h
Avg
Best
Std
h
AOA
6.9e+7
1.06e+7
3.49e+15
1
7.6e+3
7.32e+3
4.78e+4
1
DAOA
1.8e+8
6.12e+7
2.53e+15
1
7.65e+3
7.32e+3
4.99e+4
1
IAOA
1.53e+7
1.45e+7
8.6e+13
1
6.74e+3
6.62e+3
7.3774e+4
1
ENGWO
5.55e+7
5.46e+7
5.37e+13
1
8.19e+3
8.19e+3
1.06e+4
1
WSSA
2.12e+8
2.12e+8
1.11e+11
1
8.72e+3
8.71e+3
1.24e+03
1
OBLAOA
2.86e+6
7.6e+5
1.89e+14
0
6.6e+3
2.99e+3
5.05e+4
0
Table 4
Results of three engineering problems
Engineering problem A: Welded Beam Design, WBD
 
Avg
Best
Std
h
AOA
101.40
2.47
2.46e+6
1
DAOA
158.78
2.3768
34.54e+6
1
IAOA
1217
2.85
1.32e+7
1
ENGWO
1922
4.38
32.85e+7
1
WSSA
335
235.41
352.12e+6
1
OBLAOA
158.78
2.37
4.54e+6
0
Engineering problem B: compression spring design, CSD
 
Avg
Best
Std
h
AOA
12.33
6.44
16.22
1
DAOA
11.75
7.12
137.92
1
IAOA
9.15
9.82
129.11
1
ENGWO
28.98
16.86
120.95
1
WSSA
33.31
26.25
380.73
1
OBLAOA
12.09
4.25
129.1167
0
Engineering problem C: design problems of I-beam, IBP
 
Avg
Best
Std
h
AOA
189.42
187.63
3.68
1
DAOA
190.63
187.28
10.41
1
IAOA
192.97
187
24.34
1
ENGWO
188.05
187.73
1.48
1
WSSA
188.73
186.73
5.46
1
OBLAOA
189.47
186.42
5.43
0

5.3 Analysis of the results

The results of numerical simulation are recorded in Tables 2 and  3. To verify the effectiveness of OBLAOA, we compared the results of OBLAOA with the standards AOA, IAOA, DAOA, ENGWO and WSSA. We select the corresponding average value (AVG), standard deviation (STD) and best value (BEST) as performance indicators and report them in all tables. We show better results in bold in Tables 2 and 3. In addition, Wilson’s rank test was used for all results, and all results of Wilson’s rank test (h) were 1. It can be seen from the table that OBLAOA has better performance than standard AOA and other current popular optimization algorithms (i.e., IAOA, DAOA, ENGWO and WSSA). In the test of high-dimensional meta heuristic algorithm, for all functions, the average value and optimal value of OBLAOA are better than standard AOA and current popular algorithms. In the test of low-dimensional meta-heuristic algorithm, the average and optimal values of OBLAOA are better than AOA for F1, F2, F3, F5, F6, F7 and F8 functions. In some experiments, compared with AOA, the performance of OBLAOA is significantly improved. Taking the F3 function of 10 dim as an example, the best index of OBLAOA is 106.76, which is 46.62\(\%\) lower than standard AOA, 77.17\(\%\) lower than DAOA, 2.67\(\%\) lower than ENGWO and 80.91\(\%\) lower than WSSA. As far as F6 is concerned, the index of best is 1600, which is 44\(\%\) lower than standard AOA, 21.95\(\%\) lower than DAOA, 24.88\(\%\)lower than ENGWO and 31.91\(\%\) lower than WSSA. From the F8 function, the index of best is 2.99e + 3, which is 59.15\(\%\) lower than standard AOA, 59.14\(\%\) lower than DAOA, 54.83\(\%\) lower than IAOA, 63.49\(\%\) lower than ENGWO and 65.67\(\%\) lower than WSSA. To sum up, our proposed OBL is better than standard AOA and other current popular algorithms in dealing with complex functions.
In order to further prove the optimization effect of OBLAOA, we selected three practical engineering problems for verification, including welded beam design [38], compression spring design [39] and design problems of I-beam [40]. The results are recorded in Table 4 and shown in Fig. 3. To verify the adequacy of the experimental results, we also carried out Wilcoxon signed rank test. The results are expressed in h, which are all 1, and recorded in Table 4. From Fig. 3, we can see that our OBLAOA has better optimization effect compared with other algorithms. Our OBLAOA has the highest convergence accuracy among all problems. Specifically, in the CSD problem, our OBLAOA converges first, and the convergence effect is greatly improved compared with ENGWO and WSSA. In general, our OBLAOA has better convergence effect in solving practical engineering problems. As can be seen from Table 4, our OBLAOA algorithm also has great advantages over standard AOA and the latest algorithm in solving practical engineering problems. Our OBLAOA has obtained the best value in all three engineering problems. Taking the WBD problem as an example, our best value is 4.25, which is 34\(\%\) lower than the standard AOA algorithm, 40.3\(\%\) lower than DAOA and 56.72\(\%\) lower than IAOA. ENGWO and WSSA do not converge, which is quite different from OBLAOA. We can also see from Figs. 4 and 5 that OBLAOA converges earlier and faster, and the final fitness value is lower than that of other algorithms.

6 Experiment and performance evaluation

This section is summarized as follows. In Sect. 6.1, we describe the datasets in the experiment. In Sect. 6.2, we introduce the evaluation indexes that used. In Sect. 6.3, we describe the parameter setting process in detail. In Sect. 6.4, we use ten datasets to test different optimization algorithms. In Sect. 6.5, we compare the optimized OBLAOA-DBSCAN with five classical clustering algorithms.

6.1 The datasets

In this part, we use ten datasets to test the performance of our optimization algorithm OBLAOA-DBSCAN. The instance of 10 datasets is 788, 399, 373, 150, 251, 300, 198, 1980, 341 and 846. The dimensions of 10 datasets are 3, 3, 3, 5, 3, 2, 34, 3, 3 and 19. The clusters of 10 datasets are 7, 6, 2, 3, 3, 5, 2, 5, 9 and 4. Table 5 shows ten datasets as experimental data. We compared the real labels with the clustering label and use the comparison result as the evaluation index of the algorithm, therefore, we use the datasets with real labels.
Table 5
Datasets used in experiments
Dataset
Instance
Dimension
Cluster
Aggregation
788
3
7
Compound
399
3
6
Jain
373
3
2
Iris
150
5
3
Spiral
251
3
3
Pathbased
300
2
5
Wpbc
198
34
2
Synthesis
1980
3
5
R15
341
3
9
Vehicel
846
19
4

6.2 The error index

In order to measure the clustering results of the improved method, we use Accuracy, Davies- Bouldin index (DBI), Silhouette index (Sil), Rand index (RI) [41, 42], Normalized Mutual Information (NMI), Homogeneity, Completeness, and V-measure [43]. Because of the datasets with the real label, we use the accuracy index to show the performance of the proposed method.
Accuracy is the ratio of correctly clustered data to total data. The correctly clustered data is obtained by comparing the cluster labels K with the actual labels C. DBI is used to measure the distance within the cluster and the distance between the clusters. The smaller DBI means the smaller distance within the cluster and the larger distance among clusters that is formulated as:
$$\begin{aligned} \mathrm {DBI}=\frac{1}{N}\sum _{i=1}^{N}\left( \max \limits _{j=1,\ldots ,N, j \ne i} \left( \frac{d_{i j}}{S_i+S_j}\right) \right) , \end{aligned}$$
where N is the number of clusters, \(d_{i j}\) is the average of the distance between clusters i and j. In addition, \(S_{i}\) and \(S_{j}\) are the mean distances of cluster i and cluster j.
The Silhouette value describes the similarity between different clusters. The larger this value is, the higher similarity between the target and its cluster, and the lower similarity with other clusters. The formula is as follows:
$$\begin{aligned} \mathrm {SIL}=\frac{1}{N} \sum _{i=1}^{N}\left( \frac{b\left( {i}\right) -a\left( {i}\right) }{\max \left\{ a\left( {i}\right) , b\left( {i}\right) \right\} }\right) , \end{aligned}$$
where a(i) is the average distance between a cluster \(C_{i}\) and all other data points in the same cluster, and b(i) is the average difference between a cluster \(C_{i}\) and other clusters.
The Rand index is a way to compare the similarity of results between two different clustering methods. The larger the value is, the clustering result that compared with the real situation is more consistent. The formula is as follows:
$$\begin{aligned} \mathrm {RI}=\frac{x+y}{C_{n}^{2}}, \end{aligned}$$
where x represents the number of the same labels in both C and K, and y represents the number of different labels in both C and K. \(C_{n}^{2}\) represents the number of combinations of C and K that can be made in the dataset.
NMI is used to measure the coincidence degree of two datasets and refers to the correlation between two sets of results. The greater the NMI, the greater the degree of correlation between categories. The formula is as follows:
$${\text{Hl}} = {\text{ }} - \sum\limits_{{i = 1}}^{{{\text{ N }}}} {\left( {\frac{{{\text{Ml}}}}{N}*\log _{2} \frac{{{\text{Ml}}}}{N}} \right)} ,{\text{Hr}} = {\text{ }} - \sum\limits_{{i = 1}}^{{{\text{ N }}}} {\left( {\frac{{{\text{Mr}}}}{N}*\log _{2} \frac{{{\text{Mr}}}}{N}} \right)} ,{\text{Hlr}} = {\text{ }} - \sum\limits_{{i = 1}}^{{{\text{ N }}}} {\left( {\frac{{{\text{Ml*Mr}}}}{N}*\log _{2} \frac{{{\text{Ml*Mr}}}}{N}} \right)} ,$$
and
$${\text{NMI}} = {\text{ }}\sqrt {\left( {{\text{Hl + Hr}} - \frac{{{\text{Hlr}}}}{{{\text{Hl}}}}} \right)*\left( {{\text{Hl + Hr}} - \frac{{{\text{Hlr}}}}{{Hr}}} \right)} ,$$
where Ml represents the cluster distribution of the randomly selected object from the clustering result K, Mr represents the cluster distribution of the randomly selected object from the actual labels C.
Homogeneity refers to each cluster only containing one member of the same cluster. Completeness refers to all members of a cluster are in the same cluster. V-measure is average of Homogeneity and Completeness. The formula is as follows:
$${\text{homogeneity}} = {\text{ Hl + Hr}} - \frac{{{\text{Hlr}}}}{{{\text{Hl}}}},{\text{completeness}} = {\text{Hl + Hr}} - \frac{{{\text{Hlr}}}}{{{\text{Hr}}}},$$
and
$$\begin{aligned} \mathrm {V-measure}= & {} \frac{2*homogeneity*completeness}{completeness+homogeneity}. \end{aligned}$$
The DBI index is usually less than 1, and the lower the index, the better the performance. SIL and RI index values are usually within 1, the closer they are to 1, the better the clustering performance of this method will be. The bigger Accuracy, NMI, homogeneity, completeness, and V-measure are the more real the clustering results are. Through the analysis of evaluation index, we can clearly compare the clustering performance of the new algorithm.

6.3 Experiment settings

Table 6
The range of parameters for the investigated datasets
Dataset
EPS
MinPts
Aggregation
[1.0,3.0]
[2,25]
Compound
[1.2,2.8]
[2,15]
Jain
[2.4,3.3]
[4,16]
Iris
[0.5,1.8]
[0,30]
Spiral
[1.1,4.0]
[0,10]
Pathbased
[1.6,2.1]
[2,6]
Wpbc
[0.5,1.2]
[2,12]
Synthesis
[0.5,6.6]
[2,10]
R15
[0.5,1.2]
[0,20]
Vehicel
[0.3,1.1]
[3,12]
DBSCAN [44] requires two parameters to be selected during clustering. By changing the values of EPS and MinPts parameters, we can get different clustering results. We first set up a large range of two parameters to run. We set EPS to 0-20 and MinPts to 0-40 to find an appropriate clustering results and adjust the range of parameters manually. These ranges of parameters are shown in Table 6. By comparing the results, we get a more accurate range for each dataset, which is used for the following experiments. We take EPS to one decimal and round down MinPts. We use the OBLAOA-DBSCAN algorithm to optimize these two parameters in the experiment. Firstly, we compare the optimization algorithms. The results of OBLAOA are compared with the following algorithms: Arithmetic Optimization Algorithm (AOA), Whale Optimization Algorithm (WOA) [45], Salp Swarm Algorithm (SSA) [46], Weighted Salp Swarm Algorithms (WSSA) [47], Exponential Neighborhood Grey Wolf Optimization (ENGWO) [48], developed Arithmetic Optimization Algorithm (dAOA) [49] and improved arithmetic optimization algorithm (IAOA) [50]. Secondly, we compare our OBLAOA-DBSCAN algorithm with five classical clustering algorithms, namely K-means [51], Spectral [52], OPTICS [53], clustering by fast search and find of density peaks (DPC) [54] and the original DBSCAN.
To compare the gap conveniently and clearly among the algorithms, we set the parameters in the test as follows. The maximum number of iterations and population size of all algorithms are set to 100 and 20. In addition, we run each algorithm 20 times, and take the average result to eliminate the error in the experiment. The experimental algorithm run by MATLAB 2017b.

6.4 Experimental results of the optimization algorithm

In this part, we compared our improved optimization algorithm OBL-AOA with other seven meta-heuristic optimization algorithms. We take Euclidean distance as the fitness function and get the convergence curve of fitness function. In Tables 7, 8, 9 and 10, we show the error indexes of different algorithms and make better indexes in bold. In Fig. 6, we show the convergence curves of six datasets, and the convergence curves of other datasets are in Fig. 9 in Appendix.
Table 7
The evaluation indexes of datasets in DBSCAN optimized by different meta-heuristic algorithms I
Dataset
Algorithm
Evaluation index
Accuracy
DBI
RI
SIL
Aggregation
WOA-DBSCAN
0.9949
0.3651
1
0.6813
 
SSA-DBSCAN
0.9949
0.3651
1
0.6813
 
WSSA-DBSCAN
0.9949
0.3651
1
0.6813
 
ENGWO-DBSCAN
0.9949
0.3651
1
0.6813
 
AOA-DBSCAN
0.9949
0.3651
1
0.6813
 
dAOA-DBSCAN
0.9949
0.3651
1
0.6813
 
IAOA-DBSCAN
0.9949
0.3651
1
0.6813
 
OBLAOA-DBSCAN
0.9949
0.3651
1
0.6813
Compound
WOA-DBSCAN
0.7443
0.4221
0.8916
0.6488
 
SSA-DBSCAN
0.7757
0.4334
0.9083
0.5890
 
WSSA-DBSCAN
0.7453
0.4218
0.8922
0.6411
 
ENGWO-DBSCAN
0.8321
1.0888
0.9324
0.1298
 
AOA-DBSCAN
0.7966
1.0815
0.9171
0.1041
 
dAOA-DBSCAN
0.8375
1.0888
0.9347
0.1341
 
IAOA-DBSCAN
0.7966
1.0815
0.9171
0.1041
 
OBLAOA-DBSCAN
0.8538
1.0941
0.9415
0.1488
Jain
WOA-DBSCAN
0.6728
0.4920
0.8517
0.4272
 
SSA-DBSCAN
0.6518
0.4828
0.8427
0.4047
 
WSSA-DBSCAN
0.6518
0.4828
0.8427
0.4047
 
ENGWO-DBSCAN
0.6728
0.4920
0.8517
0.4272
 
AOA-DBSCAN
0.6834
0.4999
0.8562
0.4355
 
dAOA-DBSCAN
0.6834
0.4999
0.8562
0.4355
 
IAOA-DBSCAN
0.7151
0.5037
0.8700
0.4064
 
OBLAOA-DBSCAN
0.7151
0.5037
0.8700
0.4064
Iris
WOA-DBSCAN
0.9800
0.3773
0.9911
0.6642
 
SSA-DBSCAN
0.9400
0.3760
0.9740
0.7148
 
WSSA-DBSCAN
0.9600
0.3765
0.9825
0.7272
 
ENGWO-DBSCAN
1
0.3654
1
0.7478
 
AOA-DBSCAN
0.9400
0.3760
0.9740
0.7148
 
dAOA-DBSCAN
0.9400
0.3760
0.9740
0.7148
 
IAOA-DBSCAN
0.9600
0.3765
0.9825
0.7272
 
OBLAOA-DBSCAN
1
0.3654
1
0.7478
Spiral
WOA-DBSCAN
1
2.2296
1
–0.1081
 
SSA-DBSCAN
1
2.2296
1
–0.1081
 
WSSA-DBSCAN
1
2.2296
1
–0.1081
 
ENGWO-DBSCAN
1
2.2296
1
–0.1081
 
AOA-DBSCAN
1
2.2296
1
–0.1081
 
dAOA-DBSCAN
1
2.2296
1
–0.1081
 
IAOA-DBSCAN
1
2.2296
1
–0.1081
 
OBLAOA-DBSCAN
1
2.2296
1
–0.1081
Table 8
The evaluation indexes of datasets in DBSCAN optimized by different meta-heuristic algorithms II
Dataset
Algorithm
Evaluation index
Accuracy
DBI
RI
SIL
 
WOA-DBSCAN
0.8100
0.9482
0.8129
0.4349
 
SSA-DBSCAN
0.8100
0.9482
0.8129
0.4349
 
WSSA-DBSCAN
0.8100
0.9482
0.8129
0.4349
Pathbased
ENGWO-DBSCAN
0.8100
0.9482
0.8129
0.4349
 
AOA-DBSCAN
0.8233
0.9482
0.8158
0.4515
 
dAOA-DBSCAN
0.8233
0.9482
0.8158
0.4515
 
IAOA-DBSCAN
0.8233
0.9482
0.8158
0.4515
 
OBLAOA-DBSCAN
0.8233
0.9482
0.8158
0.4515
 
WOA-DBSCAN
0.1314
0.3792
0.6699
0.0751
 
SSA-DBSCAN
0.2595
0.4859
0.7075
0.1425
 
WSSA-DBSCAN
0.2595
0.4859
0.7075
0.1425
Wpbc
ENGWO-DBSCAN
0.8913
0.7829
0.9505
0.3282
 
AOA-DBSCAN
0.8270
0.7552
0.9221
0.2937
 
dAOA-DBSCAN
0.8270
0.7552
0.9221
0.2937
 
IAOA-DBSCAN
0.8913
0.7829
0.9505
0.3282
 
OBLAOA-DBSCAN
0.9346
0.8023
0.9700
0.3447
 
WOA-DBSCAN
0.9884
0.1926
0.9956
0.8542
 
SSA-DBSCAN
0.9860
0.2152
0.9930
0.8606
 
WSSA-DBSCAN
0.9884
0.1926
0.9956
0.8542
Synthesis
ENGWO-DBSCAN
0.9970
0.1791
0.9985
0.8534
 
AOA-DBSCAN
0.9970
0.1791
0.9985
0.8534
 
dAOA-DBSCAN
0.9970
0.1791
0.9985
0.8534
 
IAOA-DBSCAN
0.9860
0.2152
0.9930
0.8606
 
OBLAOA-DBSCAN
0.9998
0.1787
0.9999
0.8517
 
WOA-DBSCAN
1
0.3044
1
0.8966
 
SSA-DBSCAN
0.9971
0.3047
0.9986
0.8946
 
WSSA-DBSCAN
1
0.3044
1
0.8966
R15
ENGWO-DBSCAN
1
0.3044
1
0.8966
 
AOA-DBSCAN
1
0.3044
1
0.8966
 
dAOA-DBSCAN
1
0.3044
1
0.8966
 
IAOA-DBSCAN
1
0.3044
1
0.8966
 
OBLAOA-DBSCAN
1
0.3044
1
0.8966
 
WOA-DBSCAN
0.4823
1.5198
0.9605
0.1637
 
SSA-DBSCAN
0.9561
1.5527
0.9835
0.1772
 
WSSA-DBSCAN
0.9561
1.5527
0.9835
0.1772
Vehicle
ENGWO-DBSCAN
0.9624
1.5534
0.9859
0.1790
 
AOA-DBSCAN
0.9561
1.5527
0.9835
0.1772
 
dAOA-DBSCAN
0.9248
1.5301
0.9718
0.1701
 
IAOA-DBSCAN
0.9624
1.5534
0.9859
0.1790
 
OBLAOA-DBSCAN
0.9656
1.5541
0.9871
0.1808
Table 9
The evaluation indexes of datasets in DBSCAN optimized by different meta-heuristic algorithms III
Dataset
Algorithm
Evaluation index
NMI
Homogeneity
Completeness
Vmeasure
Aggregation
WOA-DBSCAN
1
1
1
1
SSA-DBSCAN
1
1
1
1
WSSA-DBSCAN
1
1
1
1
ENGWO-DBSCAN
1
1
1
1
AOA-DBSCAN
1
1
1
1
dAOA-DBSCAN
1
1
1
1
IAOA-DBSCAN
1
1
1
1
OBLAOA-DBSCAN
1
1
1
1
Compound
WOA-DBSCAN
0.8073
0.9531
0.6838
0.7963
SSA-DBSCAN
0.8021
0.8928
0.7207
0.7976
WSSA-DBSCAN
0.8061
0.9478
0.6856
0.7956
ENGWO-DBSCAN
0.8736
0.9434
0.8089
0.8710
AOA-DBSCAN
0.8364
0.9118
0.7672
0.8333
dAOA-DBSCAN
0.8796
0.9488
0.8154
0.8770
IAOA-DBSCAN
0.8364
0.9118
0.7672
0.8333
OBLAOA-DBSCAN
0.9049
0.9729
0.8417
0.9026
Jain
WOA-DBSCAN
0.5987
0.6584
0.5409
0.5939
SSA-DBSCAN
0.5781
0.6435
0.5194
0.5748
WSSA-DBSCAN
0.5781
0.6435
0.5194
0.5748
ENGWO-DBSCAN
0.5987
0.6584
0.5409
0.5939
AOA-DBSCAN
0.6062
0.6660
0.5518
0.6035
dAOA-DBSCAN
0.6062
0.6660
0.5518
0.6035
IAOA-DBSCAN
0.6353
0.6894
0.5855
0.6332
OBLAOA-DBSCAN
0.6353
0.6894
0.5855
0.6332
Iris
WOA-DBSCAN
0.9702
0.9703
0.9701
0.9702
SSA-DBSCAN
0.9306
0.9311
0.9300
0.9306
WSSA-DBSCAN
0.9488
0.9490
0.9486
0.9488
ENGWO-DBSCAN
1
1
1
1
AOA-DBSCAN
0.9306
0.9311
0.9300
0.9306
dAOA-DBSCAN
0.9306
0.9311
0.9300
0.9306
IAOA-DBSCAN
0.9488
0.9490
0.9486
0.9488
OBLAOA-DBSCAN
1
1
1
1
Spiral
WOA-DBSCAN
1
1
1
1
SSA-DBSCAN
1
1
1
1
WSSA-DBSCAN
1
1
1
1
ENGWO-DBSCAN
1
1
1
1
AOA-DBSCAN
1
1
1
1
dAOA-DBSCAN
1
1
1
1
IAOA-DBSCAN
1
1
1
1
OBLAOA-DBSCAN
1
1
1
1
Table 10
The evaluation indexes of datasets in DBSCAN optimized by different meta-heuristic algorithms IV
Dataset
Algorithm
Evaluation index
NMI
Homogeneity
Completeness
SIL
Pathbased
WOA-DBSCAN
0.6907
0.7114
0.6706
0.6904
SSA-DBSCAN
0.6907
0.7114
0.6706
0.6904
WSSA-DBSCAN
0.6907
0.7114
0.6706
0.6904
ENGWO-DBSCAN
0.6907
0.7114
0.6706
0.6904
AOA-DBSCAN
0.7012
0.7226
0.6804
0.7009
dAOA-DBSCAN
0.7012
0.7226
0.6804
0.7009
IAOA-DBSCAN
0.7012
0.7226
0.6804
0.7009
OBLAOA-DBSCAN
0.7012
0.7226
0.6804
0.7009
Wpbc
WOA-DBSCAN
0.1655
0.3324
0.0824
0.1320
SSA-DBSCAN
0.2649
0.4102
0.1711
0.2415
WSSA-DBSCAN
0.2649
0.4102
0.1711
0.2415
ENGWO-DBSCAN
0.8199
0.8443
0.7961
0.8195
AOA-DBSCAN
0.7438
0.7817
0.7078
0.7429
dAOA-DBSCAN
0.7438
0.7817
0.7078
0.7429
IAOA-DBSCAN
0.8199
0.8443
0.7961
0.8195
OBLAOA-DBSCAN
0.8786
0.8936
0.8637
0.8784
Synthesis
WOA-DBSCAN
0.9674
0.9820
0.9530
0.9673
SSA-DBSCAN
0.9505
0.9809
0.9386
0.9593
WSSA-DBSCAN
0.9674
0.9820
0.9530
0.9673
ENGWO-DBSCAN
0.9836
0.9914
0.9758
0.9835
AOA-DBSCAN
0.9836
0.9914
0.9758
0.9835
dAOA-DBSCAN
0.9836
0.9914
0.9758
0.9835
IAOA-DBSCAN
0.9505
0.9809
0.9386
0.9593
OBLAOA-DBSCAN
0.9934
0.9980
0.9888
0.9934
R15
WOA-DBSCAN
1
1
1
1
SSA-DBSCAN
0.9937
0.9937
0.9937
0.9937
WSSA-DBSCAN
1
1
1
1
ENGWO-DBSCAN
1
1
1
1
AOA-DBSCAN
1
1
1
1
dAOA-DBSCAN
1
1
1
1
IAOA-DBSCAN
1
1
1
1
OBLAOA-DBSCAN
1
1
1
1
Vehicle
WOA-DBSCAN
0.8938
0.8953
0.8924
0.8938
SSA-DBSCAN
0.9436
0.9439
0.9432
0.9436
WSSA-DBSCAN
0.9436
0.9439
0.9432
0.9436
ENGWO-DBSCAN
0.9500
0.9503
0.9497
0.9500
AOA-DBSCAN
0.9436
9439
0.9432
0.9436
dAOA-DBSCAN
0.9161
0.9169
0.9154
0.9161
IAOA-DBSCAN
0.9500
0.9503
0.9497
0.9500
OBLAOA-DBSCAN
0.9535
0.9538
0.9533
0.9585
The experiment shows that our OBLAOA algorithm is better than the original AOA algorithm, and it is the best among the eight optimization algorithms when we apply it into DBSCAN algorithm. We use the convergence curve and the error index to introduce them. Our optimization algorithm has better fitness function and the convergence rate, it can be seen through the convergence curve in Fig. 6. In all the datasets, the fitness function of our OBLAOA algorithm are better than the original AOA algorithm and other optimization algorithms. Fig. 6 shows that the convergence accuracy and rate of the OBLAOA are better than those of the AOA. In the datasets Aggregation, Jain, and Synthesis, as the function gradually converges, all algorithms converge more slowly and sometimes AOA falls into local optimal solution. However, because the OBL algorithm has strong local search capability, OBLAOA can still update the optimal solution.
Our OBLAOA algorithm performs better than the other optimization algorithms according to the results of error index in Tables 7, 8, 9 and 10. In the datasets Compound, Jain, Iris, Wpbc, Synthesis and Vehicle, we can clearly see that our OBLAOA-DBSCAN algorithm is better in accuracy. Its DBI index is smaller than the others, and its SIL, RI, NMI, homogeneity, completeness, and V-measure index are larger than the others. Their accuracy is the best of the eight algorithms, the accuracy of Compound is 0.8538, the accuracy of Jain is 0.7151, the accuracy of Iris is 1, the accuracy of Wpbc is 0.9346, the accuracy of Synthesis is 0.9998, the accuracy of Vehicle is 0.9656. Although some of the indexes are the same, our OBLAOA algorithm is better in general. In the four datasets Aggregation, Spiral, Pathbased and R15, the accuracy and the evaluation index of the different algorithms are similar. However, it can be concluded that, in general, the OBLAOA algorithm has a better effect on the analysis of clustering problems than the original AOA algorithm and other six meta-heuristic algorithms. Therefore, the OBLAOA-DBSCAN algorithm has a good influence on the clustering of datasets.

6.5 Experimental results of clustering algorithm

Table 11
The evaluation indexes of datasets in different clustering algorithms I
Dataset
Algorithm
Evaluation index
Accuracy
DBI
RI
SIL
Aggregation
K-means
0.9226
0.8668
0.5323
0.6729
Spectral
0.9727
0.9870
0.3853
0.6808
Optics
0.5533
0.8301
0.9274
0.4362
DPC
0.6586
0.3738
0.9055
0.3023
DBSCAN
0.9898
0.3657
0.9993
0.5877
OBLAOA-DBSCAN
0.9949
0.3651
1
0.6813
Compound
K-means
0.5740*
0.6115
0.8234
0.5329
Spectral
0.5789
0.5692
0.8462
0.606
Optics
0.4286
0.9662
0.9302
0.0496
DPC
0.2130
0.4052
0.8444
0.3998
DBSCAN
0.8450*
1.0888
0.9347
0.1341
OBLAOA-DBSCAN
0.8538
1.0941
0.9415
0.1488
Jain
K-means
0.7748
0.3923
0.6501
0.6722
Spectral
0.7105
0.3879
0.5875
0.6466
Optics
0.1930
0.5667
0.6876
0.3049
DPC
0.6793
2.1268
0.5624
0.0688
DBSCAN
0.6834
0.4999
0.8562
0.4355
OBLAOA-DBSCAN
0.7151
0.5037
0.8700
0.4064
Iris
K-means
0.8830*
0.3960
0.8997
0.7242
Spectral
0.9933
0.3824
0.9910
0.7540
Optics
0.6600
0.6453
0.7719
0.6943
DPC
0.9067
0.3902
0.8923
0.7023
DBSCAN
0.6400*
0.3654
1
0.7478
OBLAOA-DBSCAN
1
0.3654
1
0.7478
Spiral
K-means
0.3720
0.6149
0.5156
0.5492
Spectral
0.2815
0.5707
0.5339
0.5415
Optics
0.9760
2.1960
0.9888
–0.0206
DPC
1
2.2296
1
–0.1081
DBSCAN
1
2.2296
1
–0.1081
OBLAOA-DBSCAN
1
2.2296
1
–0.1081
Pathbased
K-means
0.7500
0.4437
0.7319
0.7482
Spectral
0.7500
0.4437
0.7319
0.7482
Optics
0.6167
1.3392
0.7499
0.2123
DPC
0.6467
0.8539
0.6769
0.4039
DBSCAN
0.8100
0.9482
0.8129
0.4349
OBLAOA-DBSCAN
0.8233
0.9482
0.8158
0.4515
Wpbc
K-means
0.6010
1
0.5180
0.2872
Spectral
0.6041
1.4287
0.5201
0.0796
Optics
0.7020
1.6713
0.5795
0.3222
DPC
0.6670
0.2529
0.6308
0.5213
DBSCAN
0.8270
0.7552
0.9221
0.2937
OBLAOA-DBSCAN
0.9346
0.8023
0.9700
0.3447
Synthesis
K-means
0.6074
0.5452
0.7573
0.6496
Spectral
0.8025
0.2735
0.9018
0.7964
Optics
0.9788
0.3227
0.9989
0.8517
DPC
0.9829
0.3680
0.9965
0.7789
DBSCAN
0.9884
0.1926
0.9956
0.8542
OBLAOA-DBSCAN
0.9998
0.1787
0.9999
0.8517
R15
K-means
0.8106
0.4353
0.9598
0.6844
Spectral
0.8409
0.3953
0.9659
0.7473
Optics
0.1202
1.0480
0.2310
0.5570
DPC
0.9941
0.3500
0.9973
0.8398
DBSCAN
1
0.3044
1
0.8966
OBLAOA-DBSCAN
1
0.3044
1
0.8966
Vehicle
K-means
0.2920
1.4350
0.6530
0.3075
Spectral
0.3014
0.7864
0.6903
0.3264
Optics
0.2577
0.9972
0.2625
0.2327
DPC
0.3262
0.8936
0.5118
0.0943
DBSCAN
0.9309
1.5369
0.9741
0.1717
OBLAOA-DBSCAN
0.9656
1.7856
0.9871
0.1808
Table 12
The evaluation indexes of datasets in different clustering algorithms II
Dataset
Algorithms
Evaluation Index
NMI
Homogeneity
Completeness
Vmeasure
Aggregation
K-means
0.8509*
0.7911
0.8722
0.8297
Spectral
0.9927
0.9912
0.9942
0.9927
Optics
0.8833
0.9766
0.7990
0.8789
DPC
0.9769*
0.8221
0.7950
0.8083
DBSCAN
0.9960
0.9954
0.9966
0.9960
OBLAOA-DBSCAN
1
1
1
1
Compound
K-means
0.6604
0.6351
0.6867
0.6599
Spectral
0.7236
0.6890
0.7600
0.7228
Optics
0.8317
0.9104
0.7598
0.8283
DPC
0.7597
0.7639
0.7559
0.7597
DBSCAN
0.8796
0.9488
0.8154
0.8770
OBLAOA-DBSCAN
0.9049
0.9729
0.8417
0.9026
Jain
K-means
0.3690*
0.3375
0.4070
0.3690
Spectral
0.3072
0.2804
0.3367
0.3060
Optics
0.2216
0.3209
0.1530
0.2072
DPC
0.6183*
0.1048
0.1264
0.1146
DBSCAN
0.6062
0.6660
0.5518
0.6035
OBLAOA-DBSCAN
0.6353
0.6894
0.5855
0.6332
Iris
K-means
7.7766*
0.7650
0.7515
0.7582
Spectral
0.9702
0.9703
0.9701
0.9702
Optics
0.7220
0.9063
0.5752
0.7037
DPC
0.8350*
0.7960
0.8156
0.8057
DBSCAN
1
1.0000
1
1
OBLAOA-DBSCAN
1
1.0000
1
1
Spiral
K-means
0.0636
0.0619
0.0654
0.0636
Spectral
0.0586
0.0579
0.0594
0.0586
Optics
0.9582
0.9597
0.9567
0.9582
DPC
1
1
1
1
DBSCAN
1
1
1
1
OBLAOA-DBSCAN
1
1
1
1
Pathbased
K-means
0.5463*
0.5846
0.5140
0.5470
Spectral
0.5482
0.5846
0.5140
0.5470
Optics
0.6398
0.7799
0.5248
0.6274
DPC
0.5390*
0.4845
0.3597
0.4129
DBSCAN
0.6907
0.7114
0.6706
0.6904
OBLAOA-DBSCAN
0.7012
0.7226
0.6804
0.7009
Wpbc
K-means
0.0270
0.0241
0.0302
0.0268
Spectral
0.4389
0.5421
0.3553
0.4293
Optics
0.0089
0.0124
0.0064
0.0084
DPC
0.0104
0.0432
0.0025
0.0047
DBSCAN
0.7438
0.7817
0.7078
0.7429
OBLAOA-DBSCAN
0.8786
0.8936
0.8637
0.8784
Synthesis
K-means
0.6313
0.8546
0.4734
0.6309
Spectral
0.8209
0.7201
0.9346
0.8140
Optics
0.9826
0.9801
0.9852
0.9826
DPC
0.9726
0.9619
0.9835
0.9726
DBSCAN
0.9674
0.9820
0.9530
0.9673
OBLAOA-DBSCAN
0.9934
0.9980
0.9888
0.9934
R15
K-means
0.9942*
0.8839
0.9182
0.9007
Spectral
0.9441
0.9225
0.9630
0.9439
Optics
0.2991
0.1244
0.7190
0.2122
DPC
0.9933*
0.9874
0.9874
0.9874
DBSCAN
1
1
1
1
OBLAOA-DBSCAN
1
1
1
1
Vehicle
K-means
0.0999
0.0997
1
0.0999
Spectral
0.3927
0.3489
0.4420
0.3900
Optics
0.0351
0.0078
0.1585
0.0148
DPC
0.1106
0.0770
0.1587
0.1037
DBSCAN
0.9208
0.9215
0.9202
0.9208
OBLAOA-DBSCAN
0.9535
0.9538
0.9533
0.9585
Specific clustering results of these datasets are recorded in Figs. 7 and 8, where we show the results by using K-means algorithm, Spectral algorithm, Optics algorithm, DPC algorithm, DBSCAN algorithm and the best clustering optimization algorithm (OBLAOA-DBSCAN). Each colour in the figure represents a cluster of data. By comparing the graphs of each cluster, we can make a basic judgment about the effect of clustering as follows. We can find that OBLAOA-DBSCAN algorithm has a better clustering result in Figs. 7 and 8. It can cluster the data into a better shape and find the actual number of clusters. The graphs of datasets without illustrations are in the in Fig. 10 in Appendix. In Tables 11 and 12, we show the error indexes of different Clustering algorithms and make better indexes in bold. The data with * in the table represents the data in articles [55] and [56].
In Fig. 7, compared with K-means, the effect of dataset Aggregation shows that our algorithm has more reliable clusters. Each cluster in the figure is clearly distinguished, while some clusters in K-means are not clearly distinguished. In Fig. 8, compared with Spectral, the effect of dataset Synthesis shows that our algorithm clusters more accurately on the left side of the graph. The cluster effect for a whole block of data is better than Spectral algorithm. From the graphs of Jain, Spiral and Pathbased in Figs. 7 and 8, the OBLAOA-DBSCAN algorithm is more accurate than K-means and Spectral for the for circular datasets.
In Fig. 8, we can see from the graphs of datasets Pathbased and R15 that our algorithm clusters more accurately than Optics when dealing with dense data. When dealing with discrete data points, the Optics algorithm marks them as noise points. Our algorithm is more accurate when dealing with these points. We can draw this conclusion from the picture of dataset Synthesis. Through the datasets Aggregation and Jain in Fig. 7, it can be obtained that DPC algorithm marks the boundary points as noise points when dealing with data. Therefore, we can find that OBLAOA-DBSCAN algorithm has a better clustering effect than Optics and DPC on circular datasets by comparing cluster graphs. In addition, OBLAOA-DBSCAN correctly identifies sets of data points in areas of lower local density, and edge data points. In contrast, the original DBSCAN failed to accurately cluster these points.
We can find the Accuracy, RI, Sil, NMI, homogeneity, completeness, and V-measure index of OBLAOA-DBSCAN algorithm are significantly higher than those of K-means, Spectral, Optics, DPC and DBSCAN algorithm, the DBI index of the OBLAOA-DBSCAN algorithm is lower than that of K-means and Spectral algorithm from Tables 11 and 12. Therefore, the accuracy of improved OBLAOA-DBSCAN algorithm is better than the original DBSCAN in the dataset clustering.
Compared with the indexes of other articles in Table 11, our algorithm has better NMI indexes than K-means and original DBSCAN algorithms. On dataset Compound, OBLAOA-DBSCAN’s NMI index is 48.74\(\%\) higher than K-means’s and 1.04\(\%\) higher than DBSCAN’s. On dataset Iris, OBLAOA-DBSCAN’s NMI index is 13.25\(\%\) higher than K-means’ and 56.25\(\%\) higher than DBSCAN’s. Compared with the indexes of other articles in Table 12, our algorithm has better NMI indexes than K-means and DPC algorithms. On dataset Aggregation, OBLAOA-DBSCAN’s NMI index is 17.52\(\%\) higher than K-means’s and 2.08\(\%\) higher than DPC’s. On dataset Jain, OBLAOA-DBSCAN’s NMI index is 72.16\(\%\) higher than K-means’s and 2.74\(\%\) higher than DPC’s. On dataset Pathbased, OBLAOA-DBSCAN’s NMI index is 28.99\(\%\) higher than K-means’s and 16.22\(\%\) higher than DPC’s. On dataset R15, OBLAOA-DBSCAN’s NMI index is 0.58\(\%\) higher than K-means’s and 0.67\(\%\) higher than DPC’s.
In Table 11, we can find the index DBI and RI of dataset Spiral and Pathbased are not best, but their accuracy is better than those compared with real labels. According to figure, we can draw a conclusion that for circular datasets like Figs. 7 and 8, our DBSCAN algorithm can determine the shape of clustering more accurately and get better results. In Table 11, we can see that the SIL index has a negative set of values on circular dataset Spiral, but the clustering shapes are more consistent with the real labels. Through the above comparative analysis, we can find that OBLAOA-DBSCAN algorithm not only optimizes better than other optimization algorithms, but also performs better in clustering analysis compared with some classical clustering algorithms. In general, we can conclude that OBLAOA-DBSCAN algorithm has a very good effect on the clustering of datasets.

7 Conclusion

In this paper, we have proposed a new clustering algorithm named OBLAOA-DBSCAN. In this algorithm, we introduce OBL into AOA algorithm and develop an OBLAOA optimizer to improve the global search ability and convergence accuracy of standard AOA algorithm. Then, we use the improved OBLAOA algorithm to adjust the EPS and MinPts parameters of DBSCAN in order to improve the clustering effect of DBSCAN and propose a hybrid clustering algorithm (OBLAOA-DBSCAN). In our numerical simulation, we have demonstrated that the improved OBLAOA is more effective than the original AOA and other current popular algorithms. In addition, we also have validated the effectiveness of our proposed OBLAOA-DBSCAN algorithm by many clustering projects and found that the proposed clustering algorithm can achieve an accurate and reliable clustering results with less computational costs.
Although OBLAOA-DBSCAN can achieve significant improvement, there are still some insufficient, such as the selection of the best parameters of the optimization algorithm, the global search ability and clustering effect of the optimization algorithm need to be further improved. In the future, we will apply OBLAOA-DBSCAN to clustering problems on more datasets. In addition, OBLAOA can also be applied to other application problems like clustering model, such as image classification and recognition, speech signal classification, electrical information classification and so on, which needs further research by other researchers.

Acknowledgements

The authors would like to thank the six excellent reviewers for their constructive comments and suggestions, which have led to a much-improved paper. Also, the authors would like to acknowledge Ms. Xin Jiang and Ms. Xia Lin for their preparation for the original manuscript. This work is supported in part by the National Natural Science Foundation of China under Grant 61873130 and Grant 61833011, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20191377, in part by the 1311 Talent Project of Nanjing University of Posts and Telecommunications, in part by Natural Science Foundation of Nanjing University of Posts and Telecommunications under Grant NY220194 and under Grant NY221082 by the Australian Research Council project DP160104292 and the National Natural Science Foundation of China under Grant 62001337.

Declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anhänge

Appendix A The additional results for our experiments

See Figs. 9 and 10.
Literatur
1.
Zurück zum Zitat Yuvaraj N, Suresh Ghana Dhas C (2020) High-performance link-based cluster ensemble approach for categorical data clustering. J Supercomput 76(6):4556–4579CrossRef Yuvaraj N, Suresh Ghana Dhas C (2020) High-performance link-based cluster ensemble approach for categorical data clustering. J Supercomput 76(6):4556–4579CrossRef
2.
Zurück zum Zitat Hussein S, Kandel P, Bolan CW, Wallace MB, Bagci U (2019) Lung and pancreatic tumor characterization in the deep learning era: novel supervised and unsupervised learning approaches. IEEE Trans Med Imag 38(8):1777–1787CrossRef Hussein S, Kandel P, Bolan CW, Wallace MB, Bagci U (2019) Lung and pancreatic tumor characterization in the deep learning era: novel supervised and unsupervised learning approaches. IEEE Trans Med Imag 38(8):1777–1787CrossRef
3.
Zurück zum Zitat Wu J, Wang YG, Burrage K, Tian YC, Lawson B, Ding Z (2020) An improved firefly algorithm for global continuous optimization problems. Expert Syst Appl 149:113340CrossRef Wu J, Wang YG, Burrage K, Tian YC, Lawson B, Ding Z (2020) An improved firefly algorithm for global continuous optimization problems. Expert Syst Appl 149:113340CrossRef
4.
Zurück zum Zitat Chen H, Li W, Yang X (2020) A whale optimization algorithm with chaos mechanism based on quasi-opposition for global optimization problems. Expert Syst Appl 158:113612CrossRef Chen H, Li W, Yang X (2020) A whale optimization algorithm with chaos mechanism based on quasi-opposition for global optimization problems. Expert Syst Appl 158:113612CrossRef
5.
Zurück zum Zitat Edwin Dhas P, Sankara Gomathi B (2020) A novel clustering algorithm by clubbing GHFCM and GWO for microarray gene data. J Supercomput 76(8):5679–5693CrossRef Edwin Dhas P, Sankara Gomathi B (2020) A novel clustering algorithm by clubbing GHFCM and GWO for microarray gene data. J Supercomput 76(8):5679–5693CrossRef
6.
Zurück zum Zitat Wang C, Koh JM, Yu T, Xie NG, Cheong KH (2020) Material and shape optimization of bi-directional functionally graded plates by GIGA and an improved multi-objective particle swarm optimization algorithm. Computer Methods Appl Mech Eng 366:113017MathSciNetMATHCrossRef Wang C, Koh JM, Yu T, Xie NG, Cheong KH (2020) Material and shape optimization of bi-directional functionally graded plates by GIGA and an improved multi-objective particle swarm optimization algorithm. Computer Methods Appl Mech Eng 366:113017MathSciNetMATHCrossRef
7.
Zurück zum Zitat Nanda SJ, Panda G (2014) A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput 16:1–18CrossRef Nanda SJ, Panda G (2014) A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput 16:1–18CrossRef
8.
Zurück zum Zitat Hu L, Zhang J, Pan X, Yan H, You ZH (2021) HiSCF: leveraging higher-order structures for clustering analysis in biological networks. Bioinformatics. 37(4):542–550CrossRef Hu L, Zhang J, Pan X, Yan H, You ZH (2021) HiSCF: leveraging higher-order structures for clustering analysis in biological networks. Bioinformatics. 37(4):542–550CrossRef
9.
Zurück zum Zitat Chen YJ, Chen MZ, Zhang HW, Wu GS, Guo SR (2021) Effect of Guo Qing Yi Tang combined with Western medicine cluster therapy on acute pancreatitis. Am J Emergency Med 50:66–70CrossRef Chen YJ, Chen MZ, Zhang HW, Wu GS, Guo SR (2021) Effect of Guo Qing Yi Tang combined with Western medicine cluster therapy on acute pancreatitis. Am J Emergency Med 50:66–70CrossRef
10.
Zurück zum Zitat Rochat L, Bianchi-Demicheli F, Aboujaoude E, Khazaal Y (2019) The psychology of swiping: a cluster analysis of the mobile dating app Tinder. J Behav Addict 8(4):804–813CrossRef Rochat L, Bianchi-Demicheli F, Aboujaoude E, Khazaal Y (2019) The psychology of swiping: a cluster analysis of the mobile dating app Tinder. J Behav Addict 8(4):804–813CrossRef
11.
Zurück zum Zitat Kim S, Jung I (2017) Optimizing the maximum reported cluster size in the spatial scan statistic for ordinal data. PLoS One 12(7):182234CrossRef Kim S, Jung I (2017) Optimizing the maximum reported cluster size in the spatial scan statistic for ordinal data. PLoS One 12(7):182234CrossRef
12.
Zurück zum Zitat Celebi ME (2014) Partitional clustering algorithms. Springer, New YorkMATH Celebi ME (2014) Partitional clustering algorithms. Springer, New YorkMATH
13.
Zurück zum Zitat Medová J, Bakusová J (2019) Application of hierarchical cluster analysis in educational research: Distinguishing between transmissive and constructivist oriented mathematics teachers. Statistika: Stat Econ J 99:142–150 Medová J, Bakusová J (2019) Application of hierarchical cluster analysis in educational research: Distinguishing between transmissive and constructivist oriented mathematics teachers. Statistika: Stat Econ J 99:142–150
14.
Zurück zum Zitat Gong W, Pang L, Wang J, Xia M, Zhang Y (2021) A social-aware K means clustering algorithm for D2D multicast communication under SDN architecture. AEU-Int J Electron Commun 132:153610CrossRef Gong W, Pang L, Wang J, Xia M, Zhang Y (2021) A social-aware K means clustering algorithm for D2D multicast communication under SDN architecture. AEU-Int J Electron Commun 132:153610CrossRef
15.
Zurück zum Zitat Raj S, Improved Ghosh D, Optimal DBSCAN, for Embedded Applications Using High-Resolution Automotive Radar. In, (2020) 21st International Radar Symposium (IRS). IEEE 2020:343–346 Raj S, Improved Ghosh D, Optimal DBSCAN, for Embedded Applications Using High-Resolution Automotive Radar. In, (2020) 21st International Radar Symposium (IRS). IEEE 2020:343–346
16.
Zurück zum Zitat Mardani K, Maghooli K (2021) Enhancing retinal blood vessel segmentation in medical images using combined segmentation modes extracted by DBSCAN and morphological reconstruction. Biomed Signal Process Control 69:102837CrossRef Mardani K, Maghooli K (2021) Enhancing retinal blood vessel segmentation in medical images using combined segmentation modes extracted by DBSCAN and morphological reconstruction. Biomed Signal Process Control 69:102837CrossRef
17.
Zurück zum Zitat Fouedjio F (2020) Clustering of multivariate geostatistical data. Wiley Interdiscipl Rev: Comput Stat 12(5):1510MathSciNetCrossRef Fouedjio F (2020) Clustering of multivariate geostatistical data. Wiley Interdiscipl Rev: Comput Stat 12(5):1510MathSciNetCrossRef
18.
Zurück zum Zitat Wang L, Wang H, Han X, Zhou W (2021) A novel adaptive density-based spatial clustering of application with noise based on bird swarm optimization algorithm. Computer Commun 174:205–214CrossRef Wang L, Wang H, Han X, Zhou W (2021) A novel adaptive density-based spatial clustering of application with noise based on bird swarm optimization algorithm. Computer Commun 174:205–214CrossRef
19.
Zurück zum Zitat Wang C, Ji M, Wang J, Wen W, Li T, Sun Y (2019) An improved DBSCAN method for LiDAR data segmentation with automatic Eps estimation. Sensors 19(1):172CrossRef Wang C, Ji M, Wang J, Wen W, Li T, Sun Y (2019) An improved DBSCAN method for LiDAR data segmentation with automatic Eps estimation. Sensors 19(1):172CrossRef
21.
Zurück zum Zitat Agarwal P, Mehta S, Abraham A (2021) A meta-heuristic density-based subspace clustering algorithm for high-dimensional data. Soft Comput 25:10237–10256CrossRef Agarwal P, Mehta S, Abraham A (2021) A meta-heuristic density-based subspace clustering algorithm for high-dimensional data. Soft Comput 25:10237–10256CrossRef
22.
Zurück zum Zitat Zhang H, Nguyen H, Bui XN, Pradhan B, Mai NL, Vu DA (2021) Proposing two novel hybrid intelligence models for forecasting copper price based on extreme learning machine and meta-heuristic algorithms. Resour Policy 73:102195CrossRef Zhang H, Nguyen H, Bui XN, Pradhan B, Mai NL, Vu DA (2021) Proposing two novel hybrid intelligence models for forecasting copper price based on extreme learning machine and meta-heuristic algorithms. Resour Policy 73:102195CrossRef
23.
Zurück zum Zitat Singh H, Singh B, Kaur M (2021) An improved elephant herding optimization for global optimization problems. Eng Computers 55:1–33 Singh H, Singh B, Kaur M (2021) An improved elephant herding optimization for global optimization problems. Eng Computers 55:1–33
24.
Zurück zum Zitat Lai W, Zhou M, Hu F, Bian K, Song Q (2019) A new DBSCAN parameters determination method based on improved MVO. IEEE Access 7:104085–104095CrossRef Lai W, Zhou M, Hu F, Bian K, Song Q (2019) A new DBSCAN parameters determination method based on improved MVO. IEEE Access 7:104085–104095CrossRef
25.
Zurück zum Zitat Jian S, Li D, Yu Y (2021) Research on Taxi Operation Characteristics by Improved DBSCAN Density Clustering Algorithm and K-means Clustering Algorithm. In: Journal of Physics: Conference Series. vol. 1952. IOP Publishing; p. 042103 Jian S, Li D, Yu Y (2021) Research on Taxi Operation Characteristics by Improved DBSCAN Density Clustering Algorithm and K-means Clustering Algorithm. In: Journal of Physics: Conference Series. vol. 1952. IOP Publishing; p. 042103
26.
Zurück zum Zitat Zhu Q, Tang X, Elahi A (2021) Application of the novel harmony search optimization algorithm for DBSCAN clustering. Expert Syst Appl 178:115054CrossRef Zhu Q, Tang X, Elahi A (2021) Application of the novel harmony search optimization algorithm for DBSCAN clustering. Expert Syst Appl 178:115054CrossRef
27.
Zurück zum Zitat Hu L, Liu H, Zhang J, Liu A (2021) KR-DBSCAN: a density-based clustering algorithm based on reverse nearest neighbor and influence space. Expert Syst Appl 186:115763CrossRef Hu L, Liu H, Zhang J, Liu A (2021) KR-DBSCAN: a density-based clustering algorithm based on reverse nearest neighbor and influence space. Expert Syst Appl 186:115763CrossRef
28.
Zurück zum Zitat Li M, Bi X, Wang L, Han X (2021) A method of two-stage clustering learning based on improved DBSCAN and density peak algorithm. Computer Commun 167:75–84CrossRef Li M, Bi X, Wang L, Han X (2021) A method of two-stage clustering learning based on improved DBSCAN and density peak algorithm. Computer Commun 167:75–84CrossRef
29.
Zurück zum Zitat Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Computer Methods Appl Mech Eng 376:113609MathSciNetMATHCrossRef Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Computer Methods Appl Mech Eng 376:113609MathSciNetMATHCrossRef
30.
Zurück zum Zitat Brust JJ, Marcia RF, Petra CG (2019) Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints. Comput Optim Appl 74(3):669–701MathSciNetMATHCrossRef Brust JJ, Marcia RF, Petra CG (2019) Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints. Comput Optim Appl 74(3):669–701MathSciNetMATHCrossRef
31.
Zurück zum Zitat Bouhlel MA, Martins JR (2019) Gradient-enhanced kriging for high-dimensional problems. Eng Computers 35(1):157–173CrossRef Bouhlel MA, Martins JR (2019) Gradient-enhanced kriging for high-dimensional problems. Eng Computers 35(1):157–173CrossRef
32.
Zurück zum Zitat Fu G, Wang C, Zhang D, Zhao J, Wang H (2019) A multiobjective particle swarm optimization algorithm based on multipopulation coevolution for weapon-target assignment. Math Probl Eng 2019:1424590CrossRef Fu G, Wang C, Zhang D, Zhao J, Wang H (2019) A multiobjective particle swarm optimization algorithm based on multipopulation coevolution for weapon-target assignment. Math Probl Eng 2019:1424590CrossRef
33.
Zurück zum Zitat Elgamal ZM, Yasin NM, Sabri AQM, Sihwail R, Tubishat M, Jarrah H (2021) Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets. Computation 9(6):68CrossRef Elgamal ZM, Yasin NM, Sabri AQM, Sihwail R, Tubishat M, Jarrah H (2021) Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets. Computation 9(6):68CrossRef
34.
Zurück zum Zitat Lei D, You T, Cai L (2021) Parameter identification of roll motion equation of ship in regular wave using opposition based learning gaussian bare bone imperialist competition algorithm. IEEJ Trans Electr Electron Eng 16(8):1086–1092CrossRef Lei D, You T, Cai L (2021) Parameter identification of roll motion equation of ship in regular wave using opposition based learning gaussian bare bone imperialist competition algorithm. IEEJ Trans Electr Electron Eng 16(8):1086–1092CrossRef
35.
Zurück zum Zitat Nekooei-Joghdani A, Safi-Esfahani F (2021) Dynamic scheduling of independent tasks in cloud computing applying a new hybrid metaheuristic algorithm including Gabor filter, opposition-based learning, multi-verse optimizer, and multi-tracker optimization algorithms. J Supercomput 78:1182–1243CrossRef Nekooei-Joghdani A, Safi-Esfahani F (2021) Dynamic scheduling of independent tasks in cloud computing applying a new hybrid metaheuristic algorithm including Gabor filter, opposition-based learning, multi-verse optimizer, and multi-tracker optimization algorithms. J Supercomput 78:1182–1243CrossRef
36.
Zurück zum Zitat Ester M, Kriegel H, Sander J, Xu X, Idrissov A, Nascimento M et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD. vol. 2. ACM Press. p. 49–60 Ester M, Kriegel H, Sander J, Xu X, Idrissov A, Nascimento M et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD. vol. 2. ACM Press. p. 49–60
37.
Zurück zum Zitat Abualigah L, Ewees AA, Al-qaness MAA, Elaziz MA, Yousri D, Ibrahim RA et al (2022) Boosting arithmetic optimization algorithm by sine cosine algorithm and levy flight distribution for solving engineering optimization problems. Neural Comput Appl 34(11):8823–8852CrossRef Abualigah L, Ewees AA, Al-qaness MAA, Elaziz MA, Yousri D, Ibrahim RA et al (2022) Boosting arithmetic optimization algorithm by sine cosine algorithm and levy flight distribution for solving engineering optimization problems. Neural Comput Appl 34(11):8823–8852CrossRef
38.
Zurück zum Zitat Kamil AT, Saleh HM, Abd-Alla IH (2021) A multi-swarm structure for particle swarm optimization: Solving the welded beam design problem. In: Journal of Physics: Conference Series. vol. 1804. IOP Publishing. p. 01201 Kamil AT, Saleh HM, Abd-Alla IH (2021) A multi-swarm structure for particle swarm optimization: Solving the welded beam design problem. In: Journal of Physics: Conference Series. vol. 1804. IOP Publishing. p. 01201
39.
Zurück zum Zitat Gupta S (2021) Enhanced harmony search algorithm with non-linear control parameters for global optimization and engineering design problems. Eng Computers 87:1–24 Gupta S (2021) Enhanced harmony search algorithm with non-linear control parameters for global optimization and engineering design problems. Eng Computers 87:1–24
40.
Zurück zum Zitat Kumar N, Mahato SK, Bhunia AK (2021) Design of an efficient hybridized CS-PSO algorithm and its applications for solving constrained and bound constrained structural engineering design problems. Results Control Optim 5:100064CrossRef Kumar N, Mahato SK, Bhunia AK (2021) Design of an efficient hybridized CS-PSO algorithm and its applications for solving constrained and bound constrained structural engineering design problems. Results Control Optim 5:100064CrossRef
41.
Zurück zum Zitat Rad MH, Abdolrazzagh-Nezhad M (2020) A new hybridization of DBSCAN and fuzzy earthworm optimization algorithm for data cube clustering. Soft Comput 24(20):15529–15549CrossRef Rad MH, Abdolrazzagh-Nezhad M (2020) A new hybridization of DBSCAN and fuzzy earthworm optimization algorithm for data cube clustering. Soft Comput 24(20):15529–15549CrossRef
42.
Zurück zum Zitat Gholizadeh N, Saadatfar H, Hanafi N (2021) K-DBSCAN: an improved DBSCAN algorithm for big data. J Supercomput 77(6):6214–6235CrossRef Gholizadeh N, Saadatfar H, Hanafi N (2021) K-DBSCAN: an improved DBSCAN algorithm for big data. J Supercomput 77(6):6214–6235CrossRef
43.
Zurück zum Zitat Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef
44.
Zurück zum Zitat Bryant A, Cios K (2017) RNN-DBSCAN: a density-based clustering algorithm using reverse nearest neighbor density estimates. IEEE Trans Knowl Data Eng 30(6):1109–1121CrossRef Bryant A, Cios K (2017) RNN-DBSCAN: a density-based clustering algorithm using reverse nearest neighbor density estimates. IEEE Trans Knowl Data Eng 30(6):1109–1121CrossRef
45.
Zurück zum Zitat Jiang J, Feng T, Liu C (2021) An improved nonlinear grey bernoulli model based on the whale optimization algorithm and its application. Math Probl Eng 2021:6691724 Jiang J, Feng T, Liu C (2021) An improved nonlinear grey bernoulli model based on the whale optimization algorithm and its application. Math Probl Eng 2021:6691724
46.
Zurück zum Zitat Abd El-sattar S, Kamel S, Ebeed M, Jurado F (2021) An improved version of salp swarm algorithm for solving optimal power flow problem. Soft Comput 25(5):4027–4052CrossRef Abd El-sattar S, Kamel S, Ebeed M, Jurado F (2021) An improved version of salp swarm algorithm for solving optimal power flow problem. Soft Comput 25(5):4027–4052CrossRef
47.
Zurück zum Zitat Chouhan N, Bhatt UR, Upadhyay R (2021) Weighted salp swarm and salp swarm algorithms in fiWi access network: a new paradigm for ONU placement. Opt Fiber Technol 63:102505CrossRef Chouhan N, Bhatt UR, Upadhyay R (2021) Weighted salp swarm and salp swarm algorithms in fiWi access network: a new paradigm for ONU placement. Opt Fiber Technol 63:102505CrossRef
48.
Zurück zum Zitat Mohakud R, Dash R (2022) Skin cancer image segmentation utilizing a novel EN-GWO based hyper-parameter optimized FCEDN. J King Saud Univ-Computer Inf Sci 45:1–16 Mohakud R, Dash R (2022) Skin cancer image segmentation utilizing a novel EN-GWO based hyper-parameter optimized FCEDN. J King Saud Univ-Computer Inf Sci 45:1–16
49.
Zurück zum Zitat Xu YP, Tan JW, Zhu DJ, Ouyang P, Taheri B (2021) Model identification of the proton exchange membrane fuel cells by extreme learning machine and a developed version of arithmetic optimization algorithm. Energy Rep 7:2332–2342CrossRef Xu YP, Tan JW, Zhu DJ, Ouyang P, Taheri B (2021) Model identification of the proton exchange membrane fuel cells by extreme learning machine and a developed version of arithmetic optimization algorithm. Energy Rep 7:2332–2342CrossRef
50.
Zurück zum Zitat Kaveh A, Hamedani KB (2022) Improved arithmetic optimization algorithm and its application to discrete structural optimization. In: Structures. vol. 35. Elsevier; p. 748–764 Kaveh A, Hamedani KB (2022) Improved arithmetic optimization algorithm and its application to discrete structural optimization. In: Structures. vol. 35. Elsevier; p. 748–764
51.
Zurück zum Zitat Karczmarek P, Kiersztyn A, Pedrycz W, Al E (2020) K-Means-based isolation forest. Knowl-Based Syst 195:105659CrossRef Karczmarek P, Kiersztyn A, Pedrycz W, Al E (2020) K-Means-based isolation forest. Knowl-Based Syst 195:105659CrossRef
52.
Zurück zum Zitat Allab K, Labiod L, Nadif M (2016) Power simultaneous spectral data embedding and clustering. In: Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM; p. 270–278 Allab K, Labiod L, Nadif M (2016) Power simultaneous spectral data embedding and clustering. In: Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM; p. 270–278
53.
Zurück zum Zitat Kim JH, Choi JH, Yoo KH, Nasridinov A (2019) AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities. J Supercomput 75(1):142–169CrossRef Kim JH, Choi JH, Yoo KH, Nasridinov A (2019) AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities. J Supercomput 75(1):142–169CrossRef
54.
Zurück zum Zitat Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef
55.
Zurück zum Zitat Guo W, Xu P, Dai F, Hou Z (2022) Harris hawks optimization algorithm based on elite fractional mutation for data clustering. Appl Intell 89:1–27 Guo W, Xu P, Dai F, Hou Z (2022) Harris hawks optimization algorithm based on elite fractional mutation for data clustering. Appl Intell 89:1–27
56.
Zurück zum Zitat Zhang Y, Ding S, Wang L, Wang Y, Ding L (2021) Chameleon algorithm based on mutual k-nearest neighbors. Appl Intell 51(4):2031–2044CrossRef Zhang Y, Ding S, Wang L, Wang Y, Ding L (2021) Chameleon algorithm based on mutual k-nearest neighbors. Appl Intell 51(4):2031–2044CrossRef
Metadaten
Titel
An efficient DBSCAN optimized by arithmetic optimization algorithm with opposition-based learning
verfasst von
Yang Yang
Chen Qian
Haomiao Li
Yuchao Gao
Jinran Wu
Chan-Juan Liu
Shangrui Zhao
Publikationsdatum
26.06.2022
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 18/2022
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-022-04634-w

Weitere Artikel der Ausgabe 18/2022

The Journal of Supercomputing 18/2022 Zur Ausgabe

Premium Partner