Grid-DPC: Improved density peaks clustering based on spatial grid walk

Liang, Bo; Cai, JiangHui; Yang, HaiFeng

doi:10.1007/s10489-022-03705-y

Grid-DPC: Improved density peaks clustering based on spatial grid walk

Published: 25 May 2022

Volume 53, pages 3221–3239, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

558 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Traditional clustering methods need to find the initial centers first. A reasonable cluster center can improve the efficiency and accuracy of the algorithm. However, finding centers is not an easy task. It often needs much calculation and easily falls into local optimal points. In allusion to the problem, an improved density peaks clustering algorithm based on spatial grid walk (Grid-DPC) is proposed. Grid-DPC uses a spatial grid walk method to determine the initial cluster centers, avoiding the traditional method requiring multiple iterations to optimize the centers and preventing falling into the local optimal. In terms of density definition, Grid-DPC uses grid density instead of the traditional method, which reduces the time cost of finding neighbors. In terms of the random walk strategy, the adjacent higher grid weight direction and Lévy Flight based step length method are adopted to improve the convergence speed of the algorithm effectively. Aiming at the phenomenon of “target loss” that may occur in random walks, the variable neighborhood search method is used to help the algorithm find high-weight grids and prolong the number of steps per walk as long as possible. Simulation experiments show that the algorithm can effectively find clustering centers, complete the task of clustering arbitrarily distributed data, and has high efficiency for massive data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DPCG: an efficient density peaks clustering algorithm based on grid

Article 26 September 2016

An Efficient Clustering Algorithm Based on Grid Density and its Application in Human Mobility Analysis

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

Article 02 August 2019

References

Dogan A, Birant D (2021) Machine learning and data mining in manufacturing. Expert Syst Appl 166:114060. https://doi.org/10.1016/j.eswa.2020.114060
Article Google Scholar
Maia J, Severiano C, Guimarães F, Castro C, Lemos A, Galindo J, Weiss M (2020) Evolving clustering algorithm based on mixture of typicalities for stream data mining. Future Generation Computer Systems, 106. https://doi.org/10.1016/j.future.2020.01.017
Adolfsson A, Ackerman M, Brownstein NC (2019) To cluster, or not to cluster: an analysis of clusterability methods. Pattern Recogn 88:13–26. https://doi.org/10.1016/j.patcog.2018.10.026
Article Google Scholar
Chauhan R, Batra P, Chaudhary S (2014) A survey of density based clustering algorithms. www.ijcst.com 5. https://doi.org/10.1007/s11704-019-9059-
Zaman M, Hassan A (2019) Improved statistical features-based control chart patterns recognition using anfis with fuzzy clustering. Neural Comput Appl 31:5935–5949. https://doi.org/10.1007/s00521-018-3388-2
Article Google Scholar
Chen L (2019) Opportunities and challenges for resource management and machine learning clusters, pp 165–166. https://doi.org/10.1145/3368235.3369376
Curiel A, Gutiérrez-Soto C, Soto-Borquez P-N, Galdames P (2020) Measuring the effects of summarization in cluster-based information retrieval, 1–8. https://doi.org/10.1109/SCCC51225.2020.9281189
Gao X, Shen W, Zhang L, Hu J, Fortin N, Frostig R, Ombao H (2020) Regularized matrix data clustering and its application to image analysis. Biometrics 01:1–13. https://doi.org/10.1111/biom.13354
Google Scholar
Saha J, Mukherjee J (2021) Cnak: Cluster number assisted k-means. Pattern Recogn, 110. https://doi.org/10.1016/j.patcog.2020.107625
Song K, Yao X, Nie F, Li X, Xu M (2021) Weighted bilateral k-means algorithm for fast co-clustering and fast spectral clustering. Pattern Recogn, 109. https://doi.org/10.1016/j.patcog.2020.107560
Pan Pin-chen LY-J, Jiang H (2019) Initial center optimization method of k-means algorithm within non-independent and identi cally distribution context. J Chin Comput Syst 40:1254–1259
Google Scholar
Chen Guang-ping HJ, Wang W-P (2012) Improved initial clustering center selection method for k-means algorithm. J Chin Comput Syst 33:1320–1323
Google Scholar
Dinh T, Huynh V-N (2020) K-pbc: an improved cluster center initialization for categorical data clustering. Appl Intell, 2610–2632. https://doi.org/10.1007/s10489-020-01677-5
Liu Y, Jiang Y, Hou T, Liu F (2021) A new robust fuzzy clustering validity index for imbalanced data sets. Inf Sci 547:579–591. https://doi.org/10.1016/j.ins.2020.08.041
Article MathSciNet MATH Google Scholar
Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785–2797. https://doi.org/10.1016/j.eswa.2014.09.054
Article Google Scholar
Ros F, Guillaume S, El Hajji M, Riad R (2020) Kdmutual: a novel clustering algorithm combining mutual neighboring and hierarchical approaches using a new selection criterion. Knowl-Based Syst, 204. https://doi.org/10.1016/j.knosys.2020.106220
Xu Q, Zhang Q, Liu J, Luo B (2020) Efficient synthetical clustering validity indexes for hierarchical clustering. Expert Syst Appl, 151. https://doi.org/10.1016/j.eswa.2020.113367
Shi P, Zhao Z, Zhong H, Shen H, Ding L (2020) An improved agglomerative hierarchical clustering anomaly detection method for scientific data. Concurrency and Computation: Practice and Experience, 33. https://doi.org/10.1002/cpe.6077
Pasupathi S, Shanmuganthan V, Madasamy K, Robinson H, Kim M (2021) Trend analysis using agglomerative hierarchical clustering approach for time series big data. J Supercomput 77:1–20. https://doi.org/10.1007/s11227-020-03580-9
Article Google Scholar
Cirrincione G, Ciravegna G, Barbiero P, Randazzo V, Pasero E (2020) The gh-exin neural network for hierarchical clustering. Neural Netw 121:57–73. https://doi.org/10.1016/j.neunet.2019.07.018
Article Google Scholar
Wang X, Wang X, Wilkes M (2020) A fast distance-based outlier detection technique using a divisive hierarchical clustering algorithm, pp 39–69. https://doi.org/10.1007/978-981-15-9519-6_3
Yamada Y, Masuyama N, Amako N, Nojima Y, Chu Kiong L, Ishibuchi H (2020) Divisive hierarchical clustering based on adaptive resonance theory, 1–6. https://doi.org/10.1109/CcS49175.2020.9231474
Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. SIGMOD Rec 25(2):103–114. https://doi.org/10.1145/235968.233324
Article Google Scholar
Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. Inf Syst 26:35–58. https://doi.org/10.1016/S0306-4379(01)00008-4
Article MATH Google Scholar
Guha S, Rastogi R, Shim K (2002) Rock: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366. https://doi.org/10.1016/S0306-4379(00)00022-3
Article Google Scholar
Shi J, Zhu Q, Li J (2021) A novel hierarchical clustering algorithm with merging strategy based on shared subordinates. Applied Intelligence. https://doi.org/10.1007/s10489-021-02830-4
Alex R, Alessandro L (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496. https://doi.org/10.1126/science.1242072
Article Google Scholar
Wang S, Li Q, Zhao C, Zhu X, Yuan H, Dai T (2021) Extreme clustering – a clustering method via density extreme points. Inf Sci 542:24–39. https://doi.org/10.1016/j.ins.2020.06.069
Article MathSciNet MATH Google Scholar
Hou J, Zhang A, Qi N (2020) Density peak clustering based on relative density relationship. Pattern Recognition 108:107554. https://doi.org/10.1016/j.patcog.2020.107554
Article Google Scholar
Hamidzadeh J, Monsefi R, Sadoghi Yazdi H (2015) Irahc: Instance reduction algorithm using hyperrectangle clustering. Pattern Recogn 48(5):1878–1889. https://doi.org/10.1016/j.patcog.2014.11.005
Article MATH Google Scholar
Flores KG, Garza SE (2020) Density peaks clustering with gap-based automatic center detection. Knowl-Based Syst, 206. https://doi.org/10.1016/j.knosys.2020.106350
Dogan A, Birant D (2021) K-centroid link: a novel hierarchical clustering linkage method. Applied Intelligence. https://doi.org/10.1007/s10489-021-02624-8
Skurichina M, Duin R (2002) Limited bagging, boosting and the random subspace method for linear classifiers. Pattern Anal Applic, 5. https://doi.org/10.1007/s100440200011
Yu Q (2017) Optimization of initial clustering centers selection method for k-means algorithm. Comput Syst Appl 26:170–174. https://doi.org/10.15888/j.cnki.csa.005733
Google Scholar
Zhang S, Zhao H (2017) Algorithm research of optimal cluster number and initial cluster center. Appl Res Comput 34:1617–1620. https://doi.org/10.3969/j.issn.1001-3695.2017.06.004
Google Scholar
Chen Y, Hu X, Fan W, Shen L, Zhang Z, Liu X, Du J, Li H, Chen Y, Li H (2020) Fast density peak clustering for large scale data based on knn. Knowledge-Based Systems 187:104824. https://doi.org/10.1016/j.knosys.2019.06.032
Article Google Scholar
Ullah I, Youn H, Han Y-H (2021) An efficient data aggregation and outlier detection scheme based on radial basis function neural network for wsn. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-020-02703-7
Paikray H, Das P, Panda S (2021) Optimal multi-robot path planning using particle swarm optimization algorithm improved by sine and cosine algorithms. Arabian Journal for Science and Engineering. https://doi.org/10.1007/s13369-020-05046-9
Surono S, Putri R (2020) Optimization of fuzzy c-means clustering algorithm with combination of minkowski and chebyshev distance using principal component analysis. International Journal of Fuzzy Systems. https://doi.org/10.1007/s40815-020-00997-5
Pearson K (1905) The problem of the random walk. Nature 72:1865
Article Google Scholar
Liting G (2018) Improved wolf pack algorithm based on adaptive step length and adjustable scouting direction. Journal of Zhejiang University (Science Edition) 45:284–293. https://doi.org/10.3785/j.issn.1008-9497.2018.03.003
Google Scholar
Mendonça JRG (2020) A numerical investigation into the scaling behavior of the longest increasing subsequences of the symmetric ultra-fat tailed random walk. Phys Lett A 384(29):126753. https://doi.org/10.1016/j.physleta.2020.126753
Article MathSciNet Google Scholar
Abdel-Basset M, Hessin A-N, Abdel-Fatah L (2018) A comprehensive study of cuckoo-inspired algorithms. Neural Comput Appl 29(2):345–361. https://doi.org/10.1007/s00521-016-2464-8
Article Google Scholar
Schelling B, Plant C (2016) Clustering with the levy walk: hunting for clusters. In: 2016 IEEE 16Th international conference on data mining workshops (ICDMW), pp 1251–1260. https://doi.org/10.1109/ICDMW.2016.0180
kelidari M, Hamidzadeh J (2021) Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator. Soft Comput 25:1–23. https://doi.org/10.1007/s00500-020-05349-x
Article Google Scholar
Amiri E, Mahmoudi S (2016) Efficient protocol for data clustering by fuzzy cuckoo optimization algorithm. Appl Soft Comput 41:15–21. https://doi.org/10.1016/j.asoc.2015.12.00
Article Google Scholar
Boushaki SI, Kamel N, Bendjeghaba O (2018) A new quantum chaotic cuckoo search algorithm for data clustering. Expert Syst Appl 96:358–372. https://doi.org/10.1016/j.eswa.2017.12.001
Article MATH Google Scholar
Niu X, Zheng Y, Fournier Viger P, Wang B (2021) Parallel grid-based density peak clustering of big trajectory data. Applied Intelligence. https://doi.org/10.1007/s10489-021-02757-w
Mantegna R (1994) Accurate algorithm for numerical simulation of levy stable stochastic processes. Physical Review E, Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics 49:4677–4683
Article Google Scholar
Zhang G, Zhang C, Zhang H (2018) Improved k-means algorithm based on density canopy. Knowledge-Based Systems 145:289–297. https://doi.org/10.1016/j.knosys.2018.01.031
Article Google Scholar
Yuan X-T, Hu B-G, He R (2012) Agglomerative mean-shift clustering. IEEE Trans Knowl Data Eng 24(2):209–219. https://doi.org/10.1109/TKDE.2010.232
Article Google Scholar
de Amorim RC, Hennig C (2015) Recovering the number of clusters in data sets with noise features using feature rescaling factors. Inf Sci 324:126–145. https://doi.org/10.1016/j.ins.2015.06.039
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China [Grant Nos. U1931209], Key Research and Development Projects of Shanxi Province [Grant Nos. 201903D121116], Central Government Guides Local Science and Technology Development Funds [Grant No. 20201070], Fundamental Research Program of Shanxi Province [Grant Nos. 20210302123223, 202103021224275], and the Graduate Education Innovation Project of Shanxi Province [Grant No.2019BY117].

Author information

Authors and Affiliations

School of Computer Science and Technology, Taiyuan University of Science and Technology, No.66 WaLiu Road, Taiyuan, 030024, ShanXi, China
Bo Liang, JiangHui Cai & HaiFeng Yang
School of Computer Science and Technology, Taiyuan Normal University, No.319 University Road, YuCi, 030619, ShanXi, China
Bo Liang
North University of China, No. 3 XueYuan Road, Taiyuan, 030051, ShanXi, China
JiangHui Cai

Authors

Bo Liang
View author publications
You can also search for this author in PubMed Google Scholar
JiangHui Cai
View author publications
You can also search for this author in PubMed Google Scholar
HaiFeng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to JiangHui Cai.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liang, B., Cai, J. & Yang, H. Grid-DPC: Improved density peaks clustering based on spatial grid walk. Appl Intell 53, 3221–3239 (2023). https://doi.org/10.1007/s10489-022-03705-y

Download citation

Accepted: 29 April 2022
Published: 25 May 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03705-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Grid-DPC: Improved density peaks clustering based on spatial grid walk

Abstract

Access this article

Similar content being viewed by others

DPCG: an efficient density peaks clustering algorithm based on grid

An Efficient Clustering Algorithm Based on Grid Density and its Application in Human Mobility Analysis

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Grid-DPC: Improved density peaks clustering based on spatial grid walk

Abstract

Access this article

Similar content being viewed by others

DPCG: an efficient density peaks clustering algorithm based on grid

An Efficient Clustering Algorithm Based on Grid Density and its Application in Human Mobility Analysis

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation