Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 5/2018

26-09-2016 | Original Article

DPCG: an efficient density peaks clustering algorithm based on grid

Authors: Xiao Xu, Shifei Ding, Mingjing Du, Yu Xue

Published in: International Journal of Machine Learning and Cybernetics | Issue 5/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

To deal with the complex structure of the data set, density peaks clustering algorithm (DPC) was proposed in 2014. The density and the delta-distance are utilized to find the clustering centers in the DPC method. It detects outliers efficiently and finds clusters of arbitrary shape. But unfortunately, we need to calculate the distance between all data points in the first process, which limits the running speed of DPC algorithm on large datasets. To address this issue, this paper introduces a novel approach based on grid, called density peaks clustering algorithm based on grid (DPCG). This approach can overcome the operation efficiency problem. When calculating the local density, the idea of the grid is introduced to reduce the computation time based on the DPC algorithm. Neither it requires calculating all the distances nor much input parameters. Moreover, DPCG algorithm successfully inherits the all merits of the DPC algorithm. Experimental results on UCI data sets and artificial data show that the DPCG algorithm is flexible and effective.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Morris K, Mcnicholas PD (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat Data Anal 97:133–150MathSciNetCrossRef Morris K, Mcnicholas PD (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat Data Anal 97:133–150MathSciNetCrossRef
2.
go back to reference Cai R, Zhang Z, Tung AKH et al (2014) A general framework of hierarchical clustering and its applications. Inf Sci 272(C):29–48MathSciNetCrossRef Cai R, Zhang Z, Tung AKH et al (2014) A general framework of hierarchical clustering and its applications. Inf Sci 272(C):29–48MathSciNetCrossRef
3.
go back to reference Yu Z, Luo P, You J et al (2015) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714CrossRef Yu Z, Luo P, You J et al (2015) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714CrossRef
4.
go back to reference Xia Z, Wang X, Zhang L et al (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Transactions on Information Forensics and Security 1–1 Xia Z, Wang X, Zhang L et al (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Transactions on Information Forensics and Security 1–1
5.
go back to reference Yu Z, Li L, Liu J et al (2015) Adaptive noise immune cluster ensemble using affinity propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189CrossRef Yu Z, Li L, Liu J et al (2015) Adaptive noise immune cluster ensemble using affinity propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189CrossRef
6.
go back to reference Yu Z, Chen H, You J et al (2015) Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data. IEEE/ACM Trans Comput Biol Bioinf 12(4):887–901CrossRef Yu Z, Chen H, You J et al (2015) Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data. IEEE/ACM Trans Comput Biol Bioinf 12(4):887–901CrossRef
7.
go back to reference Ren Y, Shen J, Wang J et al (2015) Mutual verifiable provable data auditing in public cloud storage. J Internet Technol 16(2):317–323 Ren Y, Shen J, Wang J et al (2015) Mutual verifiable provable data auditing in public cloud storage. J Internet Technol 16(2):317–323
8.
go back to reference Bayat A (2013) Uniform voltage distribution based constructive algorithm for optimal reconfiguration of electric distribution networks. Electr Power Syst Res 104(9):146–155CrossRef Bayat A (2013) Uniform voltage distribution based constructive algorithm for optimal reconfiguration of electric distribution networks. Electr Power Syst Res 104(9):146–155CrossRef
9.
go back to reference Yu Z, Zhu X, Wong HS et al (2016) Distribution-Based Cluster Structure Selection. IEEE Transactions on Cybernetics Yu Z, Zhu X, Wong HS et al (2016) Distribution-Based Cluster Structure Selection. IEEE Transactions on Cybernetics
10.
go back to reference Zhang Z, Ding F, Liu X (2011) Hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average systems. Comput Math Appl 61(3):672–682MathSciNetCrossRefMATH Zhang Z, Ding F, Liu X (2011) Hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average systems. Comput Math Appl 61(3):672–682MathSciNetCrossRefMATH
11.
go back to reference Fraley C, Raftery AE (2011) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631MathSciNetCrossRefMATH Fraley C, Raftery AE (2011) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631MathSciNetCrossRefMATH
12.
go back to reference He Y, Tan H, Luo W, et al (2011) MR-DBSCAN: an efficient parallel density-based clustering algorithm using mapreduce. International Conference on Parallel and Distributed Systems Proceedings, pp 473–480 He Y, Tan H, Luo W, et al (2011) MR-DBSCAN: an efficient parallel density-based clustering algorithm using mapreduce. International Conference on Parallel and Distributed Systems Proceedings, pp 473–480
13.
go back to reference Li C, Li L, Zhang J et al (2012) Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi. J Comput Chem 33(24):1960–1966CrossRef Li C, Li L, Zhang J et al (2012) Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi. J Comput Chem 33(24):1960–1966CrossRef
14.
go back to reference Ros F, Guillaume S (2016) DENDIS: a new density-based sampling for clustering algorithm. Expert Syst Appl 56:349–359CrossRef Ros F, Guillaume S (2016) DENDIS: a new density-based sampling for clustering algorithm. Expert Syst Appl 56:349–359CrossRef
15.
go back to reference Nanda SJ, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38CrossRef Nanda SJ, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38CrossRef
16.
go back to reference Yue SH, Wang JS, Tao G et al (2010) An unsupervised grid-based approach for clustering analysis. Sciece China Inf Sci 53(7):1345–1357CrossRef Yue SH, Wang JS, Tao G et al (2010) An unsupervised grid-based approach for clustering analysis. Sciece China Inf Sci 53(7):1345–1357CrossRef
17.
go back to reference Huang J, Zhang X (2013) An incremental grid clustering algorithm based on density-dimension-tree. International Conference on Machine Learning and Cybernetics. IEEE, pp 356–361 Huang J, Zhang X (2013) An incremental grid clustering algorithm based on density-dimension-tree. International Conference on Machine Learning and Cybernetics. IEEE, pp 356–361
18.
go back to reference Zhang D, Tian H, Ying P et al (2012) A clustering algorithm based on density-grid for stream data. International Conference on Parallel and Distributed Computing, Applications, and Technologies (PDCAT), 398–403 Zhang D, Tian H, Ying P et al (2012) A clustering algorithm based on density-grid for stream data. International Conference on Parallel and Distributed Computing, Applications, and Technologies (PDCAT), 398–403
19.
go back to reference Zhong Z (2011) A kind of data stream clustering algorithm based on grid-density. Commun Comput Inf Sci 215:418–423 Zhong Z (2011) A kind of data stream clustering algorithm based on grid-density. Commun Comput Inf Sci 215:418–423
20.
go back to reference Rodrigitez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef Rodrigitez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef
21.
go back to reference Zheng Y, Byeungwoo J, Xu D et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):4024–4028 Zheng Y, Byeungwoo J, Xu D et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):4024–4028
22.
go back to reference Arora P, Deepali, Varshney S (2016) Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput Sci 78:507–512CrossRef Arora P, Deepali, Varshney S (2016) Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput Sci 78:507–512CrossRef
23.
go back to reference Jia H, Ding S, Du M (2015) Self-tuning P-spectral clustering based on shared nearest neighbors. Cogn Comput 7(5):622–632CrossRef Jia H, Ding S, Du M (2015) Self-tuning P-spectral clustering based on shared nearest neighbors. Cogn Comput 7(5):622–632CrossRef
24.
go back to reference Yu Z, Li L, You J et al (2012) SC3: triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles. IEEE/ACM Trans Comput Biol Bioinf 9(6):1751–1765CrossRef Yu Z, Li L, You J et al (2012) SC3: triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles. IEEE/ACM Trans Comput Biol Bioinf 9(6):1751–1765CrossRef
25.
go back to reference Cheng Q, Liu Z, Huang J et al (2016) Community detection in hypernetwork via density-ordered tree partition. Appl Math Comput 276(C):384–393 Cheng Q, Liu Z, Huang J et al (2016) Community detection in hypernetwork via density-ordered tree partition. Appl Math Comput 276(C):384–393
26.
go back to reference Yin Y, Wang X, Xu D et al (2016) Robust visual detection-learning-tracking framework for autonomous aerial refueling of UAVs. IEEE Trans Instrum Meas 65(3):1–12CrossRef Yin Y, Wang X, Xu D et al (2016) Robust visual detection-learning-tracking framework for autonomous aerial refueling of UAVs. IEEE Trans Instrum Meas 65(3):1–12CrossRef
27.
go back to reference Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145CrossRef Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145CrossRef
28.
go back to reference Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402CrossRef Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402CrossRef
29.
go back to reference Guo L, Lin JH, Guo Q et al (2015) Identifying multiple influential spreaders in term of the distance-based coloring. Phys Lett A 380(7–8):837–842 Guo L, Lin JH, Guo Q et al (2015) Identifying multiple influential spreaders in term of the distance-based coloring. Phys Lett A 380(7–8):837–842
30.
31.
go back to reference Kumar KM, Reddy ARM (2016) A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recogn 58:39–48CrossRef Kumar KM, Reddy ARM (2016) A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recogn 58:39–48CrossRef
32.
go back to reference Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRef Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRef
33.
go back to reference Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K -nearest neighbors. Inf Sci 354:19–40CrossRef Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K -nearest neighbors. Inf Sci 354:19–40CrossRef
34.
go back to reference Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402CrossRef Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402CrossRef
35.
go back to reference Liang Z, Chen P (2016) Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73(C):52–59 Liang Z, Chen P (2016) Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73(C):52–59
36.
go back to reference Ban Z, Liu J, Yuan L et al (2015) A modified density-based clustering algorithm and its implementation. PATTERN RECOGNITION AND COMPUTER VISION, 9813 Ban Z, Liu J, Yuan L et al (2015) A modified density-based clustering algorithm and its implementation. PATTERN RECOGNITION AND COMPUTER VISION, 9813
37.
go back to reference Wang D, Zhang B, Wang K (2014) A vertex-clustering algorithm based on the cluster-clique. Algorithms Archit Parallel Process 8631:376–385 Wang D, Zhang B, Wang K (2014) A vertex-clustering algorithm based on the cluster-clique. Algorithms Archit Parallel Process 8631:376–385
38.
go back to reference Bohn B, Garcke J, Griebel M (2016) A sparse grid based method for generative dimensionality reduction of high-dimensional data. J Comput Phys 309:1–17MathSciNetCrossRefMATH Bohn B, Garcke J, Griebel M (2016) A sparse grid based method for generative dimensionality reduction of high-dimensional data. J Comput Phys 309:1–17MathSciNetCrossRefMATH
39.
go back to reference Fu Z, Sun X, Liu Q, et al (2015) Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. Ieice Trans Commun E98.B(1):190–200 Fu Z, Sun X, Liu Q, et al (2015) Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. Ieice Trans Commun E98.B(1):190–200
40.
go back to reference Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352CrossRef Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352CrossRef
41.
go back to reference Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280CrossRef Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280CrossRef
42.
go back to reference Fränti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recogn 39(5):761–775CrossRefMATH Fränti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recogn 39(5):761–775CrossRefMATH
43.
go back to reference Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):341–352CrossRef Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):341–352CrossRef
44.
go back to reference Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinf 8(1):1–15CrossRef Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinf 8(1):1–15CrossRef
45.
go back to reference Kärkkäinen I, Fränti P (2002) Dynamic local search algorithm for the clustering problem. Technical Report A-2002-6, Department of Computer Science, University of Joensuu Kärkkäinen I, Fränti P (2002) Dynamic local search algorithm for the clustering problem. Technical Report A-2002-6, Department of Computer Science, University of Joensuu
Metadata
Title
DPCG: an efficient density peaks clustering algorithm based on grid
Authors
Xiao Xu
Shifei Ding
Mingjing Du
Yu Xue
Publication date
26-09-2016
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 5/2018
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0603-2

Other articles of this Issue 5/2018

International Journal of Machine Learning and Cybernetics 5/2018 Go to the issue