Skip to main content
Erschienen in: Soft Computing 13/2019

11.04.2018 | Methodologies and Application

A feasible density peaks clustering algorithm with a merging strategy

verfasst von: Xiao Xu, Shifei Ding, Hui Xu, Hongmei Liao, Yu Xue

Erschienen in: Soft Computing | Ausgabe 13/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Density peaks clustering (DPC) algorithm is a novel algorithm that efficiently deals with the complex structure of the data sets by finding the density peaks. It needs neither iterative process nor more parameters. The density–distance is utilized to find the density peaks in the DPC algorithm. But unfortunately, it will divide one cluster into multiple clusters if there are multiple density peaks in one cluster and ineffective when data sets have relatively higher dimensions. To overcome the first problem, we propose a FDPC algorithm based on a novel merging strategy motivated by support vector machine. First, the strategy utilizes the support vectors to calculate the feedback values between every two clusters after clustering based on the DPC. Then, it merges clusters to obtain accurate clustering results in a recursive way according to the feedback values. To address the second limitation, we introduce nonnegative matrix factorization into the FDPC to preprocess high-dimensional data sets before clustering. The experimental results on real-world data sets and artificial data sets demonstrate that our algorithm is robust and flexible and can recognize arbitrary shapes of the clusters effectively regardless of the space dimension and outperforms DPC.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bai L, Cheng X, Liang J et al (2017) Fast density clustering strategies based on the k-means algorithm. Pattern Recogn 71:375–386CrossRef Bai L, Cheng X, Liang J et al (2017) Fast density clustering strategies based on the k-means algorithm. Pattern Recogn 71:375–386CrossRef
Zurück zum Zitat Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial-temporal data. Data Know Eng 60(1):208–221CrossRef Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial-temporal data. Data Know Eng 60(1):208–221CrossRef
Zurück zum Zitat Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41(1):191–203CrossRefMATH Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41(1):191–203CrossRefMATH
Zurück zum Zitat Deng L (2012) The MNIST database of handwritten digit images for machine learning research. IEEE Signal Process Mag 29(6):141–142CrossRef Deng L (2012) The MNIST database of handwritten digit images for machine learning research. IEEE Signal Process Mag 29(6):141–142CrossRef
Zurück zum Zitat Ding S, Jia H, Shi Z (2014) Spectral clustering algorithm based on adaptive Nystrom sampling for big data analysis. J Softw 25(9):2037–2049MATH Ding S, Jia H, Shi Z (2014) Spectral clustering algorithm based on adaptive Nystrom sampling for big data analysis. J Softw 25(9):2037–2049MATH
Zurück zum Zitat Ding S, Zhang X, Yu J (2016) Twin support vector machines based on fruit fly optimization algorithm. J Int J Mach Learn Cybern 7(2):193–203CrossRef Ding S, Zhang X, Yu J (2016) Twin support vector machines based on fruit fly optimization algorithm. J Int J Mach Learn Cybern 7(2):193–203CrossRef
Zurück zum Zitat Ding S, Du M, Sun T et al (2017) An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood. Know Based Syst 133:294–313CrossRef Ding S, Du M, Sun T et al (2017) An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood. Know Based Syst 133:294–313CrossRef
Zurück zum Zitat Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145CrossRef Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145CrossRef
Zurück zum Zitat Fraley C, Raftery A (2011) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631MathSciNetCrossRefMATH Fraley C, Raftery A (2011) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631MathSciNetCrossRefMATH
Zurück zum Zitat Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. Acm Trans Know Discov Data 1(1):341–352 Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. Acm Trans Know Discov Data 1(1):341–352
Zurück zum Zitat Gu B, Sheng V (2016) A Robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8 Gu B, Sheng V (2016) A Robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8
Zurück zum Zitat Gu B, Sheng V, Wang Z et al (2015) Incremental learning for \(\nu \)-support vector regression. Neural Netw Off J Int Neural Netw Soc 67:140–150CrossRefMATH Gu B, Sheng V, Wang Z et al (2015) Incremental learning for \(\nu \)-support vector regression. Neural Netw Off J Int Neural Netw Soc 67:140–150CrossRefMATH
Zurück zum Zitat Jia H, Ding S, Du M (2015) Self-tuning p-spectral clustering based on shared nearest neighbors. Cognit Comput 7(5):1–11CrossRef Jia H, Ding S, Du M (2015) Self-tuning p-spectral clustering based on shared nearest neighbors. Cognit Comput 7(5):1–11CrossRef
Zurück zum Zitat Kanungo T, Mount D, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRef Kanungo T, Mount D, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRef
Zurück zum Zitat Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: International conference on neural information processing systems. MIT Press, pp 535–541 Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: International conference on neural information processing systems. MIT Press, pp 535–541
Zurück zum Zitat Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRefMATH Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRefMATH
Zurück zum Zitat Lee N, Tang R, Priebe C et al (2016) A model selection approach for clustering a multinomial sequence with non-negative factorization. IEEE Trans Pattern Anal Mach Intell 38(12):2345–2358CrossRef Lee N, Tang R, Priebe C et al (2016) A model selection approach for clustering a multinomial sequence with non-negative factorization. IEEE Trans Pattern Anal Mach Intell 38(12):2345–2358CrossRef
Zurück zum Zitat Li C, Li L, Zhang J et al (2012) Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi. J Comput Chem 33(24):1960–1966CrossRef Li C, Li L, Zhang J et al (2012) Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi. J Comput Chem 33(24):1960–1966CrossRef
Zurück zum Zitat Ma Y, Cheng G, Liu Z et al (2017) Fuzzy nodes recognition based on spectral clustering in complex networks. Phys A 465:792–797CrossRef Ma Y, Cheng G, Liu Z et al (2017) Fuzzy nodes recognition based on spectral clustering in complex networks. Phys A 465:792–797CrossRef
Zurück zum Zitat Mehmood R, Zhang G, Bie R et al (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208(6191):210–217CrossRef Mehmood R, Zhang G, Bie R et al (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208(6191):210–217CrossRef
Zurück zum Zitat Morris K, Mcnicholas P (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat Data Anal 97:133–150MathSciNetCrossRefMATH Morris K, Mcnicholas P (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat Data Anal 97:133–150MathSciNetCrossRefMATH
Zurück zum Zitat Papadimitriou CH, Steiglitz K (1982) Combinatorial optimization: algorithms and complexity. IEEE Trans Acoust Speech Signal Process 32(6):1258–1259MATH Papadimitriou CH, Steiglitz K (1982) Combinatorial optimization: algorithms and complexity. IEEE Trans Acoust Speech Signal Process 32(6):1258–1259MATH
Zurück zum Zitat Rodríguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef Rodríguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef
Zurück zum Zitat Ros F, Guillaume S (2016) DENDIS: a new density-based sampling for clustering algorithm. Expert Syst Appl 56:349–359CrossRef Ros F, Guillaume S (2016) DENDIS: a new density-based sampling for clustering algorithm. Expert Syst Appl 56:349–359CrossRef
Zurück zum Zitat Samaria F, Harter A (1994) Parameterisation of a stochastic model for human face identification. Proc Second IEEE Workshop Appl Comput Vis 1995:138–142 Samaria F, Harter A (1994) Parameterisation of a stochastic model for human face identification. Proc Second IEEE Workshop Appl Comput Vis 1995:138–142
Zurück zum Zitat Sampat M, Wang Z, Gupta S et al (2009) Complex wavelet structural similarity: a new image similarity index. IEEE Trans Image Process 18(11):2385–2401MathSciNetCrossRefMATH Sampat M, Wang Z, Gupta S et al (2009) Complex wavelet structural similarity: a new image similarity index. IEEE Trans Image Process 18(11):2385–2401MathSciNetCrossRefMATH
Zurück zum Zitat Trigeorgis G, Bousmalis K, Zafeiriou S et al (2017) A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Intell 39(3):417–429CrossRef Trigeorgis G, Bousmalis K, Zafeiriou S et al (2017) A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Intell 39(3):417–429CrossRef
Zurück zum Zitat Wang XF, Xu Y (2015) Fast clustering using adaptive density peak detection. Stat Methods Med Res 26(6):2800–281MathSciNetCrossRef Wang XF, Xu Y (2015) Fast clustering using adaptive density peak detection. Stat Methods Med Res 26(6):2800–281MathSciNetCrossRef
Zurück zum Zitat Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K -nearest neighbors. Inf Sci 354:19–40CrossRef Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K -nearest neighbors. Inf Sci 354:19–40CrossRef
Zurück zum Zitat Zhang Y, Cheny S, Yu G (2016) Efficient Distributed Density Peaks for Clustering Large Data Sets in MapReduce. IEEE Trans Knowl Data Eng 28(12):3218–3230CrossRef Zhang Y, Cheny S, Yu G (2016) Efficient Distributed Density Peaks for Clustering Large Data Sets in MapReduce. IEEE Trans Knowl Data Eng 28(12):3218–3230CrossRef
Zurück zum Zitat Zhou L, Pei C (2016) Delta-distance based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73:52–59CrossRef Zhou L, Pei C (2016) Delta-distance based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73:52–59CrossRef
Metadaten
Titel
A feasible density peaks clustering algorithm with a merging strategy
verfasst von
Xiao Xu
Shifei Ding
Hui Xu
Hongmei Liao
Yu Xue
Publikationsdatum
11.04.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 13/2019
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-018-3183-0

Weitere Artikel der Ausgabe 13/2019

Soft Computing 13/2019 Zur Ausgabe

Premium Partner