nach oben

Soft Computing

Erschienen in:

25.08.2020 | Methodologies and Application

A methodology for automatic parameter-tuning and center selection in density-peak clustering methods

Erschienen in: Soft Computing | Ausgabe 2/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The density-peak clustering algorithm, which we refer to as DPC, is a novel and efficient density-based clustering approach. The method has the advantage of allowing non-convex clusters, and clusters of variable size and density, to be grouped together, but it also has some limitations, such as the visual location of centers and the parameter tuning. This paper describes an optimization-based methodology for automatic parameter/center selection applicable both to the DPC and to other algorithms derived from it. The objective function is an internal/external cluster validity index, and the decisions are the parameterization of the algorithm and the choice of centers. The internal validation measures lead to an automatic parameter-tuning process, and the external validation measures lead to the so-called optimal rules, which are a tool to bound the performance of a given algorithm from above on the set of parameterizations. A numerical experiment with real data was performed for the DPC and for the fuzzy weighted k-nearest neighbor (FKNN-DPC) which validates the automatic parameter-tuning methodology and demonstrates its efficiency compared to the state of the art.

Vorheriger Artikel Reassessments of gross domestic product model for fractional derivatives with non-singular and singular kernels

Nächster Artikel Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Accessing scopus on 27th September 2019 gave 1475 references

Bai L, Cheng X, Liang J, Shen H, Guo Y (2017) Fast density clustering strategies based on the \(k-\)means algorithm. Pattern Recognit 71:375–386CrossRef

Bie R, Mehmood R, Ruan S, Sun Y, Dawood H (2016) Adaptive fuzzy clustering by fast search and find of density peaks. Pers Ubiquit Comput 20(5):785–793CrossRef

Bu F, Chen Z, Li P, Tang T, Zhang Y (2016) A high-order CFS algorithm for clustering big data. Mob Inf Syst 2016(4356127):1–8

Chen G, Zhang X, Wang Z, Li F (2015) Robust support vector data description for outlier detection with noise or uncertain data. Knowl-Based Syst 90:129–137CrossRef

Chen J-Y, He H-H (2015) Research on density-based clustering algorithm for mixed data with determine cluster centers automatically. Acta Autom Sin 41(10):1798–1813MATH

Chen J-Y, He H-H (2016) A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data. Inf Sci 345:271–293CrossRef

Chen M, Li L, Wang B, Cheng J, Pan L, Chen X (2016) Effectively clustering by finding density backbone based-on kNN. Pattern Recognit 60:486–498CrossRef

Criminisi A, Shotton J, Konukoglu E (2011) Decision forests for classification, regression, density estimation, manifold. Microsoft Research technical report

Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

Ding J, Chen Z, He X, Zhan Y (2016) Clustering by finding density peaks based on Chebyshev’s inequality. In: Chinese control conference, CCC, pp 7169–7172

Ding J, He X, Yuan J, Jiang B (2018) Automatic clustering based on density peak detection using generalized extreme value distribution. Soft Comput 22(9):2777–2796CrossRef

Du M, Ding S, Jia H (2016) Study on density peaks clustering based on \(k-\)nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145CrossRef

Du M, Ding S, Xue Y (2017) A novel density peaks clustering algorithm for mixed data. Pattern Recognit Lett 97:46–53CrossRef

Gao J, Zhao L, Chen Z, Li P, Xu H, Hu Y (2016) ICFS: an improved fast search and find of density peaks clustering algorithm. In: Proceedings—2016 IEEE 14th international conference on dependable, autonomic and secure computing, DASC 2016, 2016 IEEE 14th international conference on pervasive intelligence and computing, PICom 2016, 2016 IEEE 2nd international conference on big data intelligence and computing, DataCom 2016 and 2016 IEEE Cyber Science and Technology Congress, CyberSciTech 2016, DASC-PICom-DataCom-CyberSciTech 2016, pp 537–543

Gong S, Zhang Y (2016) EDDPC: an efficient distributed density peaks clustering algorithm. Comput Res Dev 53(6):1400–1409

Guo P, Xing W, Yubing W, Yue C, Ying Z (2017) Research on automatic determining clustering centers algorithm based on linear regression analysis. In: 2nd International conference on image, vision and computing, pp 1016–1023

Hofmeyr DP (2017) Clustering by minimum cut hyperplanes. IEEE Trans Pattern Anal Mach Intell 39(8):1547–1560CrossRef

Hua J-L, Yu J, Yang M-S (2016) Correlative density-based clustering. J Comput Theor Nanosci 13(10):6935–6943CrossRef

Jiang J, Hao D, Chen Y, Parmar M, Li K (2018) GDPC: gravitation-based density peaks clustering algorithm. Physica A 502:345–355CrossRef

Jinyin C, Xiang L, Haibing Z, Xintong B (2017) A novel cluster center fast determination clustering algorithm. Appl Soft Comput J 57:539–555CrossRef

Kun D, Ze W, Rui Z, Chao Y (2016) Clustering by exponential density analysis and find of cluster centers based on genetic algorithm. In: Proceedings of SPIE—the international society for optical engineering (ICDIP 2016), vol 10033

Lee K (2005) Yale face database B. http://vision.ucsd.edu/~leekc/ExtYaleDatabase/l

Li M, Huang J, Wang J (2016) Paralleled fast search and find of density peaks clustering algorithm on gpus with cuda. Int J Netw Distrib Comput 4(3):173–181

Li Z, Tang Y (2018) Comparative density peaks clustering. Expert Syst Appl 95:236–247CrossRef

Liang Z, Chen P (2016) Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recognit Lett 73:52–59CrossRef

Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226MathSciNetCrossRef

Liu S, Zhou B, Huang D, Shen L (2017) Clustering mixed data by fast search and find of density peaks. Math Probl Eng 2017(5060842):1–7

Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM ’10, pp 911–916. IEEE Computer Society, Washington

López-García ML, García-Ródenas R, Gómez AG (2015) K-means algorithms for functional data. Neurocomputing 151:231–245CrossRef

Lu J, Zhu Q (2017) An effective algorithm based on density clustering framework. IEEE Access 5:4991–5000CrossRef

Mehmood R, Bie R, Jiao L, Dawood H, Sun Y (2016a) Adaptive cutoff distance: clustering by fast search and find of density peaks. J Intell Fuzzy Sys 31(5):2619–2628CrossRef

Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016b) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208:210–217CrossRef

Rodríguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRef

Rosenberg A, Hirschberg J (2007) V-measure: a conditional entropy-based external cluster evaluation measure. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, vol 7, pp 410–420

Tabor J, Spurek P (2014) Cross-entropy clustering. Pattern Recognit 47(9):3046–3059CrossRef

Tao L, Li W, Jin Y (2017) An optimal density peak algorithm based on data field and information entropy. In: ACM international conference proceeding series, vol Part F128770

Wang G, Song Q (2016) Automatic clustering via outward statistical testing on density metrics. IEEE Trans Knowl Data Eng 28(8):1971–1985CrossRef

Wang J, Zhu C, Zhou Y, Zhu X, Wang Y, Zhang W (2017) From partition-based clustering to density-based clustering: fast find clusters with diverse shapes and densities in spatial databases. IEEE Access 6:1718–1729CrossRef

Wang M, Zuo W, Wang Y (2016) An improved density peaks-based clustering method for social circle discovery in social networks. Neurocomputing 179:219–227CrossRef

Wang X-F, Xu Y (2017) Fast clustering using adaptive density peak detection. Stat Methods Med Res 26(6):2800–2811MathSciNetCrossRef

Wiwie C, Baumbach J, Röttger R (2015) Comparing the performance of biomedical clustering methods. Nat Methods 12(11):1033–1038CrossRef

Xie J, Gao H, Xie W, Liu X, Grant P (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted \(k-\)nearest neighbors. Inf Sci 354:19–40CrossRef

Xu J, Wang G, Deng W (2016) DenPEHC: density peak based efficient hierarchical clustering. Inf Sci 373:200–218CrossRef

Xu X, Ding S, Xu H, Liao H, Xue Y (2019) A feasible density peaks clustering algorithm with a merging strategy. Soft Comput 23(13):5171–5183CrossRef

Yang X-H, Zhu Q-P, Huang Y-J, Xiao J, Wang L, Tong F-C (2017) Parameter-free laplacian centrality peaks clustering. Pattern Recognit Lett 100:167–173CrossRef

Yaohui L, Zhengming M, Fang Y (2017) Adaptive density peak clustering based on \(k\)-nearest neighbors with aggregating strategy. Knowl-Based Syst 133:208–220CrossRef

Zang W, Ren L, Zhang W, Liu X (2017) Automatic density peaks clustering using DNA genetic algorithm optimized data field and Gaussian process. Int J Pattern Recognit Artif Intell 31(8)

Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Tech. Rep., pp 01–04

Titel: A methodology for automatic parameter-tuning and center selection in density-peak clustering methods
Publikationsdatum: 25.08.2020
Erschienen in: Soft Computing / Ausgabe 2/2021
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-020-05244-5

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2021

A new compound wind speed forecasting structure combining multi-kernel LSSVM with two-stage decomposition technique

Population-based Tabu search with evolutionary strategies for permutation flow shop scheduling problems under effects of position-dependent learning and linear deterioration

Rotor fault diagnosis of frequency inverter fed or line-connected induction motors using mutual information

Intelligent analysis framework for healthy environment spatial model of BIM horticultural therapy based on complex network information model

Generalized trapezoidal hesitant fuzzy numbers and their applications to multi criteria decision-making problems

Motion control of multiple humanoids using a hybridized prim’s algorithm-fuzzy controller

Premium Partner