Skip to main content
Erschienen in: Pattern Analysis and Applications 3/2021

12.05.2021 | Theoretical advances

A novel clustering algorithm by adaptively merging sub-clusters based on the Normal-neighbor and Merging force

verfasst von: Guan Junyi , Sheng li, He Xiongxiong, Chen Jiajia

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Clustering by fast search and find of density peaks (DPC) is a popular clustering method based on density and distance. In DPC, each non-center point’s cluster label is led by its nearest point with higher density, which may cause some misclassifications of non-center points and interfere with the choice of correct cluster centers in the decision graph. To avoid these defects, we propose a novel clustering algorithm that automatically generates clusters without using the decision graph based on the Normal-neighbor and Merging force (NM-DPC). We conduct a series of experiments on various challenging synthetic datasets. Experimental results demonstrate that NM-DPC can better identify clusters of complex shapes and automatically recognize the number of clusters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRef Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRef
2.
Zurück zum Zitat Hansen P, Jaumard B (1997) Cluster analysis and mathematical programming. Math Program 79(1–3):191–215MathSciNetMATH Hansen P, Jaumard B (1997) Cluster analysis and mathematical programming. Math Program 79(1–3):191–215MathSciNetMATH
3.
Zurück zum Zitat Xu R, Wunsch D II (2007) Computational intelligence in clustering algorithms, with applications Xu R, Wunsch D II (2007) Computational intelligence in clustering algorithms, with applications
4.
Zurück zum Zitat Xu R, Wunsch DC (2010) Clustering algorithms in biomedical research: a review. IEEE Rev Biomed Eng 3:120–154CrossRef Xu R, Wunsch DC (2010) Clustering algorithms in biomedical research: a review. IEEE Rev Biomed Eng 3:120–154CrossRef
5.
Zurück zum Zitat Jain AK, Dubes RC (1988) Algorithms for clustering data. Technometrics 32(2):227–229MATH Jain AK, Dubes RC (1988) Algorithms for clustering data. Technometrics 32(2):227–229MATH
6.
Zurück zum Zitat X Qian, Y Wu, M Li, Y Ren, S Jiang, Z Li (2020) LAST: location-appearance-semantic-temporal clustering based POI summarization. IEEE Trans Multimed X Qian, Y Wu, M Li, Y Ren, S Jiang, Z Li (2020) LAST: location-appearance-semantic-temporal clustering based POI summarization. IEEE Trans Multimed
7.
Zurück zum Zitat Wu Z, Leahy Richard M (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mach Intell 15(11):1101–1113CrossRef Wu Z, Leahy Richard M (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mach Intell 15(11):1101–1113CrossRef
8.
Zurück zum Zitat Berry Michael W, Castellanos Malu (2007) Survey of text mining: clustering, classification, and retrieval. Springer, Berlin Berry Michael W, Castellanos Malu (2007) Survey of text mining: clustering, classification, and retrieval. Springer, Berlin
9.
Zurück zum Zitat Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260MathSciNetCrossRef Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260MathSciNetCrossRef
10.
Zurück zum Zitat Macqueen J (1965) Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley symposium on mathematical statistics and probability Macqueen J (1965) Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley symposium on mathematical statistics and probability
11.
Zurück zum Zitat Jain AK (2008) Data clustering: 50 years beyond k-means. In: Machine learning and knowledge discovery in databases Jain AK (2008) Data clustering: 50 years beyond k-means. In: Machine learning and knowledge discovery in databases
12.
Zurück zum Zitat Rahman MA, Islam MZ (2014) A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl Based Syst 7:1345–365 Rahman MA, Islam MZ (2014) A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl Based Syst 7:1345–365
13.
Zurück zum Zitat Tzortzis G, Likas A (2014) The MinMax K-Means clustering algorithm. Pattern Recognit 47(7):2505–2516CrossRef Tzortzis G, Likas A (2014) The MinMax K-Means clustering algorithm. Pattern Recognit 47(7):2505–2516CrossRef
14.
Zurück zum Zitat Likas A, Vlassis N, Verbeek JJ (2003) The global K-Means clustering algorithm. Pattern Recognit 36(2):451–46CrossRef Likas A, Vlassis N, Verbeek JJ (2003) The global K-Means clustering algorithm. Pattern Recognit 36(2):451–46CrossRef
15.
Zurück zum Zitat Xie J, Jiang S, Xie W, Gao X (2011) An efficient global K-Means clustering algorithm. JCP 6(2):27–279 Xie J, Jiang S, Xie W, Gao X (2011) An efficient global K-Means clustering algorithm. JCP 6(2):27–279
17.
Zurück zum Zitat Ester M, Kriegel HP, Xu X, Sanders J (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining Ester M, Kriegel HP, Xu X, Sanders J (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining
18.
Zurück zum Zitat Han J, Kamber M (2006) Data mining: concepts and techniques. In: Data mining concepts models methods and, 2nd edn, vol 5, no 4, pp 1–18 Han J, Kamber M (2006) Data mining: concepts and techniques. In: Data mining concepts models methods and, 2nd edn, vol 5, no 4, pp 1–18
19.
Zurück zum Zitat Xu R, Wunsch DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678CrossRef Xu R, Wunsch DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678CrossRef
20.
Zurück zum Zitat Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496CrossRef Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496CrossRef
21.
Zurück zum Zitat Pizzagalli Diego Ulisse, Gonzalez Santiago F, Krause Rolf (2019) A trainable clustering algorithm based on shortest paths from density peaks. Sci Adv 5(10):eaax3770CrossRef Pizzagalli Diego Ulisse, Gonzalez Santiago F, Krause Rolf (2019) A trainable clustering algorithm based on shortest paths from density peaks. Sci Adv 5(10):eaax3770CrossRef
22.
Zurück zum Zitat Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors. Inf Sci 354:19–40CrossRef Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors. Inf Sci 354:19–40CrossRef
23.
Zurück zum Zitat Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145CrossRef Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145CrossRef
24.
Zurück zum Zitat Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226MathSciNetCrossRef Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226MathSciNetCrossRef
25.
Zurück zum Zitat Jain AK, Law MH (2005) Data clustering: a user’s dilemma. In: International conference on pattern recognition and machine intelligence, pp 1–10 Jain AK, Law MH (2005) Data clustering: a user’s dilemma. In: International conference on pattern recognition and machine intelligence, pp 1–10
26.
Zurück zum Zitat Ball GH, Hall DJ (1965) ISODATA, a novel method of data analysis and pattern classification. Stanford Research Iinst, Menlo Park CA Ball GH, Hall DJ (1965) ISODATA, a novel method of data analysis and pattern classification. Stanford Research Iinst, Menlo Park CA
27.
Zurück zum Zitat Chang H, Yeung D-Y (2008) Robust path-based spectral clustering. Pattern Recognit 41(1):191–203CrossRef Chang H, Yeung D-Y (2008) Robust path-based spectral clustering. Pattern Recognit 41(1):191–203CrossRef
28.
Zurück zum Zitat Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86CrossRef Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86CrossRef
29.
Zurück zum Zitat Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8(1):1–15CrossRef Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8(1):1–15CrossRef
30.
Zurück zum Zitat Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4CrossRef Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4CrossRef
31.
Zurück zum Zitat Frnti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–775CrossRef Frnti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–775CrossRef
32.
Zurück zum Zitat Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280CrossRef Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280CrossRef
33.
Zurück zum Zitat L Zelnikmanor, P Perona (2004) Self-tuning spectral clustering. Neural Inf Process Syst L Zelnikmanor, P Perona (2004) Self-tuning spectral clustering. Neural Inf Process Syst
34.
Zurück zum Zitat Vinh HX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11(1):2837–2854MathSciNetMATH Vinh HX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11(1):2837–2854MathSciNetMATH
35.
Zurück zum Zitat Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc 78(383):553–569CrossRef Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc 78(383):553–569CrossRef
36.
Zurück zum Zitat Franti Pasi, Virmajoki Olli, Hautamaki Ville (2006) Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881CrossRef Franti Pasi, Virmajoki Olli, Hautamaki Ville (2006) Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881CrossRef
37.
Zurück zum Zitat Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142 Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142
Metadaten
Titel
A novel clustering algorithm by adaptively merging sub-clusters based on the Normal-neighbor and Merging force
verfasst von
Guan Junyi
Sheng li
He Xiongxiong
Chen Jiajia
Publikationsdatum
12.05.2021
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 3/2021
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-021-00981-1

Weitere Artikel der Ausgabe 3/2021

Pattern Analysis and Applications 3/2021 Zur Ausgabe

Premium Partner