As data mining having attracted a significant amount of research attention, many clustering algorithms have been proposed in the past decades. However, most of existing clustering methods have high computational time or are not suitable for discovering clusters with non-convex shape. In this paper, an efficient clustering algorithm CHSMST is proposed, which is based on clustering based on hyper surface (CHS) and minimum spanning tree. In the first step, CHSMST applies CHS to obtain initial clusters immediately. Thereafter, minimum spanning tree is introduced to handle locally dense data which is hard for CHS to deal with. The experiments show that CHSMST can discover clusters with arbitrary shape. Moreover, CHSMST is insensitive to the order of input samples and the run time of the algorithm increases moderately as the scale of dataset becomes large.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the ACM SIGMOD international conference on management of data. Seattle, Washington, DC, USA, pp 94–105
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
MATH
Ester M, Kriegel HP, Sander J, Xu XW (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, pp 226–231
Everitt B, Landau S, Leese M (2001) Cluster analysis. Arnold, London
MATH
Gao XB, Xie WX (2000) Advances in theory and applications of fuzzy clustering. Chin Sci Bull 45(11):961–970
CrossRefMATH
Gao XB, Ji H, Li J (2002) An advanced cluster analysis method based on statistical test. In: Proceedings of IEEE international conference on signal processing, pp 1100–1103
Gao XB, Li J, Tao DC, Li XL (2007) Fuzziness measurement of fuzzy sets and its application in cluster validity analysis. Int J Fuzzy Syst 9(4):188–197
MathSciNet
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: Proceedings of ACM SIGMOD international conference on management of data, pp 73–84
Han JW, Kamber M (2001) Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco
Hansen P, Jaumard B (1997) Cluster analysis and mathematical programming. Math Program 79:191–215
MathSciNetMATH
Hathaway R, Bezdek J (2001) Fuzzy c-means clustering of incomplete data. IEEE Trans Syst Man Cybern, Part B 31(5):735–744
CrossRef
He Q, Shi ZZ, Ren LA (2002) The classification method based on hyper surface. In: Proceedings of the 2002 international joint conference on neural networks, pp 1499–1503
He Q, Shi ZZ, Ren LA, Lee ES (2003) A novel classification method based on hypersurface. Int J Math Comput Model 38:395–407
MathSciNetCrossRefMATH
He Q, Zhao XR, Shi ZZ (2008) Minimal consistent subset for hyper surface classification method. Int J Pattern Recognit Artif Intell 22(1):95–108
CrossRef
Hung M, Yang D (2001) An efficient fuzzy c-means clustering algorithm. In: Proceedings of IEEE international conference on data mining, pp 225–232
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
MATH
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–36
CrossRef
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
CrossRef
Karypis G, Han E, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. IEEE Comput 32(8):68–75
Kolen J, Hutcheson T (2002) Reducing the time complexity of the fuzzy c-means algorithm. IEEE Trans Fuzzy Syst 10(2):263–267
CrossRef
Kruskal JB (1956) On the shortest spanning subtree of a graph and the traveling salesman problem. Proc Am Math Soc 7:48–50
McQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–287
Ng RT, Han JW (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th VLDB conference, pp 144–155
Prim RC (1957) Shortest connection networks and some generalizations. Bell Syst Tech J 36(1):1389–1401
Raraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition, Part I and II. IEEE Trans Syst Man Cybern, Part B 29(6):778–801
CrossRef
Stonebraker M, Frew J, Gardels K, Meredith J (1993) The SEQUOIA 2000 storage benchmark. In: Proceedings of ACM SIGMOD international conference on management of data, pp 2–11
Wu KL, Yu J, Yang MS (2005) A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests. Pattern Recognit Lett 26:639–652
CrossRef
Xu R, Wunsch II D (2005) Survey of clustering algorithms. IEEE Trans Neural Net 16(3):645–678
CrossRef
Yu J (2005) General c-means clustering model. IEEE Trans Pattern Anal Mach Intell 27(8):1197–1211
CrossRef
Yu J, Cheng QS, Huang HK (2004) Analysis of the weighting exponent in the FCM. IEEE Trans Syst Man Cybern, Part B: Cybern 34(1):634–639
CrossRef
Yu J, Yang MS (2005) Optimality test for generalized FCM and its application on parameter selection. IEEE Trans Fuzzy Syst 13(1):164–176
CrossRef
Über diesen Artikel
Titel
CHSMST: a clustering algorithm based on hyper surface and minimum spanning tree
Autoren:
Qing He Weizhong Zhao Zhongzhi Shi
Publikationsdatum
01.06.2011
DOI
https://doi.org/10.1007/s00500-010-0585-z
Verlag
Springer-Verlag
Zeitschrift
Soft Computing
A Fusion of Foundations, Methodologies and Applications
Ausgabe 6/2011
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.
Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis. Jetzt gratis downloaden!