Abstract
K-means clustering is one of the most popular clustering algorithms and has been embedded in other clustering algorithms, e.g. the last step of spectral clustering. In this paper, we propose two techniques to improve previous k-means clustering algorithm by designing two different adjacent matrices. Extensive experiments on public UCI datasets showed the clustering results of our proposed algorithms significantly outperform three classical clustering algorithms in terms of different evaluation metrics.
Similar content being viewed by others
References
Abe S (2010) Feature selection and extraction, in support vector machines for pattern classification. Springer, London, pp 331–341
Arora P, Varshney S (2016) Analysis of k-means and k-medoids algorithm for big data. Proc Comput Sci 78:507–512
Bachem O et al. (2016) Approximate K-means++ in sublinear time. AAAI. Phoenix, Arizona USA: 1459–1467
Birch ZT (1996) BIRCH: an efficient data clustering method for very large databases, in SIGMOD, R.R. T. Zhang, M. Livny, Editor. New York: 103–114
Bryant A, Cios K (2018) RNN-DBSCAN: a density-based clustering algorithm using reverse nearest neighbor density estimates. IEEE Trans Knowl Data Eng 30(6):1109–1121
Capó M, Pérez A, Lozano JA (2017) An efficient approximation to the K-means clustering for massive data. Knowl-Based Syst 117:56–69
Cassisi C et al (2013) Enhancing density-based clustering: parameter reduction and outlier detection. Inf Syst 38(3):317–330
Chang L et al (2017) Fast and exact structural graph clustering. IEEE Trans Knowl Data Eng 29(2):387–401
Chen J et al (2018) FGCH: a fast and grid based clustering algorithm for hybrid data stream. Appl Intell 49(4):1228–1244
Deng Z et al (2015) A scalable and fast OPTICS for clustering trajectory big data. Clust Comput 18(2):549–562
Ding Y, Fu X (2016) Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188:233–238
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowledge Discov Data (TKDD) 2(4):17–57
Du L et al. (2015) Robust multiple kernel K-means using L21-norm. IJCAI.: Buenos Aires, Argentina: 3476–3482
Du, T., et al., (2018) Spectral clustering algorithm combining local covariance matrix with normalization. Neural Comput & Applic: 1–8.
Ferreira, M.R.P., F.d.A.T. de Carvalho, and E.C. Simões, Kernel-based hard clustering methods with kernelization of the metric and automatic weighting of the variables. Pattern Recogn, 2016. 51: p. 310–321.
Gan J, Tao Y (2015) DBSCAN revisited: mis-claim, un-fixability, and approximation. SIGMOD: Melbourne: 519–530
Gebru ID et al (2016) EM algorithms for weighted-data clustering with application to audio-visual scene analysis. IEEE Trans Pattern Anal Mach Intell 38(12):2402–2415
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases, in SIGMOD. ACM, Seattle, Washington, USA, pp 73–84
Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366
He L et al (2018) Fast large-scale spectral clustering via explicit feature mapping. IEEE Transactions on Cybernetics 49(3):1058–1071
Jayasumana S et al (2015) Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Trans Pattern Anal Mach Intell 37(12):2464–2477
Kodinariya TM, Makwana PR (2013) Review on determining number of cluster in K-means clustering. Int J 1(6):90–95
Lattanzi S et al (2015) Robust hierarchical k-Center clustering, in ITCS. ACM, Rehovot, Israel, pp 211–218
Lei C, Zhu X (2018) Unsupervised feature selection via local structure learning and sparse learning. Multimed Tools Appl 77(22):29605–29622
Liu F, Ye C, Zhu E (2017) Accurate grid-based clustering algorithm with diagonal grid searching and merging. ICAMMT. https://doi.org/10.1088/1757-899X/242/1/012123
Lv Y et al (2016) An efficient and scalable density-based clustering algorithm for datasets with complex structures. Neurocomputing 171:9–22
Malinen MI, Fränti P (2014) Balanced K-means for clustering. S+SSPR. : Berlin, Heidelberg: 32–41
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Data Mining Knowl Discov 2(1):86–97
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: Analysis and an algorithm, in NIPS: 849–856
Pavan KK, Rao AD, Sridhar G (2010) Single pass seed selection algorithm for k-means. Comput Sci 6(1):60–66
Sharma A, Sharma A (2017) KNN-DBSCAN: Using k-nearest neighbor information for parameter-free density based clustering, in ICICICT: Kannur, India: 787–792
Shiokawa H, Fujiwara Y, Onizuka M (2015) SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc VLDB Endow 8(11):1178–1189
Shiokawa H, Takahashi T, Kitagawa H (2018) ScaleSCAN: scalable density-based graph clustering, in DEXA: Cham: 18–34
Souza CR (2010) Kernel functions for machine learning applications. Creative Commons Attribution-Noncommercial-Share Alike 3:29–41
Sting WWYJMR (1997) A Statistical Information Grid Approach to Spatial Data Mining, in VLDB. : Athens, Greece: 186–195
Tibshirani R, Walther G, Hastie T (2002) Estimating the number of clusters in a data set via the gap statistic. J Royal Stat Soc: Ser B (Statistical Methodology) 63(2):411–423
Tremblay N et al. (2016) Compressive spectral clustering. ICML : New York: 1002–1011
Vajda S, Santosh KC (2016) A fast k-nearest neighbor classifier using unsupervised clustering, in RTIP2R : Singapore: 185–193
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Xu X et al. (2007) Scan: a structural clustering algorithm for networks. In KDD. ACM: San Jose, CA: 824–833
Zahra S et al (2015) Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf Sci 320:156–189
Zhang S (2018) Multiple-scale cost sensitive decision tree learning. World Wide Web 21:1787–1800. https://doi.org/10.1007/s11280-018-0619-5
Zhang S (2019) Cost-sensitive KNN classification. Neurocomputing. https://doi.org/10.1016/j.neucom.2018.11.101
Zheng W et al (2017) Dynamic graph learning for spectral feature selection. Multimed Tools Appl 77(22):29739–29755
Zheng W et al (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.06.029
Zhu X, Li X, Zhang S (2015) Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybernet 46(2):450–461
Zhu X et al (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Zhu X et al (2018) Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2858782
Zhu X et al (2018) One-step multi-view spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2873378
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, J., Liu, T. & Zhu, J. Weighted adjacent matrix for K-means clustering. Multimed Tools Appl 78, 33415–33434 (2019). https://doi.org/10.1007/s11042-019-08009-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08009-x