Skip to main content
Log in

Similar Coefficient of Cluster for Discrete Elements

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

This article proposes a new concept called Cluster Similar Coefficient (CSC) for discrete elements. CSC is not only used as a criterion to build cluster by hierarchical and non-hierarchical approaches but also to evaluate the quality of established clusters quality. Based on CSC, we also propose four algorithms: to determine the suitable number of clusters, to analyze the non-fuzzy clusters, to analyze the fuzzy clusters and to build clusters with given CSC. The proposed algorithms are performed by Matlab procedures that would allow users to perform efficiently and conveniently in practice. The numerical examples demonstrate suitability and advantages of using CSC as a criterion to build the clusters in comparing with others.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ayala-Ramirez, V., Obara-Kepowicz, M., Sanchez-Yanez, R.E. and Jaime-Rivas, R. (2003). Bayesian texture classification method using a random sampling scheme. In IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 2065–2069.

  • Babuška, R. (2012). Fuzzy modeling for control, vol. 12. Springer Science & Business Media.

  • Ball, G.H. and Hall, I. (1965). A novel method of data analysis and pattern classification. Isodata, A novel method of data analysis and pattern classification. Tch. Report 5RI, Project 5533.

  • Bock, H.H. (1974). Automatic classification. Vandenhoeck and Ruprechat.

  • Bora, D.J. and Gupta, A.K. (2014). Impact of exponent parameter value for the partition matrix on the performance of fuzzy c means algorithm. arXiv:1406.4007.

  • Brodatz, P. (1966). Textures: a photographic album for artists and designers. Dover Publications, New York.

    Google Scholar 

  • Cannon, R.L., Dave, J.V. and Bezdek, J.C. (1986). Efficient implementation of the fuzzy c-means clustering algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 8, 248–255.

  • Celebi, E. and Alpkocak, A. (2000). Clustering of texture features for content-based image retrieval. In Advances in Information Systems, pp. 216–225. Springer, Berlin.

  • Defays, D. (1977). An efficient algorithm for a complete link method. Comput. J. 20, 364–366.

    Article  MathSciNet  MATH  Google Scholar 

  • Dunn, J.C. (1974). Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104.

    Article  MathSciNet  MATH  Google Scholar 

  • Ester, M., Kriegel, H.P., Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd, vol. 96, pp. 226–231.

  • Fadili, M.J., Ruan, S., Bloyet, D. and Mazoyer, B. (2001). On the number of clusters and the fuzziness index for unsupervised FCA application to BOLD fMRI time series. Med. Image Anal. 5, 55–67.

    Article  Google Scholar 

  • Ganti, V., Gehrke, J. and Ramakrishnan, R. (1999). CACTUS–clustering categorical data using summaries. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge discovery and Data Mining, pp. 73–83. ACM.

  • Hall, L.O., Bensaid, A.M., Clarke, L.P., Velthuizen, R. P., Silbiger, M. S. and Bezdek, J. C. (1992). A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Trans. Neural Netw. 3, 672–682.

    Article  Google Scholar 

  • Haralick, R.M. (1979). Statistical and structural approaches to texture. Proc. IEEE 67, 786–804.

    Article  Google Scholar 

  • Hubert, L. and Arabie, P. (1985). Comparing partitions. J. Classif. 2, 193–218.

    Article  MATH  Google Scholar 

  • Hung, W.L. and Yang, J.H. (2015). Automatic clustering algorithm for fuzzy data. J. Appl. Stat. 42, 1503–1518.

    Article  MathSciNet  Google Scholar 

  • Jain, A.K. and Dubes, R.C. (1988). Algorithms for clustering data. Prentice-Hall, Englewood Cliffs.

    MATH  Google Scholar 

  • Johnson, R.A. and Wichern, D.W. (1992). Applied multivariate statistical analysis, 4. Prentice-Hall, Englewood Cliffs.

    MATH  Google Scholar 

  • Kaufman, L. and Rousseeuw, P. (1987). Clustering by means of medoids. North-Holland, Amsterdam.

    Google Scholar 

  • Keinosuke, F. (1990). Introduction to statistical pattern recognition. Academic Press, New York.

    MATH  Google Scholar 

  • Kohonen, T. (2012). Self-organization and associative memory, vol. 8. Springer Science & Business Media.

  • Lauritzen, S.L. (1995). The EM algorithm for graphical association models with missing data. Comput. Stat. Data Anal. 19, 191–201.

    Article  MATH  Google Scholar 

  • Li, J. and Wang, J.Z. (2008). Real-time computerized annotation of pictures. IEEE Trans. Pattern Anal. Mach. Intell. 30, 985–1002.

    Article  Google Scholar 

  • Lissack, T. and Fu, K.S. (1976). Error estimation in pattern recognition via distance between posterior density functions. IEEE Trans. Inf. Theory 22, 34–45.

    Article  MathSciNet  MATH  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and probability, vol. 1, pp. 281–297. Oakland.

  • Martinez, W.L. and Martinez, A.R. (2007). Computational Statistics Handbook with MATLAB, 2nd edn. Chapman & Hall/CRC Computer Science & Data Analysis. CRC Press, Boca Raton.

    MATH  Google Scholar 

  • Pal, N.R. and Bezdek, J.C. (1995). On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3, 370–379.

    Article  Google Scholar 

  • Popat, K. and Picard, R.W. (1997). Cluster-based probability model and its application to image and texture processing. https://doi.org/10.1109/83.551697.

  • Rand, W.M. (1971). Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc., 66. https://doi.org/10.1080/01621459.1971.10482356.

  • Sheikholeslami, G., Chatterjee, S. and Zhang, A. (1998). Wavecluster: a multi-resolution clustering approach for very large spatial databases. VLDB 98, 428–439.

    Google Scholar 

  • Sibson, R. (1973). SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16, 30–34.

    Article  MathSciNet  Google Scholar 

  • Sneath, P.H.A. and Sokal, R.R. (1973). Numerical taxonomy. The principles and practice of numerical classification.

  • Vo Van, T. and Pham-Gia, T. (2010). Clustering probability distributions. J. Appl. Stat. 37, 1891–1910.

    Article  MathSciNet  Google Scholar 

  • Webb, A.R. (2003). Statistical pattern recognition. Wiley, New York.

    MATH  Google Scholar 

  • Wong, A.K.C. and Wang, D.C.C. (1979). DECA: A discrete-valued data clustering algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 1, 342–349.

  • Xie, X.L. and Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13, 841–847.

  • Yu, J., Cheng, Q. and Huang, H. (2004). Analysis of the weighting exponent in the FCM. IEEE Trans. Syst. Man Cybern. B Cybern. 34, 634–639.

    Article  Google Scholar 

  • Zhang, Y., Wang, J.Z. and Li, J. (2015). Parallel massive clustering of discrete distributions. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11, 49.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thao Nguyen Trang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

VoVan, T., Nguyen Trang, T. Similar Coefficient of Cluster for Discrete Elements. Sankhya B 80, 19–36 (2018). https://doi.org/10.1007/s13571-018-0159-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-018-0159-0

Keywords and phrases

AMS (2000) subject classification

Navigation