Skip to main content
Top

2018 | OriginalPaper | Chapter

4. Cluster Quality Versus Choice of Parameters

Authors : Sławomir T. Wierzchoń, Mieczysław A. Kłopotek

Published in: Modern Algorithms of Cluster Analysis

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter is devoted to actions to be performed in order to get maximum insights into the data by application of clustering algorithms. For data preprocessing stage, methods for choosing the appropriate set of features and algorithms for selection of the proper number of clusters are presented. For post-processing of cluster analysis algorithm results, criteria evaluating the quality of the obtained clusters, both for the output of a single clustering algorithm and in case of applying multiple ones. Multiple internal and external quality measures are suggested.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Cf. e.g. E. Siegel, Predictive Analytics: The power to predict who will click, buy, lie, or die. Wiley 2013.
 
2
We recommend the procedure VARCLUST described abundantly in Chap. 104 of SAS/STAT \(\textregistered \) 13.1 User’s Guide. Cary, NC: SAS Institute Inc.
 
3
Cf. A. Blum, “Random projection, margins, kernels, and feature-selection”. In: C. Saunders, M. Grobelnik, S. Gunn, and J. Shawe-Taylor, eds. Subspace, Latent Structure and Feature Selection. LNCS 3940. Springer Berlin Heidelberg, 2006, 52–68.
 
4
Cf. e.g. I. Borg, and Patrick JF Groenen. Modern multidimensional scaling: Theory and applications. Springer, 2005.
 
5
M. Ming-Tso Chiang and B. Mirkin: Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads. Journal of Classification 27 (2009).
 
6
D. Pelleg and A. Moore: X-means: Extending k-means with efficient estimation of the number of clusters. Proc. 17th International Conf. on Machine Learning, 2000.
 
7
Cf. K. Mardia et al. Multivariate Analysis. Academic Press 1979, p. 365.
 
8
The basic work of reference, usually cited while this method is discussed, is the following: R.L. Thorndike “Who Belong in the Family?”. Psychometrika 18 (4) 1953. Some modification of this method was applied in the paper C. Goutte, P. Toft, E. Rostrup, F.A. Nielsen, L.K. Hansen. “On clustering fMRI time series”. NeuroImage 9 (3): 298–310 (March 1999).
 
9
The interested reader should visit the website http://​cran.​r-project.​org/​web/​packages/​clusterCrit/​ which continues to be maintained by Bernard Desgraupes. Description of 42 quality indexes can be found there, as well as clusterCrit package in R, providing their implementation.
 
10
Another work of reference is G. Saporta, G. Youness, Comparing two partitions: Some proposals and experiments. Compstat 2002, pp. 243-248. It is also worth becoming familiar with the website http://​darwin.​phyloviz.​net/​ComparingPartiti​ons/​.
 
Metadata
Title
Cluster Quality Versus Choice of Parameters
Authors
Sławomir T. Wierzchoń
Mieczysław A. Kłopotek
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-69308-8_4

Premium Partner