Skip to main content

2019 | OriginalPaper | Buchkapitel

12. Cluster Analysis

verfasst von : Thomas Cleff

Erschienen in: Applied Statistics and Multivariate Data Analysis for Business and Economics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Before we turn to the subject of cluster analysis, think for a moment about the meaning of the word cluster. The term refers to a group of individuals or objects that converge around a certain point and are thus closely related in their position. In astronomy there are clusters of stars; in chemistry, clusters of atoms. Economic research often relies on techniques that consider groups within a total population. For instance, firms that engage in target group marketing must first divide consumers into segments, or clusters of potential customers. Indeed, in many contexts researchers and economists need accurate methods for delineating homogenous groups within a set of observations. Groups may contain individuals (such as people or their behaviours) or objects (such as firms, products, or patents). This chapter thus takes a cue from Goethe’s Faust (1987, Line 1943–45): “You soon will [understand]; just carry on as planned/You’ll learn reductive demonstrations/And all the proper classifications”.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
By contrast, divisive clustering methods start by collecting all observations as one cluster. They proceed by splitting the initial cluster into two groups and continue by splitting the subgroups, repeating this process down the line. The main disadvantage of divisive methods is their high level of computational complexity. With agglomerative methods, the most complicated set of calculations comes in the first step: for n observations, a total of n(n−1)/2 distance measurements must be performed. With divisive methods containing two non-empty clusters, there are a total of 2(n−1)−1 possible calculations. The greater time required for calculating divisive hierarchical clusters explains why this method is used infrequently by researchers and not included in standard statistics software.
 
2
In the case of two dimensions, the Euclidean distance and the Pythagorean theorem provide the same results.
 
3
In standardization – sometimes also called z-transform – the mean of x is subtracted from each x variable value and the result divided by the standard deviation (S) of the x variable: \( {z}_i=\frac{x_i-\overline{x}}{S} \).
 
4
Say we wanted to dichotomize calories per fl. oz. using three calorie variables. Calorie variable 1 assumes the value of one when the calories in a beer lie between 60 and 99.99 calories, otherwise it is equal to zero. Calorie variable 2 assumes the value one when the calories in a beer lie between 100 and 139.99 calories, otherwise it is equal to zero. Calorie variable 3 assumes the value one when the calories in a beer lie between 140 and 200 calories; otherwise it is equal to zero.
 
5
The centroid is determined by calculating the mean for every variable for all observations of each cluster separately.
 
6
Euclidean distance of #9 to centroid CLU#1: \( \sqrt{{\left(-0.571-\left(-0.401\right)\right)}^2+{\left(0.486-\left(-0.563\right)\right)}^2}=1.06 \).
 
7
Euclidean distance of #9 to centroid CLU#2: \( \sqrt{{\left(1.643-\left(-0.401\right)\right)}^2+{\left(0.719-\left(-0.563\right)\right)}^2}=2.41 \).
 
8
Euclidean distance of #9 to centroid CLU#3: \( \sqrt{{\left(-0.401-\left(-0.401\right)\right)}^2+{\left(-1.353-\left(-0.563\right)\right)}^2}=0.79 \).
 
Literatur
Zurück zum Zitat Backhaus, K., Erichson, B., Plinke, W., Weiber, R. (2016). Multivariate Analysemethoden. Eine Anwendungsorientierte Einführung, 14th Edition. Berlin, Heidelberg: Springer.CrossRef Backhaus, K., Erichson, B., Plinke, W., Weiber, R. (2016). Multivariate Analysemethoden. Eine Anwendungsorientierte Einführung, 14th Edition. Berlin, Heidelberg: Springer.CrossRef
Zurück zum Zitat Berg, S. (1981). Optimalität bei Cluster-Analysen, Münster: Dissertation, Fachbereich Wirtschafts- und Sozialwissenschaften, Westfälische Wilhelms-Universität Münster. Berg, S. (1981). Optimalität bei Cluster-Analysen, Münster: Dissertation, Fachbereich Wirtschafts- und Sozialwissenschaften, Westfälische Wilhelms-Universität Münster.
Zurück zum Zitat Bühl, A. (2019). SPSS: Einführung in die moderne Datenanalyse ab SPSS 25, 16th Edition. Munich: Pearson Studium. Bühl, A. (2019). SPSS: Einführung in die moderne Datenanalyse ab SPSS 25, 16th Edition. Munich: Pearson Studium.
Zurück zum Zitat Everitt, B.S., Rabe-Hesketh, S. (2004). A Handbook of Statistical Analyses Using Stata, 3rd Edition. Chapman & Hall: Boca Raton. Everitt, B.S., Rabe-Hesketh, S. (2004). A Handbook of Statistical Analyses Using Stata, 3rd Edition. Chapman & Hall: Boca Raton.
Zurück zum Zitat Goethe, J.W. (1987). Faust Part One. Translated with an Introduction and Notes by David Luke. New York: Oxford University Press. Goethe, J.W. (1987). Faust Part One. Translated with an Introduction and Notes by David Luke. New York: Oxford University Press.
Zurück zum Zitat Janssens, W., Wijnen, K., Pelsmacker de, P., Kenvove van, P. (2008). Marketing Research with SPSS. Essex: Pearson Education. Janssens, W., Wijnen, K., Pelsmacker de, P., Kenvove van, P. (2008). Marketing Research with SPSS. Essex: Pearson Education.
Zurück zum Zitat Kaufman, L., Rousseeuw, P.J. (1990). Finding Groups in Data. New York: Wiley.CrossRef Kaufman, L., Rousseeuw, P.J. (1990). Finding Groups in Data. New York: Wiley.CrossRef
Zurück zum Zitat Mooi, E., Sarstedt, M. (2019). A Concise Guide to Market Research. The Process, Data, and Methods Using IBM SPSS Statistics, 3rd Edition. Berlin, Heidelberg: Springer. Mooi, E., Sarstedt, M. (2019). A Concise Guide to Market Research. The Process, Data, and Methods Using IBM SPSS Statistics, 3rd Edition. Berlin, Heidelberg: Springer.
Zurück zum Zitat Ward, J. H., Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.CrossRef Ward, J. H., Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.CrossRef
Metadaten
Titel
Cluster Analysis
verfasst von
Thomas Cleff
Copyright-Jahr
2019
Verlag
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-030-17767-6_12

Premium Partner