Skip to main content
Erschienen in: Journal of Classification 3/2020

16.07.2019

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

verfasst von: Alberto Fernández, Sergio Gómez

Erschienen in: Journal of Classification | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Agglomerative hierarchical clustering can be implemented with several strategies that differ in the way elements of a collection are grouped together to build a hierarchy of clusters. Here we introduce versatile linkage, a new infinite system of agglomerative hierarchical clustering strategies based on generalized means, which go from single linkage to complete linkage, passing through arithmetic average linkage and other clustering methods yet unexplored such as geometric linkage and harmonic linkage. We compare the different clustering strategies in terms of cophenetic correlation, mean absolute error, and also tree balance and space distortion, two new measures proposed to describe hierarchical trees. Unlike the β-flexible clustering system, we show that the versatile linkage family is space-conserving.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aeberhard, S., Coomans, D., De Vel, O. (1992). Comparison of classifiers in high dimensional settings. Dept. Math. Statist., James Cook Univ., North Queensland, Australia, Tech. Rep. no. 92-02. Aeberhard, S., Coomans, D., De Vel, O. (1992). Comparison of classifiers in high dimensional settings. Dept. Math. Statist., James Cook Univ., North Queensland, Australia, Tech. Rep. no. 92-02.
Zurück zum Zitat Belbin, L., Faith, D.P., Milligan, G.W. (1992). A comparison of two approaches to beta-flexible clustering. Multivariate Behavioral Research, 27(3), 417–433.CrossRef Belbin, L., Faith, D.P., Milligan, G.W. (1992). A comparison of two approaches to beta-flexible clustering. Multivariate Behavioral Research, 27(3), 417–433.CrossRef
Zurück zum Zitat Bradley, P.E. (2010). Mumford dendrograms. The Computer Journal, 53(4), 393–404.CrossRef Bradley, P.E. (2010). Mumford dendrograms. The Computer Journal, 53(4), 393–404.CrossRef
Zurück zum Zitat Contreras, P., & Murtagh, F. (2012). Fast, linear time hierarchical clustering using the Baire metric. Journal of Classification, 29(2), 118–143.MathSciNetCrossRef Contreras, P., & Murtagh, F. (2012). Fast, linear time hierarchical clustering using the Baire metric. Journal of Classification, 29(2), 118–143.MathSciNetCrossRef
Zurück zum Zitat Day, W.H.E., & Edelsbrunner, H. (1984). Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1(1), 7–24.CrossRef Day, W.H.E., & Edelsbrunner, H. (1984). Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1(1), 7–24.CrossRef
Zurück zum Zitat Dubien, J.L., & Warde, W.D. (1979). A mathematical comparison of the members of an infinite family of agglomerative clustering algorithms. Canadian Journal of Statistics, 7, 29–38.MathSciNetCrossRef Dubien, J.L., & Warde, W.D. (1979). A mathematical comparison of the members of an infinite family of agglomerative clustering algorithms. Canadian Journal of Statistics, 7, 29–38.MathSciNetCrossRef
Zurück zum Zitat Fernández, A., & Gómez, S. (2008). Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms. Journal of Classification, 25(1), 43–65.MathSciNetCrossRef Fernández, A., & Gómez, S. (2008). Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms. Journal of Classification, 25(1), 43–65.MathSciNetCrossRef
Zurück zum Zitat Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.CrossRef Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.CrossRef
Zurück zum Zitat Gordon, A.D. (1999). Classification, 2nd edn. Boca Raton: Chapman & Hall/CRC.CrossRef Gordon, A.D. (1999). Classification, 2nd edn. Boca Raton: Chapman & Hall/CRC.CrossRef
Zurück zum Zitat Hart, G. (1983). The occurrence of multiple UPGMA phenograms. In J. Felsenstein (Ed.) Numerical taxonomy (pp. 254–258). Berlin: Springer. Hart, G. (1983). The occurrence of multiple UPGMA phenograms. In J. Felsenstein (Ed.) Numerical taxonomy (pp. 254–258). Berlin: Springer.
Zurück zum Zitat Jossinet, J. (1996). Variability of impedivity in normal and pathological breast tissue. Medical and Biological Engineering and Computing, 34(5), 346–350.CrossRef Jossinet, J. (1996). Variability of impedivity in normal and pathological breast tissue. Medical and Biological Engineering and Computing, 34(5), 346–350.CrossRef
Zurück zum Zitat Lance, G.N., & Williams, W.T. (1966). A generalized sorting strategy for computer classifications. Nature, 212, 218.CrossRef Lance, G.N., & Williams, W.T. (1966). A generalized sorting strategy for computer classifications. Nature, 212, 218.CrossRef
Zurück zum Zitat Lance, G.N., & Williams, W.T. (1967). A general theory of classificatory sorting strategies: 1. Hierarchical systems. The Computer Journal, 9(4), 373–380.CrossRef Lance, G.N., & Williams, W.T. (1967). A general theory of classificatory sorting strategies: 1. Hierarchical systems. The Computer Journal, 9(4), 373–380.CrossRef
Zurück zum Zitat Little, M.A., McSharry, P.E., Hunter, E.J., Spielman, J., Ramig, L.O. (2009). Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 56(4), 1015–1022.CrossRef Little, M.A., McSharry, P.E., Hunter, E.J., Spielman, J., Ramig, L.O. (2009). Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 56(4), 1015–1022.CrossRef
Zurück zum Zitat Morgan, B.J.T., & Ray, A.P.G. (1995). Non-uniqueness and inversions in cluster analysis. Journal of the Royal Statistical Society: Series C (Applied Statistics), 44(1), 117–134.MATH Morgan, B.J.T., & Ray, A.P.G. (1995). Non-uniqueness and inversions in cluster analysis. Journal of the Royal Statistical Society: Series C (Applied Statistics), 44(1), 117–134.MATH
Zurück zum Zitat Murtagh, F. (1985). Multidimensional clustering algorithms. In Compstat lectures. Vienna: Physica-Verlag. Murtagh, F. (1985). Multidimensional clustering algorithms. In Compstat lectures. Vienna: Physica-Verlag.
Zurück zum Zitat Murtagh, F., & Contreras, P. (2017a). Algorithms for hierarchical clustering: an overview, ii. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(6), e1219. Murtagh, F., & Contreras, P. (2017a). Algorithms for hierarchical clustering: an overview, ii. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(6), e1219.
Zurück zum Zitat Murtagh, F., & Contreras, P. (2017b). Clustering through high dimensional data scaling: applications and implementations. Archives of Data Science, Series A, 2(1), 1–16. Murtagh, F., & Contreras, P. (2017b). Clustering through high dimensional data scaling: applications and implementations. Archives of Data Science, Series A, 2(1), 1–16.
Zurück zum Zitat Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27, 379–423.MathSciNetCrossRef Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27, 379–423.MathSciNetCrossRef
Zurück zum Zitat Sneath, P.H.A., & Sokal, R.R. (1973). Numerical taxonomy: the principles and practice of numerical classification. San Francisco: W. H. Freeman and Company.MATH Sneath, P.H.A., & Sokal, R.R. (1973). Numerical taxonomy: the principles and practice of numerical classification. San Francisco: W. H. Freeman and Company.MATH
Zurück zum Zitat Sokal, R.R., & Michener, C.D. (1958). A statistical method for evaluating systematic relationships. The University of Kansas Science Bulletin, 38, 1409–1438. Sokal, R.R., & Michener, C.D. (1958). A statistical method for evaluating systematic relationships. The University of Kansas Science Bulletin, 38, 1409–1438.
Zurück zum Zitat Sokal, R.R., & Rohlf, F.J. (1962). The comparison of dendrograms by objective methods. Taxon, 11(2), 33–40.CrossRef Sokal, R.R., & Rohlf, F.J. (1962). The comparison of dendrograms by objective methods. Taxon, 11(2), 33–40.CrossRef
Zurück zum Zitat Ward, J.H. Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.MathSciNetCrossRef Ward, J.H. Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.MathSciNetCrossRef
Metadaten
Titel
Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering
verfasst von
Alberto Fernández
Sergio Gómez
Publikationsdatum
16.07.2019
Verlag
Springer US
Erschienen in
Journal of Classification / Ausgabe 3/2020
Print ISSN: 0176-4268
Elektronische ISSN: 1432-1343
DOI
https://doi.org/10.1007/s00357-019-09339-z

Weitere Artikel der Ausgabe 3/2020

Journal of Classification 3/2020 Zur Ausgabe

Premium Partner