Efficient algorithms for divisive hierarchical clustering with the diameter criterion

Guénoche, A.; Hansen, P.; Jaumard, B.

doi:10.1007/BF02616245

Efficient algorithms for divisive hierarchical clustering with the diameter criterion

Published: January 1991

Volume 8, pages 5–30, (1991)
Cite this article

Journal of Classification Aims and scope Submit manuscript

A. Guénoche¹,
P. Hansen² &
B. Jaumard³

764 Accesses
93 Citations
Explore all metrics

Abstract

Divisive hierarchical clustering algorithms with the diameter criterion proceed by recursively selecting the cluster with largest diameter and partitioning it into two clusters whose largest diameter is smallest possible. We provide two such algorithms with complexitiesO(\(\overline M \) N ²) andO(N ²logN) respectively, where\(\overline M \) denotes the maximum number of clusters in a partition andN the number of entities to be clustered. The former algorithm, an efficient implementation of an algorithm of Hubert, allows to find all partitions into at most\(\overline M \) clusters and is inO(N ²) for fixed\(\overline M \). Moreover, if in each partitioning the size of the largest cluster is bounded byp times the number of entities in the set to be partitioned, with 1/2<=p<1, it provides a complete hierarchy of partitionsO(N ² logN) time. The latter algorithm, a refinement of an algorithm of Rao allows to build a complete hierarchy of partitions inO(N ² logN) time without any restriction. Comparative computational experiments with both algorithms and with an agglomerative hierarchical algorithm of Benzécri are reported.

Résumé

Les algorithmes de classification hiérarchique descendante utilisant le critère du diamètre, sélectionnent récursivement la classe de plus grand diamètre et la partitionnent en deux classes, dont le plus grand diamètre est le plus, petit possible. Nous proposons deux tels algorithmes, avec des complexités enO (\(\overline M \)N²) etO(N ² logN) respectivement, où\(\overline M \) désigne le nombre maximum de classes d'une partition etN le nombre d'objets à classifier. Le premier algorithme, une implantation d'un algorithme de Hubert, permet de construire des partitions avec au plus\(\overline M \) classes et est enO(N ²) pour\(\overline M \) fixé. De plus, si dans chaque bipartition le nombre d'objets de la plus grande classe, est borné parp fois le nombre d'objets de l'ensemble à partitionner, où 1/2≤p<1, cet algorithme permet de construire une hiérarchie complète de partitions en tempsO(N ² logN). Le second algorithme, un raffinement d'un algorithme de Rao, permet de construire une hiérarchie complète de partitions en tempsO(N ² logN) sans aucune restriction On présente également des résultats de calcul comparatifs pour les deux algorithmes et pour l'algorithme de classification hiérarchique ascendante de Benzécri.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparative Study of Divisive and Agglomerative Hierarchical Clustering Algorithms

Article 01 July 2018

Maurice Roux

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances

References

AHO, A. V., HOPCROFT, J. F., and ULLMAN, J. D. (1974).The Design and Analysis of Computer Algorithms, Reading, Massachusetts: Addison-Wesley.
MATH Google Scholar
BENZÉCRI, J. P. (1973),L'Analyse de Données. 1, La Taxinomie, Paris: Dunod.
Google Scholar
BENZÉCRI, J. P. (1982), “Construction d'une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques,”Les Cahiers de l'Analyse des Données, (VII 2, 209–218.
Google Scholar
BRUCKER, P. (1978), “On the Complexity of Clustering Problems,” inOptimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, 157, Eds. M. Beckmann and H. Kunzi, Heidelberg: Springer-Verlag, 45–54.
Google Scholar
DAY, H. E., and EDELSBRUNNER, H. (1984), “Effcient Algorithrns for Agglomerative Hierarchical Clustering Methods,”Journal of Classification, 1, 7–24.
Article MATH Google Scholar
DEFAYS, D. (1977), “An Efficient Algorithm for a Complete Link Method,”The Computer Journal, 20(4) 364–366.
Article MathSciNet MATH Google Scholar
DELATTRE, M., and HANSEN, P. (1980), “Bicriterion Cluster Analysis,”IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2(4), 277–291.
Article MATH Google Scholar
DIDAY, E., et al. (1979),Optimisation en Classification Automatique, Le Chesnay: INRIA.
MATH Google Scholar
DIJKSTRA, E. W. (1959), “A Note on Two Problems in Connexion with Graphs,”Numerische Mathematik, 1, 269–271.
Article MATH MathSciNet Google Scholar
DORING, V., HANSEN, P., and JAUMARD, B. (1990), “Divisive Hierarchical Clustering with the Diameter Criterion: Hubet's Algorithm,”Cahiers du GERAD (forthcoming).
DORING, V., HANSEN, P., and JAUMARD, B. (1990), “Divisive Hierarchical Clustering with the Diameter Criterion: Algorithm Alltrees,”Cahiers du GERAD (forthcoming).
EDWARDS, A. F. S., and CAVALLI-SFORZA, L. L. (1965), “A Method for Cluster Analysis,”Biometrics, 21(2), 362–375.
Article Google Scholar
FISHER, R. A. (1936), “The Use of Multiple Measurements in Taxonomic Problems,”Annals of Eugenics, VII(II), 179–188.
Google Scholar
GAREY, M. R., and JOHNSON, D. S. (1979),Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: Freeman.
MATH Google Scholar
GROTSCHEL, M., and WAKABAYASHI, Y. (1989), “A Cutting-plane Algorithm for a Clustering Problem,”Mathematical Programing, 45(1),Series B, 59–96.
Article MathSciNet Google Scholar
GUÉNOCHE, A. (1989), “Partitions with Minimum Diameter,” Conference of the I.F.C.S., Charlottesville.
HANSEN, P., and DELATTRE, M. (1978), “Complete Link Cluster Analysis by Graph Coloring,”Journal of the American Statistical Association, 73, 397–403.
Article Google Scholar
HANSEN, P., and JAUMARD, B. (1987), “Minimum Sum of Diameters Clustering,”Journal of Classification, 4, 215–226.
Article MATH MathSciNet Google Scholar
HARMAN, H. H. (1967),Modern Factor Analysis Chicago: University of Chicago Press.
MATH Google Scholar
HUBERT, L. (1973), “Monotone Invariant Clustering Procedures,”Psychometrika, 38(1), 47–62.
Article MATH Google Scholar
JAMBU, M. (1978),Classification automatique pour l'analyse des données, Paris: Dunod (Cluster Analysis and Data Analysis, Amsterdam: North Holland, 1983).
Google Scholar
JUAN, J. (1982), “Programme de classification hiérarchique par l'algorithme de la recherche en chaîne des voisns réciproques,”Les Cahiers de l'Analyse des Données, VII(2), 219–225.
Google Scholar
KRUSKAL, J.B. (1956), “On the Shortest Spanning Subtree of a Graph,”Proceedings of the American Mathematical Society, 7, 48–50.
Article MathSciNet Google Scholar
LANCE, G. N., and WILLIAMS, W. T. (1967), “Mixed-data Classificatory Systems: I. Agglomerative Systems,”Australian Computer Journal, 1, 15–20.
Google Scholar
LECLERC, B. (1986), “Caractérisation, construction et dénombrement des ultramétriques supérieures minimales,”Statistique et Analyse des Données, 11(2), 26–50.
MATH MathSciNet Google Scholar
McNAUGHTON-SMITH, P., WILLIAMS, W. T., DALE, M. B., and MOCKETT, L. G. (1964), “Dissimilarity Analysis,” Nature, 202, 1034–1035.
Article MATH Google Scholar
MONMA, C., and SURI, S. (1989), “Partitioning Points and Graphs to Minimize the Maximum or the Sum of Diameters,”Proceedings of the Sixth International Conference on the Theory and Applications of Graphs, New York: Wiley.
Google Scholar
MURTAGH, F. (1983), “A Survey of Recent Advances in Hierarchical Algorithms,”The Computer Journal, 26(4), 354–359.
MATH Google Scholar
MURTAGH, F. (1984), “Complexities of Hierarchic Clustering Algorithms: State of the Art,”Computational Statistics Quarterly, 1(2), 101–113.
MATH MathSciNet Google Scholar
PRIM, R.C. (1957), “Shortest Connection Networks and Some Generalizations”Bell System Technical Journal, 36, 1389–1401.
Google Scholar
RAO, M. R. (1971), “Cluster Analysis and Mathematical Programming,”Journal of the American Statistical Association, 66, 622–626.
Article MATH Google Scholar
VICHI, M. (1985), “On a Flexible and Computationally Feasible Divisive Clustering Technique,”Revista di Statistics, 18(4), 200–208.
Google Scholar
WELCH, W.J. (1982), “Algorithmic Complexity: Three NP-Hard Problems in Computational Statistics,”Journal of Statistical Computing and Simulation, 15, 17–25.
MATH MathSciNet Google Scholar
WILLIAMS, J.W.J. (1964), “Algorithm 232, Heapsort,”Communications of the Association for Computing Machinery, 7(6), 347–348.
Google Scholar

Download references

Author information

Authors and Affiliations

GRTC-CNRS, 31 Ch. J. Aiguier, 13402, Marseille, France
A. Guénoche
Rutgers Center for Operations Research, Hill Center for the Mathematical Sciences, Rutgers University, 08903, New Brunswick, New Jersey, USA
P. Hansen
GERAD and Départment de Mathématiques Apliguées, Ecolle Polytechnique de Montréal, Succursale “A”, Case Postale 6079, H3C 3A7, Montréal, Québec, Canada
B. Jaumard

Authors

A. Guénoche
View author publications
You can also search for this author in PubMed Google Scholar
P. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
B. Jaumard
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guénoche, A., Hansen, P. & Jaumard, B. Efficient algorithms for divisive hierarchical clustering with the diameter criterion. Journal of Classification 8, 5–30 (1991). https://doi.org/10.1007/BF02616245

Download citation

Issue Date: January 1991
DOI: https://doi.org/10.1007/BF02616245

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient algorithms for divisive hierarchical clustering with the diameter criterion

Abstract

Résumé

Access this article

Similar content being viewed by others

A Comparative Study of Divisive and Agglomerative Hierarchical Clustering Algorithms

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Résumé

Access this article

Similar content being viewed by others

A Comparative Study of Divisive and Agglomerative Hierarchical Clustering Algorithms

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation