Skip to main content
Log in

Clustering the rows and columns of a contingency table

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

A number of ways of investigating heterogeneity in a two-way contingency table are reviewed. In particular, we consider chi-square decompositions of the Pearson chi-square statistic with respect to the nodes of a hierarchical clustering of the rows and/or the columns of the table. A cut-off point which indicates “significant clustering” may be defined on the binary trees associated with the respective row and column cluster analyses. This approach provides a simple graphical procedure which is useful in interpreting a significant chi-square statistic of a contingency table.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BENZECRI, J.-P. (1973),L'Analyse des Données, Tome (Vol.) 1 — La Taxinomie, Tome 2 — L'Analyse des Correspondances, Paris: Dunod.

    Google Scholar 

  • BENZECRI, J.-P., and CAZES, P. (1978), “Probleme sur la classification,”Cahiers de L'Analyse des Données, 3, 95–101.

    Google Scholar 

  • CLEVELAND, W.S., and RELLES, D.A. (1975), “Clustering by Identification with Special Application to Two-way Tables of Counts,”Journal of the American Statistical Association, 70, 626–630.

    Google Scholar 

  • EVERITT, B.,Cluster Analysis, London: Heinemann.

  • GABRIEL, K.R. (1966), “Simultaneous Test Procedures for Multiple Comparisons on Categorical Data,”Journal of the American Statistical Association, 61, 1081–1096.

    Google Scholar 

  • GILULA, Z. (1986), “Grouping and Association in Contingency Tables: An Exploratory Canonical Correlation Approach,”Journal of the American Statistical Association, 81, 773–779.

    Google Scholar 

  • GILULA, Z., and HABERMAN, S.J. (1986), “Canonical Analysis of Contingency Tables by Maximum Likelihood,”Journal of the American Statistical Association, 81, 780–788.

    Google Scholar 

  • GILULA, Z. and KRIEGER, A.M. (1983), “The Decomposability and Monotonicity of Pearson's Chi-Square for Collapsed Contingency Tables with Applications,”Journal of the American Statistical Association, 78, 176–180.

    Google Scholar 

  • GOLD, R.Z. (1963), “Tests Auxilliary to x2 Tests in a Markov Chain,”Annals of Mathematical Statistics, 34, 56–74.

    Google Scholar 

  • GOODMAN, L.A. (1964), “Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations,”Annals of Mathematical Statistics, 35, 716–725.

    Google Scholar 

  • GOODMAN, L.A. (1965), “On Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 7, 247–254.

    Google Scholar 

  • GOODMAN, L.A. (1985), “The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables with or without Missing Entries,”Annals of Statistics, 13, 10–69.

    Google Scholar 

  • GOVAERT G. (1984), “Classification Simultanée de Tableaux Binaires,” inData Analysis and Informatics 3, eds. E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, Amsterdam: North Holland, 223–236.

    Google Scholar 

  • GREENACRE, M.J. (1984),Theory and Applications of Correspondence Analysis, London: Academic Press.

    Google Scholar 

  • GUTTMAN, L. (1971), “Measurement as Structural Theory,”Psychometrika, 36, 329–347.

    Google Scholar 

  • HIROTSU, C. (1983), “Defining the Pattern of Association in Two-way Contingency Tables,”Biometrika, 70, 579–589.

    Google Scholar 

  • JAMBU, M. (1978),Classification Automatique pour L'Analyse des Données, 1 — Méthodes et Algorithmes, Paris: Dunod.

    Google Scholar 

  • JAMBU, M., and LEBEAUX, M.O. (1983),Cluster Analysis and Data Analysis, Amsterdam: North Holland.

    Google Scholar 

  • LANCE, G.N., and WILLIAMS, W.T. (1967), “A General Theory of Classificatory Sorting Strategies. 1. Hierarchical Systems,”Computer Journal, 9, 373–380.

    Google Scholar 

  • LEBART, L. (1975),Validité des Résultats en Analyse des Données, Paris: CREDOC-DGRST.

    Google Scholar 

  • LEBART, L., MORINEAU, A., and WARWICK, K. (1984),Multivariate Descriptive Statistical Analysis, New York: Wiley.

    Google Scholar 

  • O'NEILL, M.E. (1981), “A Note on the Canonical Correlations from Contingency Tables,”Australian Journal of Statistics, 23, 58–66.

    Google Scholar 

  • PEARSON, E.S., and HARTLEY, H.O. (1972),Biometrika Tables for Statisticians, Volume 2, Cambridge, England: Cambridge University Press.

    Google Scholar 

  • QUESENBERRY, C.P., and HURST, D.C. (1964), “Large Sample Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 6, 191–195.

    Google Scholar 

  • SNEE, R.D. (1974), “Graphical Display of Two-way Contingency Tables,”American Statistician, 28, 9–12.

    Google Scholar 

  • THARU, J., and WILLIAMS, W.T. (1966), “Concentration of Entries in Binary Arrays,”Nature, 210, 549.

    Google Scholar 

  • WARD, J.H. (1963), “Hierarchical Grouping to Optimize an Objective Function,”Journal of the American Statistical Association, 58, 236–244.

    Google Scholar 

  • WISHART, D. (1969), “An Algorithm for Hierarchical Classifications,”Biometrics, 25, 165–170.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The author gratefully acknowledges the constructive comments of the referees and the editor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Greenacre, M.J. Clustering the rows and columns of a contingency table. Journal of Classification 5, 39–51 (1988). https://doi.org/10.1007/BF01901670

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01901670

Keywords

Navigation