Abstract
A number of ways of investigating heterogeneity in a two-way contingency table are reviewed. In particular, we consider chi-square decompositions of the Pearson chi-square statistic with respect to the nodes of a hierarchical clustering of the rows and/or the columns of the table. A cut-off point which indicates “significant clustering” may be defined on the binary trees associated with the respective row and column cluster analyses. This approach provides a simple graphical procedure which is useful in interpreting a significant chi-square statistic of a contingency table.
Similar content being viewed by others
References
BENZECRI, J.-P. (1973),L'Analyse des Données, Tome (Vol.) 1 — La Taxinomie, Tome 2 — L'Analyse des Correspondances, Paris: Dunod.
BENZECRI, J.-P., and CAZES, P. (1978), “Probleme sur la classification,”Cahiers de L'Analyse des Données, 3, 95–101.
CLEVELAND, W.S., and RELLES, D.A. (1975), “Clustering by Identification with Special Application to Two-way Tables of Counts,”Journal of the American Statistical Association, 70, 626–630.
EVERITT, B.,Cluster Analysis, London: Heinemann.
GABRIEL, K.R. (1966), “Simultaneous Test Procedures for Multiple Comparisons on Categorical Data,”Journal of the American Statistical Association, 61, 1081–1096.
GILULA, Z. (1986), “Grouping and Association in Contingency Tables: An Exploratory Canonical Correlation Approach,”Journal of the American Statistical Association, 81, 773–779.
GILULA, Z., and HABERMAN, S.J. (1986), “Canonical Analysis of Contingency Tables by Maximum Likelihood,”Journal of the American Statistical Association, 81, 780–788.
GILULA, Z. and KRIEGER, A.M. (1983), “The Decomposability and Monotonicity of Pearson's Chi-Square for Collapsed Contingency Tables with Applications,”Journal of the American Statistical Association, 78, 176–180.
GOLD, R.Z. (1963), “Tests Auxilliary to x2 Tests in a Markov Chain,”Annals of Mathematical Statistics, 34, 56–74.
GOODMAN, L.A. (1964), “Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations,”Annals of Mathematical Statistics, 35, 716–725.
GOODMAN, L.A. (1965), “On Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 7, 247–254.
GOODMAN, L.A. (1985), “The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables with or without Missing Entries,”Annals of Statistics, 13, 10–69.
GOVAERT G. (1984), “Classification Simultanée de Tableaux Binaires,” inData Analysis and Informatics 3, eds. E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, Amsterdam: North Holland, 223–236.
GREENACRE, M.J. (1984),Theory and Applications of Correspondence Analysis, London: Academic Press.
GUTTMAN, L. (1971), “Measurement as Structural Theory,”Psychometrika, 36, 329–347.
HIROTSU, C. (1983), “Defining the Pattern of Association in Two-way Contingency Tables,”Biometrika, 70, 579–589.
JAMBU, M. (1978),Classification Automatique pour L'Analyse des Données, 1 — Méthodes et Algorithmes, Paris: Dunod.
JAMBU, M., and LEBEAUX, M.O. (1983),Cluster Analysis and Data Analysis, Amsterdam: North Holland.
LANCE, G.N., and WILLIAMS, W.T. (1967), “A General Theory of Classificatory Sorting Strategies. 1. Hierarchical Systems,”Computer Journal, 9, 373–380.
LEBART, L. (1975),Validité des Résultats en Analyse des Données, Paris: CREDOC-DGRST.
LEBART, L., MORINEAU, A., and WARWICK, K. (1984),Multivariate Descriptive Statistical Analysis, New York: Wiley.
O'NEILL, M.E. (1981), “A Note on the Canonical Correlations from Contingency Tables,”Australian Journal of Statistics, 23, 58–66.
PEARSON, E.S., and HARTLEY, H.O. (1972),Biometrika Tables for Statisticians, Volume 2, Cambridge, England: Cambridge University Press.
QUESENBERRY, C.P., and HURST, D.C. (1964), “Large Sample Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 6, 191–195.
SNEE, R.D. (1974), “Graphical Display of Two-way Contingency Tables,”American Statistician, 28, 9–12.
THARU, J., and WILLIAMS, W.T. (1966), “Concentration of Entries in Binary Arrays,”Nature, 210, 549.
WARD, J.H. (1963), “Hierarchical Grouping to Optimize an Objective Function,”Journal of the American Statistical Association, 58, 236–244.
WISHART, D. (1969), “An Algorithm for Hierarchical Classifications,”Biometrics, 25, 165–170.
Author information
Authors and Affiliations
Additional information
The author gratefully acknowledges the constructive comments of the referees and the editor.
Rights and permissions
About this article
Cite this article
Greenacre, M.J. Clustering the rows and columns of a contingency table. Journal of Classification 5, 39–51 (1988). https://doi.org/10.1007/BF01901670
Issue Date:
DOI: https://doi.org/10.1007/BF01901670