Abstract
Clustering or classifying individuals into groups such that there is relative homogeneity within the groups and heterogeneity between the groups is a problem which has been considered for many years. Most available clustering techniques are applicable only to a two-way data set, where one of the modes is to be partitioned into groups on the basis of the other mode. Suppose, however, that the data set is three-way. Then what is needed is a multivariate technique which will cluster one of the modes on the basis of both of the other modes simultaneously. It is shown that by appropriate specification of the underlying model, the mixture maximum likelihood approach to clustering can be applied in the context of a three-way table. It is illustrated using a soybean data set which consists of multiattribute measurements on a number of genotypes each grown in several environments. Although the problem is set in the framework of clustering genotypes, the technique is applicable to other types of three-way data sets.
Similar content being viewed by others
References
AITKIN, M., ANDERSON, D., and HINDE, J. (1981), “Statistical Modelling of Data on Teaching Styles,”Journal of the Royal Statistical Society, A 144, 419–461.
BASFORD, K.E. (1982), “The Use of Multidimensional Scaling in Analysing Multi-attribute Genotype Response Across Environments,”Australian Journal of Agricultural Research, 33, 473–480.
BASFORD, K.E., and MCLACHLAN, G.J. (1985), “Cluster Analysis in a Randomized Complete Block Design,” To appear inCommunications in Statistics — Theory and Methods.
BINDER, D.A. (1978), “Bayesian Cluster Analysis,”Biometrika, 65, 31–38.
BURT, R.L., EDYE, L.A., WILLIAMS, W.T., GROF, B., and NICHOLSON, C.H.L. (1971), “Numerical Analysis of Variation Patterns in the Genus Stylosanthes as an Aid to Plant Introduction and Assessment,”Australian Journal of Agricultural Research, 22, 737–757.
BYTH, D.E., EISEMANN, R.L., and DE LACY, I.H. (1976), “Two-way Pattern Analysis of a Large Data Set to Evaluate Genotype Adaptation,”Heredity, 37, 215–230.
CARROLL, J.D., and ARABIE, P. (1980), “Multidimensional Scaling,”Annual Review of Psychology, 31, 607–649.
CARROLL, J.D., and ARABIE, P. (1983), “INDCLUS: An Individual Differences Generalization of the ADCLUS Model and the MAPCLUS Algorithm,”Psychometrika, 48, 157–169.
CARROLL, J.D., CLARK, L.A., and DE SARBO, W.S. (1984), “The Representation of Three-way Proximity Data by Single and Multiple Tree Structure Models,”Journal of Classification, 1, 25–74.
CHANG, W.C. (1983), “On Using Principal Components Before Separating a Mixture of Two Multivariate Normal Distributions,”Applied Statistics, 32, 267–275.
DEMPSTER, A.P., LAIRD, N.M., and RUBIN, D.B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm,”Journal of the Royal Statistical Society, B 39, 1–38.
DE SARBO, W.S., CARROLL, J.D., CLARK, L.A., and GREEN, P.E. (1984), “Synthesized Clustering: A Method for Amalgamating Alternative Clustering Bases with Differential Weighting of Variables,”Psychometrika, 49, 57–78.
HARTIGAN, J.A. (1977), “Distribution Problems in Clustering,” InClassifications and Clustering, (Ed.) J. van Ryzin, New York: Academic Press, 45–71.
HAWKINS, D.M., MULLER, M.W., and TEN KROODEN, J.A. (1982), “Cluster Analysis,” InTopics in Applied Multivariate Analysis, (Ed.) D. M. Hawkins, Cambridge: Cambridge University Press, 303–356.
KENDALL, M.G. (1965),A Course in Multivariate Analysis, London: Charles Griffin.
KRUSKAL, J.B. (1964a), “Mulidimensional Scaling by Optimizing Gooness-of-Fit to a Nonmetric Hypothesis,”Psychometrika, 29, 1–27.
KRUSKAL, J.B. (1964b), “Nonmetric Multidimensional Scaling,”Psychometrika, 29, 115–129.
KRUSKAL, J.B. (1977), “The Relationship Between Multidimensional Scaling and Clustering,” InClassification and Clustering, (Ed.) J. van Ryzin, New York: Academic Press, 17–44.
MACQUEEN, J. (1967), “Some Methods for Classification and Analysis of Multivariate Observations,”Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 231–297.
MCLACHLAN, G.J. (1982), “The Classification and Mixture Maximum Likelihood Approaches to Cluster Analysis,” InHandbook of Statistics, Vol. 2 (Eds.) P. R. Krishnaiah and L. N. Kanal, Amsterdam: North-Holland Publishing Company, 199–208.
MORGAN, B.J.T. (1981), “Three Applications of Methods of Cluster-Analysis,”Statistician, 30, 205–223.
MUNGOMERY, V.E., SHORTER, R., and BYTH, D.E. (1974), “Genotype X Environment Interactions and Environmental Adaption. I. Pattern Analysis — Application to Soya Bean Populations,”Australian Journal of Agricultural Research, 25, 59–72.
RAMSAY, J.O. (1982), “Some Statistical Approaches to Multidimensional Scaling Data,”Journal of the Royal Statistical Society, A 145, 285–312.
SHEPARD, R.N. (1962a), “Analysis of PRoximities: Multidimensional Scaling with an Unknown Distance Function. I,”Psychometrika, 27, 125–140.
SHEPARD, R.N. (1962b), “Analysis of Proximities: Multidimensional Scaling with an Unknown Distance Function. II,”Psychometrika, 27, 219–246.
SHORTER, R., BYTH, D.E., and MUNGOMERY, V.E. (1977), “Genotype X Environment Interactions and Environmental Adaptation. II. Assessment of Environmental Contributions,”Australian Journal of Agricultural Research, 28, 223–235.
WHITMORE, R.C., and HARNER, E.J. (1980), “Analyses of Multivariately Determined Community Matrices Using Cluster Analysis and Multidimensional Scaling,”Biometrical Journal, 22, 715–723.
WOLFE, J.H. (1971), “A Monte Carlo Study of the Sampling Distribution of the Likelihood Ratio for Mixtures of Multinormal Distributions,”Naval Personnel and Training Research Laboratory, Technical Bulletin STB 72-2, San Diego, California.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Basford, K.E., McLachlan, G.J. The mixture method of clustering applied to three-way data. Journal of Classification 2, 109–125 (1985). https://doi.org/10.1007/BF01908066
Issue Date:
DOI: https://doi.org/10.1007/BF01908066