Abstract
A method for structural analysis of multivariate data is proposed that combines features of regression analysis and principal component analysis. In this method, the original data are first decomposed into several components according to external information. The components are then subjected to principal component analysis to explore structures within the components. It is shown that this requires the generalized singular value decomposition of a matrix with certain metric matrices. The numerical method based on the QR decomposition is described, which simplifies the computation considerably. The proposed method includes a number of interesting special cases, whose relations to existing methods are discussed. Examples are given to demonstrate practical uses of the method.
Similar content being viewed by others
References
Bechtel, G. G. (1976).Multidimensional preference scaling. The Hague: Mouton.
Bechtel, G. G., Tucker, L. R., & Chang, W. (1971). A scalar product model for the multidimensional scaling of choice.Psychometrika, 36, 369–387.
Besse, P., & Ramsay, J. O. (1986). Principal components analysis of sampled functions.Psychometrika, 51, 285–311.
Bloxom, B. (1978). Constrained multidimensional scaling inN spaces.Psychometrika, 43, 397–408.
Böckenholt, U., & Böckenholt, I. (1990). Canonical analysis of contingency tables with linear constraints.Psychometrika, 55, 633–639.
Carroll, J. D. (1972). Individual differences and multidimensional scaling. In R. N. Shepard, A. K. Romney, & S. B. Nerlove (Eds.),Multidimensional scaling, Vol. I (pp. 105–155). New York: Seminar Press.
Carroll, J. D., Pruzansky, S., & Kruskal, J. B. (1980). CANDELINC: A general approach to multidimensional analysis of many-way arrays with linear constraints on parameters.Psychometrika, 45, 3–24.
Corsten, L. C. A. (1976).Matrix approximation, a key to application of multivariate methods. Invited paper presented at the 9th Biometric Conference, Boston.
Corsten, L. C. A., & Van Eijnsbergen, A. C. (1972). Multiplicative effects in two-way analysis of variance.Statistica Neelandica, 26, 61–68.
Critchley, F. (1985). Influence in principal component analysis.Biometrika, 72, 627–636.
de Leeuw, J. (1984). Fixed rank matrix approximation with singular weights matrices.Computational Statistics Quarterly, 1, 3–12.
DeSarbo, W. S., Carroll, J. D., Lehmann, D. R., & O'Shaughnessy, J. (1982). Three-way multivariate conjoint analysis.Marketing Science, 1, 323–350.
DeSarbo, W. S., & Rao, V. R. (1984). GENFOLD2: A set of models and algorithms for the GENeral UnFOLDing analysis of preference/dominance data.Journal of Classification, 1, 147–186.
De Soete, G., & Carroll, J. D. (1983). A maximum likelihood method for fitting the wandering vector model.Psychometrika, 48, 553–566.
Eastment, H. T., & Krzanowski, W. J. (1982). Cross-validatory choice of the number of components from a principal component analysis.Technometrics, 24, 73–77.
Efron, B. (1979). Bootstrap methods: Another look at the Jackknife.Annals of Statistics, 7, 1–26.
Escoufier, Y., & Holmes, S. (1988).Décomposition de la variabilité dans les analyses exploratoires: Un exemple d'analyse en composantes principles en presence de variables qualitatives concomittantes [Descomposition of variabilities in exploratory data analysis: An example of principal component analysis in the presence of qualitative concomitant variables.] Unpublished manuscript, ENSAM-INRA, Montpellier, France.
Fisher, R. A. (1948).Statistical methods for research workers (10th ed.). London: Oliver and Boyd.
Gabriel, K. R. (1978). Least squares approximation of matrices by additive and multiplicative models.Journal of Royal Statistical Society, Series B, 40, 186–196.
Gabriel, K. R., & Zamir, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights.Technometrics, 21, 489–498.
Gifi, A. (1981).Non-linear multivariate analysis. Leiden: University of Leiden, Department of Data Theory.
Gollob, H. F. (1968). A statistical model which combines features of factor analytic and analysis of variance technique.Psychometrika, 33, 73–115.
Greenacre, M. J., & Underhill, L. G. (1982). Scaling a data matrix in a low-dimensional euclidean space. In D. M. Hawkins (Ed.),Topics in applied multivariate analysis (pp. 183–268). Cambridge: Cambridge University Press.
Grizzle, J. E., & Allen, D. M. (1969). Analysis of growth and dose response curves.Biometrics, 25, 357–381.
Hayashi, C. (1952). On the prediction of phenomena from qualitative data and the quantification of qualitative data from the mathematico-statistical point of view.Annals of the Institute of Statistical Mathematics, 2, 69–98.
Heiser, W. J., & de Leeuw, J. (1981). Multidimensional mapping of preference data.Mathematiqué et sciences humaines, 19, 39–96.
Heiser, W. J., & Meulman, J. (1983a). Analyzing rectangular tables by joint and constrained multidimensional scaling.Journal of Econometrics, 22, 139–167.
Heiser, W. J., & Meulman, J. (1983b). Constrained multidimensional scaling, including confirmation.Applied Psychological Measurement, 7, 381–404.
Israëls, A. Z. (1984). Redundancy analysis for qualitative variables.Psychometrika, 49, 331–346.
Jolliffe, I. T. (1986).Principal component analysis. Berlin: Springer Verlag.
Khatri, C. G. (1966). A note on a MANOVA model applied to problems in growth curves.Annals of the Institute of Statistical Mathematics, 18, 75–86.
Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method.Psychometrika, 29, 115–129.
Meulman, J. (1982).Homogeneity analysis of incomplete data. Leiden: DSWO Press.
Nishisato, S. (1978). Optimal scaling of paired comparison and rank order data: An alternative to Guttman's formulation.Psychometrika, 43, 263–271.
Nishisato, S. (1980a).Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.
Nishisato, S. (1980b). Dual scaling of successive categories data.Japanese Psychological Research, 22, 134–143.
Nishisato, S. (1982).Quantification of qualitative data. Tokyo: Asakurashoten. (in Japanese)
Nishisato, S. (1988). Dual scaling: Its development and comparisons with other quantification methods. In H. D. Pressmar, K. E. Jager, H. Krallmann, H. Schellhaas, & L. Streitferdt (Eds.),Deutsche Geselleschaft für operations research proceedings (pp. 376–389). Berlin: Springer.
Nishisato, S., & Lawrence, D. R. (1981, May).Dual scaling of multidimensional tables, a comparative study. Paper presented at the annual meeting of the Psychometric Society, Chapel Hill, NC.
Nishisato, S., & Lawrence, D. R. (1989). Dual scaling of multiway data matrices: Several variants. In R. Coppi & S. Bolasco (Eds.)Multiway data analysis (pp. 317–326). Amsterdam: North Holland.
Nishisato, S., & Sheu, W. (1984). A note on dual scaling of successive categories data.Psychometrika, 49, 493–500.
Okamoto, M. (1972). Four techniques of principal component analysis.Journal of Japanese Statistical Society, 2, 63–69.
Potthoff, R. F., & Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems.Biometrika, 51, 313–326.
Ramsay, J. O. (1978). Confidence regions for multidimensional scaling analysis.Psychometrika, 43, 145–160.
Ramsay, J. O. (1980). Joint analysis of direct ratings, pairwise preferences and dissimilarities.Psychometrika, 45, 149–165.
Ramsay, J. O. (1989). Monotone regression splines in actions.Statistical Science, 4, 425–441.
Ramsay, J. O., ten Berge, J., & Styan, G. P. H. (1984). Matrix correlation.Psychometrika, 49, 403–423.
Rao, C. R. (1964). The use and interpretation of principal component analysis in applied research.Sankhya A, 26, 329–358.
Rao, C. R. (1965). The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves.Biometrika, 52, 447–458.
Rao, C. R. (1979). Separation theorems for singular values of matrices and their applications in multivariate analysis.Journal of Multivariate Analysis, 9, 362–377.
Rao, C. R. (1980). Matrix approximations and reduction of dimensionality in multivariate statistical analysis. In P. R. Krishnaiah (Ed.),Multivariate analysis (pp. 3–22). Amsterdam: North Holland.
Rumelhart, D. L., & Greeno, J. G. (1971). Similarity between stimuli: An experimental test of the Luce and Restle choice models.Journal of Mathematical Psychology, 8, 370–381.
Sabatier, R., Lebreton, J. D., & Chessel, D. (1989). Multivariate analysis of composition data accompanied by qualitative variables describing a structure. In R. Coppi & S. Bolasco (Eds.),Multiway data analysis (pp. 341–352). Amsterdam: North-Holland.
Shibayama, T. (1988).Multivariate analysis of test scores with missing values. Unpublished Doctoral Dissertation, University of Tokyo. (in Japanese)
Siotani, M., Hayakawa, T., & Fujikoshi, Y. (1985).Modern multivariate statistical analysis: A graduate course handbook. Columbus, OH: American Sciences Press.
Slater, P. (1960). The analysis of personal preferences.The British Journal of Statistical Psychology, 13, 119–135.
Sjöberg, L. (1967). Successive categories scaling of paired comparisons.Psychometrika, 32, 297–308.
Takane, Y. (1980). Maximum likelihood estimation in the generalized case of Thurstone's model of comparative judgment.Japanese Psychological Research, 22, 188–196.
Takane, Y. (1987). Analysis of covariance structures and binary choice data.Communication and Cognition, 20, 45–62.
Takane, Y., & Shibayama, T. (1988a). Three vector models of pairwise preference ratings and their generalizations. In S. Kashiwagi (Ed.),Proceedings of the 16th Annual Meeting of the Behaviormetric Society (pp. 131–132). Tokyo: Behaviormetric Society of Japan.
Takane, Y., & Shibayama, T. (1988b). Dual scaling with external criteria reconsidered. In S. Kashiwagi (Ed.),Proceedings of the 16th Annual Meeting of the Behaviormetric Society (pp. 133–134). Tokyo: Behaviormetric Society of Japan.
Takane, Y., Yanai, H., & Mayekawa, S. (in press).Relationships among several methods of linearly constrained correspondence analysis. Psychometrika.
Tanaka, Y. (1988). Sensitivity analysis in principal component analysis: Influence on the subspace spanned by principal components.Communications in Statistics—Theory and Methods, 17, 3157–3175.
ter Braak, C. J. F. (1986). Canonical correspondence analysis: A new eigenvector technique for multivariate direct gradient analysis.Ecology, 67, 1167–1179.
Tucker, L. R. (1959). Intra-individual and inter-individual multidimensionality. In H. Gulliksen & S. Messick (Eds.),Psychological scaling (pp. 155–167). New York: Wiley.
van den Wollenberg, A. L. (1977). Redundancy analysis: An alternative for canonical correlation analysis.Psychometrika, 42, 207–219.
van der Heijden, P. G. M., de Falguerolles, A., & de Leeuw, J. (1989). A combined approach to contingency table analysis using correspondence analysis and log linear analysis.Applied Statistics, 38, 249–292.
Weinberg, S. L., Carroll, J. D., & Cohen, H. S. (1984). Confidence regions for INDSCAL using the jackknife and bootstrap techniques.Psychometrika, 49, 475–491.
Wilkinson, J. H. (1965).The algebraic eigenvalue problem. Oxford: Oxford University Press.
Winsberg, S. (1988). Two techniques: Monotone spline transformations for dimension reduction in PCA and easy-to-generate metrics for PCA of sampled functions. In J. L. A. van Rijckevorsel & J. de Leeuw (Eds.),Component and correspondence analysis (pp. 115–135). New York: Wiley.
Winsberg, S., & Ramsay, J. O. (1983). Monotone spline transformations for dimension reduction.Psychometrika, 48, 575–595.
Yanai, H. (1970). Factor analysis with external criteria.Japanese Psychological Research, 12, 143–153.
Yanai, H. (1974). Unification of various techniques of multivariate analysis by means of generalized coefficients of determination. (G.C.D.)Journal of Behaviormetrics, 1, 45–54. (in Japanese)
Yanai, H. (1990). Some generalized forms of least squares g-inverse, minimum norm g-inverse and Moore-Penrose inverse matrices.Computational Statistics and Data Analysis, 10, 251–260.
Yanai, H., & Takeuchi, K. (1983).Projection matrices, generalized inverse and singular value decomposition. Tokyo: University of Tokyo Press. (in Japanese)
Author information
Authors and Affiliations
Additional information
The work reported in this paper was supported by grant A6394 from the Natural Sciences and Engineering Research Council of Canada to the first author. Thanks are due to Jim Ramsay, Haruo Yanai, Henk Kiers, and Shizuhiko Nishisato for their insightful comments on earlier versions of this paper. Jim Ramsay, in particular, suggested the use of the QR decomposition, which simplified the presentation of the paper considerably.
Rights and permissions
About this article
Cite this article
Takane, Y., Shibayama, T. Principal component analysis with external information on both subjects and variables. Psychometrika 56, 97–120 (1991). https://doi.org/10.1007/BF02294589
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294589