Skip to main content
Erschienen in: Advances in Data Analysis and Classification 3/2018

13.04.2017 | Regular Article

Sparsest factor analysis for clustering variables: a matrix decomposition approach

verfasst von: Kohei Adachi, Nickolay T. Trendafilov

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a new procedure for sparse factor analysis (FA) such that each variable loads only one common factor. Thus, the loading matrix has a single nonzero element in each row and zeros elsewhere. Such a loading matrix is the sparsest possible for certain number of variables and common factors. For this reason, the proposed method is named sparsest FA (SSFA). It may also be called FA-based variable clustering, since the variables loading the same common factor can be classified into a cluster. In SSFA, all model parts of FA (common factors, their correlations, loadings, unique factors, and unique variances) are treated as fixed unknown parameter matrices and their least squares function is minimized through specific data matrix decomposition. A useful feature of the algorithm is that the matrix of common factor scores is re-parameterized using QR decomposition in order to efficiently estimate factor correlations. A simulation study shows that the proposed procedure can exactly identify the true sparsest models. Real data examples demonstrate the usefulness of the variable clustering performed by SSFA.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adachi K (2012) Some contributions to data-fitting factor analysis with empirical comparisons to covariance-fitting factor analysis. J Jpn Soc Comput Stat 25:25–38MathSciNetCrossRef Adachi K (2012) Some contributions to data-fitting factor analysis with empirical comparisons to covariance-fitting factor analysis. J Jpn Soc Comput Stat 25:25–38MathSciNetCrossRef
Zurück zum Zitat Adachi K (2014) A matrix-intensive approach to factor analysis. Jpn J Stat 44:363–382 (in Japanese)MathSciNet Adachi K (2014) A matrix-intensive approach to factor analysis. Jpn J Stat 44:363–382 (in Japanese)MathSciNet
Zurück zum Zitat Adachi K, Trendafilov NT (2014) Sparse orthogonal factor analysis. In: Carpita M, Brentari E, Qannari EM (eds) Advances in latent variables: studies in theoretical and applied statistics. Springer, Heidelberg, pp 227–239 Adachi K, Trendafilov NT (2014) Sparse orthogonal factor analysis. In: Carpita M, Brentari E, Qannari EM (eds) Advances in latent variables: studies in theoretical and applied statistics. Springer, Heidelberg, pp 227–239
Zurück zum Zitat Costa PT, McCrae RR (1992) NEO PI-R professional manual: revised NEO personality inventory (NEO PI-R) and NEO five-factor inventory (NEO-FFI). Psychological Assessment Resources, Odessa Costa PT, McCrae RR (1992) NEO PI-R professional manual: revised NEO personality inventory (NEO PI-R) and NEO five-factor inventory (NEO-FFI). Psychological Assessment Resources, Odessa
Zurück zum Zitat de Leeuw J (2004) Least squares optimal scaling of partially observed linear systems. In: van Montfort K, Oud J, Satorra A (eds) Recent developments of structural equation models: theory and applications. Kluwer Academic Publishers, Dordrecht, pp 121–134CrossRef de Leeuw J (2004) Least squares optimal scaling of partially observed linear systems. In: van Montfort K, Oud J, Satorra A (eds) Recent developments of structural equation models: theory and applications. Kluwer Academic Publishers, Dordrecht, pp 121–134CrossRef
Zurück zum Zitat Eldén L (2007) Matrix methods in data mining and pattern recognition. SIAM, PhiladelphiaCrossRef Eldén L (2007) Matrix methods in data mining and pattern recognition. SIAM, PhiladelphiaCrossRef
Zurück zum Zitat Everitt BS (1993) Cluster analysis, 3rd edn. Edward Arnold, LondonMATH Everitt BS (1993) Cluster analysis, 3rd edn. Edward Arnold, LondonMATH
Zurück zum Zitat Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and Applications. Society of Industrial and Applied Mathematics (SIAM), PhiladelphiaCrossRef Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and Applications. Society of Industrial and Applied Mathematics (SIAM), PhiladelphiaCrossRef
Zurück zum Zitat Goldberg LR (1992) The development of markers for the Big-five factor structure. Psychol Assess 4:26–42CrossRef Goldberg LR (1992) The development of markers for the Big-five factor structure. Psychol Assess 4:26–42CrossRef
Zurück zum Zitat Harman HH (1976) Modern factor analysis, 3rd edn. The University of Chicago Press, ChicagoMATH Harman HH (1976) Modern factor analysis, 3rd edn. The University of Chicago Press, ChicagoMATH
Zurück zum Zitat Hirose K, Yamamoto M (2014a) Estimation of an oblique structure via penalized likelihood factor analysis. Comput Stat Data Anal 79:120–132MathSciNetCrossRef Hirose K, Yamamoto M (2014a) Estimation of an oblique structure via penalized likelihood factor analysis. Comput Stat Data Anal 79:120–132MathSciNetCrossRef
Zurück zum Zitat Holzinger KJ, Swineford F (1939) A study in factor analysis: the stability of a bi-factor solution. University of Chicago, Supplementary Educational Monographs, No. 48 Holzinger KJ, Swineford F (1939) A study in factor analysis: the stability of a bi-factor solution. University of Chicago, Supplementary Educational Monographs, No. 48
Zurück zum Zitat Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer, New YorkCrossRef Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer, New YorkCrossRef
Zurück zum Zitat Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the LASSO. J Comput Graph Stat 12:531–547MathSciNetCrossRef Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the LASSO. J Comput Graph Stat 12:531–547MathSciNetCrossRef
Zurück zum Zitat Knowles D, Ghahramani Z (2011) Nonparametric Bayesian sparse factor models with applications to gene expression modeling. Ann Appl Stat 5:1534–1552MathSciNetCrossRef Knowles D, Ghahramani Z (2011) Nonparametric Bayesian sparse factor models with applications to gene expression modeling. Ann Appl Stat 5:1534–1552MathSciNetCrossRef
Zurück zum Zitat Mazumder R, Friedman J, Hastie T (2011) SparseNet: coordinate descent with nonconvex penalties. J Am Stat Assoc 106:1125–1138MathSciNetCrossRef Mazumder R, Friedman J, Hastie T (2011) SparseNet: coordinate descent with nonconvex penalties. J Am Stat Assoc 106:1125–1138MathSciNetCrossRef
Zurück zum Zitat Mulaik SA (2010) Foundations of factor analysis, 2nd edn. CRC Press, Boca RatonMATH Mulaik SA (2010) Foundations of factor analysis, 2nd edn. CRC Press, Boca RatonMATH
Zurück zum Zitat Reyment R, Jöreskog KG (1996) Applied factor analysis in the natural sciences. Cambridge University Press, CambridgeMATH Reyment R, Jöreskog KG (1996) Applied factor analysis in the natural sciences. Cambridge University Press, CambridgeMATH
Zurück zum Zitat Sampson RJ (1968) \(R\)-mode factor analysis program in FORTRAN II for the IBM 1620 computer. Kansas Geol Survey Comput Contrib 20 Sampson RJ (1968) \(R\)-mode factor analysis program in FORTRAN II for the IBM 1620 computer. Kansas Geol Survey Comput Contrib 20
Zurück zum Zitat Seber GAF (2008) A matrix handbook for statisticians. Wiley, HobokenMATH Seber GAF (2008) A matrix handbook for statisticians. Wiley, HobokenMATH
Zurück zum Zitat Sočan G (2003) The incremental value of minimum rank factor analysis. Ph.D. Thesis, University of Groningen, Groningen Sočan G (2003) The incremental value of minimum rank factor analysis. Ph.D. Thesis, University of Groningen, Groningen
Zurück zum Zitat Spearman C (1904) ‘General intelligence’ objectively determined and measured. Am J Psychol 15:201–293CrossRef Spearman C (1904) ‘General intelligence’ objectively determined and measured. Am J Psychol 15:201–293CrossRef
Zurück zum Zitat Stegeman A (2016) A new method for simultaneous estimation of the factor model parameters, factor scores, and unique parts. Comput Stat Data Anal 99:189–203MathSciNetCrossRef Stegeman A (2016) A new method for simultaneous estimation of the factor model parameters, factor scores, and unique parts. Comput Stat Data Anal 99:189–203MathSciNetCrossRef
Zurück zum Zitat ten Berge JMF (1983) A generalization of Kristof’s theorem on the trace of certain matrix products. Psychometrika 48:519–523MathSciNetCrossRef ten Berge JMF (1983) A generalization of Kristof’s theorem on the trace of certain matrix products. Psychometrika 48:519–523MathSciNetCrossRef
Zurück zum Zitat ten Berge JMF (1993) Least squares optimization in multivariate analysis. DSWO Press, Leiden ten Berge JMF (1993) Least squares optimization in multivariate analysis. DSWO Press, Leiden
Zurück zum Zitat Trendafilov NT, Unkel S (2011) Exploratory factor analysis of data matrices with more variables than observations. J Comput Graph Stat 20:874–891MathSciNetCrossRef Trendafilov NT, Unkel S (2011) Exploratory factor analysis of data matrices with more variables than observations. J Comput Graph Stat 20:874–891MathSciNetCrossRef
Zurück zum Zitat Trendafilov NT, Unkel S, Krzanowski W (2011) Exploratory factor and principal component analyses: some new aspects. Stat Comput 23:209–220MathSciNetCrossRef Trendafilov NT, Unkel S, Krzanowski W (2011) Exploratory factor and principal component analyses: some new aspects. Stat Comput 23:209–220MathSciNetCrossRef
Zurück zum Zitat Unkel S, Trendafilov NT (2010) Simultaneous parameter estimation in exploratory factor analysis: an expository review. Int Stat Rev 78:363–382CrossRef Unkel S, Trendafilov NT (2010) Simultaneous parameter estimation in exploratory factor analysis: an expository review. Int Stat Rev 78:363–382CrossRef
Zurück zum Zitat Vichi M, Saporta G (2009) Clustering and disjoint principal component analysis. Comput Stat Data Anal 53:3194–3208MathSciNetCrossRef Vichi M, Saporta G (2009) Clustering and disjoint principal component analysis. Comput Stat Data Anal 53:3194–3208MathSciNetCrossRef
Zurück zum Zitat Zaki MJ, Meira W (2014) Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, CambridgeCrossRef Zaki MJ, Meira W (2014) Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, CambridgeCrossRef
Metadaten
Titel
Sparsest factor analysis for clustering variables: a matrix decomposition approach
verfasst von
Kohei Adachi
Nickolay T. Trendafilov
Publikationsdatum
13.04.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 3/2018
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-017-0284-z

Weitere Artikel der Ausgabe 3/2018

Advances in Data Analysis and Classification 3/2018 Zur Ausgabe