Skip to main content
Erschienen in: Journal of Quantitative Economics 1/2022

23.08.2022 | Original Article

Factor Analysis Regression for Predictive Modeling with High-Dimensional Data

verfasst von: Randy Carter, Netsanet Michael

Erschienen in: Journal of Quantitative Economics | Sonderheft 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Factor analysis regression (FAR) of \(y _i\) on \({{\varvec{x}}}_i=(x _{1i},x _{2i},\ldots ,x _{pi})\), i = 1,2,...,n, has been studied only in the low-dimensional case \((p < n )\), using maximum likelihood (ML) factor extraction. The ML method breaks down in high-dimensional cases \((p >n )\). In this paper, we develop a high-dimensional version of FAR based on a computationally efficient method of factor extraction. We compare the performance of our high-dimensional FAR with partial least squares regression (PLSR) and principal component regression (PCR) under three underlying correlation structures: arbitrary correlation, factor model correlation structure, and when y is independent of x. Under each structure, we generated Monte Carlo training samples of sizes \(n <p\) from a multivariate normal distribution with each structure. Parameters were fixed at estimates obtained from analyses of real data sets. Given the independence structure, we observed severe over-fitting by PLSR compared to FAR and PCR. Under the two dependent structures, FAR had a notably better average mean square error of prediction than PCR. The performance of FAR and PLSR were not notably different given the dependent structures. Thus, overall, FAR performed better than either PLSR or PCR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bair, E., T. Hastie, D. Paul, and R. Tibshirani. 2006. Prediction by supervised principal components. Journal of the American Statistical Association 101 (473): 119–137.CrossRef Bair, E., T. Hastie, D. Paul, and R. Tibshirani. 2006. Prediction by supervised principal components. Journal of the American Statistical Association 101 (473): 119–137.CrossRef
Zurück zum Zitat Belloni, A., V. Chernozhukov, and C. Hansen. 2014. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives 28 (2): 29–50.CrossRef Belloni, A., V. Chernozhukov, and C. Hansen. 2014. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives 28 (2): 29–50.CrossRef
Zurück zum Zitat Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani. 2004. Least angle regression. The Annals of Statistics 32: 407–499.CrossRef Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani. 2004. Least angle regression. The Annals of Statistics 32: 407–499.CrossRef
Zurück zum Zitat Fan, J., Y. Liao, and J. Yao. 2015. Power enhancement in high-dimensional cross-sectional tests. Econometrica 83 (4): 1497–1541.CrossRef Fan, J., Y. Liao, and J. Yao. 2015. Power enhancement in high-dimensional cross-sectional tests. Econometrica 83 (4): 1497–1541.CrossRef
Zurück zum Zitat Frank, I.E., and J.H. Friedman. 1993. A statistical view of some chemometrics regression tools. Technometrics 35 (2): 109–135.CrossRef Frank, I.E., and J.H. Friedman. 1993. A statistical view of some chemometrics regression tools. Technometrics 35 (2): 109–135.CrossRef
Zurück zum Zitat Friedman, J., R. Tibshirani, and T. Hastie. 2009. The Elements of Statistical Learning Data Mining, Inference, and Prediction. Berlin: Springer. Friedman, J., R. Tibshirani, and T. Hastie. 2009. The Elements of Statistical Learning Data Mining, Inference, and Prediction. Berlin: Springer.
Zurück zum Zitat Garthwaite, P.H. 1994. An interpretation of partial least squares. Journal of the American Statistical Association 89 (425): 122–127.CrossRef Garthwaite, P.H. 1994. An interpretation of partial least squares. Journal of the American Statistical Association 89 (425): 122–127.CrossRef
Zurück zum Zitat Hadi, A.S., and R.F. Ling. 1998. Some cautionary notes on the use of principal components regression. The American Statistician 52 (1): 15–19. Hadi, A.S., and R.F. Ling. 1998. Some cautionary notes on the use of principal components regression. The American Statistician 52 (1): 15–19.
Zurück zum Zitat Harville, D.A. 1998. Matrix Algebra from a Statistician’s Perspective. New York: Taylor & Francis. Harville, D.A. 1998. Matrix Algebra from a Statistician’s Perspective. New York: Taylor & Francis.
Zurück zum Zitat Helland, I.S. 2010. Steps Towards a Unified Basis for Scientific Models and Methods. Singapore: World Scientific Pub. Co. Helland, I.S. 2010. Steps Towards a Unified Basis for Scientific Models and Methods. Singapore: World Scientific Pub. Co.
Zurück zum Zitat Hoerl, A.E., and R.W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 (1): 55–67.CrossRef Hoerl, A.E., and R.W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 (1): 55–67.CrossRef
Zurück zum Zitat James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An Introduction to Statistical Learning vol. 112. James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An Introduction to Statistical Learning vol. 112.
Zurück zum Zitat Johnson, R.A., and D.W. Wichern. 2007. Applied Multivariate Statistical Analysis. Hoboken: Pearson Prentice Hall. Johnson, R.A., and D.W. Wichern. 2007. Applied Multivariate Statistical Analysis. Hoboken: Pearson Prentice Hall.
Zurück zum Zitat Jolliffe, I. 2005. Principal component analysis. Encyclopedia of Statistics in Behavioral Science 20: 20. Jolliffe, I. 2005. Principal component analysis. Encyclopedia of Statistics in Behavioral Science 20: 20.
Zurück zum Zitat Kalina, J. 2017. High-dimensional data in economics and their (robust) analysis. Serbian Journal of Management 12 (1): 157–169.CrossRef Kalina, J. 2017. High-dimensional data in economics and their (robust) analysis. Serbian Journal of Management 12 (1): 157–169.CrossRef
Zurück zum Zitat Rao, C.R. 1996. Principal component and factor analyses. Handbook of Statistics 14: 489–505.CrossRef Rao, C.R. 1996. Principal component and factor analyses. Handbook of Statistics 14: 489–505.CrossRef
Zurück zum Zitat Schneeweiss, H., and H. Mathes. 1995. Factor analysis and principal components. Journal of Multivariate Analysis 55 (1): 105–124.CrossRef Schneeweiss, H., and H. Mathes. 1995. Factor analysis and principal components. Journal of Multivariate Analysis 55 (1): 105–124.CrossRef
Zurück zum Zitat Scott, J.T., Jr. 1966. Factor analysis and regression. Econometrica: Journal of the Econometric Society 20: 552–562.CrossRef Scott, J.T., Jr. 1966. Factor analysis and regression. Econometrica: Journal of the Econometric Society 20: 552–562.CrossRef
Zurück zum Zitat Stone, E., A. Chiang, T. Scheetz, K. Kim, R. Swiderski, D. Nishimura, L. Affatigato, J. Huang, T. Casavant, and V. Sheffield. 2006. Analysis of correlated gene expression in a large cohort of rats assists the discovery of two new genes involved in bardet biedl syndrome (bbs). Investigative Ophthalmology & Visual Science 47 (13): 5919. Stone, E., A. Chiang, T. Scheetz, K. Kim, R. Swiderski, D. Nishimura, L. Affatigato, J. Huang, T. Casavant, and V. Sheffield. 2006. Analysis of correlated gene expression in a large cohort of rats assists the discovery of two new genes involved in bardet biedl syndrome (bbs). Investigative Ophthalmology & Visual Science 47 (13): 5919.
Zurück zum Zitat Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 58 (1): 267–288. Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 58 (1): 267–288.
Zurück zum Zitat Wold, H. 1966. Estimation of principal components and related models by iterative least squares. Multivariate Analysis 20: 391–420. Wold, H. 1966. Estimation of principal components and related models by iterative least squares. Multivariate Analysis 20: 391–420.
Zurück zum Zitat Wold, S. 1978. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 20 (4): 397–405.CrossRef Wold, S. 1978. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 20 (4): 397–405.CrossRef
Zurück zum Zitat Wold, S., M. Høy, H. Martens, J. Trygg, F. Westad, J. MacGregor, and B.M. Wise. 2009. The pls model space revisited. Journal of Chemometrics: A Journal of the Chemometrics Society 23 (2): 67–68.CrossRef Wold, S., M. Høy, H. Martens, J. Trygg, F. Westad, J. MacGregor, and B.M. Wise. 2009. The pls model space revisited. Journal of Chemometrics: A Journal of the Chemometrics Society 23 (2): 67–68.CrossRef
Zurück zum Zitat Wold, S., H. Martens, and H. Wold. 1983. The multivariate calibration-problem in chemistry solved by the pls method. Lecture Notes in Mathematics 973: 286–293.CrossRef Wold, S., H. Martens, and H. Wold. 1983. The multivariate calibration-problem in chemistry solved by the pls method. Lecture Notes in Mathematics 973: 286–293.CrossRef
Zurück zum Zitat Wold, S., M. Sjöström, and L. Eriksson. 2001. Pls-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58 (2): 109–130.CrossRef Wold, S., M. Sjöström, and L. Eriksson. 2001. Pls-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58 (2): 109–130.CrossRef
Zurück zum Zitat Zhang, M.H., Q.S. Xu, and D.L. Massart. 2003. Robust principal components regression based on principal sensitivity vectors. Chemometrics and Intelligent Laboratory Systems 67 (2): 175–185.CrossRef Zhang, M.H., Q.S. Xu, and D.L. Massart. 2003. Robust principal components regression based on principal sensitivity vectors. Chemometrics and Intelligent Laboratory Systems 67 (2): 175–185.CrossRef
Zurück zum Zitat Zou, H., and T. Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 67 (2): 301–320.CrossRef Zou, H., and T. Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 67 (2): 301–320.CrossRef
Metadaten
Titel
Factor Analysis Regression for Predictive Modeling with High-Dimensional Data
verfasst von
Randy Carter
Netsanet Michael
Publikationsdatum
23.08.2022
Verlag
Springer India
Erschienen in
Journal of Quantitative Economics / Ausgabe Sonderheft 1/2022
Print ISSN: 0971-1554
Elektronische ISSN: 2364-1045
DOI
https://doi.org/10.1007/s40953-022-00322-x

Weitere Artikel der Sonderheft 1/2022

Journal of Quantitative Economics 1/2022 Zur Ausgabe

Premium Partner