Skip to main content

2016 | OriginalPaper | Buchkapitel

2. Principal Component Analysis

verfasst von : René Vidal, Yi Ma, S. Shankar Sastry

Erschienen in: Generalized Principal Component Analysis

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Principal component analysis (PCA) is the problem of fitting a low-dimensional affine subspace to a set of data points in a high-dimensional space. PCA is, by now, well established in the literature, and has become one of the most useful tools for data modeling, compression, and visualization.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The reason for this is that both \(\boldsymbol{u}_{1}\) and its orthogonal complement \(\boldsymbol{u}_{1}^{\perp }\) are invariant subspaces of \(\Sigma _{\boldsymbol{x}}\).
 
2
In the statistical setting, \(\boldsymbol{x}_{j}\) and \(\boldsymbol{y}_{j}\) will be samples of two random variables \(\boldsymbol{x}\) and \(\boldsymbol{y}\), respectively. Then this constraint is equivalent to setting their means to zero.
 
3
From a statistical standpoint, the column vectors of U give the directions in which the data X has the largest variance, whence the name “principal components.”
 
4
In Section 1.​2.​1, we have seen an example in which a similar process can be applied to an ensemble of face images from multiple subspaces, where the first d = 3 principal components are calculated and visualized.
 
5
We leave as an exercise to the reader to calculate the number of parameters needed to specify a d-dimensional subspace in \(\mathbb{R}^{D}\) and then the additional parameters needed to specify a Gaussian distribution inside the subspace.
 
6
Even if one chooses to compare models by their algorithmic complexity, such as the minimum message length (MML) criterion (Wallace and Boulton 1968) (an extension of the Kolmogrov complexity to model selection), a strong connection with the above information-theoretic criteria, such as minimum description length (MDL), can be readily established via Shannon’s optimal coding theory (see (Wallace and Dowe 1999)).
 
7
It can be shown that the nuclear norm is a convex envelope of the rank function for matrices.
 
Literatur
Zurück zum Zitat Akaike, H. (1977). A new look at the statistical model selection. IEEE Transactions on Automatic Control, 16(6), 716–723.MathSciNetMATH Akaike, H. (1977). A new look at the statistical model selection. IEEE Transactions on Automatic Control, 16(6), 716–723.MathSciNetMATH
Zurück zum Zitat Basri, R., & Jacobs, D. (2003). Lambertian reflection and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.CrossRef Basri, R., & Jacobs, D. (2003). Lambertian reflection and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.CrossRef
Zurück zum Zitat Beltrami, E. (1873). Sulle funzioni bilineari. Giornale di Mathematiche di Battaglini, 11, 98–106.MATH Beltrami, E. (1873). Sulle funzioni bilineari. Giornale di Mathematiche di Battaglini, 11, 98–106.MATH
Zurück zum Zitat Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4), 1956–1982.MathSciNetCrossRefMATH Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4), 1956–1982.MathSciNetCrossRefMATH
Zurück zum Zitat Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.CrossRef Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.CrossRef
Zurück zum Zitat Collins, M., Dasgupta, S., & Schapire, R. (2001). A generalization of principal component analysis to the exponential family. In Neural Information Processing Systems (Vol. 14) Collins, M., Dasgupta, S., & Schapire, R. (2001). A generalization of principal component analysis to the exponential family. In Neural Information Processing Systems (Vol. 14)
Zurück zum Zitat Ding, C., Zha, H., He, X., Husbands, P., & Simon, H. D. (2004). Link analysis: Hubs and authoraties on the world wide web. SIAM Review, 46(2), 256–268.MathSciNetCrossRefMATH Ding, C., Zha, H., He, X., Husbands, P., & Simon, H. D. (2004). Link analysis: Hubs and authoraties on the world wide web. SIAM Review, 46(2), 256–268.MathSciNetCrossRefMATH
Zurück zum Zitat Donoho, D., & Gavish, M. (2014). The optimal hard threshold for singular values is \(4/\sqrt{3}\). IEEE Transactions on Information Theory, 60(8), 5040–5053.MathSciNetCrossRef Donoho, D., & Gavish, M. (2014). The optimal hard threshold for singular values is \(4/\sqrt{3}\). IEEE Transactions on Information Theory, 60(8), 5040–5053.MathSciNetCrossRef
Zurück zum Zitat Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211–218.CrossRefMATH Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211–218.CrossRefMATH
Zurück zum Zitat Gabriel, K. R. (1978). Least squares approximation of matrices by additive and multiplicative models. Journal of the Royal Statistical Society B, 40, 186–196.MathSciNetMATH Gabriel, K. R. (1978). Least squares approximation of matrices by additive and multiplicative models. Journal of the Royal Statistical Society B, 40, 186–196.MathSciNetMATH
Zurück zum Zitat Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.CrossRef Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.CrossRef
Zurück zum Zitat Hansen, M., & Yu, B. (2001). Model selection and the principle of minimum description length. Journal of American Statistical Association, 96, 746–774.MathSciNetCrossRefMATH Hansen, M., & Yu, B. (2001). Model selection and the principle of minimum description length. Journal of American Statistical Association, 96, 746–774.MathSciNetCrossRefMATH
Zurück zum Zitat Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441.CrossRefMATH Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441.CrossRefMATH
Zurück zum Zitat Hubert, L., Meulman, J., & Heiser, W. (2000). Two purposes for matrix factorization: A historical appraisal. SIAM Review, 42(1), 68–82.MathSciNetCrossRefMATH Hubert, L., Meulman, J., & Heiser, W. (2000). Two purposes for matrix factorization: A historical appraisal. SIAM Review, 42(1), 68–82.MathSciNetCrossRefMATH
Zurück zum Zitat Jolliffe, I. (2002). Principal Component Analysis (2nd ed.). New York: Springer.MATH Jolliffe, I. (2002). Principal Component Analysis (2nd ed.). New York: Springer.MATH
Zurück zum Zitat Jordan, M. (1874). Mémoire sur les formes bilinéaires. Journal de Mathématiques Pures et Appliqués, 19, 35–54.MATH Jordan, M. (1874). Mémoire sur les formes bilinéaires. Journal de Mathématiques Pures et Appliqués, 19, 35–54.MATH
Zurück zum Zitat Kanatani, K. (1998). Geometric information criterion for model selection. International Journal of Computer Vision (pp. 171–189). Kanatani, K. (1998). Geometric information criterion for model selection. International Journal of Computer Vision (pp. 171–189).
Zurück zum Zitat Kleinberg, J. M. (1999). Authorative sources in a hyberlinked environment. Journal of the ACM, 48, 604–632.MathSciNetCrossRef Kleinberg, J. M. (1999). Authorative sources in a hyberlinked environment. Journal of the ACM, 48, 604–632.MathSciNetCrossRef
Zurück zum Zitat Minka, T. (2000). Automatic choice of dimensionality for PCA. In Neural Information Processing Systems (Vol. 13, pp. 598–604). Minka, T. (2000). Automatic choice of dimensionality for PCA. In Neural Information Processing Systems (Vol. 13, pp. 598–604).
Zurück zum Zitat Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh and Dublin Philosphical Magazine and Journal of Science, 2, 559–572.CrossRefMATH Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh and Dublin Philosphical Magazine and Journal of Science, 2, 559–572.CrossRefMATH
Zurück zum Zitat Recht, B., Fazel, M., & Parrilo, P. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501.MathSciNetCrossRefMATH Recht, B., Fazel, M., & Parrilo, P. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501.MathSciNetCrossRefMATH
Zurück zum Zitat Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.CrossRefMATH Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.CrossRefMATH
Zurück zum Zitat Shabalin, A., & Nobel, A. (2010). Reconstruction of a low-rank matrix in the presence of gaussian noise (pp. 1–34). arXiv preprint 1007.4148 Shabalin, A., & Nobel, A. (2010). Reconstruction of a low-rank matrix in the presence of gaussian noise (pp. 1–34). arXiv preprint 1007.4148
Zurück zum Zitat Tipping, M., & Bishop, C. (1999b). Probabilistic principal component analysis. Journal of the Royal Statistical Society, 61(3), 611–622.MathSciNetCrossRefMATH Tipping, M., & Bishop, C. (1999b). Probabilistic principal component analysis. Journal of the Royal Statistical Society, 61(3), 611–622.MathSciNetCrossRefMATH
Zurück zum Zitat Turk, M., & Pentland, A. (1991). Face recognition using eigenfaces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 586–591). Turk, M., & Pentland, A. (1991). Face recognition using eigenfaces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 586–591).
Zurück zum Zitat Wallace, C., & Boulton, D. (1968). An information measure for classification. The Computer Journal, 11, 185–194.CrossRefMATH Wallace, C., & Boulton, D. (1968). An information measure for classification. The Computer Journal, 11, 185–194.CrossRefMATH
Zurück zum Zitat Wallace, C., & Dowe, D. (1999). Minimum message length and Kolmogrov complexity. The Computer Journal, 42(4), 270–283.CrossRefMATH Wallace, C., & Dowe, D. (1999). Minimum message length and Kolmogrov complexity. The Computer Journal, 42(4), 270–283.CrossRefMATH
Metadaten
Titel
Principal Component Analysis
verfasst von
René Vidal
Yi Ma
S. Shankar Sastry
Copyright-Jahr
2016
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-0-387-87811-9_2

Neuer Inhalt