Top

Published in:

2016 | OriginalPaper | Chapter

2. Principal Component Analysis

Authors : René Vidal, Yi Ma, S. Shankar Sastry

Published in: Generalized Principal Component Analysis

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Principal component analysis (PCA) is the problem of fitting a low-dimensional affine subspace to a set of data points in a high-dimensional space. PCA is, by now, well established in the literature, and has become one of the most useful tools for data modeling, compression, and visualization.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Introduction

next chapter Robust Principal Component Analysis

The reason for this is that both \(\boldsymbol{u}_{1}\) and its orthogonal complement \(\boldsymbol{u}_{1}^{\perp }\) are invariant subspaces of \(\Sigma _{\boldsymbol{x}}\).

In the statistical setting, \(\boldsymbol{x}_{j}\) and \(\boldsymbol{y}_{j}\) will be samples of two random variables \(\boldsymbol{x}\) and \(\boldsymbol{y}\), respectively. Then this constraint is equivalent to setting their means to zero.

From a statistical standpoint, the column vectors of U give the directions in which the data X has the largest variance, whence the name “principal components.”

In Section 1.2.1, we have seen an example in which a similar process can be applied to an ensemble of face images from multiple subspaces, where the first d = 3 principal components are calculated and visualized.

We leave as an exercise to the reader to calculate the number of parameters needed to specify a d-dimensional subspace in \(\mathbb{R}^{D}\) and then the additional parameters needed to specify a Gaussian distribution inside the subspace.

Even if one chooses to compare models by their algorithmic complexity, such as the minimum message length (MML) criterion (Wallace and Boulton 1968) (an extension of the Kolmogrov complexity to model selection), a strong connection with the above information-theoretic criteria, such as minimum description length (MDL), can be readily established via Shannon’s optimal coding theory (see (Wallace and Dowe 1999)).

It can be shown that the nuclear norm is a convex envelope of the rank function for matrices.

http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html.

Akaike, H. (1977). A new look at the statistical model selection. IEEE Transactions on Automatic Control, 16(6), 716–723.MathSciNetMATH

Basri, R., & Jacobs, D. (2003). Lambertian reflection and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.CrossRef

Beltrami, E. (1873). Sulle funzioni bilineari. Giornale di Mathematiche di Battaglini, 11, 98–106.MATH

Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4), 1956–1982.MathSciNetCrossRefMATH

Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.CrossRef

Collins, M., Dasgupta, S., & Schapire, R. (2001). A generalization of principal component analysis to the exponential family. In Neural Information Processing Systems (Vol. 14)

Ding, C., Zha, H., He, X., Husbands, P., & Simon, H. D. (2004). Link analysis: Hubs and authoraties on the world wide web. SIAM Review, 46(2), 256–268.MathSciNetCrossRefMATH

Donoho, D., & Gavish, M. (2014). The optimal hard threshold for singular values is \(4/\sqrt{3}\). IEEE Transactions on Information Theory, 60(8), 5040–5053.MathSciNetCrossRef

Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211–218.CrossRefMATH

Gabriel, K. R. (1978). Least squares approximation of matrices by additive and multiplicative models. Journal of the Royal Statistical Society B, 40, 186–196.MathSciNetMATH

Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.CrossRef

Hansen, M., & Yu, B. (2001). Model selection and the principle of minimum description length. Journal of American Statistical Association, 96, 746–774.MathSciNetCrossRefMATH

Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441.CrossRefMATH

Householder, A. S., & Young, G. (1938). Matrix approximation and latent roots. American Mathematical Monthly, 45, 165–171.MathSciNetCrossRefMATH

Hubert, L., Meulman, J., & Heiser, W. (2000). Two purposes for matrix factorization: A historical appraisal. SIAM Review, 42(1), 68–82.MathSciNetCrossRefMATH

Jolliffe, I. (1986). Principal Component Analysis. New York: Springer.CrossRefMATH

Jolliffe, I. (2002). Principal Component Analysis (2nd ed.). New York: Springer.MATH

Jordan, M. (1874). Mémoire sur les formes bilinéaires. Journal de Mathématiques Pures et Appliqués, 19, 35–54.MATH

Kanatani, K. (1998). Geometric information criterion for model selection. International Journal of Computer Vision (pp. 171–189).

Kleinberg, J. M. (1999). Authorative sources in a hyberlinked environment. Journal of the ACM, 48, 604–632.MathSciNetCrossRef

Minka, T. (2000). Automatic choice of dimensionality for PCA. In Neural Information Processing Systems (Vol. 13, pp. 598–604).

Mirsky, L. (1975). A trace inequality of John von Neumann. Monatshefte für Mathematic, 79, 303–306.MathSciNetCrossRefMATH

Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh and Dublin Philosphical Magazine and Journal of Science, 2, 559–572.CrossRefMATH

Recht, B., Fazel, M., & Parrilo, P. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501.MathSciNetCrossRefMATH

Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.CrossRefMATH

Shabalin, A., & Nobel, A. (2010). Reconstruction of a low-rank matrix in the presence of gaussian noise (pp. 1–34). arXiv preprint 1007.4148

Tipping, M., & Bishop, C. (1999b). Probabilistic principal component analysis. Journal of the Royal Statistical Society, 61(3), 611–622.MathSciNetCrossRefMATH

Turk, M., & Pentland, A. (1991). Face recognition using eigenfaces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 586–591).

Wallace, C., & Boulton, D. (1968). An information measure for classification. The Computer Journal, 11, 185–194.CrossRefMATH

Wallace, C., & Dowe, D. (1999). Minimum message length and Kolmogrov complexity. The Computer Journal, 42(4), 270–283.CrossRefMATH

Title: Principal Component Analysis
Authors: René Vidal
Yi Ma
S. Shankar Sastry
Publisher: Springer New York
Book: Generalized Principal Component Analysis
Print ISBN: 978-0-387-87810-2

Electronic ISBN: 978-0-387-87811-9

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-0-387-87811-9_2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"