Skip to main content

Über dieses Buch

This book reports on the latest advances in concepts and further developments of principal component analysis (PCA), addressing a number of open problems related to dimensional reduction techniques and their extensions in detail. Bringing together research results previously scattered throughout many scientific journals papers worldwide, the book presents them in a methodologically unified form. Offering vital insights into the subject matter in self-contained chapters that balance the theory and concrete applications, and especially focusing on open problems, it is essential reading for all researchers and practitioners with an interest in PCA.



Sparse Principal Component Analysis via Rotation and Truncation

This chapter begins with the motivation of sparse PCA–to improve the physical interpretation of the loadings. Second, we introduce the issues involved in sparse PCA problem that are distinct from PCA problem. Third, we briefly review some sparse PCA algorithms in the literature, and comment their limitations as well as problems unresolved. Forth, we introduce one of the state-of-the-art algorithms, SPCArt Hu et al. (IEEE Trans. Neural Networks Learn. Syst. 27(4):875–890, 2016), including its motivating idea, formulation, optimization solution, and performance analysis. Along with the introduction, we describe how SPCArt addresses the unresolved problems. Fifth, based on the Eckart-Young Theorem, we provide a unified view to a series of sparse PCA algorithms including SPCArt. Finally, we make a concluding remark.
Zhenfang Hu, Gang Pan, Yueming Wang, Zhaohui Wu

PCA, Kernel PCA and Dimensionality Reduction in Hyperspectral Images

In this chapter an application of PCA, kernel PCA with their modified versions are discussed in the field of dimensionality reduction of hyperspectral images. Hyperspectral image cube is a set of images from hundreds of narrow and contiguous bands of electromagnetic spectrum from visible to near-infrared regions, which usually contains large amount of information to identify and distinguish spectrally unique materials. In hyperspectral image analysis, reducing the dimensionality is an important step where the aim is to discard the redundant bands and make it less time consuming for classification. Principal component analysis (PCA), and the modified version of PCA, i.e., segmented PCA are useful for reducing the dimensionality. A brief detail of these PCA based methods in the field of hyperspectral images with their advantages and disadvantages are discussed here. Also, dimensionality reduction using kernel PCA (one of the non linear PCA) and its modification i.e., clustering oriented kernel PCA in this field are elaborated in this chapter. Advantages and disadvantages of all these methods are experimentally evaluated over few hyperspectral data sets with different performance measures.
Aloke Datta, Susmita Ghosh, Ashish Ghosh

Principal Component Analysis in the Presence of Missing Data

The aim of this chapter is to provide an overview of recent developments in principal component analysis (PCA) methods when the data are incomplete. Missing data bring uncertainty into the analysis and their treatment requires statistical approaches that are tailored to cope with specific missing data processes (i.e., ignorable and nonignorable mechanisms). Since the publication of the classic textbook by Jolliffe, which includes a short, same-titled section on the missing data problem in PCA, there have been a few methodological contributions that hinge upon a probabilistic approach to PCA. In this chapter, we unify methods for ignorable and nonignorable missing data in a general likelihood framework. We also provide real data examples to illustrate the application of these methods using the R language and environment for statistical computing and graphics.
Marco Geraci, Alessio Farcomeni

Robust PCAs and PCA Using Generalized Mean

In this chapter, a robust principal component analysis (PCA) is described, which can overcome the problem that PCA is prone to outliers included in training set. Different from the other alternatives which commonly replace \(L_{2}\)-norm by other distance measures, our method alleviates the negative effect of outliers using the characteristic of the generalized mean keeping the use of the Euclidean distance. The optimization problem based on the generalized mean is solved by a novel method. We also present a generalized sample mean, which is a generalization of the sample mean, to estimate a robust mean in the presence of outliers. The proposed method shows better or equivalent performance than the conventional PCAs in various problems such as face reconstruction, clustering, and object categorization.
Jiyong Oh, Nojun Kwak

Principal Component Analysis Techniques for Visualization of Volumetric Data

We investigate the use of Principal Component Analysis (PCA) for the visualization of 3D volumetric data. For static volume datasets, we assume, as input training samples, a set of images rendered from spherically distributed viewing positions, using a state-of-the-art volume rendering technique. We compute a high-dimensional eigenspace, that we can then use to synthesize arbitrary views of the dataset with minimal computation at run-time. Visual quality is improved by subdividing the training samples using two techniques: cell-based decomposition into equally sized spatial partitions and a more generalized variant, which we referred to as band-based PCA. The latter approach is further extended for the compression of time-varying volume data directly. This is achieved by taking, as input, full 3D volumes comprised by the time-steps of the time-varying sequence and generating an eigenspace of volumes. Results indicate that, in both cases, PCA can be used for effective compression with minimal loss of perceptual quality, and could benefit applications such as client-server visualization systems.
Salaheddin Alakkari, John Dingliana

Outlier-Resistant Data Processing with L1-Norm Principal Component Analysis

Principal Component Analysis (PCA) has been a cornerstone of data analysis for more than a century, with important applications across most fields of science and engineering. However, despite its many strengths, PCA is known to have a major drawback: it is very sensitive to the presence of outliers among the processed data. To counteract the impact of outliers in data analysis, researchers have been long working on robust modifications of PCA. One of the most successful (and promising) PCA alternatives is L1-PCA. L1-PCA relies on the L1-norm of the processed data and, thus, tames any outliers that may exist in the dataset. Experimental studies in various applications have shown that L1-PCA (i) attains similar performance to PCA when the processed data are outlier-free and (ii) maintains sturdy resistance against outliers when the processed data are corrupted. Thus, L1-PCA is expected to play a significant role in the big-data era, when large datasets are often outlier corrupted. In this chapter, we present the theoretical foundations of L1-PCA, optimal and state-of-the-art approximate algorithms for its implementation, and some numerical studies that demonstrate its favorable performance.
Panos P. Markopoulos, Sandipan Kundu, Shubham Chamadia, Nicholas Tsagkarakis, Dimitris A. Pados

Damage and Fault Detection of Structures Using Principal Component Analysis and Hypothesis Testing

This chapter illustrates the application of principal component analysis (PCA) plus statistical hypothesis testing to online damage detection in structures, and to fault detection of an advanced wind turbine benchmark under actuators (pitch and torque) and sensors (pitch angle measurement) faults. A baseline pattern or PCA model is created with the healthy state of the structure using data from sensors. Subsequently, when the structure is inspected or supervised, new measurements are obtained and projected into the baseline PCA model. When both sets of data are compared, both univariate and multivariate statistical hypothesis testing is used to make a decision. In this work, both experimental results (with a small aluminum plate) and numerical simulations (with a well-known benchmark wind turbine) show that the proposed technique is a valuable tool to detect structural changes or faults.
Francesc Pozo, Yolanda Vidal

Principal Component Analysis for Exponential Family Data

This chapter reviews exponential family principal component analysis (ePCA), a family of statistical methods for dimension reduction of large-scale data that are not real-valued, such as user ratings for items in e-commerce, categorical/count genetic data in bioinformatics, and digital images in computer vision. The ePCA framework extends the applications of traditional PCA to modern data containing various data types. A sparse version of ePCA further helps overcome the model inconsistency and improve interpretability when applied to high-dimensional data. Model formulations and solution strategies of ePCA and sparse ePCA are discussed with real-world applications.
Meng Lu, Kai He, Jianhua Z. Huang, Xiaoning Qian

Application and Extension of PCA Concepts to Blind Unmixing of Hyperspectral Data with Intra-class Variability

The most standard blind source separation (BSS) methods address the situation when a set of signals are available, e.g. from measurements, and all of them are linear memoryless combinations, with unknown coefficient values, of the same limited set of unknown source signals. BSS methods aim at estimating these unknown source signals and/or coefficients. This generic problem is e.g. faced in the field of Earth observation (where it is also called “unsupervised unmixing”), when considering the commonly used (over)simplified model of hyperspectral images. Each pixel of such an image has an associated reflectance spectrum derived from measurements, which is defined by the fraction of sunlight power reflected by the corresponding Earth surface at each wavelength. Each source signal is then the single reflectance spectrum associated with one of the classes of pure materials which are present in the region of Earth corresponding to the overall considered hyperspectral image. Besides, the associated coefficients define the surfaces on Earth covered with each of these pure materials in each sub-region corresponding to one pixel of the considered image. However, real hyperspectral data e.g. obtained in urban areas have a much more complex structure than the above basic model: each class of pure materials (e.g. roof tiles, trees or asphalt) has so-called spectral or intra-class variability, i.e. it yields a somewhat different spectral component in each pixel of the image. In this complex framework, this chapter shows that Principal Component Analysis (PCA) and its proposed extension are of high interest at three stages of our investigation. First, PCA allows us to analyze the above-mentioned spectral variability of real high-dimensional hyperspectral data and to propose an extended data model which is suited to these complex data. We then develop a versatile extension of BSS methods based on Nonnegative Matrix Factorization, which adds the capability to handle arbitrary forms of intra-class variability by transposing PCA concepts to this original version of the BSS tramework. Finally, PCA again proves to be very well suited to analyzing the high-dimensional data obtained as the result of the proposed BSS method.
Yannick Deville, Charlotte Revel, Véronique Achard, Xavier Briottet
Weitere Informationen