Related and independent variable fault detection based on KPCA and SVDD
Introduction
The scale of modern industries is increasing, and the monitoring of industrial processes has played an important role in ensuring the safety of production and quality of products. High demands for reliability and stabilization of system indicate the need for an efficient algorithm to realize fault detection and diagnosis. Large amount of data are collected and stored in modern industries as a result of the employment of distributed control system and computer science. Multivariate statistical process monitoring (MSPM) has developed rapidly under this circumstance [1], [2], [3], [4], [5]. MSPM aims to seek a low-dimensional representation of original data and structure statistics to monitor the whole system. Of all the data-based process monitoring methods, principal component analysis (PCA) has been the most successfully used [6], [7]. PCA can project the linear data matrix onto an uncorrelated subspace, which achieves the purpose of dimensionality reduction and keeps important information simultaneously. However, the characteristics of real industrial data are complex, such as nonlinearity, dynamic, and non-Gaussianity, which lead to weak PCA performance. Some modified PCA algorithms have been proposed to solve such problems [8], [9], [10].
PCA can monitor change of values in a single variable and it can also identify change in the relationship among variables. Monitoring the change of relationship is more difficult than monitoring the value of variables. Three major kinds of directions are used to search the relationship of data in the current research area.
The first direction aims to assume a single structure of the whole process (i.e., globe or local structure). Traditional PCA, partial least squares (PLS) [11], [12], and independent component analysis (ICA) [13], [14] are all based on the global model assumption. Kernel PCA (KPCA) [15], [16], kernel PLS [17], and kernel ICA [18], [19] are proposed for nonlinear processes to overcome drawbacks of traditional methods in handling linear issues. Ge and Song [20] proposed a combination method called ICA–PCA to solve the problem of Gaussian data that coexists with non-Gaussian data. The aforementioned global dimensional reduction methods were successfully applied in industrial processes. However, these algorithms ignored the local structure. Several manifold learning methods, such as isometric feature mapping [21], locally linear embedding [22], Laplacian eigenmaps [23], and locality preserving projections (LPP) [24], [25], are currently being used in pattern recognition. The main idea of these methods is to seek the local neighborhood structure that exists in the dataset. LPP is a linear projection method that can optimally preserve geometry structure. Among these methods, LPP has become a priority because it has a defined structure, which can be easily obtained. Global structures mainly focus on the outer organization of a process, whereas local structures consider inner shape. Global structure or local structure considers the partial structure of a dataset.
The second direction aims to search the relationship of data with one another, which is based on the combination of global and local structures. A global–local structure analysis (GLSA) was proposed by Zhang et al. [26] to exploit the underlying geometrical manifold and to keep global information. GLSA combined the advantages of LPP and PCA for fault detection and identification. Yu [27] proposed a local and global PCA by preserving local and global information. Tong and Yan [28] proposed a multimanifold method through global graph maximum and local graph minimum strategy to extract low-dimensional information. The global and local structures only build one single model in the data space. However, one single model may not be well interpreted. That is, one model is possibly not sensitive to different abnormal events.
Another kind of model building method separates the whole model into several submodels and monitors each submodel. MacGregor et al. [12] employed a multiblock PLS (MBPLS) projection method by building monitoring charts in each subsection and in the entire process, thereby making an abnormal event detected early and diagnosed easily. Qin et al. [3] further explored four multiblock PCAs and MBPLS algorithms and defined block and variable contributions to T2 and SPE (squared prediction error). A new multiblock PCA algorithm was proposed by Ge and Song [29] by constructing sub-blocks along with the different directions of PCA. Lv et al. [30] proposed a multi-subspace multiway PCA and Bayesian inference by dividing the loading matrix; this method was applied in batch process. Huang and Yan [31] proposed a new multiblock monitoring method based on variable distribution characteristic by considering that the process data contained different distributions. Li and Yang [32] improved the KPCA-based method by incorporating an ensemble learning strategy, which led to a robust width parameter selection. Many scholars have come up with multimodel algorithms. However, a specific best method has not yet been identified.
The relationship among variables in real industrial process is complicated. Some variables are linearly correlated, whereas others are nonlinearly correlated and some are independent from others. The related and independent characters of variables have gained little attention in existing multiblock or multispace process monitoring studies. This study is the first to consider the related and independent characters as basis of variable division. In this paper, a new related and independent variables monitoring based on KPCA and support vector data description (SVDD) [33], [34], [35] algorithm is proposed. Traditional KPCA method assumes that the relationship among variables is nonlinear. However, several variables are independent from others in real industries. Only the value should be considered when analyzing independent variables. Therefore, this kind of variables should be separated and monitored independently. First, mutual information (MI) [36], [37] is adopted to determine if a variable is independent. The MI value close to 0 indicates that both variables are unrelated with each other. A variable is considered independent if each value in MI vector, which is generated with MI between a variable and other variables, is close to 0. After examining each variable, independent variables are separated with related variables. The fault detection of independent variables and related variables are conducted with SVDD [34], [35] and KPCA, respectively.
The rest of this paper is structured as follows. KPCA and SVDD are briefly reviewed in Section 2. KPCA–SVDD fault detection scheme is presented in Section 3. In Section 4, a numerical system and the Tennessee Eastman (TE) [38], [39] process are used to test the efficiency of the proposed method. The conclusion is presented in Section 5.
Section snippets
Preliminary
This section briefly reviews KPCA and SVDD schemes.
KPCA–SVDD monitoring scheme
This section provides the details of the proposed method.
Case study
In this section, a numerical system and TE are employed to test the efficiency of the proposed KPCA–SVDD method.
Conclusion
In this paper, a new related and independent variable monitoring based on KPCA and SVDD (KPCA–SVDD) is presented. The original variables are divided into independent variables and related variables. The monitoring of both parts is performed with SVDD and KPCA. A numerical system and the well-known TE process are adopted to test and verify the efficiency of this method. The monitoring performance shows that KPCA–SVDD method outperforms PCA, KPCA, and SVDD methods. However, the proposed method
Acknowledgments
The authors gratefully acknowledge the support from the following foundations: 973 Project of China (2013CB733600), National Natural Science Foundation of China (21176073), Program for New Century Excellent Talents in University (NCET-09-0346) and the Fundamental Research Funds for the Central Universities.
References (41)
Data-driven design of monitoring and diagnosis systems for dynamic processes: a review of subspace technique based schemes and some recent results
J. Process Control
(2014)- et al.
Subspace method aided data-driven design of fault detection and isolation systems
J. Process Control
(2009) - et al.
A new multivariate statistical process monitoring method using principal component analysis
Comput. Chem. Eng.
(2001) - et al.
Disturbance detection and isolation by dynamic principal component analysis
Chemom. Intell. Lab. Syst.
(1995) - et al.
Statistical process monitoring with independent component analysis
J. Process Control
(2004) - et al.
Independent component analysis: algorithms and applications
Neural Netw.
(2000) - et al.
Nonlinear process monitoring using kernel principal component analysis
Chem. Eng. Sci.
(2004) - et al.
Improved kernel PCA-based monitoring approach for nonlinear processes
Chem. Eng. Sci.
(2009) - et al.
Process data modeling using modified kernel partial least squares
Chem. Eng. Sci.
(2010) Enhanced statistical analysis of nonlinear processes using KPCA, KICA and SVM
Chem. Eng. Sci.
(2009)
Multivariate statistical process control based on multiway locality preserving projections
J. Process Control
Generalized orthogonal locality preserving projections for nonlinear fault detection and diagnosis
Chemom. Intell. Lab.
Local and global principal component analysis for process monitoring
J. Process Control
Statistical process monitoring based on a multi-manifold projection algorithm
Chemom. Intell. Lab.
On-line dynamic process monitoring using wavelet-based generic dissimilarity measure
Chem. Eng. Res. Des.
Support vector domain description
Pattern Recogn. Lett.
A plant-wide industrial process control problem
Comput. Chem. Eng.
Plant-wide control of the Tennessee Eastman problem
Comput. Chem. Eng.
Nonlinear process monitoring based on linear subspace and Bayesian inference
J. Process Control
Statistical process monitoring: basics and beyond
J. Chemom.
Cited by (72)
Fault diagnosis based on counterfactual inference for the batch fermentation process
2024, ISA TransactionsRobust fault detection for chemical processes based on dynamic low-rank matrix and optimized LSTM
2023, Process Safety and Environmental ProtectionDynamic nonlinear process monitoring based on dynamic correlation variable selection and kernel principal component regression
2022, Journal of the Franklin InstituteFault diagnosis of rolling bearings based on improved energy entropy and fault location of triangulation of amplitude attenuation outer raceway
2021, Measurement: Journal of the International Measurement ConfederationDynamic Non-Gaussian hybrid serial modeling for industrial process monitoring
2021, Chemometrics and Intelligent Laboratory SystemsAn enhanced fault detection method for centrifugal chillers using kernel density estimation based kernel entropy component analysis
2021, International Journal of Refrigeration