Related and independent variable fault detection based on KPCA and SVDD

doi:10.1016/j.jprocont.2016.01.001

Journal of Process Control

Volume 39, March 2016, Pages 88-99

https://doi.org/10.1016/j.jprocont.2016.01.001 Get rights and content

Highlights

•
A new relevant and independent variable monitoring algorithm is proposed.
•
An independent variable division strategy based on mutual information is presented.
•
Fault detections of independent variable space and relevant variable space are conducted by SVDD and KPCA methods, respectively.
•
KPCA-SVDD has better monitoring performance than traditional methods.

Abstract

This paper proposes a new independent and related variable monitoring based on kernel principal component analysis (KPCA) and support vector data description (SVDD) algorithm. Some process variables are considered independent from other variables and the monitoring of independent and related variables should be performed separately. First, an independent variable division strategy based on mutual information is presented. Second, SVDD and KPCA methods are adopted to monitor independent variable space and related variable space, respectively. Finally, a general statistic is built according to the monitoring results of SVDD and KPCA. The proposed KPCA–SVDD method considers the related and independent characters of variables. This method combines the advantages of KPCA in managing nonlinear related variables and those of SVDD in handling independent variables. A numerical system and the Tennessee Eastman process are used to examine the efficiency of the proposed method. Simulation results have proved the superiority of KPCA–SVDD method.

Introduction

The scale of modern industries is increasing, and the monitoring of industrial processes has played an important role in ensuring the safety of production and quality of products. High demands for reliability and stabilization of system indicate the need for an efficient algorithm to realize fault detection and diagnosis. Large amount of data are collected and stored in modern industries as a result of the employment of distributed control system and computer science. Multivariate statistical process monitoring (MSPM) has developed rapidly under this circumstance [1], [2], [3], [4], [5]. MSPM aims to seek a low-dimensional representation of original data and structure statistics to monitor the whole system. Of all the data-based process monitoring methods, principal component analysis (PCA) has been the most successfully used [6], [7]. PCA can project the linear data matrix onto an uncorrelated subspace, which achieves the purpose of dimensionality reduction and keeps important information simultaneously. However, the characteristics of real industrial data are complex, such as nonlinearity, dynamic, and non-Gaussianity, which lead to weak PCA performance. Some modified PCA algorithms have been proposed to solve such problems [8], [9], [10].

PCA can monitor change of values in a single variable and it can also identify change in the relationship among variables. Monitoring the change of relationship is more difficult than monitoring the value of variables. Three major kinds of directions are used to search the relationship of data in the current research area.

The first direction aims to assume a single structure of the whole process (i.e., globe or local structure). Traditional PCA, partial least squares (PLS) [11], [12], and independent component analysis (ICA) [13], [14] are all based on the global model assumption. Kernel PCA (KPCA) [15], [16], kernel PLS [17], and kernel ICA [18], [19] are proposed for nonlinear processes to overcome drawbacks of traditional methods in handling linear issues. Ge and Song [20] proposed a combination method called ICA–PCA to solve the problem of Gaussian data that coexists with non-Gaussian data. The aforementioned global dimensional reduction methods were successfully applied in industrial processes. However, these algorithms ignored the local structure. Several manifold learning methods, such as isometric feature mapping [21], locally linear embedding [22], Laplacian eigenmaps [23], and locality preserving projections (LPP) [24], [25], are currently being used in pattern recognition. The main idea of these methods is to seek the local neighborhood structure that exists in the dataset. LPP is a linear projection method that can optimally preserve geometry structure. Among these methods, LPP has become a priority because it has a defined structure, which can be easily obtained. Global structures mainly focus on the outer organization of a process, whereas local structures consider inner shape. Global structure or local structure considers the partial structure of a dataset.

The second direction aims to search the relationship of data with one another, which is based on the combination of global and local structures. A global–local structure analysis (GLSA) was proposed by Zhang et al. [26] to exploit the underlying geometrical manifold and to keep global information. GLSA combined the advantages of LPP and PCA for fault detection and identification. Yu [27] proposed a local and global PCA by preserving local and global information. Tong and Yan [28] proposed a multimanifold method through global graph maximum and local graph minimum strategy to extract low-dimensional information. The global and local structures only build one single model in the data space. However, one single model may not be well interpreted. That is, one model is possibly not sensitive to different abnormal events.

Another kind of model building method separates the whole model into several submodels and monitors each submodel. MacGregor et al. [12] employed a multiblock PLS (MBPLS) projection method by building monitoring charts in each subsection and in the entire process, thereby making an abnormal event detected early and diagnosed easily. Qin et al. [3] further explored four multiblock PCAs and MBPLS algorithms and defined block and variable contributions to T² and SPE (squared prediction error). A new multiblock PCA algorithm was proposed by Ge and Song [29] by constructing sub-blocks along with the different directions of PCA. Lv et al. [30] proposed a multi-subspace multiway PCA and Bayesian inference by dividing the loading matrix; this method was applied in batch process. Huang and Yan [31] proposed a new multiblock monitoring method based on variable distribution characteristic by considering that the process data contained different distributions. Li and Yang [32] improved the KPCA-based method by incorporating an ensemble learning strategy, which led to a robust width parameter selection. Many scholars have come up with multimodel algorithms. However, a specific best method has not yet been identified.

The relationship among variables in real industrial process is complicated. Some variables are linearly correlated, whereas others are nonlinearly correlated and some are independent from others. The related and independent characters of variables have gained little attention in existing multiblock or multispace process monitoring studies. This study is the first to consider the related and independent characters as basis of variable division. In this paper, a new related and independent variables monitoring based on KPCA and support vector data description (SVDD) [33], [34], [35] algorithm is proposed. Traditional KPCA method assumes that the relationship among variables is nonlinear. However, several variables are independent from others in real industries. Only the value should be considered when analyzing independent variables. Therefore, this kind of variables should be separated and monitored independently. First, mutual information (MI) [36], [37] is adopted to determine if a variable is independent. The MI value close to 0 indicates that both variables are unrelated with each other. A variable is considered independent if each value in MI vector, which is generated with MI between a variable and other variables, is close to 0. After examining each variable, independent variables are separated with related variables. The fault detection of independent variables and related variables are conducted with SVDD [34], [35] and KPCA, respectively.

The rest of this paper is structured as follows. KPCA and SVDD are briefly reviewed in Section 2. KPCA–SVDD fault detection scheme is presented in Section 3. In Section 4, a numerical system and the Tennessee Eastman (TE) [38], [39] process are used to test the efficiency of the proposed method. The conclusion is presented in Section 5.

Section snippets

Preliminary

This section briefly reviews KPCA and SVDD schemes.

KPCA–SVDD monitoring scheme

This section provides the details of the proposed method.

Case study

In this section, a numerical system and TE are employed to test the efficiency of the proposed KPCA–SVDD method.

Conclusion

In this paper, a new related and independent variable monitoring based on KPCA and SVDD (KPCA–SVDD) is presented. The original variables are divided into independent variables and related variables. The monitoring of both parts is performed with SVDD and KPCA. A numerical system and the well-known TE process are adopted to test and verify the efficiency of this method. The monitoring performance shows that KPCA–SVDD method outperforms PCA, KPCA, and SVDD methods. However, the proposed method

Acknowledgments

The authors gratefully acknowledge the support from the following foundations: 973 Project of China (2013CB733600), National Natural Science Foundation of China (21176073), Program for New Century Excellent Talents in University (NCET-09-0346) and the Fundamental Research Funds for the Central Universities.

References (41)

S.X. Ding
Data-driven design of monitoring and diagnosis systems for dynamic processes: a review of subspace technique based schemes and some recent results
J. Process Control
(2014)
S.X. Ding et al.
Subspace method aided data-driven design of fault detection and isolation systems
J. Process Control
(2009)
M. Kano et al.
A new multivariate statistical process monitoring method using principal component analysis
Comput. Chem. Eng.
(2001)
W. Ku et al.
Disturbance detection and isolation by dynamic principal component analysis
Chemom. Intell. Lab. Syst.
(1995)
J.M. Lee et al.
Statistical process monitoring with independent component analysis
J. Process Control
(2004)
A. Hyvärinen et al.
Independent component analysis: algorithms and applications
Neural Netw.
(2000)
J.M. Lee et al.
Nonlinear process monitoring using kernel principal component analysis
Chem. Eng. Sci.
(2004)
Z.Q. Ge et al.
Improved kernel PCA-based monitoring approach for nonlinear processes
Chem. Eng. Sci.
(2009)
Y. Zhang et al.
Process data modeling using modified kernel partial least squares
Chem. Eng. Sci.
(2010)
Y.W. Zhang
Enhanced statistical analysis of nonlinear processes using KPCA, KICA and SVM
Chem. Eng. Sci.
(2009)

K.L. Hu et al.

Multivariate statistical process control based on multiway locality preserving projections

J. Process Control

(2008)

J.D. Shao et al.

Generalized orthogonal locality preserving projections for nonlinear fault detection and diagnosis

Chemom. Intell. Lab.

(2009)

J.B. Yu

Local and global principal component analysis for process monitoring

J. Process Control

(2012)

C.D. Tong et al.

Statistical process monitoring based on a multi-manifold projection algorithm

Chemom. Intell. Lab.

(2014)

S.I. Alabi et al.

On-line dynamic process monitoring using wavelet-based generic dissimilarity measure

Chem. Eng. Res. Des.

(2005)

D.M.J. Tax et al.

Support vector domain description

Pattern Recogn. Lett.

(1999)

J.J. Downs et al.

A plant-wide industrial process control problem

Comput. Chem. Eng.

(1993)

P.R. Lyman et al.

Plant-wide control of the Tennessee Eastman problem

Comput. Chem. Eng.

(1995)

Z.Q. Ge et al.

Nonlinear process monitoring based on linear subspace and Bayesian inference

J. Process Control

(2010)

S.J. Qin

Statistical process monitoring: basics and beyond

J. Chemom.

(2003)

Cited by (72)

Fault diagnosis based on counterfactual inference for the batch fermentation process
2024, ISA Transactions
Fault diagnosis plays a pivotal role in identifying the root causes of a fault. Current fault diagnosis methods encounter the shortcomings being unable to assess the fault amplitude or having low efficiency for batch fermentation process. In order to solve the above problems, this paper proposes a fault detection model named convolutional neural network based on variational autoencoder (CNN-VAE) and a fault diagnosis based on counterfactual inference (FDCI). To begin with, quality-related process variables are selected using mutual information (MI). Next, a two-dimensional moving window is used to obtain input sequences from the process data. Then, two statistics from the latent and residual domains of the CNN-VAE model are constructed for fault detection. Additionally, once a fault occurs, FDCI is used to locate the root cause of a fault. Finally, a simulation process and a real-world L. plantarum batch fermentation process are provided to demonstrate the effectiveness of the proposed approache.
Robust fault detection for chemical processes based on dynamic low-rank matrix and optimized LSTM
2023, Process Safety and Environmental Protection
The data collected by sensors in modern chemical process systems are always contaminated by industrial noise, so robust fault detection is an important technology to ensure process safety. However, the existing robust detection techniques have limitations in extracting dynamic and time series features of chemical process data. In this paper, a robust fault detection method based on dynamic low-rank matrix and optimized LSTM is proposed for dynamic chemical processes under noise background. First, a new low-rank matrix decomposition method, dynamic principal component pursuit (DPCP), is proposed for the dynamic characteristics of process data containing noise. The PCP is improved using the time delay excursion method, i.e., the DPCP is constructed by embedding an augmented dynamic matrix with a time lag factor on the PCP and solved using the alternating direction multiplier method. The purpose is to strip the useless noise from the useful detection information and extract the dynamic low-rank matrix. Second, an optimized LSTM (OPLSTM) is proposed for the time series features of dynamic low-rank matrix. vector factors for feature selection of memory cells in the LSTM are reweighted to balance the time series features of industrial processes and avoid heavy reliance on erroneous fault information. In addition, support vector data description (SVDD) is used to describe the hypersphere with clear boundary, and distance-based statistic is constructed to achieve fault detection with high fault detection rate and low false alarm rate. Finally, to assess the effectiveness of the proposed method, we performed extensive experiments on the Tennessee Eastman Process and the electrolytic aluminum process, and compared them with multiple methods. The experimental results show that the proposed method is effective and robust for chemical processes fault detection in the background of industrial noise.
Dynamic nonlinear process monitoring based on dynamic correlation variable selection and kernel principal component regression
2022, Journal of the Franklin Institute
In actual industrial processes, data are usually time series. Each process variable may have strong autocorrelation and cross-correlation with other variables with different delays. In addition, there are usually complex nonlinearities among variables. To further improve the monitoring performance for dynamic nonlinear processes, establishing a nonlinear filtering model for each variable is necessary. Therefore, a novel dynamic nonlinear process monitoring method based on dynamic nonlinear feature selection and kernel principal component regression (KPCR) is proposed in this study. First, dynamic nonlinear related variables are selected for each variable through mutual information by considering variables with different time delays. Second, process variables are divided into response and independent variable sets. Third, corresponding KPCR models are established to describe the dynamic relationships with the selected dynamic related variables as input variables and with response variables as output variables. To monitor the dynamic processes, kernel principal component analysis model is constructed on the basis of the residuals, where the residuals are obtained by comparing measured values of instruments with the predicted values of KPCR models. A support vector data description model is established to monitor the independent variables. Three cases are used to verify the performance of the novel approach. Results show that the proposed method is superior to and more effective than other advanced dynamic process monitoring methods.
Fault diagnosis of rolling bearings based on improved energy entropy and fault location of triangulation of amplitude attenuation outer raceway
2021, Measurement: Journal of the International Measurement Confederation
As an important part of mechanical transmission parts, rolling bearings directly affect the operating efficiency of the machine. Therefore, the diagnosis of rolling bearing fault information is of great significance to the operation of the machine. Aiming at the problem of bearing fault diagnosis and location, this paper proposes a fault diagnosis method with improved energy entropy and a location method based on vibration amplitude attenuation. First, when the fault is slight, the difference in energy entropy distribution is small, and the fault information cannot be accurately diagnosed. To solve this problem, an improved energy entropy is proposed in this paper. By increasing the vibration amplitude, the difference of energy entropy distribution is expanded, and the accuracy of fault diagnosis is improved. Secondly, taking the logarithm of the vibration signal attenuation function, the exponential attenuation is transformed into linear attenuation, and the specific location of rolling bearing outer raceway fault is located from the amplitude values of three different points and their amplitude attenuation. Finally, comparing experiments with other methods shows that the improved energy entropy and fault location method proposed in this paper is effective.
Dynamic Non-Gaussian hybrid serial modeling for industrial process monitoring
2021, Chemometrics and Intelligent Laboratory Systems
Process monitoring has been widely used for fault detection and performance supervision in modern industrial processes. Nevertheless, hybrid characteristics including Gaussianity, non-Gaussianity and dynamic usually coexist in process variables, which brings new challenge to obtain satisfactory monitoring performance. Aiming at the hybrid characteristics problem, this paper proposes a dynamic non-Gaussian hybrid serial modeling method for industrial process monitoring. First, a multivariate non-Gaussianity evaluation method is utilized to divide industrial process variables into the Gaussian variable subspace and the non-Gaussian variable subspace. Afterwards considering the hybrid characteristics including Gaussianity, non-Gaussianity and dynamic at information level, a dynamic principal component analysis (DPCA)-dynamic independent component analysis (DICA)-based hybrid serial modeling method is presented for analyzing simultaneously the dynamic Gaussian and non-Gaussian information in each variable subspace. Subsequently, the final monitoring results are obtained using Bayesian inference and the DPCA-DICA-based hybrid serial similarity factor is proposed for fault identification. Unlike the existing methods, the proposed method analyzes simultaneously the Gaussianity, non-Gaussianity and dynamic at different levels of variable and information for improving performance. The case studies including a numerical system, the Tennessee Eastman process and a practical industrial process demonstrate its feasibility and effectiveness.
An enhanced fault detection method for centrifugal chillers using kernel density estimation based kernel entropy component analysis
2021, International Journal of Refrigeration
In building automation system, timely detecting the operational faults in water chillers is crucial to system operation management. As a linear data transformation technique, the performance of principal component analysis (PCA) based chiller fault detection method is limited, especially for the incipient ones, due to the nonlinearities in chillers. In addition, the control limit for the monitoring statistic, such as squared prediction error (SPE), is determined under the Gaussian assumption of the score variables, which can hardly be satisfied in water chillers. Therefore, an enhanced fault detection method with the application of kernel density estimation (KDE) and kernel entropy component analysis (KECA) algorithm is reported in this paper. Cauchy-Schwarz (CS) divergence was evaluated as monitoring statistic to measure the cosine of the angle between two data sets after being projected onto a dominated subspace, and then adopted as an index for dissimilarity. KDE with its bandwidth being optimized was also applied to estimate the distribution of the CS divergence, so that the control limit for fault monitoring could be determined. The proposed KDE based KECA-CS method was validated using the experimental data from ASHRAE RP-1043, and further compared to the PCA -SPE, kernel principal component analysis (KPCA)-SPE, and KECA-SPE methods. Results showed that the best performance could be realized when using the KDE based KECA-CS method. The reported fault detection ratio was over 68% for the seven typical chiller faults even at their corresponding least severity level. The average fault detection accuracy was over 90%.

View all citing articles on Scopus

View full text

Related and independent variable fault detection based on KPCA and SVDD

Highlights

Abstract

Introduction

Section snippets

Preliminary

KPCA–SVDD monitoring scheme

Case study

Conclusion

Acknowledgments

J. Process Control

J. Process Control

Comput. Chem. Eng.

Chemom. Intell. Lab. Syst.

J. Process Control

Neural Netw.

Chem. Eng. Sci.

Chem. Eng. Sci.

Chem. Eng. Sci.

Chem. Eng. Sci.

J. Process Control

Chemom. Intell. Lab.

J. Process Control

Chemom. Intell. Lab.

Chem. Eng. Res. Des.

Pattern Recogn. Lett.

Comput. Chem. Eng.

Comput. Chem. Eng.

J. Process Control

Statistical process monitoring: basics and beyond

J. Chemom.