Elsevier

Journal of Process Control

Volume 39, March 2016, Pages 88-99
Journal of Process Control

Related and independent variable fault detection based on KPCA and SVDD

https://doi.org/10.1016/j.jprocont.2016.01.001Get rights and content

Highlights

  • A new relevant and independent variable monitoring algorithm is proposed.

  • An independent variable division strategy based on mutual information is presented.

  • Fault detections of independent variable space and relevant variable space are conducted by SVDD and KPCA methods, respectively.

  • KPCA-SVDD has better monitoring performance than traditional methods.

Abstract

This paper proposes a new independent and related variable monitoring based on kernel principal component analysis (KPCA) and support vector data description (SVDD) algorithm. Some process variables are considered independent from other variables and the monitoring of independent and related variables should be performed separately. First, an independent variable division strategy based on mutual information is presented. Second, SVDD and KPCA methods are adopted to monitor independent variable space and related variable space, respectively. Finally, a general statistic is built according to the monitoring results of SVDD and KPCA. The proposed KPCA–SVDD method considers the related and independent characters of variables. This method combines the advantages of KPCA in managing nonlinear related variables and those of SVDD in handling independent variables. A numerical system and the Tennessee Eastman process are used to examine the efficiency of the proposed method. Simulation results have proved the superiority of KPCA–SVDD method.

Introduction

The scale of modern industries is increasing, and the monitoring of industrial processes has played an important role in ensuring the safety of production and quality of products. High demands for reliability and stabilization of system indicate the need for an efficient algorithm to realize fault detection and diagnosis. Large amount of data are collected and stored in modern industries as a result of the employment of distributed control system and computer science. Multivariate statistical process monitoring (MSPM) has developed rapidly under this circumstance [1], [2], [3], [4], [5]. MSPM aims to seek a low-dimensional representation of original data and structure statistics to monitor the whole system. Of all the data-based process monitoring methods, principal component analysis (PCA) has been the most successfully used [6], [7]. PCA can project the linear data matrix onto an uncorrelated subspace, which achieves the purpose of dimensionality reduction and keeps important information simultaneously. However, the characteristics of real industrial data are complex, such as nonlinearity, dynamic, and non-Gaussianity, which lead to weak PCA performance. Some modified PCA algorithms have been proposed to solve such problems [8], [9], [10].

PCA can monitor change of values in a single variable and it can also identify change in the relationship among variables. Monitoring the change of relationship is more difficult than monitoring the value of variables. Three major kinds of directions are used to search the relationship of data in the current research area.

The first direction aims to assume a single structure of the whole process (i.e., globe or local structure). Traditional PCA, partial least squares (PLS) [11], [12], and independent component analysis (ICA) [13], [14] are all based on the global model assumption. Kernel PCA (KPCA) [15], [16], kernel PLS [17], and kernel ICA [18], [19] are proposed for nonlinear processes to overcome drawbacks of traditional methods in handling linear issues. Ge and Song [20] proposed a combination method called ICA–PCA to solve the problem of Gaussian data that coexists with non-Gaussian data. The aforementioned global dimensional reduction methods were successfully applied in industrial processes. However, these algorithms ignored the local structure. Several manifold learning methods, such as isometric feature mapping [21], locally linear embedding [22], Laplacian eigenmaps [23], and locality preserving projections (LPP) [24], [25], are currently being used in pattern recognition. The main idea of these methods is to seek the local neighborhood structure that exists in the dataset. LPP is a linear projection method that can optimally preserve geometry structure. Among these methods, LPP has become a priority because it has a defined structure, which can be easily obtained. Global structures mainly focus on the outer organization of a process, whereas local structures consider inner shape. Global structure or local structure considers the partial structure of a dataset.

The second direction aims to search the relationship of data with one another, which is based on the combination of global and local structures. A global–local structure analysis (GLSA) was proposed by Zhang et al. [26] to exploit the underlying geometrical manifold and to keep global information. GLSA combined the advantages of LPP and PCA for fault detection and identification. Yu [27] proposed a local and global PCA by preserving local and global information. Tong and Yan [28] proposed a multimanifold method through global graph maximum and local graph minimum strategy to extract low-dimensional information. The global and local structures only build one single model in the data space. However, one single model may not be well interpreted. That is, one model is possibly not sensitive to different abnormal events.

Another kind of model building method separates the whole model into several submodels and monitors each submodel. MacGregor et al. [12] employed a multiblock PLS (MBPLS) projection method by building monitoring charts in each subsection and in the entire process, thereby making an abnormal event detected early and diagnosed easily. Qin et al. [3] further explored four multiblock PCAs and MBPLS algorithms and defined block and variable contributions to T2 and SPE (squared prediction error). A new multiblock PCA algorithm was proposed by Ge and Song [29] by constructing sub-blocks along with the different directions of PCA. Lv et al. [30] proposed a multi-subspace multiway PCA and Bayesian inference by dividing the loading matrix; this method was applied in batch process. Huang and Yan [31] proposed a new multiblock monitoring method based on variable distribution characteristic by considering that the process data contained different distributions. Li and Yang [32] improved the KPCA-based method by incorporating an ensemble learning strategy, which led to a robust width parameter selection. Many scholars have come up with multimodel algorithms. However, a specific best method has not yet been identified.

The relationship among variables in real industrial process is complicated. Some variables are linearly correlated, whereas others are nonlinearly correlated and some are independent from others. The related and independent characters of variables have gained little attention in existing multiblock or multispace process monitoring studies. This study is the first to consider the related and independent characters as basis of variable division. In this paper, a new related and independent variables monitoring based on KPCA and support vector data description (SVDD) [33], [34], [35] algorithm is proposed. Traditional KPCA method assumes that the relationship among variables is nonlinear. However, several variables are independent from others in real industries. Only the value should be considered when analyzing independent variables. Therefore, this kind of variables should be separated and monitored independently. First, mutual information (MI) [36], [37] is adopted to determine if a variable is independent. The MI value close to 0 indicates that both variables are unrelated with each other. A variable is considered independent if each value in MI vector, which is generated with MI between a variable and other variables, is close to 0. After examining each variable, independent variables are separated with related variables. The fault detection of independent variables and related variables are conducted with SVDD [34], [35] and KPCA, respectively.

The rest of this paper is structured as follows. KPCA and SVDD are briefly reviewed in Section 2. KPCA–SVDD fault detection scheme is presented in Section 3. In Section 4, a numerical system and the Tennessee Eastman (TE) [38], [39] process are used to test the efficiency of the proposed method. The conclusion is presented in Section 5.

Section snippets

Preliminary

This section briefly reviews KPCA and SVDD schemes.

KPCA–SVDD monitoring scheme

This section provides the details of the proposed method.

Case study

In this section, a numerical system and TE are employed to test the efficiency of the proposed KPCA–SVDD method.

Conclusion

In this paper, a new related and independent variable monitoring based on KPCA and SVDD (KPCA–SVDD) is presented. The original variables are divided into independent variables and related variables. The monitoring of both parts is performed with SVDD and KPCA. A numerical system and the well-known TE process are adopted to test and verify the efficiency of this method. The monitoring performance shows that KPCA–SVDD method outperforms PCA, KPCA, and SVDD methods. However, the proposed method

Acknowledgments

The authors gratefully acknowledge the support from the following foundations: 973 Project of China (2013CB733600), National Natural Science Foundation of China (21176073), Program for New Century Excellent Talents in University (NCET-09-0346) and the Fundamental Research Funds for the Central Universities.

References (41)

  • K.L. Hu et al.

    Multivariate statistical process control based on multiway locality preserving projections

    J. Process Control

    (2008)
  • J.D. Shao et al.

    Generalized orthogonal locality preserving projections for nonlinear fault detection and diagnosis

    Chemom. Intell. Lab.

    (2009)
  • J.B. Yu

    Local and global principal component analysis for process monitoring

    J. Process Control

    (2012)
  • C.D. Tong et al.

    Statistical process monitoring based on a multi-manifold projection algorithm

    Chemom. Intell. Lab.

    (2014)
  • S.I. Alabi et al.

    On-line dynamic process monitoring using wavelet-based generic dissimilarity measure

    Chem. Eng. Res. Des.

    (2005)
  • D.M.J. Tax et al.

    Support vector domain description

    Pattern Recogn. Lett.

    (1999)
  • J.J. Downs et al.

    A plant-wide industrial process control problem

    Comput. Chem. Eng.

    (1993)
  • P.R. Lyman et al.

    Plant-wide control of the Tennessee Eastman problem

    Comput. Chem. Eng.

    (1995)
  • Z.Q. Ge et al.

    Nonlinear process monitoring based on linear subspace and Bayesian inference

    J. Process Control

    (2010)
  • S.J. Qin

    Statistical process monitoring: basics and beyond

    J. Chemom.

    (2003)
  • Cited by (72)

    • Dynamic Non-Gaussian hybrid serial modeling for industrial process monitoring

      2021, Chemometrics and Intelligent Laboratory Systems
    View all citing articles on Scopus
    View full text