An improved weighted recursive PCA algorithm for adaptive fault detection

https://doi.org/10.1016/j.conengprac.2016.02.010Get rights and content

Highlights

  • A novel weighted recursive PCA-based fault detection technique is developed.

  • The false alarm rate in process monitoring due to time-drifting changes is reduced.

  • Performance of two existing algorithms is compared to the proposed technique.

  • The computational complexity (FLOPs required for update) is significantly reduced.

Abstract

A novel weighted adaptive recursive fault detection technique based on Principal Component Analysis (PCA) is proposed to address the issue of the increment in false alarm rate in process monitoring schemes due to the natural, slow and normal process changes (aging), which often occurs in real processes. It has been named as weighted adaptive recursive PCA (WARP).

The aforementioned problem is addressed recursively by updating the eigenstructure (eigenvalues and eigenvectors) of the statistical detection model when the false alarm rate increases given the awareness of non-faulty condition. The update is carried out by incorporating the new available information within a specific online process dataset, instead of keeping a fixed statistical model such as conventional PCA does. To achieve this recursive updating, equations for means, standard deviations, covariance matrix, eigenvalues and eigenvectors are developed. The statistical thresholds and the number of principal components are updated as well.

A comparison between the proposed algorithm and other recursive PCA-based algorithms is carried out in terms of false alarm rate, misdetection rate, detection delay and its computational complexity. WARP features a significant reduction of the computational complexity while maintaining a similar performance on false alarm rate, misdetection rate and detection delay compared to that of the other existing PCA-based recursive algorithms. The computational complexity is assessed in terms of the Floating Operation Points (FLOPs) needed to carry out the update.

Introduction

PCA is a dimensionality reduction technique used for fault detection that optimally captures the maximum variance of the data in a low-dimensional space, and has been widespread used in process monitoring (Amanian, Salahshoor, Jafari, & Mosallaei. (2007), Choi & Lee (2004), Iwashita (1997), Jeng (2010); McGregor & Kourti (1995), Iwashita (1997), Russell et al., 2000, Qin (2003), Choi & Lee (2004), Amanian, Salahshoor, Jafari, & Mosallaei. (2007), Jeng (2010)). The large amount of observations gathered from sensors and actuators is turned into a couple of meaningful measures such as the T2 and the Q statistics (Chen, Kruger, Meronk, & Leung (2004), Chiu & Ling (2009); Iwashita (1997), Chiu & Ling (2009)). Its ultimate advantage is to perform the monitoring procedures like univariate charts, comparing a calculated measure with a statistical threshold, both arranged in a single plot (Chow, Tan, Tabe, Zhang, & Thornhill, 1999).

Among the drawbacks ascribed to conventional PCA, perhaps the major one is that, once the fault detection models have been structured in the training step, their monitoring schemes remain invariant. This feature becomes a significant disadvantage considering that real industrial processes usually demonstrate slow time-varying behaviors, such as catalyst deactivation, heat exchanger fouling, equipment and sensor aging and process time-drifting (Chen & Liao (2002), Gallagher, Wise, Butler, White, & Barna (1997), Wold (1994), Yingwei, Shuai, & Yongdong (2012)). As a consequence, false alarms will eventually occur unless the underlying statistical structure is updated.

When processes exhibit slow-varying changes, adaptive/recursive approaches are more suitable to address the false alarm issue. On the other hand, when processes exhibit several different operation conditions, the multimode monitoring approaches should be implemented (Zhiqiang, Zhihuan & Furong, 2013). Another type of monitoring methods, which tackles both time-varying and multimode processes, is found in literature, such as the just-in-time-learning model proposed by Cheng and Chiu (2005), the adaptive local model proposed by Ge and Song (2008), and the external analysis combined with ICA (Independent Component Analysis) proposed by Kano, Hasebe, Hashimoto and Ohno (2004). An overview of these methods is summarized in Fig. 1.

Many adaptive approaches have been developed based on PCA and Partial Least Squares (PLS) algorithms as these techniques register the more successful industrial implementations. Alkaya and Eker (2011) proposed a PCA-based variance-sensible fault detection algorithm combined with a dynamic threshold, mitigating false alarms caused by time-drifting behaviors by following the T2 statistic trend and adjusting the detection threshold. Including the time-varying behavior and variable autocorrelation is a key feature for a robust fault detection scheme, being one the most important development branches on the fault detection field. Some approaches combine univariate techniques like Exponentially Weighted Moving Average (EWMA) and Cumulative Sum (CUSUM) charts with PCA, e.g. Wold (1994) discussed the use of EWMA filters in conjunction with PCA and PLS. Ku, Storer, and Georgakis (1995) proposed a modification to PCA to include time-lagged information to mitigate the temporal correlation among the process variables; this method is considered a dynamic version of the conventional PCA, hence dynamic PCA (DPCA) (Maravelakis & Castagliola (2009), Rato & Reis (2013), Weihua & Qin (2001)).

Wang, Kruger, and Irwin (2005) presented a fast moving window PCA approach to improve monitoring efficiency of time-varying processes monitoring and Liu, Kruger, and Littler (2009) developed a moving window kernel PCA for non-linear time-varying process. Rigopoulos, Arkun, Kayihan and Hanezyc (1996) used a similar window scheme to identify significant modes in a simulated paper machine profile. Rannar, MacGregor and Wold (1997) used a hierarchical PCA for adaptive batch monitoring in a similar way to EWMA-based PCA. For industrial processes with multiple modes, different multimode approaches for process monitoring have been developed, such as the real-time monitoring approach proposed by Hwang and Han (1999). Moreover, when the monitoring of transition period between two different operation modes is a requirement, soft modeling algorithms offer a alternative to perform the fault detection procedures (Choi et al., 2005, Ge & Song (2010), Yu & Qin (2008)).

Besides adaptive approaches, there are also solutions relying on the periodic incorporation of new process data, thus recursively updating the statistical fault detection model. Dayal and MacGregor (1997) developed a recursive exponentially weighted PLS method for adaptive control in industrial processes, Wang, Kruger and Lennox (2003) built a recursive PLS (RPLS) model for adaptive monitoring in complex industrial processes, Naik, Yin, Ding and Zhang (2010) propose algorithms to deal with recursive identification of parity-based fault detection systems, updating their eigenstructure after every new measurement, which improves fault detection performance against frequent shifts in operation point or parameter variations. Qin (1998a), Qin (1998b) proposed several RPLS algorithms for both offline and on-line process modeling allowing the adaptation to process changes and dealing with a large number of data samples. These algorithms include a block-wise RPLS with a moving window and a forgetting factor adaptation scheme and a block-wise RPLS off-line used to reduce computation time and computer memory usage in PLS regression and cross-validation.

Like adaptive approaches, recursive techniques are developed based on PCA to take advantage of its widespread implementation. Jeng (2010) proposed a recursive PCA (RPCA) algorithm based on a rank-one matrix update. This algorithm pre-treats data to be mean-centered however, it does not perform an auto-scaling operation neglecting the effects of such changes on the standard deviations of process variables. Besides, this update is made after every new measurement (sample by sample), making it inconvenient due to the large amount of FLOPS (Floating Operation Points per Second) required. Weihua, Yue, Valle-Cervantes, and Qin (2000) proposed two PCA-based algorithms using a rank-one modification and a Lanczos tridiagonalization, respectively. After a computational complexity assessment, Weihua et al. concluded that the algorithm based on rank-one modification is less demanding. The rank-one algorithm carries out an auto-scaling operation to consider the changes on standard deviations of process variables; nevertheless it requires two spectral decompositions to update the eigenstructure. In addition, the formulas used to update the covariance matrix and standard deviations may be improved in order to lower their complexity. This algorithm also features a forgetting factor μ to weight current and new datasets.

In this paper, a new weighted adaptive recursive PCA-based algorithm is developed. A comparison between the proposed algorithm and other recursive PCA-based algorithms (Weihua, Yue, Valle-Cervantes, & Qin (2000), Jeng (2010)) is carried out in terms of false alarm rate, misdetection rate, detection delay and computational complexity. The paper is organized as follows: in Section 2, the background about conventional PCA is presented. The recursive formulas proposed for PCA are developed in Section 3 (WARP). In Section 4, the computational complexity of the proposed recursive formulas to update means, standard deviations and the eigenstructure is compared to the computational complexity of two sets of recursive formulas found in the literature. Section 5 assesses the performance of overall algorithms with a benchmark process, whilst the Section 6 contains a second validation with a real process data from a regional natural gas pipeline. Finally, Section 7 presents the conclusions and future work related to this investigation.

Section snippets

Conventional PCA algorithm

Historical process data corresponding to normal operation is arranged in a matrix, χ0Rn×m0, where m0 is the number of process variables being measured and n is the number of samples. Variables with null variance or missing signal problems are removed, so that χ0 becomes XρRn×m, with mm0. The means of these m remaining variables are contained in the vector bRm,b=1n(Xρ)TIn,(In=[111]TRn)

The standard deviations of the m variables are contained in:Σ=[σ10000σ200σm]Rm×m

The data

Recursive formulas proposed for PCA

A new set of recursive equations to update conventional PCA statistical model is carried out by incorporating the new available information within a specific online process dataset. A weighting (or learning) factor is introduced to assign “importance” to the new online dataset about to be included. A guideline notation used during this section is presented below.

Computational complexity

Computational complexity is assessed in terms of the FLOPs spent by overall operations required to perform the updates on the eigenstructure. This assessment is performed over the proposed algorithm and two sets of recursive formulas found in literature: one proposed by Jeng (2010) and another proposed by Weihua et al. (2000). The results from the comparison are summarized in Table 1. As a remark, the computational complexity of the three sets has been expressed according the notation presented

Performance assessment

As it may be observed in Fig. 4, the algorithm proposed in this paper has the lowest computational complexity compared to Jeng (2010) and Weihua, Yue, Valle-Cervantes, & Qin (2000), Jeng (2010), consuming between 7 and 20 times less FLOPs within the evaluated span of process variables. It is expected to achieve even better performance in this indicator for more complex processes with larger amount of variables. This reduction of computational demand is of special significance considering that a

Validation with real process data: natural gas transmission pipeline

A complementary validation of the proposed technique (WARP) along a reference technique (Weihua et al.) is performed using a real process dataset from a regional natural gas transmission pipeline. This dataset was also used by Torres, Posada, Garcia and Sanjuan (2012) in a previous research study on fault detection in natural gas pipelines.

Conclusions

A literature review showed that among the drawbacks ascribed to conventional PCA, perhaps the major one is that, once the fault detection models have been structured in the training step, their monitoring schemes remain invariant. This feature becomes a significant disadvantage considering that real industrial processes usually demonstrate slow time-varying behaviors. A new weighted recursive PCA-based algorithm (WARP) was developed in order to address the rise of false alarms in process

References (48)

  • J.C. Jeng

    Adaptive process monitoring using efficient recursive PCA and moving window PCA algorithms

    Journal of the Taiwan Institute of Chemical Engineers

    (2010)
  • M. Kano et al.

    Evolution of multivariate statistical process control: application of independent component analysis and external analysis

    Computers and Chemical Engineering

    (2004)
  • W. Ku et al.

    Disturbance detection and isolation by dynamic principal component analysis

    Chemometrics and Intelligent Laboratory Systems

    (1995)
  • J.M. Lee et al.

    Statistical monitoring of dynamic processes based on dynamic independent component analysis

    Chemical Engineering Science

    (2004)
  • W. Lin et al.

    Nonlinear dynamic principal component analysis for on-line process monitoring and diagnosis

    Computer and Chemical Engineering

    (2000)
  • X. Liu et al.

    Moving window kernel PCA for adaptive monitoring of nonlinear processes

    Chemometrics and Intelligent Laboratory Systems

    (2009)
  • P. Maravelakis et al.

    EWMA chart for monitoring the process standard deviation when parameters are estimated

    Computational Statistics and Data Analysis

    (2009)
  • A.S. Naik et al.

    Recursive identification algorithms to design fault detection systems

    Journal of Process Control

    (2010)
  • T.J. Rato et al.

    Fault detection in the Tennessee Eastman benchmark process using dynamic principal component analysis based on decorrelated residuals (DPCA-DR)

    Chemometrics and Intelligent Laboratory Systems

    (2013)
  • V. Venkatasubramanian et al.

    A review of process fault detection and diagnosis Part III: Process history based methods

    Computers and Chemical Engineering

    (2003)
  • X. Wang et al.

    Recursive partial least squares algorithms for monitoring complex industrial processes

    Control Engineering Practice

    (2003)
  • S. Wold

    Exponentially weighted moving principal component analysis and projection to latent structures

    Chemometrics and Intelligent Laboratory systems

    (1994)
  • Amanian, K., Salahshoor, K., Jafari, M.R., Mosallaei, M. (2007). Soft Sensor Based on Dynamic Principal Component...
  • J.R. Bunch et al.

    Rank-one modification of the symmetric eigenproblem

    Numerische Mathematik

    (1978)
  • Cited by (0)

    View full text