Real-time business process monitoring method for prediction of abnormal termination using KNNI-based LOF prediction
Introduction
A business process monitoring system is defined as an information system that provides real-time access to process management indicators (Buytendijk & Flint, 2002). A business process is represented as a set of tasks and their flows orchestrated to achieve a common business goal (Keung & Kawalek, 1997). After process execution, a process instance is generated, which is observed through a set of process- or task-relevant attributes. The monitoring system records such attributes, which include the start and completion times of each task, input and output data, and resources or information from any event occurring during executions (Grigori et al., 2004). By archiving these instance logs and analyzing the relations between process attributes and results, we can extract valuable knowledge that can be utilized for monitoring running instances through observed attributes, which helps to diagnose a current state of running instances and predict their probable issues (Wang & Romagnoli, 2005). Such knowledge is extracted in the form of If-Then rules by means of an inductive data mining technique (Ma & Wang, 2009), which is known as the rule-based approach. A rule is defined as a correlation between a pattern of process attributes and the corresponding result (as given by the pattern). In rule-based monitoring, if the specific condition of attributes is detected, the status of the process is identified, and the predefined operation is provided (Grigori et al., 2001, Grigori et al., 2004). One of the goals of rule-based monitoring is fault detection of infrequent process patterns as compared with the normal, frequent pattern. Various fault detection algorithms have been applied in fields such as municipal solid waste incineration (Chen & Lin, 2008), fraud detection in financial processes (Yue, Wu, Wang, Li, & Chu, 2007), Emergency Department triage (Nie, Zhang, Liu, Zheng, & Shi, 2010), and infrequent image detection in video streams (Medioni, Cohen, Hongeng, Bremond, & Nevatia, 2001).
Local outlier factor (LOF) is one of the most widely used fault detection algorithms. Its operation is based on relative density as measured by how isolated the pattern is with respect to the surrounding neighbors, which indicates the probability of being a fault (Breunig, Kriegel, Ng, & Sander, 2000). Thereby it can detect local faults as well as global faults; indeed, LOF typically achieves the best performance among numerous algorithms with which it is tested (Lazarevic, Ertoz, Ozgur, Srivastava, & Kumar, 2003).
Existing rule-based approaches require that all attributes be obtained. In that sense, identification of a process state is conducted effectively instance by instance, once all attribute conditions are detected instantly or near instantly at the end of the process. However, the existing approaches have shown some limitations when applied to real-time process monitoring. As a process becomes longer and more complex, execution of a process instance requires more time, so that attributes are detected gradually as the execution period elapses (Kang, Lee, Min, & Cho, 2009). Therefore, when monitoring an ongoing instance in real-time, the monitoring system has to remain idle until all conditions are detected upon its termination (Grigori et al., 2001, Grigori et al., 2004). Even if the instance terminates abnormally, the monitoring system cannot help providing a reactive operation which merely resolves the abnormal termination only after its actual occurrence (Kim, Choi, & Park, 2010). Such limitations require proactive real-time process monitoring systems that predict final outcomes based on the current status at midcourse (Leitner, Wetzstein, Rosenberg, Michlmayr, & Leymann, 2010).
To alleviate these limitations, this paper proposes a novel approach to real-time business process monitoring for fault (especially abnormal termination) prediction using LOF and an imputation method. Over the course of real-time monitoring, unknown attributes are, by imputation, substituted for assumed attributes corresponding to probable results after the current monitoring period. Then, LOF values are computed based on the plausible instances composed of the observed attributes and the imputed attributes from the probable next states as given by the current state. After that, the LOF values, when the ongoing instance is terminated, are estimated probabilistically in each monitoring period. By the proposed method, probable outcomes are predicted over entire monitoring periods, based on the current performance. Therefore, by observing the tendency after real-time progress, an abnormal termination of an instance can proactively be predicted, before its actual occurrence.
The rest of the paper is organized as follows. Section 2 reviews the existing research on LOF algorithms and rule-based monitoring approaches. Section 3 describes the motivation behind and concept of the proposed real-time monitoring method. Section 4 presents details of the real-time business process monitoring scheme using KNNI (k nearest neighbor imputation)-based LOF estimation. In Section 5, the results of experiments conducted with an example scenario are discussed. Finally, Section 6 summarizes conclusions and future work.
Section snippets
Local outlier factor (LOF) algorithm
In this paper, we focus on the LOF (local outlier factor) algorithm among the various unsupervised fault detection algorithms used with rule-based monitoring approaches. For each object, the LOF algorithm computes the degree of being a fault, called the local outlier factor, according to how isolated the object is compared with other, surrounding objects. A higher LOF value indicates that the local density of a data point is smaller than that of its surrounding points. The LOF value for data
Concept of proposed method
In this section, we present the motivations and concepts behind the proposed method. Fig. 1 schematizes three approaches to LOF-based fault detection for real-time business process monitoring. Let us suppose that, after executing a process model, an ongoing process instance is monitored through a set of m attributes, which are generated gradually with real-time progress and observed by a monitoring system. At monitoring period t at midcourse, only t attributes (t < m) have been recorded; the rest
Overall procedures
In this section, we formulate overall procedures for the real-time business process monitoring method using KNNI-based LOF prediction. Fig. 2 shows the procedures as categorized by phase, either preprocessing or real-time monitoring.
Preprocessing aims at defining an upper control limit of an LOF value by analyzing historical process instances as training data. After calculating their LOF values, an upper control limit (UCL) can be derived by applying kernel density estimation (KDE) of LOF
Experimental design
We conducted experiments with an example scenario to describe how the proposed method can be applied to real-time business process monitoring. The visualized indicators included the expected LOF value, confidence intervals and the probability of abnormal termination. Then, the error of expected LOF value was analyzed through entire monitoring periods in order to observe real-time progress with decreasing uncertainty. Finally, an early alarm was generated by comparing the probability of abnormal
Conclusion
In this paper, we proposed a novel approach to real-time business process monitoring for prediction of abnormal termination. To realize this monitoring method, we devised a KNNI-based LOF prediction algorithm. The conventional rule-based approach, especially LOF-based fault detection, is inefficient as applied to real-time monitoring, and indeed shows limitations such as no indicator or late alarm, due to inevitably unobserved attributes according to the monitoring period. To improve these
Acknowledgment
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0020943).
References (28)
- et al.
Diagnosis for monitoring system of municipal solid waste incineration plant
Expert Systems with Applications
(2008) - et al.
Development and testing of regeneration imputation models for forests in Minnesota
Forest Ecology and Management
(1997) - et al.
Business process intelligence
Computers in Industry
(2004) - et al.
Processing online analytics with classification and association rule mining
Knowledge-Based Systems
(2010) - et al.
Inductive data mining based on genetic programming: Automatic generation of decision trees from data for process historical data analysis
Computers and Chemical Engineering
(2009) - et al.
Real time diagnostics of technological processes and field equipment
Chemometrics and Intelligent Laboratory Systems
(2007) - et al.
Robust multi-scale principal components analysis with applications to process monitoring
Journal of Process Control
(2005) - et al.
Support vector machine in machine condition monitoring and fault diagnosis
Mechanical Systems and Signal Processing
(2007) - et al.
On-line monitoring of batch processes using multiway independent component analysis
Chemometrics and Intelligent Laboratory Systems
(2004) - et al.
Implementing an industrial continuous improvement system: a knowledge management case study
Industrial Management & Data Systems
(2000)
LOF: Identifying density based local outliers
ACM SIGMOD Record
Predictive business operations management
International Journal of Computational Science and Engineering
The cases for quantitative process management
IEEE Software
Cited by (61)
Unusual customer response identification and visualization based on text mining and anomaly detection
2020, Expert Systems with ApplicationsReliability hazard characterization of wafer-level spatial metrology parameters based on LOF-KNN method
2020, Microelectronics ReliabilityCitation Excerpt :The LOF method was a density-based outlier detection algorithm. The method could identify, quantify and analyse the outliers by calculating local outliers for each data [13]. The higher the LOF value, the more likely the data was the outlier.
Predicting performances in business processes using deep neural networks
2020, Decision Support Systems