Run-time prediction of business process indicators using evolutionary decision rules

doi:10.1016/j.eswa.2017.05.069

Expert Systems with Applications

Volume 87, 30 November 2017, Pages 1-14

https://doi.org/10.1016/j.eswa.2017.05.069 Get rights and content

Highlights

•
An evolutionary rule-based system for the prediction of BP indicators is proposed.
•
Generated decision rules can be easily interpreted by users.
•
A software stack to support the stages of a predictive monitoring system is presented.

Abstract

Predictive monitoring of business processes is a challenging topic of process mining which is concerned with the prediction of process indicators of running process instances. The main value of predictive monitoring is to provide information in order to take proactive and corrective actions to improve process performance and mitigate risks in real time. In this paper, we present an approach for predictive monitoring based on the use of evolutionary algorithms. Our method provides a novel event window-based encoding and generates a set of decision rules for the run-time prediction of process indicators according to event log properties. These rules can be interpreted by users to extract further insight of the business processes while keeping a high level of accuracy. Furthermore, a full software stack consisting of a tool to support the training phase and a framework that enables the integration of run-time predictions with business process management systems, has been developed. Obtained results show the validity of our proposal for two large real-life datasets: BPI Challenge 2013 and IT Department of Andalusian Health Service (SAS).

Introduction

Process mining techniques allow the extraction of useful information from the event log and historical data of business processes (Li, Ge, Huang, Hu, Wu, Yang, Hu, Luo, 2016a, Schnig, Cabanillas, Jablonski, Mendling, 2016). Knowledge can be generated from this information to improve the processes (Kamsu-Foguem, Rigal, Mauget, 2013, Nkambou, Fournier-Viger, Mephu-Nguifo, 2011, Potes-Ruiz, Kamsu-Foguem, Grabot, 2014). Generally, this knowledge is extracted after the process has been finished. Nevertheless, the interest to apply process mining to running process instances is increasing (Maggi, Francescomarino, Dumas, & Ghidini, 2014). One of the main issues in process mining is the predictive monitoring of business processes (de Leoni, van der Aalst, & Dees, 2016). The main value of predictive monitoring is to provide information in order to take proactive and corrective actions to improve process performance and mitigate risks in real time. Predictive monitoring of business process provides the prediction of business process indicators of a running process instance with the generation of predictive models. Business process indicators are quantifiable metrics that can be measured directly by data that is generated within the process flow (del Río-Ortega, Resinas, Cabanillas, & Ruiz-Cortés, 2013). An improvement in the prediction of these indicators, in many occasions, also means savings in human and economic resources and prevention of important loss of turnover to the companies. Some issues of real companies can also be solved with predictive monitoring. For instance, Push to front problem, detailed in Verbeek (2013) and covered in this work, try to identify those incidences which are not resolved by the service desks and are pushed to the other support lines of the company.

Since predicting these process indicators can be interpreted as a classification or regression problem, machine learning algorithms can be used for this task (Francescomarino, Dumas, Maggi, Teinemaa, 2015, Maggi, Francescomarino, Dumas, Ghidini, 2014). Classification and regression are used for the prediction of discrete or continuous target values, respectively. For instance, an indicator such as, the cycle time of a process instance can be regarded as a regression problem. By contrast, the fulfillment of a determined target, e.g. the process instance must complete in less than 4 h, or a condition, e.g. whether a specific activity occurs in the process instance, can be interpreted as a classification problem.

Multiple machine learning approaches have been applied for predictive monitoring, such as decision trees (Maggi, Francescomarino, Dumas, & Ghidini, 2014), clustering methods (Francescomarino, Dumas, Maggi, & Teinemaa, 2015) or neural networks (Tax, Verenich, Rosa, & Dumas, 2016). Nevertheless, as far as we are concerned, evolutionary algorithms (EAs) have not been applied for the prediction of process indicators. The use of an evolutionary algorithm may be justified for four different reasons (Fogel, 1997): (a) it can handle continuous and discrete attributes and automatically discretizes the continuous features; (b) it also handle missing attribute values and noise; (c) it can build models that can be easily interpreted by humans and finally (d) it finds a sub-set of the features that are relevant to the classification without the use of feature selection. In addition, EAs have shown the capacity of finding suboptimal solutions in search spaces when the search space is characterized by high dimensionality (Marquez-Chamorro, Asencio-Cortes, Divina, & Aguilar-Ruiz, 2014). In this case, the set of possible state conditions of a process, encoding in decision rules, determine the search space and fulfil these requirements. Some methods in process mining area also utilize association or decision rules for the improvement of the performance of the processes (Karray, Chebel-Morello, Zerhouni, 2014, Wen, Zhong, Wang, 2015).

In this work, we have developed a general method based on an evolutionary rule learning approach for the prediction of business process indicators in execution time. The resulting model consists in a set of decision rules that determine a prediction for an indicator of a running process instance. We have employed as encoded features, a window of the previous events to the point in the process execution where the prediction is carried out. This window of events considers attributes of a typical event log, such as activity name or timestamps, together with the data of each event. A combination of continuous and discrete values is allowed by the evolutionary algorithm. An advantage of this approach is that the generated decision rules can be interpreted by users to extract further insight of the business processes. Furthermore, as previously mentioned, the method incorporates a new encoding based on event windows of different sizes which provides more information from event logs. Additionally, this method is accompanied by a full software stack we have developed to support both the training and the prediction phase of our predictive monitoring approach. The learning phase is supported by a ProM plugin that helps in the computation of process indicators and the preprocessing of the event log for the machine learning algorithm. The prediction phase is supported by a framework that enables the integration of run-time predictions obtained from the predictive models generated by the training phase with business process management systems like Camunda (Camunda, 2016).

Our approach was exhaustively tested with two different real-life event logs to assess the validity of the proposal. The datasets belong to IT Department of Health Services of Andalusia (Spain) and the BPI 2013 Challenge (Verbeek, 2013). For the validation of the proposal, we also include a comparison with a method of the literature, described in Breuker, Matzner, Delfmann, and Becker (2016), and several machine learning approaches, under the same experimental conditions, in order to justify the use of the evolutionary algorithm.

The remainder of this paper is organized as follows. Section 2 introduces the main concepts referred throughout the paper. Section 3 summarizes the related work in this area. Section 4 introduces our methodology. Section 5 presents the experimentation and obtained results. Finally, Section 6, includes some conclusions and possible future works.

Section snippets

Background on predictive monitoring

The goal of predictive monitoring is to predict some aspect of the execution of a running process instance. To do so, it relies on the existence of an event log that contains the relevant information of the execution of a business process. An event log (L) is composed of a set of traces (T) that contain each event (E) that occurs in the different instances of a business process. Each execution of a process instance is reflected in a trace. Formally, we can express a trace T_i as a list of events

Related work

Several approaches for the predictive monitoring of business process indicators can be found in the literature. We have classified some of these methods according to the type of predicted outcome, such as time, risk indicators, SLA violation indicators and other indicators.

Time is one of the most valuable indicators during the execution of a business process. Following two works predict the expected time of the process. In Polato, Sperduti, Burattin, and de Leoni (2014), authors present a

Predictive monitoring with evolutionary algorithms

This section describes our proposal for the prediction of business process indicators. In particular, our proposal is based on an evolutionary algorithm (EA) using the information data of a window of events described in the event logs. This section is divided into three different subsections. The first one introduces the procedure of predictive monitoring. The calculation of indicators and the encoding of the algorithm are detailed in Section 4.2. Finally, the evolutionary method and a brief

A software system for predictive monitoring

Two software systems were developed for our proposal: one of them is used to preprocess the event log during the training phase, whereas the other is used to support the prediction phase. In addition, a generic machine learning software was used to support the process of creating the prediction model. The reason why we decided not to include support for building the prediction model together with the log preprocessor is because there are dozens of machine learning frameworks that provide very

Experimentation

This section presents the experimentation results obtained by the evolutionary approach. The aim of this experimentation consists on providing an analysis of the effectiveness of our system with respect to other machine learning approaches. In particular, our evaluation focuses on the following research questions:

•
RQ1. Does the proposed approach provide reliable results in terms of prediction with respect to other similar machine learning approaches?
•
RQ2. Does the proposed encoding provide

Conclusions and future work

In this paper an evolutionary decision rule-based system for the prediction of business process indicators is described. The encoding of this approach is based on the attributes of the events extracted from the event logs. The decision rules determine a prediction of a specified process indicator. Our system can predicts instance-level indicators using both next-event and end-of-instance predictions. Unlike the prediction models obtained by other machine learning techniques, these generated

Acknowledgements

This work has received funding from the European Commission (FEDER), the Spanish and the Andalusian R+D+I programmes, BELI and COPAS, [grant numbers TIN2015-70560-R, P12TIC-1867] and Juan de la Cierva program [JCF 2015].

References (57)

R. Conforti et al.
A recommendation system for predicting risks across multiple business process instances
Decision Support Systems
(2015)
J.B. Gray et al.
Classification tree analysis using target
Computational Statistics and Data Analysis
(2008)
B. Kamsu-Foguem et al.
Mining association rules for the quality improvement of the production process
Expert Systems with Applications
(2013)
B. Kang et al.
Real-time business process monitoring method for prediction of abnormal termination using KNNI-based LOF prediction
Expert Systems with Applications
(2012)
M.H. Karray et al.
PETRA: Process evolution using a trace-based system on a maintenance platform
Knowledge-Based Systems
(2014)
C. Li et al.
Process mining with token carried data
Information Sciences
(2016)
G. Li et al.
Data-driven root cause diagnosis of faults in process industries
Chemometrics and Intelligent Laboratory Systems
(2016)
F.M. Maggi et al.
Predictive monitoring of business processes
Advanced information systems engineering
(2014)
M. Park et al.
Workload and delay analysis in manufacturing process using process mining
Asia Pacific business process management - third Asia Pacific conference, AP-BPM 2015, Busan, South Korea, June 24-26
(2015)
M. Polato et al.
Data-aware remaining time prediction of business process instances
International joint conference on neural networks (IJCNN14)
(2014)

J.R. Quinlan

C4.5: Programs for machine learning

(1993)

A. del Río-Ortega et al.

Visual PPINOT: A graphical notation for process performance indicators

Business & Information Systems Engineering

(2017)

S. Schnig et al.

A framework for efficiently mining the organisational perspective of business processes

Decision Support Systems

(2016)

K. Yoon et al.

An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics

5th international conference on hybrid intelligent systems (HIS05), Rio de Janeiro (Brazil, 2005)

(2005)

J.S. Aguilar-Ruiz et al.

Evolutionary learning of hierarchical decision rules

IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics

(2003)

J. Alcala-Fdez et al.

KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework

Journal of Multiple-Valued Logic and Soft Computing

(2011)

G.E. Batista et al.

A study of the behavior of several methods for balancing machine learning training data

SIGKDD Explorations

(2004)

P. Branco et al.

A survey of predictive modeling on imbalanced domains

ACM Computing Surveys

(2016)

C. Bratosin et al.

Discovering process models with genetic algorithms using sampling

Proceedings of the 14th international conference on knowledge-based and intelligent information and engineering systems: Part I

(2010)

D. Breuker et al.

Comprehensible predictive models for business processes

MIS Quarterly

(2016)

D.S. Broomhead et al.

Multivariable functional interpolation and adaptative networks

Complex Systems

(1988)

C. Cabanillas et al.

Predictive task monitoring for business processes

Business process management

(2014)

Camunda (2016). An open source platform for workflow and business process management. Accessed: 09.03.17....

N.V. Chawla et al.

Smote: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research

(2002)

R. Conforti et al.

Supporting risk-informed decisions during business process execution

Advanced information systems engineering

(2013)

A.L. Corcoran et al.

Using real-valued genetic algorithms to evolve rule sets for classification

1st IEEE conference on evolutionary computation. Orlando, USA

(1994)

C. Cortes et al.

Support vector networks

Machine Learning

(1995)

B. van Dongen et al.

The ProM framework: A new era in process mining tool support

Applications and theory of petri nets

(2005)

Cited by (56)

Predictive process model monitoring using long short-term memory networks
2024, Engineering Applications of Artificial Intelligence
The field of predictive process monitoring focuses on case-level models to predict a single specific outcome such as a particular objective, (remaining) time, or next activity/remaining sequence. Recently, a longer-horizon, model-wide approach has been proposed in the form of process model forecasting, which predicts the future state of a whole process model through the forecasting of all activity-to-activity relations at once using time series forecasting.
This paper introduces the concept of predictive process model monitoring which sits in the middle of both predictive process monitoring and process model forecasting. Concretely, by modelling a process model as a set of constraints being present between activities over time, we can capture more detailed information between activities compared to process model forecasting, while being compatible with typical predictive process monitoring objectives which are often expressed in the same language as these constraints. To achieve this, Processes-As-Movies (PAM) is introduced, i.e., a novel technique capable of jointly mining and predicting declarative process constraints between activities in various windows of a process’ execution. PAM predicts what declarative rules hold for a trace (objective-based), which also supports the prediction of all constraints together as a process model (model-based). Various recurrent neural network topologies inspired by video analysis tailored to temporal high-dimensional input are used to model the process model evolution with windows as time steps, including encoder–decoder long short-term memory networks, and convolutional long short-term memory networks. Results obtained over real-life event logs show that these topologies are effective in terms of predictive accuracy and precision.
A hybrid machine learning with process analytics for predicting customer experience in online insurance services industry
2024, Decision Analytics Journal
It is essential to innovate and improve service levels by predicting possible process outcomes due to the growth of online service providers. This study estimates the customer satisfaction level based on customer experience analysis. In doing so, we answer recent calls for research about a more thorough exploration of customer behavior using predictive process monitoring techniques. In particular, a hybrid framework of supervised/unsupervised machine learning methods is proposed to predict the outcomes of customers’ experiences while dealing with the problem of high intra-class variance. This problem occurs due to the large dispersion of traces identified in the customer journeys. In this regard, customer journeys are first matched with the event log format aiming to implement a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering technique based on the similarity between the customer journeys. After summarizing the journeys by removing low-value activities, the multi-class decision tree classification method is applied, and the level of customer satisfaction is predicted. Due to the imbalanced nature of the data, the oversampling for imbalanced classification is applied to achieve good results in accuracy indicators such as recall, precision, and F1-score. Finally, the proposed approach has been evaluated on a real-life event log, BPI Challenge 2016, to investigate unsatisfied customers. The results of the machine learning models on the test data show a high degree of accuracy in predicting customer dissatisfaction.
Utilizing the omnipresent: Incorporating digital documents into predictive process monitoring using deep neural networks
2023, Decision Support Systems
Predictive process monitoring (PPM) allows companies to improve the efficiency of their business processes by predicting aspects such as the process outcome, the next event, or the time until the next event. So far, existing studies have mainly focused on developing novel predictive models while using features solely from event logs. In this study, we aim to go beyond log data and increase the focus of PPM research towards external context information. To this end, we consider digital documents as they are omnipresent in many business processes and their inclusion can often be justified by a business rationale. However, incorporating digital documents into PPM models poses considerable challenges as they present unstructured data that can contain visual and textual cues of future process behavior, while manual feature extraction is generally not feasible. Therefore, we propose an approach that processes digital documents based on automated visual and textual feature extraction methods. Furthermore, we design a tailored integration module which transforms the extracted features from multiple document pages into a fixed-size representation that subsequently serves as input for the predictive models. Our evaluation, based on a real-world dataset of insurance claims from a mid-sized German insurance company, featuring 5131 process instances with 32,058 events and 39,242 document pages, shows that incorporating digital documents improves the performance by significant margins in predicting the damage type, the next event, and the time until the next event. Finally, we analyze how digital documents contribute to the model’s predictions in terms of Shapley additive explanations.
DaQAPO: Supporting flexible and fine-grained event log quality assessment
2022, Expert Systems with Applications
Citation Excerpt :
Over the last decade, process mining scholars developed a wide range of algorithms to (semi-)automatically retrieve data-driven insights in, amongst others, the order of activities in a business process (Augusto et al., 2018; Marin-Castro & Tello-Leal, 2021; van der Aalst, 2016), the adherence of a process to a normative model (Burattin et al., 2016; Carmona et al., 2018), and the behavior of resources within a process (Huang et al., 2011, 2012; Song & van der Aalst, 2008). Moreover, process mining has been connected to other techniques including simulation (Martin et al., 2016), or used within contexts such as predictive process monitoring (Di Francescomarino et al., 2018; Márquez-Chamorro et al., 2017) and robotic process automation (Syed et al., 2020). Despite the significant potential of process mining to support organizations in understanding and improving their processes (Reinkemeyer, 2016; van der Aalst, 2016), the reliability of process mining outcomes ultimately depends on the quality of the event log (Mans et al., 2015; van der Aalst et al., 2012).
Process mining can provide valuable insights in business processes using an event log containing process execution data. Despite the significant potential of process mining to support the analysis and improvement of processes, the reliability of process mining outcomes depends on the quality of the event log. Real-life logs typically suffer from various data quality issues. Consequently, thorough event log quality assessment is required before applying process mining algorithms. This paper introduces DaQAPO, the first R-package which supports flexible and fine-grained event log quality assessment. It provides a rich set of tests to identify a wide range of event log quality issues, while having sufficient flexibility to allow the detection of context-specific quality issues.
Encoding resource experience for predictive process monitoring
2022, Decision Support Systems
Events recorded during the execution of a business process can be used to train models to predict, at run-time, the outcome of each execution of the process (a.k.a. case). In this setting, the outcome of a case may refer to whether a given case led to a customer complaint or not, or to a product return or other claims, or whether a case was completed on time or not. Existing approaches to train such predictive models do not take into account information about the prior experience of the (human) resources assigned to each task in the process. Instead, these approaches simply encode the resource who performs each task as a categorical (possibly one-hot encoded) feature. Yet, the experience of the resources involved in the execution of a case may clearly have an impact on the case outcome. For example, specialized resources or resources who are familiar with a given type of case, are more likely to execute the tasks in a case faster and more effectively, leading to a higher probability of a positive outcome. Motivated by this observation, this article proposes and evaluates a framework to extract features from event logs that capture the experience of the resources involved in a business process. The framework exploits traditional principles from the literature to capture resource experience, such as experiential learning and social ties on the workplace. The proposed framework is evaluated by comparing the performance of state-of-the-art predictive models trained with and without the proposed resource experience features, using publicly available event logs. The results show that the proposed resource experience features may improve the accuracy of predictive models, but that depends on the process execution context, such as the type of process generating an event log or the type of label that is predicted.
Process science in action: A literature review on process mining in business management
2021, Technological Forecasting and Social Change
Process Mining is a new kind of Business Analytics and has emerged as a powerful family of Process Science techniques for analysing and improving business processes. Although Process Mining has managerial benefits, such as better decision making, the scientific literature has investigated it mainly from a computer science standpoint and appears to have overlooked various possible applications. We reviewed management-orientated literature on Process Mining and Business Management to assess the state of the art and to pave the way for further research. We built a seven-dimension framework to develop and guide the review. We selected and analysed 145 papers and identified eleven research gaps sorted into four categories. Our findings were formalised in a structured research agenda suggesting twenty-five research questions. We believe that these questions may stimulate the application of Process Mining in promising, albeit little explored, business contexts and in mostly unaddressed managerial areas.

View all citing articles on Scopus

View full text

Run-time prediction of business process indicators using evolutionary decision rules

Highlights

Abstract

Introduction

Section snippets

Background on predictive monitoring

Related work

Predictive monitoring with evolutionary algorithms

A software system for predictive monitoring

Experimentation

Conclusions and future work

Acknowledgements

Decision Support Systems

Computational Statistics and Data Analysis

Expert Systems with Applications

Expert Systems with Applications

Knowledge-Based Systems

Information Sciences

Chemometrics and Intelligent Laboratory Systems

Business & Information Systems Engineering

Decision Support Systems

Evolutionary learning of hierarchical decision rules

IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics

KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework

Journal of Multiple-Valued Logic and Soft Computing

A study of the behavior of several methods for balancing machine learning training data

SIGKDD Explorations

A survey of predictive modeling on imbalanced domains

ACM Computing Surveys

Discovering process models with genetic algorithms using sampling

Proceedings of the 14th international conference on knowledge-based and intelligent information and engineering systems: Part I

Comprehensible predictive models for business processes

MIS Quarterly

Multivariable functional interpolation and adaptative networks

Complex Systems

Predictive task monitoring for business processes

Business process management

Smote: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research

Supporting risk-informed decisions during business process execution

Advanced information systems engineering

Using real-valued genetic algorithms to evolve rule sets for classification

1st IEEE conference on evolutionary computation. Orlando, USA

Support vector networks

Machine Learning

The ProM framework: A new era in process mining tool support

Applications and theory of petri nets