Run-time prediction of business process indicators using evolutionary decision rules
Introduction
Process mining techniques allow the extraction of useful information from the event log and historical data of business processes (Li, Ge, Huang, Hu, Wu, Yang, Hu, Luo, 2016a, Schnig, Cabanillas, Jablonski, Mendling, 2016). Knowledge can be generated from this information to improve the processes (Kamsu-Foguem, Rigal, Mauget, 2013, Nkambou, Fournier-Viger, Mephu-Nguifo, 2011, Potes-Ruiz, Kamsu-Foguem, Grabot, 2014). Generally, this knowledge is extracted after the process has been finished. Nevertheless, the interest to apply process mining to running process instances is increasing (Maggi, Francescomarino, Dumas, & Ghidini, 2014). One of the main issues in process mining is the predictive monitoring of business processes (de Leoni, van der Aalst, & Dees, 2016). The main value of predictive monitoring is to provide information in order to take proactive and corrective actions to improve process performance and mitigate risks in real time. Predictive monitoring of business process provides the prediction of business process indicators of a running process instance with the generation of predictive models. Business process indicators are quantifiable metrics that can be measured directly by data that is generated within the process flow (del Río-Ortega, Resinas, Cabanillas, & Ruiz-Cortés, 2013). An improvement in the prediction of these indicators, in many occasions, also means savings in human and economic resources and prevention of important loss of turnover to the companies. Some issues of real companies can also be solved with predictive monitoring. For instance, Push to front problem, detailed in Verbeek (2013) and covered in this work, try to identify those incidences which are not resolved by the service desks and are pushed to the other support lines of the company.
Since predicting these process indicators can be interpreted as a classification or regression problem, machine learning algorithms can be used for this task (Francescomarino, Dumas, Maggi, Teinemaa, 2015, Maggi, Francescomarino, Dumas, Ghidini, 2014). Classification and regression are used for the prediction of discrete or continuous target values, respectively. For instance, an indicator such as, the cycle time of a process instance can be regarded as a regression problem. By contrast, the fulfillment of a determined target, e.g. the process instance must complete in less than 4 h, or a condition, e.g. whether a specific activity occurs in the process instance, can be interpreted as a classification problem.
Multiple machine learning approaches have been applied for predictive monitoring, such as decision trees (Maggi, Francescomarino, Dumas, & Ghidini, 2014), clustering methods (Francescomarino, Dumas, Maggi, & Teinemaa, 2015) or neural networks (Tax, Verenich, Rosa, & Dumas, 2016). Nevertheless, as far as we are concerned, evolutionary algorithms (EAs) have not been applied for the prediction of process indicators. The use of an evolutionary algorithm may be justified for four different reasons (Fogel, 1997): (a) it can handle continuous and discrete attributes and automatically discretizes the continuous features; (b) it also handle missing attribute values and noise; (c) it can build models that can be easily interpreted by humans and finally (d) it finds a sub-set of the features that are relevant to the classification without the use of feature selection. In addition, EAs have shown the capacity of finding suboptimal solutions in search spaces when the search space is characterized by high dimensionality (Marquez-Chamorro, Asencio-Cortes, Divina, & Aguilar-Ruiz, 2014). In this case, the set of possible state conditions of a process, encoding in decision rules, determine the search space and fulfil these requirements. Some methods in process mining area also utilize association or decision rules for the improvement of the performance of the processes (Karray, Chebel-Morello, Zerhouni, 2014, Wen, Zhong, Wang, 2015).
In this work, we have developed a general method based on an evolutionary rule learning approach for the prediction of business process indicators in execution time. The resulting model consists in a set of decision rules that determine a prediction for an indicator of a running process instance. We have employed as encoded features, a window of the previous events to the point in the process execution where the prediction is carried out. This window of events considers attributes of a typical event log, such as activity name or timestamps, together with the data of each event. A combination of continuous and discrete values is allowed by the evolutionary algorithm. An advantage of this approach is that the generated decision rules can be interpreted by users to extract further insight of the business processes. Furthermore, as previously mentioned, the method incorporates a new encoding based on event windows of different sizes which provides more information from event logs. Additionally, this method is accompanied by a full software stack we have developed to support both the training and the prediction phase of our predictive monitoring approach. The learning phase is supported by a ProM plugin that helps in the computation of process indicators and the preprocessing of the event log for the machine learning algorithm. The prediction phase is supported by a framework that enables the integration of run-time predictions obtained from the predictive models generated by the training phase with business process management systems like Camunda (Camunda, 2016).
Our approach was exhaustively tested with two different real-life event logs to assess the validity of the proposal. The datasets belong to IT Department of Health Services of Andalusia (Spain) and the BPI 2013 Challenge (Verbeek, 2013). For the validation of the proposal, we also include a comparison with a method of the literature, described in Breuker, Matzner, Delfmann, and Becker (2016), and several machine learning approaches, under the same experimental conditions, in order to justify the use of the evolutionary algorithm.
The remainder of this paper is organized as follows. Section 2 introduces the main concepts referred throughout the paper. Section 3 summarizes the related work in this area. Section 4 introduces our methodology. Section 5 presents the experimentation and obtained results. Finally, Section 6, includes some conclusions and possible future works.
Section snippets
Background on predictive monitoring
The goal of predictive monitoring is to predict some aspect of the execution of a running process instance. To do so, it relies on the existence of an event log that contains the relevant information of the execution of a business process. An event log (L) is composed of a set of traces (T) that contain each event (E) that occurs in the different instances of a business process. Each execution of a process instance is reflected in a trace. Formally, we can express a trace Ti as a list of events
Related work
Several approaches for the predictive monitoring of business process indicators can be found in the literature. We have classified some of these methods according to the type of predicted outcome, such as time, risk indicators, SLA violation indicators and other indicators.
Time is one of the most valuable indicators during the execution of a business process. Following two works predict the expected time of the process. In Polato, Sperduti, Burattin, and de Leoni (2014), authors present a
Predictive monitoring with evolutionary algorithms
This section describes our proposal for the prediction of business process indicators. In particular, our proposal is based on an evolutionary algorithm (EA) using the information data of a window of events described in the event logs. This section is divided into three different subsections. The first one introduces the procedure of predictive monitoring. The calculation of indicators and the encoding of the algorithm are detailed in Section 4.2. Finally, the evolutionary method and a brief
A software system for predictive monitoring
Two software systems were developed for our proposal: one of them is used to preprocess the event log during the training phase, whereas the other is used to support the prediction phase. In addition, a generic machine learning software was used to support the process of creating the prediction model. The reason why we decided not to include support for building the prediction model together with the log preprocessor is because there are dozens of machine learning frameworks that provide very
Experimentation
This section presents the experimentation results obtained by the evolutionary approach. The aim of this experimentation consists on providing an analysis of the effectiveness of our system with respect to other machine learning approaches. In particular, our evaluation focuses on the following research questions:
- •
RQ1. Does the proposed approach provide reliable results in terms of prediction with respect to other similar machine learning approaches?
- •
RQ2. Does the proposed encoding provide
Conclusions and future work
In this paper an evolutionary decision rule-based system for the prediction of business process indicators is described. The encoding of this approach is based on the attributes of the events extracted from the event logs. The decision rules determine a prediction of a specified process indicator. Our system can predicts instance-level indicators using both next-event and end-of-instance predictions. Unlike the prediction models obtained by other machine learning techniques, these generated
Acknowledgements
This work has received funding from the European Commission (FEDER), the Spanish and the Andalusian R+D+I programmes, BELI and COPAS, [grant numbers TIN2015-70560-R, P12TIC-1867] and Juan de la Cierva program [JCF 2015].
References (57)
- et al.
A recommendation system for predicting risks across multiple business process instances
Decision Support Systems
(2015) - et al.
Classification tree analysis using target
Computational Statistics and Data Analysis
(2008) - et al.
Mining association rules for the quality improvement of the production process
Expert Systems with Applications
(2013) - et al.
Real-time business process monitoring method for prediction of abnormal termination using KNNI-based LOF prediction
Expert Systems with Applications
(2012) - et al.
PETRA: Process evolution using a trace-based system on a maintenance platform
Knowledge-Based Systems
(2014) - et al.
Process mining with token carried data
Information Sciences
(2016) - et al.
Data-driven root cause diagnosis of faults in process industries
Chemometrics and Intelligent Laboratory Systems
(2016) - et al.
Predictive monitoring of business processes
Advanced information systems engineering
(2014) - et al.
Workload and delay analysis in manufacturing process using process mining
Asia Pacific business process management - third Asia Pacific conference, AP-BPM 2015, Busan, South Korea, June 24-26
(2015) - et al.
Data-aware remaining time prediction of business process instances
International joint conference on neural networks (IJCNN14)
(2014)
C4.5: Programs for machine learning
Visual PPINOT: A graphical notation for process performance indicators
Business & Information Systems Engineering
A framework for efficiently mining the organisational perspective of business processes
Decision Support Systems
An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics
5th international conference on hybrid intelligent systems (HIS05), Rio de Janeiro (Brazil, 2005)
Evolutionary learning of hierarchical decision rules
IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics
KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework
Journal of Multiple-Valued Logic and Soft Computing
A study of the behavior of several methods for balancing machine learning training data
SIGKDD Explorations
A survey of predictive modeling on imbalanced domains
ACM Computing Surveys
Discovering process models with genetic algorithms using sampling
Proceedings of the 14th international conference on knowledge-based and intelligent information and engineering systems: Part I
Comprehensible predictive models for business processes
MIS Quarterly
Multivariable functional interpolation and adaptative networks
Complex Systems
Predictive task monitoring for business processes
Business process management
Smote: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Supporting risk-informed decisions during business process execution
Advanced information systems engineering
Using real-valued genetic algorithms to evolve rule sets for classification
1st IEEE conference on evolutionary computation. Orlando, USA
Support vector networks
Machine Learning
The ProM framework: A new era in process mining tool support
Applications and theory of petri nets
Cited by (54)
Utilizing the omnipresent: Incorporating digital documents into predictive process monitoring using deep neural networks
2023, Decision Support SystemsDaQAPO: Supporting flexible and fine-grained event log quality assessment
2022, Expert Systems with ApplicationsCitation Excerpt :Over the last decade, process mining scholars developed a wide range of algorithms to (semi-)automatically retrieve data-driven insights in, amongst others, the order of activities in a business process (Augusto et al., 2018; Marin-Castro & Tello-Leal, 2021; van der Aalst, 2016), the adherence of a process to a normative model (Burattin et al., 2016; Carmona et al., 2018), and the behavior of resources within a process (Huang et al., 2011, 2012; Song & van der Aalst, 2008). Moreover, process mining has been connected to other techniques including simulation (Martin et al., 2016), or used within contexts such as predictive process monitoring (Di Francescomarino et al., 2018; Márquez-Chamorro et al., 2017) and robotic process automation (Syed et al., 2020). Despite the significant potential of process mining to support organizations in understanding and improving their processes (Reinkemeyer, 2016; van der Aalst, 2016), the reliability of process mining outcomes ultimately depends on the quality of the event log (Mans et al., 2015; van der Aalst et al., 2012).
Encoding resource experience for predictive process monitoring
2022, Decision Support SystemsCitation Excerpt :In outcome-oriented predictive monitoring, the outcome of a case is usually a binary variable. Approaches in the literature often define outcomes as the satisfaction of service level agreements or the satisfaction of temporal constraints defined on the order and the occurrence of tasks in a case [35,50]. Extensive efforts have been devoted to enhancing the performance of predictive monitoring models from both the pre-processing and the learning sides.
Process science in action: A literature review on process mining in business management
2021, Technological Forecasting and Social ChangeCause vs. effect in context-sensitive prediction of business process instances
2021, Information SystemsCitation Excerpt :Furthermore, a promising path for future research could be to use our technique for process analysis. Marquez-Chamorro et al. [28] state that the decision rules used by their technique for the prediction could deliver insights about the process to users. Similarly, our technique could be used to understand better both the root causes for process performance and the impact of actions taken during process execution.