1 Introduction
2 Running Example
3 The Family of Predictive Process Monitoring Approaches
-
type of prediction (i.e., the type of predictions provided as output);
-
type of adopted approach and technique;
-
type of information exploited in order to get predictions (i.e., the type of information taken as input).
-
predictions related to predefined categorical or boolean outcome values (outcome-based predictions);
-
predictions related to measures of interest taking numeric or continuous values (numeric value predictions);
-
predictions related to sequences of future activities and related data payloads (next event predictions).
-
approaches relying on an explicit model (model-based approaches), e.g., annotated transition systems. The explicit model can either be discovered from the event log and then enriched with the information the log contains or directly be enriched, if an explicit model is already available. In model-based approaches, the model that is then leveraged at runtime in order to get predictions is an (enriched) model in which the process control flow is somehow made explicit (see the blue box in the middle on the right of Fig. 3).
-
approaches leveraging machine learning and statistical techniques, e.g., classification and regression models, as well as neural networks. These approaches only rely on (implicit) predictive models built by encoding event log information in terms of features to be used as input for machine/deep learning techniques (see the blue box at the top on the right of Fig. 3).
-
information related to the control flow - i.e., the sequence of events. As depicted in the fourth row of Fig. 4, in the example of John’s history this is the information related to the activities carried out by John (e.g., check in to the hospital, go to the radiology department, ...).
-
information related to the structured data payload associated to the events. This information usually include the timestamp of the events, but it can also include other types of data attributes. For instance, in John’s history, besides the timestamp associated to each event, the data payload of the event
Visit patient
also includes the doctor who has visited John, i.e., Alice (see the third row in Fig. 4). -
information related to unstructured (textual) content, which can be available together with the event log. Indeed, it often happens that, together with the structured information related to the events and data payload, some unstructured information is also available. In John’s example, for instance, the text of Alice’s medical report is available together with the event
visit patient
(see the second row in Fig. 4) and could provide useful information on what John is going to do later on. -
information related to process context, such as workload or resource availability. In John’s example, this kind of information could be related for instance to the availability of free ultrasound scan machines (first row in Fig. 4). Contextual information could provide useful information on what John is going to do later on and when. For example, the time required to John to perform an ultrasound could be related to the immediate availability of a scan equipment.
3.1 Predictive Process Monitoring Approaches
Visit patient
is executed at time 08:00, the activity Compute rate
at time 10:00 and so on. Figure 7 shows the transition system computed using as event representation abstraction the name of the activity and as state representation the activity set.Compute rate (CR)
{12:00}, Visit patient (VP)
{13:00}). Two measurements are associated to the corresponding state of the transition system in Fig. 7 (see the state in light green), i.e., 6 and 2 hours. Considering the average as prediction function, the average value of the measurements (4 hours) can be used to compute the predicted completion time, i.e., according to the prediction, the patient will complete his process at 17:00.4 Predicting Outcomes
4.1 Typical Data Encodings
Visit patient
is the first event of sequence \(\sigma _1\). Its data payload “\(\{\) 33, radiology\(\}\)” corresponds to the data associated to attributes age and department5. Note that the value of age is static: it is the same for all the events in a case, while the value of department is different for every event. In the payload of an event, the entire set of attributes available in the log is considered as well. In case for some event the value for a specific attribute is not available, the value \(\bot \) (unknown) is specified for it.Visit patient
occurs two times in \(\sigma _i\) and Get Payment
occur four times in \(\sigma _k\).4.2 Mostly Used Approaches: Classification-Based Approaches
5 Predicting Numeric Values
5.1 Typical Data Encodings
5.2 Mostly Used Approaches: Regression-Based Approaches
6 Predicting Next Events
6.1 Typical Data Encodings
6.2 Mostly Used Approaches: LSTM-Based Approaches
7 New Trends in ML-Driven Operational Support
7.1 Intercase Predictions
event_1 | event_m | simult. trace # | avg. duration | label | ||
---|---|---|---|---|---|---|
\(\sigma _1\) | Visit patient | Perform ultrasound | 10 | 6 | False | |
\(\sigma _k\) | Compute rate | Get Payment | 18 | 8 | True |
7.2 Explainable Predictions
Visit patient
{20, clinic}, Perform X-Ray
{20, radiology}, Perform ultrasound
{20, radiology}), which we have observed up to the event 3, the prediction of our predictive model is that the patient will recover soon. In order to understand whether we can trust or not the prediction, we would need to understand why our predictive model has returned such a prediction. Figure 13 shows an example of a possible explanation returned by a prediction explainer as LIME[37] or SHAP[27] applied to our specific Predictive Process Monitoring problem. The plot shows the impact of each feature (and related value) towards (in case of positive values) or against (in case of negative values) the fast recovery of the patient.10 In the example, the feature that has impacted most on the prediction of the fast recovery of the patient is her young age.11
7.3 Predictions with A-Priori Knowledge
K
at prediction time by guiding the Predictive Process Monitoring algorithm towards a solution that is compliant to the a-priori knowledge [14]. In [14] for instance, an approach using LSTM for predicting the next activities has been enriched with a mechanism able to take into account background knowledge K
expressed in terms of LTL formulae in order to guide the LSTM algorithm to make predictions compliant with the a-priori knowledge. The LSTM approach keeps returning likely predictions on the suffix of the current ongoing trace (up to the last event \(\omega \)) until it does not find a suffix that is compliant with K
. More in detail, the LSTM network uses a beam search algorithm for considering at each time step the top beam-width bw most likely next events. Figure 14 shows the idea of the beam-search approach with \(bw=2\). \(\sigma ^m =\,<\) Take shuttle
, Enter via door 3
, Check in
> is the current ongoing trace at time step m. At time step \(m+1\), among the three possible next events we take the bw most likely next events (the green nodes in Fig. 14) and keep exploring those future paths. At time step \(m+2\), we again select the 2 most likely next events and keep exploring the next events of these sequences. Whenever we find a sequence that is not compliant with K
, as at time step \(m+3\), we discard that path and we keep on exploring bw compliant paths. We stop whenever we predict the last event \(\omega \) (see the circle with the thicker border) and the considered trace is still compliant with K
.