XES 2.0 Workshop and Survey

Frontmatter

Open Access

Rethinking the Input for Process Mining: Insights from the XES Survey and Workshop

Abstract

Although the popularity and adoption of process mining techniques grew rapidly in recent years, a large portion of effort invested in process mining initiatives is still consumed by event data extraction and transformation rather than process analysis. The IEEE Task Force on Process Mining conducted a study focused on the challenges faced during event data preparation (from source data to event log). This paper presents findings from the online survey with 289 participants spanning the roles of practitioners, researchers, software vendors, and end-users. These findings were presented at the XES 2.0 workshop co-located with the 3rd International Conference on Process Mining. The workshop also hosted presentations from various stakeholder groups and a discussion panel on the future of XES and the input needed for process mining. This paper summarises the main findings of both the survey and the workshop. These outcomes help us to accelerate and improve the standardisation process, hopefully leading to a new standard widely adopted by both academia and industry.

Moe Thandar Wynn, Julian Lebherz, Wil M. P. van der Aalst, Rafael Accorsi, Claudio Di Ciccio, Lakmali Jayarathna, H. M. W. Verbeek

PDF View full text

EdbA 2021: 2nd International Workshop on Event Data and Behavioral Analytics

Frontmatter

Open Access

Probability Estimation of Uncertain Process Trace Realizations

Abstract

Process mining is a scientific discipline that analyzes event data, often collected in databases called event logs. Recently, uncertain event logs have become of interest, which contain non-deterministic and stochastic event attributes that may represent many possible real-life scenarios. In this paper, we present a method to reliably estimate the probability of each of such scenarios, allowing their analysis. Experiments show that the probabilities calculated with our method closely match the true chances of occurrence of specific outcomes, enabling more trustworthy analyses on uncertain data.

Marco Pegoraro, Bianka Bakullari, Merih Seran Uysal, Wil M. P. van der Aalst

PDF View full text

Open Access

Visualizing Trace Variants from Partially Ordered Event Data

Abstract

Executing operational processes generates event data, which contain information on the executed process activities. Process mining techniques allow to systematically analyze event data to gain insights that are then used to optimize processes. Visual analytics for event data are essential for the application of process mining. Visualizing unique process executions—also called trace variants, i.e., unique sequences of executed process activities—is a common technique implemented in many scientific and industrial process mining applications. Most existing visualizations assume a total order on the executed process activities, i.e., these techniques assume that process activities are atomic and were executed at a specific point in time. In reality, however, the executions of activities are not atomic. Multiple timestamps are recorded for an executed process activity, e.g., a start-timestamp and a complete-timestamp. Therefore, the execution of process activities may overlap and, thus, cannot be represented as a total order if more than one timestamp is to be considered. In this paper, we present a visualization approach for trace variants that incorporates start- and complete-timestamps of activities.

Daniel Schuster, Lukas Schade, Sebastiaan J. van Zelst, Wil M. P. van der Aalst

PDF View full text

Open Access

Analyzing Multi-level BOM-Structured Event Data

Abstract

With the advent of Industry 4.0, increasing amounts of data on operational processes (e.g., manufacturing processes) become available. These processes can involve hundreds of different materials for a relatively small number of manufactured special-purpose machines rendering classical process discovery and analysis techniques infeasible. However, in contrast to most standard business processes, additional structural information is often available—for example, Bills of Materials (BOMs), listing the required materials, or Multi-level Manufacturing Bills of Materials (M²BOMs), which additionally show the material composition. This work investigates how structural information given by Multi-level Bills of Materials (M²BOMs) can be integrated into a top-down operational process analysis framework to improve special-purpose machine manufacturing processes. The approach is evaluated on industrial-scale printer assembly data provided by Heidelberger Druckmaschinen AG.

Tobias Brockhoff, Merih Seran Uysal, Isabelle Terrier, Heiko Göhner, Wil M. P. van der Aalst

PDF View full text

Open Access

Linac: A Smart Environment Simulator of Human Activities

Abstract

The identification and construction of datasets of human activities is an extremely time-consuming and resource intensive task, yet researchers cannot refrain from such datasets. The publicly available datasets may not reflect all the researchers’ requirements and are not scrupulously documented. In addition, these datasets can cope with just a limited and predefined set of behaviors. To address these challenges, we developed an instrument that allows to simulate the behavior of agents interacting with an environment. The environment is a customized configuration, equipped with sensors. The simulation generates as output a stream of events stemming from activated sensors. In addition, the agents behavior is not fully deterministic, so as to reflect the dynamic nature of human beings and to be as realistic as possible.

Gemma Di Federico, Erik Ravn Nikolajsen, Mamuna Azam, Andrea Burattin

PDF View full text

Open Access

Root Cause Analysis in Process Mining with Probabilistic Temporal Logic

Abstract

Process mining is a research domain that enables businesses to analyse and improve their processes by extracting insights from event logs. While determining the root causes of, for example, a negative case outcome can provide valuable insights for business users, only limited research has been conducted to uncover true causal relations within the process mining field. Therefore, this paper proposes AITIA-PM, a novel technique to measure cause-effect relations in event logs based on causality theory. The AITIA-PM algorithm employs probabilistic temporal logic to formally yet flexibly define hypotheses and then automatically tests them for causal relations from data. We demonstrate this by applying AITIA-PM on a real-life dataset. The case study shows that, after a well-thought-out hypotheses definition and information extraction, the AITIA-PM algorithm can be applied on rich event logs, expanding the possibilities of meaningful root cause analysis in a process mining context.

Greg Van Houdt, Benoît Depaire, Niels Martin

PDF View full text

Open Access

xPM: A Framework for Process Mining with Exogenous Data

Abstract

Process mining facilitates analysis of business processes using event logs derived from historical records of process executions stored in organisations’ information systems. Most existing process mining techniques only consider data directly related to process execution (endogenous data). Data not directly representable as attributes of either events or traces (which includes exogenous data), are generally not considered. Exogenous data may be used by process participants in making decisions about execution paths. However, as exogenous data is not represented in event logs, its impact on such decision making is opaque and cannot currently be assessed by existing process mining techniques. This paper shows how exogenous data can be used in process mining, in particular discovery and enhancement techniques, to understand its influence on process decisions. In particular, we focus on time series which represent periodic observations of e.g. weather measurements, city health alerts or patient vital signs. We show that exogenous time series can be aligned and transformed into new attributes to annotate events in an event log. Then, we use these attributes to discover preconditions in a Petri net with exogenous data (xDPN), thus revealing the exogenous data’s influence on the process. Using our framework and a real-life data set from the medical domain, we evaluate the influence of exogenous data on decision points that are non-deterministic in an xDPN.

Adam Banham, Sander J. J. Leemans, Moe T. Wynn, Robert Andrews

PDF View full text

Open Access

A Bridging Model for Process Mining and IoT

Abstract

Contextualisation is an important challenge in process mining. While Internet of Things (IoT) devices are collecting more and more data on the physical context in which business processes are executed, the IoT and process mining fields are still considerably disintegrated. Important concepts, such as event or context, are not understood in the same way, which causes confusion and hinders cooperation between the two domains. Based on IoT ontologies and business process context models, this paper proposes a model to bridge the conceptualisation gap between the IoT and the process mining fields. The model defines the necessary concepts and relationships to build process mining techniques that take the physical context into account. As a first validation, the model is used to describe a lifelike process example, showing how IoT data and process events are related. Using this conceptualisation, both practitioners and researchers from the IoT and the process mining communities can reason about the use of IoT data in process mining and find support for data understanding, event abstraction and IoT and process data integration.

Yannis Bertrand, Jochen De Weerdt, Estefanía Serral

PDF View full text

ML4PM 2021: 2nd International Workshop in Leveraging Machine Learning for Process Mining

Frontmatter

Open Access

Exploiting Instance Graphs and Graph Neural Networks for Next Activity Prediction

Abstract

Nowadays, a lot of data regarding business process executions are maintained in event logs. The next activity prediction task exploits such event logs to predict how process executions will unfold up until their completion. The present paper proposes a new approach to address this task: instead of using traces to perform predictions, we propose to use the instance graphs derived from traces. To make the most out of such representation we train a message passing neural network, specifically a Deep Graph Convolutional Neural Network to predict the next activity that will be performed in the process execution. The experiments performed show promising performance hinting that exploiting information about parallelism among activities in a process can induce a performance improvement in highly parallel process.

Andrea Chiorrini, Claudia Diamantini, Alex Mircoli, Domenico Potena

PDF View full text

Open Access

Can Deep Neural Networks Learn Process Model Structure? An Assessment Framework and Analysis

Abstract

Predictive process monitoring concerns itself with the prediction of ongoing cases in (business) processes. Prediction tasks typically focus on remaining time, outcome, next event or full case suffix prediction. Various methods using machine and deep learning have been proposed for these tasks in recent years. Especially recurrent neural networks (RNNs) such as long short-term memory nets (LSTMs) have gained in popularity. However, no research focuses on whether such neural network-based models can truly learn the structure of underlying process models. For instance, can such neural networks effectively learn parallel behaviour or loops? Therefore, in this work, we propose an evaluation scheme complemented with new fitness, precision, and generalisation metrics, specifically tailored towards measuring the capacity of deep learning models to learn process model structure. We apply this framework to several process models with simple control-flow behaviour, on the task of next-event prediction. Our results show that, even for such simplistic models, careful tuning of overfitting countermeasures is required to allow these models to learn process model structure.

Jari Peeperkorn, Seppe vanden Broucke, Jochen De Weerdt

PDF View full text

Open Access

Remaining Time Prediction for Processes with Inter-case Dynamics

Abstract

Process mining techniques use event data to describe business processes, where the provided insights are used for predicting processes’ future states (Predictive Process Monitoring). Remaining Time Prediction of process instances is an important task in the field of Predictive Process Monitoring (PPM). Existing approaches have two key limitations in developing Remaining Time Prediction Models (RTM): (1) The features used for predictions lack process context, and the created models are black-boxes. (2) The process instances are considered to be in isolation, despite the fact that process states, e.g., the number of running instances, influence the remaining time of a single process instance. Recent approaches improve the quality of RTMs by utilizing process context related to batching-at-end inter-case dynamics in the process, e.g., using the time to batching as a feature. We propose an approach that decreases the previous approaches’ reliance on user knowledge for discovering fine-grained process behavior. Furthermore, we enrich our RTMs with the extracted features for multiple performance patterns (caused by inter-case dynamics), which increases the interpretability of models. We assess our proposed remaining time prediction method using two real-world event logs. Incorporating the created inter-case features into RTMs results in more accurate and interpretable predictions.

Mahsa Pourbafrani, Shreya Kar, Sebastian Kaiser, Wil M. P. van der Aalst

PDF View full text

Open Access

Event Log Sampling for Predictive Monitoring

Abstract

Predictive process monitoring is a subfield of process mining that aims to estimate case or event features for running process instances. Such predictions are of significant interest to the process stakeholders. However, state-of-the-art methods for predictive monitoring require the training of complex machine learning models, which is often inefficient. This paper proposes an instance selection procedure that allows sampling training process instances for prediction models. We show that our sampling method allows for a significant increase of training speed for next activity prediction methods while maintaining reliable levels of prediction accuracy.

Mohammadreza Fani Sani, Mozhgan Vazifehdoostirani, Gyunam Park, Marco Pegoraro, Sebastiaan J. van Zelst, Wil M. P. van der Aalst

PDF View full text

Open Access

Active Anomaly Detection for Key Item Selection in Process Auditing

Abstract

Process mining allows auditors to retrieve crucial information about transactions by analysing the process data of a client. We propose an approach that supports the identification of unusual or unexpected transactions, also referred to as exceptions. These exceptions can be selected by auditors as “key items”, meaning the auditors wants to look further into the underlying documentation of the transaction. The approach encodes the traces, assigns an anomaly score to each trace, and uses the domain knowledge of auditors to update the assigned anomaly scores through active anomaly detection. The approach is evaluated with three groups of auditors over three cycles. The results of the evaluation indicate that the approach has the potential to support the decision-making process of auditors. Although auditors still need to make a manual selection of key items, they are able to better substantiate this selection. As such, our research can be seen as a step forward with respect to the usage of anomaly detection and data analysis in process auditing.

Ruben Post, Iris Beerepoot, Xixi Lu, Stijn Kas, Sebastiaan Wiewel, Angelique Koopman, Hajo Reijers

PDF View full text

Open Access

Prescriptive Process Monitoring Under Resource Constraints: A Causal Inference Approach

Abstract

Prescriptive process monitoring is a family of techniques to optimize the performance of a business process by triggering interventions at runtime. Existing prescriptive process monitoring techniques assume that the number of interventions that may be triggered is unbounded. In practice, though, interventions consume resources with finite capacity. For example, in a loan origination process, an intervention may consist of preparing an alternative loan offer to increase the applicant’s chances of taking a loan. This intervention requires time from a credit officer. Thus, it is not possible to trigger this intervention in all cases. This paper proposes a prescriptive monitoring technique that triggers interventions to optimize a cost function under fixed resource constraints. The technique relies on predictive modeling to identify cases that are likely to lead to a negative outcome, in combination with causal inference to estimate the effect of an intervention on a case’s outcome. These estimates are used to allocate resources to interventions to maximize a cost function. A preliminary evaluation suggests that the approach produces a higher net gain than a purely predictive (non-causal) baseline.

Mahmoud Shoush, Marlon Dumas

PDF View full text

Open Access

Quantifying Explainability in Outcome-Oriented Predictive Process Monitoring

Abstract

The growing interest in applying machine and deep learning algorithms in an Outcome-Oriented Predictive Process Monitoring (OOPPM) context has recently fuelled a shift to use models from the explainable artificial intelligence (XAI) paradigm, a field of study focused on creating explainability techniques on top of AI models in order to legitimize the predictions made. Nonetheless, most classification models are evaluated primarily on a performance level, where XAI requires striking a balance between either simple models (e.g. linear regression) or models using complex inference structures (e.g. neural networks) with post-processing to calculate feature importance. In this paper, a comprehensive overview of predictive models with varying intrinsic complexity are measured based on explainability with model-agnostic quantitative evaluation metrics. To this end, explainability is designed as a symbiosis between interpretability and faithfulness and thereby allowing to compare inherently created explanations (e.g. decision tree rules) with post-hoc explainability techniques (e.g. Shapley values) on top of AI models. Moreover, two improved versions of the logistic regression model capable of capturing non-linear interactions and both inherently generating their own explanations are proposed in the OOPPM context. These models are benchmarked with two common state-of-the-art models with post-hoc explanation techniques in the explainability-performance space.

Alexander Stevens, Johannes De Smedt, Jari Peeperkorn

PDF View full text

SA4PM 2021: 2nd International Workshop on Streaming Analytics for Process Mining

Frontmatter

Open Access

Online Prediction of Aggregated Retailer Consumer Behaviour

Abstract

Predicting the behaviour of consumers provides valuable information for retailers, such as the expected spend of a consumer or the total turnover of the retailer. The ability to make predictions on an individual level is useful, as it allows retailers to accurately perform targeted marketing. However, with the expected large number of consumers and their diverse behaviour, making accurate predictions on an individual consumer level is difficult. In this paper we present a framework that focuses on this trade-off in an online setting. By making predictions on a larger number of consumers at a time, we improve the predictive accuracy but at the cost of usefulness, as we can say less about the individual consumers. The framework is developed in an online setting, where we update the prediction model and make new predictions over time. We show the existence of the trade-off in an experimental evaluation on a real-world dataset consisting of 39 weeks of transaction data.

Yorick Spenrath, Marwan Hassani, Boudewijn F. van Dongen

PDF View full text

Open Access

PErrCas: Process Error Cascade Mining in Trace Streams

Abstract

Efficient and quick detection of problems is an essential task in online process monitoring. Many anomaly detection approaches excel in finding local deviations. We propose a novel approach that tracks local deviations over multiple process instances and visualizes correlations of deviation points. PErrCas provides knowledge about current cascades of deviations to give process analysts a starting point for rational root-cause analysis if processes leave their in-control parameters. PErrCas monitors deviations online and maintains cascades of varying timespans. Hence, our approach avoids defining an observation window beforehand, which is a significant advantage due to its impracticability to predefine expected cascade properties in exploratory scenarios.

Anna Wimbauer, Florian Richter, Thomas Seidl

PDF View full text

Open Access

Continuous Performance Evaluation for Business Process Outcome Monitoring

Abstract

While a few approaches to online predictive monitoring have focused on concept drift model adaptation, none have considered in depth the issue of performance evaluation for online process outcome prediction. Without such a continuous evaluation, users may be unaware of the performance of predictive models, resulting in inaccurate and misleading predictions. This paper fills this gap by proposing a framework for evaluating online process outcome predictions, comprising two different evaluation methods. These methods are partly inspired by the literature on streaming classification with delayed labels and complement each other to provide a comprehensive evaluation of process monitoring techniques: one focuses on real-time performance evaluation, i.e., evaluating the performance of the most recent predictions, whereas the other focuses on progress-based evaluation, i.e., evaluating the ability of a model to output correct predictions at different prefix lengths. We present an evaluation involving three publicly available event logs, including a log characterised by concept drift.

Suhwan Lee, Marco Comuzzi, Xixi Lu

PDF View full text

PQMI 2021: 6th International Workshop on Process Querying, Manipulation, and Intelligence

Frontmatter

Open Access

An Event Data Extraction Approach from SAP ERP for Process Mining

Abstract

The extraction, transformation, and loading of event logs from information systems is the first and the most expensive step in process mining. In particular, extracting event logs from popular ERP systems such as SAP poses major challenges, given the size and the structure of the data. Open-source support for ETL is scarce, while commercial process mining vendors maintain connectors to ERP systems supporting ETL of a limited number of business processes in an ad-hoc manner. In this paper, we propose an approach to facilitate event data extraction from SAP ERP systems. In the proposed approach, we store event data in the format of object-centric event logs that efficiently describe executions of business processes supported by ERP systems. To evaluate the feasibility of the proposed approach, we have developed a tool implementing it and conducted case studies with a real-life SAP ERP system.

Alessandro Berti, Gyunam Park, Majid Rafiei, Wil M. P. van der Aalst

PDF View full text

Open Access

Towards a Natural Language Conversational Interface for Process Mining

Abstract

Despite all the recent advances in process mining, making it accessible to non-technical users remains a challenge. In order to democratize this technology and make process mining ubiquitous, we propose a conversational interface that allows non-technical professionals to retrieve relevant information about their processes and operations by simply asking questions in their own language. In this work, we propose a reference architecture to support a conversational, process mining oriented interface to existing process mining tools. We combine classic natural language processing techniques (such as entity recognition and semantic parsing) with an abstract logical representation for process mining queries. We also provide a compilation of real natural language questions (aiming to form a dataset of that sort) and an implementation of the architecture that interfaces to an existing commercial tool: Everflow. Last but not least, we analyze the performance of this implementation and point out directions for future work.

Luciana Barbieri, Edmundo Roberto Mauro Madeira, Kleber Stroeh, Wil M. P. van der Aalst

PDF View full text

Open Access

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Abstract

Process mining algorithms discover a process model from an event log. The resulting process model is supposed to describe all possible event sequences of the underlying system. Generalization is a process model quality dimension of interest. A generalization metric should quantify the extent to which a process model represents the observed event sequences contained in the event log and the unobserved event sequences of the system. Most of the available metrics in the literature cannot properly quantify the generalization of a process model. A recently published method called Adversarial System Variant Approximation leverages Generative Adversarial Networks to approximate the underlying event sequence distribution of a system from an event log. While this method demonstrated performance gains over existing methods in measuring the generalization of process models, its experimental evaluations have been performed under ideal conditions. This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions such as biased and limited event logs. Moreover, experiments are performed to investigate the originally proposed sampling parameter value of the method on its performance to measure the generalization. The results confirm the need to raise awareness about the working conditions of the Adversarial System Variant Approximation method and serve to initiate future research directions.

Julian Theis, Ilia Mokhtarian, Houshang Darabi

PDF View full text

PODS4H 2021: 4th International Workshop on Process-Oriented Data Science for Healthcare

Frontmatter

Open Access

Verifying Guideline Compliance in Clinical Treatment Using Multi-perspective Conformance Checking: A Case Study

Abstract

Clinical guidelines support physicians in the evidence-based treatment of patients. The technical verification of guideline compliance is not trivial, since guideline knowledge is usually represented textually and none of the approaches to computer-interpretable guideline representation has yet been able to establish itself. Due to the procedural nature of treatment sequences, this case study examines the applicability of a guideline process model to real hospital data for verification of guideline compliance. For this purpose, the limitations and challenges in the transformation of clinical data into an event log and in the application of conformance checking to align the data with the guideline reference model are investigated. As a data set, we use treatment data of skin tumor patients from a cancer registry enriched by hospital information system data. The results show the difficulty of applying process mining to medically complex and heterogeneous data and the need for complex preprocessing. The variability of clinical processes makes the application of global conformance checking algorithms challenging. In addition, the work shows the semantic weakness of the alignments and the need for new semantically sensitive approaches.

Joscha Grüger, Tobias Geyer, Martin Kuhn, StephanA. Braun, Ralph Bergmann

PDF View full text

Open Access

Patient Discharge Classification Based on the Hospital Treatment Process

Abstract

Heart failure is one of the leading causes of hospitalization and rehospitalization in American hospitals, leading to high expenditures and increased medical risk for patients. The discharge location has a strong association with the risk of rehospitalization and mortality, which makes determining the most suitable discharge location for a patient a crucial task. So far, work regarding patient discharge classification is limited to the state of the patients at the end of the treatment, including statistical analysis and machine learning. However, the treatment process has not been considered yet. In this contribution, the methods of process outcome prediction are utilized to predict the discharge location for patients with heart failure by incorporating the patient’s department visits and measurements during the treatment process. This paper shows that, with the help of convolutional neural networks, an accuracy of 77% can be achieved for the hospital discharge classification of heart failure patients. The model has been trained and evaluated on the MIMIC-IV real-world dataset on hospitalizations in the US.

Jonas Cremerius, Maximilian König, Christian Warmuth, Mathias Weske

PDF View full text

Open Access

Combining the Clinical and Operational Perspectives in Heterogeneous Treatment Effect Inference in Healthcare Processes

Abstract

Recent developments in causal machine learning open perspectives for new approaches that support decision-making in healthcare processes using causal models. In particular, Heterogeneous Treatment Effect (HTE) inference enables the estimation of causal treatment effects for individual cases, offering great potential in a process mining context. At the same time, HTE literature typically focuses on clinical outcome measures, disregarding process efficiency. This paper shows the potential of jointly considering the clinical and operational effects of treatments in the context of healthcare processes. Moreover, we present a simple pipeline that makes existing HTE machine learning techniques directly applicable to event logs. Besides these conceptual contributions, a proof-of-concept application starting from the publicly available sepsis event log is outlined, forming the basis for a critical reflection regarding HTE estimation in a process mining context.

Sam Verboven, Niels Martin

PDF View full text

Open Access

Interactive Process Mining Applied in a Cardiology Outpatient Department

Abstract

Cardiology departments receive many outpatients from primary care services and it is necessary to differentiate which patients need special attention. One-stop clinics were deployed in a hospital in Salamanca (Spain) to triage such patients, separating those who needed further examination and those who were discharged.

Data (covering December 2018—August 2020) was explored and there was an iterative process in which clinicians, process miners and technical staff at the hospital interacted in special interviews or Data Rodeos. Interactive Process Indicators (IPIs) were generated. During Data Rodeos data quality problems arose and were tackled, input data was cleaned and preconditioned, process activities were discovered and modelled.

The original assumption that the iterative implementation of the IPI would allow clinicians and managers to have a deeper understanding of the one-stop cardiology clinics process, was evaluated and validated by them. After each iteration, they found that the IPI was more useful and near to the reality they see everyday.

The final IPI was easy to interpret by the clinicians. In the end, many key indicators were extracted, but most importantly, clinicians had a comprehensive tool that they could use by themselves, without technical assistance, to extract and interpret different indicators at any time, providing a high-quality source of information to improve patient-centered daily medical care.

Juan José Lull, Adrián Cid-Menéndez, Gema Ibanez-Sanchez, Pedro Luis Sanchez, Jose Luis Bayo-Monton, Vicente Traver, Carlos Fernandez-Llatas

PDF View full text

Open Access

Discovering Care Pathways for Multi-morbid Patients Using Event Graphs

Abstract

Patients suffering from multiple diseases (multi-morbid patients) often have complex clinical pathways. They are diagnosed and treated by different specialties and undergo other clinical actions related to various diagnoses. Coordination of care for these patients is often challenging, and it would be of great benefit to get better insight into how the clinical pathways develop in reality. Discovering these pathways using traditional process mining techniques and standard event logs may be difficult because the patient is involved in several highly independent clinical processes. Our objective is to explore the potential of analyzing these pathways using an event log representation reflecting the independent clinical processes. Our main research question is: How can we identify valuable insights by using a multi-entity event data representation for clinical pathways of multi-morbid patients? Our method was built on the idea to represent multiple entities in event logs as event graphs. The MIMIC-III dataset was used to evaluate the feasibility of this approach. Several clinical entities were identified and then mapped into an event graph. Finally, multi-entity directly follows graphs were discovered by querying the event graph visualizing them. Our result shows that paths involving multiple entities include traditional process mining concepts not for one clinical process but all involved processes. In addition, the relationship between activities of different clinical processes, which was not recognizable in traditional models, is visible in the event graph representation.

Milad Naeimaei Aali, Felix Mannhardt, Pieter Jelle Toussaint

PDF View full text

TPSA 2021: 2nd International Workshop on Trust, Privacy, and Security in Process Analytics

Frontmatter

Open Access

Process Mining in Trusted Execution Environments: Towards Hardware Guarantees for Trust-Aware Inter-organizational Process Analysis

Abstract

Process mining techniques enable business process analysis on event logs extracted from information systems. Currently, industry applications and research in process mining predominantly analyze intra-organizational processes. Intra-organizational processes deal with the workflows within a single organization. However, analyzing inter-organizational processes across separate companies has the potential to generate further insights. Process analysts can use these insights for optimizations such as workflow improvements and process cost reductions. It is characteristic for inter-organization process analysis that it is not possible to uncover the insights by analyzing the event logs of a single organization in isolation. On the other hand, privacy and trust issues are a considerable obstacle to adopting inter-organizational process mining applications. The independent companies fear competitive disadvantages by letting third parties access their valuable process logs. This paper proposes a concept for inter-organizational process mining using trusted execution environments in a decentralized cloud. The hardware-based approach aims to technically prevent data leakage to unauthorized parties without the need for a trusted intermediary. The contributions of this paper are theoretical and identify future research challenges for implementing the concept.

Marcel Müller, Anthony Simonet-Boulogne, Souvik Sengupta, Oliver Beige

PDF View full text

Open Access

Quantifying the Re-identification Risk in Published Process Models

Abstract

Event logs are the basis of process mining operations such as process discovery, conformance checking, and process optimization. Sensitive information may be obtained by adversaries when re-identifying individuals that relate to the traces of an event log. This re-identification risk is dependent on the assumed background information of an attacker. Multiple techniques have been proposed to quantify the re-identification risks for published event logs. However, in many scenarios there is no need to release the full event log, a discovered process model annotated with frequencies suffices. This raises the question on how to quantify the re-identification risk in published process models. We propose a method based on generating sample traces to quantify this risk for process trees annotated with frequencies. The method was applied on several real-life event logs and process trees discovered by Inductive Miner. Our results show that there can be still a significant re-identification risk when publishing a process tree; however, this risk is often lower than that for releasing the original event log.

Karim Maatouk, Felix Mannhardt

PDF View full text

Open Access

Trustworthy Artificial Intelligence and Process Mining: Challenges and Opportunities

Abstract

The premise of this paper is that compliance with Trustworthy AI governance best practices and regulatory frameworks is an inherently fragmented process spanning across diverse organizational units, external stakeholders, and systems of record, resulting in process uncertainties and in compliance gaps that may expose organizations to reputational and regulatory risks. Moreover, there are complexities associated with meeting the specific dimensions of Trustworthy AI best practices such as data governance, conformance testing, quality assurance of AI model behaviors, transparency, accountability, and confidentiality requirements. These processes involve multiple steps, hand-offs, re-works, and human-in-the-loop oversight. In this paper, we demonstrate that process mining can provide a useful framework for gaining fact-based visibility to AI compliance process execution, surfacing compliance bottlenecks, and providing for an automated approach to analyze, remediate and monitor uncertainty in AI regulatory compliance processes.

Andrew Pery, Majid Rafiei, Michael Simon, Wil M. P. van der Aalst

PDF View full text

Springer Professional

About this book

Table of Contents

Frontmatter

XES 2.0 Workshop and Survey

Frontmatter

Rethinking the Input for Process Mining: Insights from the XES Survey and Workshop

EdbA 2021: 2nd International Workshop on Event Data and Behavioral Analytics

Frontmatter

Probability Estimation of Uncertain Process Trace Realizations

Visualizing Trace Variants from Partially Ordered Event Data

Analyzing Multi-level BOM-Structured Event Data

Linac: A Smart Environment Simulator of Human Activities

Root Cause Analysis in Process Mining with Probabilistic Temporal Logic

xPM: A Framework for Process Mining with Exogenous Data

A Bridging Model for Process Mining and IoT

ML4PM 2021: 2nd International Workshop in Leveraging Machine Learning for Process Mining

Frontmatter

Exploiting Instance Graphs and Graph Neural Networks for Next Activity Prediction

Can Deep Neural Networks Learn Process Model Structure? An Assessment Framework and Analysis

Remaining Time Prediction for Processes with Inter-case Dynamics

Event Log Sampling for Predictive Monitoring

Active Anomaly Detection for Key Item Selection in Process Auditing

Prescriptive Process Monitoring Under Resource Constraints: A Causal Inference Approach

Quantifying Explainability in Outcome-Oriented Predictive Process Monitoring

SA4PM 2021: 2nd International Workshop on Streaming Analytics for Process Mining

Frontmatter

Online Prediction of Aggregated Retailer Consumer Behaviour

PErrCas: Process Error Cascade Mining in Trace Streams

Continuous Performance Evaluation for Business Process Outcome Monitoring

PQMI 2021: 6th International Workshop on Process Querying, Manipulation, and Intelligence

Frontmatter

An Event Data Extraction Approach from SAP ERP for Process Mining

Towards a Natural Language Conversational Interface for Process Mining

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

PODS4H 2021: 4th International Workshop on Process-Oriented Data Science for Healthcare

Frontmatter

Verifying Guideline Compliance in Clinical Treatment Using Multi-perspective Conformance Checking: A Case Study

Patient Discharge Classification Based on the Hospital Treatment Process

Combining the Clinical and Operational Perspectives in Heterogeneous Treatment Effect Inference in Healthcare Processes

Interactive Process Mining Applied in a Cardiology Outpatient Department

Discovering Care Pathways for Multi-morbid Patients Using Event Graphs

TPSA 2021: 2nd International Workshop on Trust, Privacy, and Security in Process Analytics

Frontmatter

Process Mining in Trusted Execution Environments: Towards Hardware Guarantees for Trust-Aware Inter-organizational Process Analysis

Quantifying the Re-identification Risk in Published Process Models

Trustworthy Artificial Intelligence and Process Mining: Challenges and Opportunities

Backmatter

Premium Partner