Skip to main content

Open Access 2025 | Open Access | Buch

Process Mining Workshops

ICPM 2024 International Workshops, Lyngby, Denmark, October 14–18, 2024, Revised Selected Papers

insite
SUCHEN

Über dieses Buch

This book constitutes the revised accepted papers of several workshops which were held in conjunction with the 6th International Conference on Process Mining, ICPM 2024, held in Lyngby, Denmark, during October 2024.

The 56 revised full papers presented in this book were carefully reviewed and selected from 126 submissions.

The papers presented in this volume stem from the following workshops:

– 9th International Workshop on Process Querying, Manipulation, and Intelligence (PQMI)

– 3rd International Workshop on Education Meets Process Mining (EduPM)

– 3rd International Workshop on Collaboration Mining for Distributed Systems (CoMinDS)

– 5th International Workshop on Leveraging Machine Learning in Process Mining (ML4PM)

– 5th International Workshop on Event Data and Behavioral Analytics (EdbA)

– 7th International Workshop on Process-Oriented Data Science for Healthcare (PODS4H)

– 1st International Workshop on Empirical Research in Process Mining (ERPM)

– 1st International Workshop on Generative Artificial Intelligence for Process Mining (GenAI4PM)

– 4th International Workshop on Stream Management & Analytics for Process Mining (SMA4PM)

– 1st International Workshop on Process Mining for Sustainability (PM4S).

Inhaltsverzeichnis

Frontmatter

9th International Workshop on Process Querying, Manipulation, and Intelligence (PQMI 2024)

Frontmatter

Open Access

An LLM-Based Q&A Natural Language Interface to Process Mining

Process Mining has come a long way to meet the needs of organizations that must optimize their operations. However, its use is still driven by technical users who can interpret process maps, models, graphs and other types of analyses. Business users, on the other hand, frequently report being intimidated by Process Mining tools’ interfaces and not knowing “what to do next”. An alternative to address this issue is providing more fluid and friendly interfaces for non-technical users based on natural language querying. Recent advances in Large Language Models (LLMs) have expanded the horizon for such interfaces. In this work we propose a new strategy to combine LLM capabilities with a framework for a natural language question-and-answer interface to Process Mining, which combines the flexibility of the former with the scalability and precision of the latter. We expand upon previous works in the area to research the dimensions of flexibility, generalization, scalability and precision. Finally, we implement such an LLM-enhanced framework and test it against a real-life compilation of questions to compare the performance of LLM-based, non LLM-based and hybrid implementations and point to directions in this field of research.

Luciana Barbieri, Kleber Stroeh, Edmundo R. M. Madeira, Wil M. P. van der Aalst

Open Access

One Language to Rule Them All: Behavioural Querying of Process Data Using SQL

State-of-the-art solutions for process mining rely on proprietary, domain-specific languages to query data recorded during business process execution. To support common analysis tasks, these languages focus on the definition of queries for behavioural patterns. Yet, the use of domain-specific languages for process mining has drawbacks: they require specific user training, lead to a decoupling of the query models for (i) data extraction and transformation, and (ii) the actual analysis, and induce engineering overhead through the development of a dedicated query engine. In this work, we therefore explore the use of standard SQL for process mining tasks. In particular, we demonstrate that the SQL concepts for row pattern recognition as realised by the MATCH_RECOGNIZE clause are sufficient to capture queries for behavioural patterns as specified in the SIGNAL language by SAP Signavio as well as the Process Querying Language (PQL) by Celonis. Based on a discussion of the respective language features, we outline a translation of SIGNAL and PQL queries into standard SQL. This way, we provide the basis for the adoption of widely used, general purpose query engines for process mining tasks.

Jakob Brand, Timotheus Kampik, Cem Okulmus, Matthias Weidlich

Open Access

EVErPREP: Towards an Event Knowledge Graph Enhanced Workflow Model for Event Log Preparation

Event data preparation is a critical yet time-consuming phase in process mining projects, often slowed down by complex relational data models and a lack of domain knowledge. This paper presents EVErPREP, a novel workflow model that leverages Event Knowledge Graphs to enhance event data preparation for event logs. EVErPREP uses Semantic Web technologies to improve the exploration, extraction, and processing of event data, ultimately improving the quality and interpretability of event data and event logs. The approach is evaluated through a case study at Munich Airport’s Baggage Handling System, demonstrating its effectiveness in reducing complexity and improving explainability in event data preparation. By providing a more structured and semantically enriched foundation for process mining, EVErPREP showcases increased efficiency and effectiveness of process mining projects through a semantically enriched foundation.

Peter Filipp, Rene Dorsch, Andreas Harth

Open Access

Representative Sampling in Process Mining: Two Novel Sampling Algorithms for Event Logs

Process mining allows the discovery of business processes from an event log. However, event logs are rapidly increasing in size and process mining algorithms struggle with the computational load when efficient processing is required. This calls for methods that decrease the event log size while still preserving the representativeness of the event log. This paper presents two new algorithms for sampling event logs. The first algorithm called RemainderPlus chooses traces from an event log above a threshold and subsequently selects traces with underrepresented Directly Follows Relations. The second sampling algorithm called AllBehavior selects samples that have a high intersection of Directly Follows Relations with the original event log. Usually, AllBehavior is complemented with RemainderPlus for a more accurate sample representation. They perform well for conformance checking and excel in certain scenarios for process discovery. Thus, both algorithms outperform existing sampling algorithms.

Frederik Fonger, Niclas Nebelung, Arvid Lepsien, Milda Aleknonytė-Resch, Agnes Koschmider

Open Access

Root Cause Analysis Using Rule Mining on Object-Centric Event Logs

In business processes, the behavior, evolution and interactions of objects influence the outcome of process instances, and thus the value that a business user may assign to them. For example, in an order-to-cash process, a complete and timely delivery of a package is desirable, but depends on what happens to other objects upstream, like production batches. Negative outcomes call for a Root Cause Analysis (RCA) on the process. While many approaches for RCA using process mining exist, none is native to object-centric frameworks and thus suitable for capturing dependencies across object types. This work presents a method for RCA that operates on object-centric event logs (OCELs). Given an OCEL, our method returns a set of association rules on the activity level. These rules associate descriptive patterns over the various object types occurring at events with patterns indicating the process outcome. The patterns are abstracted from the log with the help of a first-order logic based query engine. A case study confirmed that our method can identify problematic interactions across various object types in real-life business processes.

Benedikt Knopp, Mahsa Pourbafrani, Wil van der Aalst

Open Access

The Jensen-Shannon Distance for Stochastic Conformance Checking

A sub-field of process mining, conformance checking, quantifies how well the process behavior of a model represents the observed behavior recorded in a log. A stochastic-aware perspective that accounts for the probability of behavior in both model and log is necessary to support conformance checking. However, existing stochastic conformance checking measures are not comparable for a broad framework that includes log-to-log (L2L), log-to-model (L2M), and model-to-model (M2M) comparison settings. Therefore, we propose a stochastic conformance checking measure based on the Jensen-Shannon Distance (JSD), which interprets models and logs as probability distributions over traces. It can be applied to perform L2L, L2M, and M2M conformance, while the latter requires approximation. Notably, it is the only known stochastic conformance measure that is a metric. JSD has been implemented and is publicly available. Our quantitative evaluations show the feasibility of computing JSD over real-life event logs, and that it provides diagnostic results different from those of existing measures. Moreover, experiments in the M2M setting confirm that our measure can be approximated using unbiased sampling.

Tian Li, Sander J. J. Leemans, Artem Polyvyanyy

Open Access

A Dynamic Programming Approach for Alignments on Process Trees

A fundamental task in conformance checking is to compute optimal alignments between a given event log and a process model. In general, it is known that this unavoidably incurs high computational costs which, in turn, leads to poor scalability in practice. One angle to attack the complexity is to develop alignment algorithms that exploit particular syntactic restrictions of the underlying process models. In this article, we study alignments for process trees with unique labels. These models are the output of the Inductive Miner, a family of state-of-the-art process discovery algorithms also used by the leading process mining tools. Our main contribution is a novel algorithm that constructs optimal alignments for process trees with unique labels efficiently, i.e., in polynomial time. This is in contrast with general process trees where the problem is NP-complete and general workflow nets where the problem is PSPACE-hard. We give a proof-of-concept implementation of our algorithm in PM4Py and evaluate it on a collection of real-life event logs.

Christopher T. Schwanen, Wied Pakusa, Wil M. P. van der Aalst

3rd International Workshop on Education Meets Process Mining (EduPM 2024)

Frontmatter

Open Access

Constructive Alignment in Process Mining

Constructive alignment is a well-established concept in education that helps teachers design courses and modules. When we look at constructive alignment as engineers, we can interpret it as a set of axioms for setting up successful process mining projects. While in process mining we have several useful methodologies (e.g. PM2), they cover some, but not all aspects of constructive alignment. In this paper we translate the ideas from constructive alignment to terms in process mining. We exploit the key similarity between process mining and teaching, which is that there is someone who wants to learn something. From the analysis of PM2, we identify two main types of problems and discuss these in a bit more detail before concluding the paper with ideas for future work.

Mitchel Brunings, Dirk Fahland, Boudewijn van Dongen

Open Access

Understanding Student Behavior Using Active Window Tracking and Process Mining

This paper proposes a new way of collecting and processing event logs using Active Window Tracking (AWT) to investigate media multitasking (MMT) among students in higher education institutions in Indonesia. Students recorded their computer windows while doing assignments and midterms. Data from the students were preprocessed and structured into event logs. Correlation analysis indicated that MMT has no direct correlation with performance. The PM results revealed that students engaging in MMT frequently switch between assignments, social media, and multimedia. High-scoring students focused more on assignment-related activities, while low-scoring students started late, multitasked extensively, and submitted their work close to the deadline. While these results indicate that MMT does not directly affect the student’s performance for the type of assignment, MMT extends work duration. Students tend to work closer to the deadline, so they often work very late into the night, negatively impacting their well-being. Recommendations are provided to mitigate these issues.

E. R. Mahendrawathi, Wouter van der Waal, Iris Beerepoot, M. Aqmal R. R. Putra, Hardhika Propitadewa

Open Access

Measuring Skill Acquisition and Retention: A Case Study of Math Fluency

This study examines the application of process-oriented techniques to analyse learning stages in primary education, specifically developing math fact fluency. Utilizing data from an arithmetic practice platform used in primary schools, this research addresses three primary questions: the amount of practice time required to master arithmetic operations, the learning characteristics influencing this duration, and the impact of different learning characteristics on skill retention. The analysis reveals significant variability in the time and effort needed for a pupil to master a skill, influenced by initial skill levels and whether students prioritise accuracy or speed. Students leaning towards accuracy tend to achieve steady progress, attaining higher accuracy before enhancing their speed, whereas those leaning towards speed may reach mastery more quickly but risk inconsistencies in accuracy resulting in a lower skill retention. The findings highlight the effectiveness of process-oriented methodologies in education, in providing more nuanced insights into student learning phases. The case study underscores the necessity for adaptive learning platforms and personalised educational strategies that accommodate diverse learning behaviours and needs. It furthermore highlights that gamification tactics should facilitate these diverse learning behaviours, rather than counteract them.

Gert Janssenswillen, Seppe Van Daele, Marc Van Daele

Open Access

Assessing the Impact of Exam Preparation Process on Students’ Careers

Educational Process Mining techniques leverage educational data to gather relevant insights on the corresponding processes, ultimately supporting the development of evidence-based strategies for their improvement. In this work, we analyze students’ exam preparation process to i) uncover process patterns describing students’ behaviors and ii) develop predictive models capable of predicting students’ performance regarding graduation times. The results of the analysis can be employed both to formulate improvements to the study curricula and to enable the early detection of students who are likely to struggle in their career, to support them at an early stage of their studies.

Domenico Potena, Laura Genga, Lorenzo Galeazzi, Gianmarco Vigano, Claudia Diamantini

Open Access

Evaluation of Study Plans Using Partial Orders

In higher education, data is collected that indicate the term(s) that a course is taken and when it is passed. Often, study plans propose a suggested course order to students. Study planners can adjust these based on detected deviations between the proposed and actual order of the courses being taken. In this work, we detect deviations by combining (1) the deviation between the proposed and actual course order with (2) the temporal difference between the expected and actual course-taking term(s). Partially ordered alignments identify the deviations between the proposed and actual order. We compute a partial order alignment by modeling a study plan as a process model and a student’s course-taking behavior as a partial order. Using partial orders in such use cases allows one to relax the constraints of strictly ordered traces. This makes our approach less prone to the order in which courses are offered. Further, when modeling course-taking behavior as partial orders, we propose distinguishing intended course-taking behavior from actual course-passing behavior of students by including either all terms in which a course is attempted or only the term that a course is passed, respectively. This provides more perspectives when comparing the proposed and actual course-taking behavior. The proposed deviation measuring approach is evaluated on real-life data from RWTH Aachen University.

Christian Rennert, Mahsa Pourbafrani, Wil van der Aalst

3rd International Workshop on Collaboration Mining for Distributed Systems (CoMinDS 2024)

Frontmatter

Open Access

Towards Standardized Modeling of Collaboration Processes in Collaboration Process Discovery

Collaboration processes represent behavior of collaborating cases within multiple process orchestrations that interact via collaboration concepts such as organizations, agents, objects, and services. The heterogeneity of collaboration concepts and types such as message exchange and synchronous collaboration has led to different models targeted by collaboration process discovery (CPD) techniques, but a standard model class is lacking. In this paper, in order to reduce heterogeneity among model classes and to reveal similarities between CPD techniques, we prove that the synchronous collaboration type simulates message exchanges, but not vice versa. This constitutes a step towards a standard CPD model class that achieves comparability between CPD techniques, enables approach and property transfer, and is a condition for a standardized collaboration mining pipeline similar to process mining.

Janik-Vasily Benzin, Stefanie Rinderle-Ma

Open Access

Revealing One-to-Many Event Relationships in Event Knowledge Graphs

Object-centric process mining is recognized to overcome the limitations of traditional process mining by offering approaches for the analysis of processes with multiple case notions such as collaborations. Event knowledge graphs are an effective tool for gathering, manipulating, and visualizing event and entity relations. Current approaches focus on inferring correlations between events and objects and directly-follows relationships between events correlated to the same object. However, object-to-object relations may hide one-to-many relations between events essential for understanding the actual flow among processes. We propose an approach to reveal these one-to-many causal relationships in an event knowledge graph. By defining when two events are causally related and extending the standard approach of event knowledge graphs construction to reveal them. We assess the approach using two case studies.

Alessio Giacché, Sara Pettinari, Lorenzo Rossi

5th International Workshop on Leveraging Machine Learning in Process Mining (ML4PM 2024)

Frontmatter

Open Access

On the Impact of Low-Quality Activity Labels in Predictive Process Monitoring

While event log data quality is recognized as a crucial concern in process mining, the impact of event log errors on different types of process mining tasks has remained largely unexplored. This paper aims to fill such a gap by analyzing how various errors affect analysis results. In particular, we aim to assess whether and to what extent different types of errors that impact the quality of activity labels affect the performance of predictive process monitoring models, considering the three main tasks of next activity, outcome, and remaining time prediction, using publicly available and simulated event logs. The results of the experiments are used to extract preliminary insights into the design of data preparation pipelines for predictive process monitoring.

Marco Comuzzi, Sungkyu Kim, Jonghyeon Ko, Musa Salamov, Cinzia Cappiello, Barbara Pernici

Open Access

Towards Accurate Predictions in ITSM: A Study on Transformer-Based Predictive Process Monitoring

The accurate prediction of service process performance, particularly in IT service management (ITSM), is critical for adhering to service-level agreements and avoiding associated penalties. However, existing predictive process monitoring solutions, predominantly based on recurrent neural networks, have been found to be inadequate in handling ITSM processes. Notably, the heterogeneity in process artifacts and environments impairs process predictions. This research proposes a novel transformer-based architecture to effectively handle IT service process event logs. By integrating advanced positional encoding techniques and distinguishing between static and dynamic attributes, a novel transformer architecture is evaluated using multiple publicly available ITSM event logs. This architecture demonstrates its potential to deliver more accurate predictions than LSTM models in terms of remaining time predictions. This work provides experimental results into the application of transformer architectures for predictive process monitoring, paving the way for enhanced efficiency in ITSM.

Marc C. Hennig

Open Access

Predictions in Predictive Process Monitoring with Previously Unseen Categorical Values

Predictive process monitoring (PPM) methods provide users with real-time predictions about ongoing process instances. Machine learning models used for such tasks do not account for data variability, such as the occurrence of previously unseen categorical feature values. Concept drift adaptation solutions are suggested in such scenarios. However, adapting to new feature values requires time and a sample size large enough to train a well-generalizing model. Still, users expect seamless communication during the timeframe between the first occurrence of a new value and the availability of an updated model. Dedicated solutions are needed since encoding techniques like one hot encoding cannot handle previously unseen values by default. In this work, we first introduce and discuss possible solutions from a business perspective, ranging from temporary shutdowns to dedicated manual and technical solutions for an uninterrupted continuation of predictive services. Next, we present five variants for one hot encoding to handle previously unseen categorical values. This is followed by a case study using six real-world event logs and two machine learning models, XGBoost and LSTM, to identify the variants that produce the most reliable remaining time predictions. The study also includes the evaluation of two baseline models as an alternative to the machine learning models. The results show that previously unseen categorical values can be handled on a technical level without severely affecting the remaining time prediction quality. However, future research is required to provide more practical recommendations.

Johannes Roider, Weixin Wang, Dario Zanca, Martin Matzner, Bjoern M. Eskofier

Open Access

Differentially Private Event Logs with Case Attributes

Event logs capture the execution of processes, record activities and additional information. A trace represents a single instance of a process and includes a sequence of activity records and case attributes with additional information. Event logs may contain sensitive personal information that could harm an individual’s privacy if it is published without pre-processing. Differential privacy (DP) limits the disclosure of new information about any individual when publishing an event log beyond the publicly available background knowledge. Many privacy-preserving approaches to event log publishing ensure DP. Traditional methods focus on preserving the control flow but omit case attributes, limiting comprehensive process analysis based on these attributes. This work addresses this limitation by proposing a novel privacy-preserving event log publishing framework. Our approach ensures privacy for the control flow and case attributes, utilising synthetic tabular data generation approaches based on machine learning that guarantee DP. The framework allows for the use of various tabular data generation approaches. Experimental results with real-world event data demonstrate the framework’s feasibility and highlight the trade-off between data utility and the guaranteed levels of privacy.

Hannes Ueck, Robert Andrews, Moe T. Wynn, Sander J. J. Leemans

Open Access

CaLenDiR: Mitigating Case-Length Distortion in Deep-Learning-Based Predictive Process Monitoring

Predictive Process Monitoring (PPM) in Process Mining (PM) focuses on forecasting future aspects of ongoing business processes. Recent Deep Learning (DL) models excel at these tasks but suffer from case-length distortion, where longer cases dominate training and skew evaluation metrics. We propose the CaLenDiR (Case Length Distribution-Reflective) framework to address this, aligning DL training and evaluation with true case length distributions. CaLenDiR incorporates Uniform Case-Based Sampling (UCBS) and suffix-length-normalized loss functions for balanced training, along case-based metrics for evaluation. Our experiments show that CaLenDiR enhances model robustness and provides new insights into the interaction between log characteristics and model behavior.

Brecht Wuyts, Seppe Vanden Broucke, Jochen De Weerdt

Open Access

CC-HIT: Creating Counterfactuals from High-Impact Transitions

Smooth process execution relies on high-quality insights extracted from event data. For instance, trace durations heavily affect performance and increase resource consumption. While many predictive systems aim to identify these inefficiencies, they often focus on individual process instances, missing the global perspective. It is essential to detect where delays occur globally and pinpoint specific activity transitions causing them. To address this, we propose CC-HIT (Creating Counterfactuals from High-Impact Transitions), which identifies temporal activity dependencies across the process. CC-HIT uses a modified game theoretic approach and counterfactual information to generate reference event logs to estimate the consequences of activity transitions. It highlights key activity transitions impacting process performance, offering actionable insights for optimization. Validation on the BPIC 2020 dataset demonstrates its effectiveness over baseline methods.

Zhicong Xian, Ludwig Zellner, Gabriel Marques Tavares, Thomas Seidl

Open Access

Multivariate Approaches for Process Model Forecasting

Recently, inspired by predictive process monitoring, the modeling and prediction of the entire process information system has been proposed as process model forecasting. By forecasting individual elements of a directly-follows graph, the future state of the system can be predicted. However, the current state-of-the-art principally employs univariate forecasting of direct-follows relationships (DFs). This univariate approach overlooks the process structure and possible relations between different elements within the process. This paper introduces a comprehensive deployment of multivariate time series models, more specifically a range of different machine- and deep learning approaches, to forecast DFs. These are benchmarked on different event logs collected from real-life event processes. Our extensive experiments reveal that the performance of these forecasting models varies significantly across different processes, highlighting the importance of model selection.

Yongbo Yu, Jari Peeperkorn, Johannes De Smedt, Jochen De Weerdt

Open Access

Enhancing Predictive Process Monitoring Using Semantic Information

Predictive Process Monitoring (PPM) leverages historical data to forecast information about ongoing business processes. Recent methods have utilized advanced deep learning and classical machine learning models. However, the role of semantic information that can be extracted from event logs has been underexplored, although such information has been demonstrated to have significant advantages for other process mining tasks, such as anomaly detection. Therefore, this paper proposes a novel mechanism that aims to exploit semantic information for PPM, particularly by extracting information regarding the status of business objects associated with process instances from event data. We evaluate this mechanism in outcome-oriented and next activity prediction tasks, using state-of-the-art large language models (LLMs) for semantic extraction. Our results show that integrating semantic information improves prediction performance across these tasks. This work demonstrates that utilizing semantic information in PPM has considerable potential, especially in combination with advanced language models.

Jiaxin Yuan, Daniela Grigori, Han van der Aa

5th International Workshop on Event Data and Behavioral Analytics (EdbA 2024)

Frontmatter

Open Access

A Classification of Data Quality Issues in Object-Centric Event Data

Process analysis is concerned with analyzing recorded process executions to validate, monitor, or improve the underlying processes according to business goals. In this context, the paradigm of object-centric event data (OCED) has recently emerged, which relates activity executions to multiple objects instead of a single, pre-determined case. Since OCED can integrate various process perspectives simultaneously, it represents real-life activities more accurately than traditional event logs. Being the input of Object-Centric Process Mining (OCPM), the quality of the data recorded in OCED logs directly influences the results of the process analysis. To ensure reliable outcomes, it is imperative to assess potential quality problems manifesting in the data. While frameworks for such an assessment are available for classic event data, equivalent approaches for assessing quality issues in OCED do not yet exist. This paper provides an analysis and classification of data quality issues in OCED, and compares them to the issues in traditional event data. Thus, this study is a first step in the systematic assessment and management of data quality in OCED.

Maike Basmer, Martin Kabierski, Kristina Sahling, Agnieszka Patecka, Saimir Bala, Jan Mendling

Open Access

Analyzing the Evolution of Boards in Collaborative Work Management Tools

Board-Based Collaborative Work Management Tools (BBTs) like Trello and Microsoft Planner are widespread today. Their use includes the management of projects, static information, or processes, which is achieved by assigning and moving cards through lists representing specific states, steps, or other classification criteria. BBTs are a flexible solution since boards, lists and cards can be changed by the user to adapt to new situations, e.g., changes in the processes or projects. However, understanding how a board is being used is challenging because what can be seen at a glance is a static snapshot of its current state. BBTs usually produce logs that capture all the activity that has taken place within the boards. In this paper, we leverage that data to mine BBT logs to understand how boards are used and evolve over time. Specifically, we introduce an approach that aims to detect structural changes in the boards, and visualize the evolution of the boards’ lists. We have analyzed 63 real-life BBT logs and tested the approach with three case studies.

Alfonso Bravo, Cristina Cabanillas, Joaquín Peña, Manuel Resinas

Open Access

Extending Process Intelligence with Quantity-Related Process Mining

Process mining uses data logged during the execution of processes to understand, analyse, and improve processes. Logistics process management and optimisation are highly relevant for the industry, as they are crucial to the business’ operations but not intrinsically value-adding. Despite the advantages of applying process mining to logistics processes, its full potential can not yet be leveraged. Current process mining techniques assume that an event’s execution solely depends on its associated identifiable objects, their attributes, relationships, and previously executed events. However, in logistics processes, counts of items, which may not be uniquely identifiable, play a crucial role. For instance, a replenishment order is triggered when the stock level falls below a threshold, or a second shipment is dispatched if not all ordered items are available during the first shipment. This work proposes a framework that integrates the concept of a quantity state based on properties derived from common logistics processes. We introduce extensions to object-centric event logs and object-centric Petri nets that include such counts of items. We show the feasibility of detecting the quantity state from the proposed event log class and demonstrate its capability to convey quantity dependencies using a Python-based implementation.

Nina Graves, Tobias Brockhoff, István Koren, Wil M. P. van der Aalst

Open Access

Ranking the Top-K Realizations of Stochastically Known Event Logs

Various kinds of uncertainty can occur in event logs, e.g., due to flawed recording, data quality issues, or the use of probabilistic models for activity recognition. Stochastically known event logs make these uncertainties transparent by encoding multiple possible realizations for events. However, the number of realizations encoded by a stochastically known log grows exponentially with its size, making exhaustive exploration infeasible even for moderately sized event logs. Thus, considering only the top-K most probable realizations has been proposed in the literature. In this paper, we implement an efficient algorithm to calculate a top-K realization ranking of an event log under event independence within O(Kn), where n is the number of uncertain events in the log. This algorithm is used to investigate the benefit of top-K rankings over top-1 interpretations of stochastically known event logs. Specifically, we analyze the usefulness of top-K rankings against different properties of the input data. We show that the benefit of a top-K ranking depends on the length of the input event log and the distribution of the event probabilities. The results highlight the potential of top-K rankings to enhance uncertainty-aware process mining techniques.

Arvid Lepsien, Marco Pegoraro, Frederik Fonger, Dominic Langhammer, Milda Aleknonytė-Resch, Agnes Koschmider

Open Access

Framework for Extracting Real-World Object-Centric Event Logs from Game Data

In recent years, process mining has shifted towards an object-centric perspective on processes, considering interacting sub-processes that operate on objects from different types. Since businesses are hesitant to publicly share such event data that may expose business internals, there is a lack of public real-world object-centric event logs. We propose a novel approach to use publicly accessible game data as a large-scale data source for object-centric event logs. The contributions of this paper include (1) a framework to extract object-centric event logs from real-time strategy game data, (2) an application of that framework for the game Age of Empires II, (3) a published Python library to automatically transform Age of Empire gameplays into an object-centric event log, (4) a publicly accessible object-centric event log extracted from 325,398 gameplays, and (5) an evaluation of the published aoe2pm library. The evaluation shows that the extracted event logs can be used by state-of-the-art applications to generate relevant behavior insights that require an object-centric perspective. The size and attribute richness of the object-centric event log motivate future research questions in the field of object-centric process mining.

Lukas Liss, Nico Elbert, Christoph M. Flath, Wil M. P. van der Aalst

Open Access

Object-Centric Local Process Models

Process mining is a technology that helps understand, analyze, and improve processes. It has been present for around two decades, and although initially tailored for business processes, the spectrum of analyzed processes nowadays is evermore growing. To support more complex and diverse processes, subdisciplines such as object-centric process mining and behavioral pattern mining have emerged. Behavioral patterns allow for analyzing parts of the process in isolation, while object-centric process mining enables combining different perspectives of the process. In this work, we introduce Object-Centric Local Process Models (OCLPMs). OCLPMs are behavioral patterns tailored to analyzing complex processes where no single case notion exists and we leverage object-centric Petri nets to model them. Additionally, we present a discovery algorithm that starts from object-centric event logs, and implement the proposed approach in the open-source framework ProM. Finally, we demonstrate the applicability of OCLPMs in two case studies and evaluate the approach on various event logs.

Viki Peeva, Marvin Porsil, Wil M. P. van der Aalst

Open Access

Locally Optimized Process Tree Discovery

Business process optimization typically involves discovering models that are fit, precise, sound and simple. Process discovery algorithms automatically obtain these models from event logs, records of past process executions, enabling insights into the underlying process. However, event logs often contain incomplete and infrequent behaviour, which presents significant challenges for these algorithms. To address these issues, we propose a new process discovery technique called OptIMIIst, which guarantees soundness while handling both infrequent and incomplete behaviour and discovering locally optimal process trees. This technique, based on the Inductive Miner framework, operates in two steps. First, it creates candidate mining decisions for each process tree operator and then decides on the optimal decision through a local fitness and precision estimation. An experimental evaluation demonstrates that OptIMIIst produces high-quality process models and offers competitive fitness, precision, and simplicity compared to state-of-the-art techniques, while maintaining soundness.

Calvin Schröder, Jan Niklas van Detten, Sander J. J. Leemans

Open Access

A Framework for Advanced Case Notions in Object-Centric Process Mining

Real-life processes involve interacting business objects of different types. Object-centric event logs capture the execution of activities in such processes. An important step in the analysis of such logs is the identification of sets of objects which characterize an execution of the process, called a case. Given a case notion, visualizations can be constructed to display the relations between the executed activities and the involved business objects. Depending on the utilized case notion, these visualizations can quickly become excessively complex, impeding human analysis, or may oversimplify the underlying process, inducing flawed insights. To combat these issues, new case notions are needed to reduce complexity while representing relevant structures of the underlying business process correctly. In this paper, we propose continuous measures to quantify how correctly an object-centric case notion adheres to a given log and how complex the resulting visualizations are. These measures allow us to conceptualize the search for new object-centric case notions as a joint optimization problem among the two quality dimensions of correctness and simplicity. As a result, we can provide a new case notion that significantly reduces complexity in comparison to existing techniques, while preserving relevant object interactions. To evaluate our approach, we apply it to a range of real-life logs and find that major complexity reductions can be achieved without causing excessive correctness issues.

Jan Niklas van Detten, Pol Schumacher, Sander J. J. Leemans

7th International Workshop on Process-Oriented Data Science for Healthcare (PODS4H 2024)

Frontmatter

Open Access

Predicting Unplanned Hospital Readmissions Using Outcome-Oriented Predictive Process Mining

Many hospitals in the world are under pressure to improve their efficiency and effectiveness so that they can achieve better health outcomes with limited resources. One common measure of performance is the rate of unplanned hospital readmissions (UHRs) within 30-days. Emergency readmissions for the same disease can be assumed to indicate inappropriate discharge or poor planning, are costly, increase patients’ mortality risks and put additional pressure on bed capacity. Data Mining (DM) techniques have been used to predict UHRs based on clinical and demographic features, but these ignore the process perspective. Predictive Process Monitoring (PPM) is a process mining technique using completed traces to make predictions for in progress cases with machine learning (ML) algorithms. The Outcome-Oriented PPM (OOPPM) is a sub-technique of PPM focusing on predicting categorical outcomes of process. Adaptation of OOPPM in healthcare settings has been limited to date. Here, we illustrate how to implement OOPPM in a healthcare context through an application of an OOPPM pipeline to hospital admissions using the open access MIMIC-IV dataset. Clinical, demographical and process features were used to build an extended event log, which was then employed for UHRs prediction. Results show prediction using OOPPM techniques outperformed traditional DM techniques. OOPPM tests using tree-based ML algorithms achieved better results compared to OOPPM tests using other ML algorithms. Our results suggest OOPPM can make a significant contribution to better understanding of hospital performance.

Abdulaziz Aljebreen, Allan Pang, Marc de Kamps, Owen Johnson

Open Access

Structural and Semantic Enrichment of Models for the Interactive Discovery of Clinical Processes

Process Mining (PM) is a relatively new field which provides techniques to analyze business processes in different areas. In the field of Medicine, PM seeks to infer clinical processes from the data routinely collected during healthcare activities. In most frameworks, workflows are used to represent the results obtained by PM techniques. A problem with these workflows is that their structure is complex and not always easy to understand and hence to exploit by the clinician. A different problem, related to Clinical Practice Guidelines (CPGs), is that their development is mostly manual. We posit that workflows inferred by PM techniques could be improved and enriched by providing interactive tools to support their structuring and semantic annotation by clinicians. We also postulate that these improved and enriched models can be used to facilitate the development of CPGs. In this paper we describe our approach to the interactive discovery of clinical processes as well as an implementation to support it within the PMApp PM tool.

Jose Luis Bayo-Montón, Begoña Martínez-Salvador, Carlos Fernández-Llatas, Mar Marcos

Open Access

Research Paper: Enhancing Healthcare Decision-Making with Analogy-Based Reasoning

Analogy-based reasoning is often employed in the treatment of hospitalized patients, especially when clinical guidelines or robust evidence bases are unavailable. This approach is based on the assumption that similar patients respond similarly to comparable treatments. Traditionally, this reasoning has relied on the memory and experience of physicians. However, the complexity of managing patient data—such as treatment sequences and responses—presents significant challenges without technological support. In particular, the procedural perspective of comparing patients is especially demanding. To address these challenges, we introduce the MAPI framework, an innovative approach for analogy-based, process-oriented search within patient data. This framework systematically manages treatment data, defines precise similarity measures, and retrieves comparable patient cases using case-based reasoning (CBR). By integrating analogy-based reasoning, MAPI enhances decision-making and improves the explainability of treatment choices, offering a more reliable and transparent tool for clinical practice.

Joscha Grüger, Martin Kuhn, Karim Amri, Ralph Bergmann

Open Access

Analysing Disease Trajectories of Multimorbidity Through Process Mining Techniques: A Case Study

Multimorbidity is a global public health challenge, where an individual has two or more chronic conditions, making it difficult to treat and manage illnesses. Understanding the disease trajectories of multimorbidity is crucial for providing patient-centred care. Previous research has primarily employed regression-based approaches, which don’t consider the specific diseases involved and the order in which they occur. Process mining was recently proposed to address this gap, showing promising results in modelling disease trajectories across the entire spectrum of diseases. However, that study involved admissions to a single hospital, and hence the size of the dataset was much smaller than what is typically used in population-level studies on multimorbidity. In this paper, we present a case study where process mining techniques are applied to a much larger dataset of patients in Scotland. We present the disease trajectories discovered for the entire population, as well as stratified by sex. We also describe temporal patterns of disease trajectories, including trajectories with rapid progression. Finally, we discuss the experience of employing process mining within a trusted research environment, and we reflect on challenges that we faced when mining disease trajectories based on a large and complex dataset. Our main contribution involves providing additional evidence around the feasibility of disease trajectory modelling through process mining techniques, in particular when a much larger health dataset is involved.

Daniel Petrov, Thu Nguyen, Areti Manataki, Colin McCowan

Open Access

Predictive Insights for Personalising Esophagogastric Cancer Treatment Process - A Case Study

For metastatic esophagogastric cancer (EGC), treatments aim to extend survival time, manage symptoms, and enhance the quality of life . However, determining the best treatments for patients with EGC is challenging due to patients’ variability. Personalised treatments supported by predictive models enable tailoring treatment process to individuals. Even so, traditional predictive models often neglect the interaction between treatments, limiting their utility in comprehensive planning. State-of-the-art Predictive Process Monitoring shows promising results in predicting the outcome of the treatment process but often lacks transparency. This paper investigates the potential of supporting healthcare experts in personalising the EGC treatment process, using eXplainable Predictive Process Monitoring methods. A real-world case study among 7,090 patients identifies expert needs for helpful explanations and discusses the capabilities and limitations of existing methods, suggesting future research directions. Our findings demonstrate high-quality explanations with strong fidelity, providing insights validated by expert knowledge. While the resulting explanations are not always actionable, experts acknowledged their value for exploratory analysis.

Mozhgan Vazifehdoostirani, Andrei Buliga, Laura Genga, Rob Verhoeven, Remco Dijkman

Open Access

Case Study: Insights on Prostate Cancer Treatment Pathways Using Process Discovery

In this case study, records about prostate cancer patients, provided by the Cancer Registry of Rhineland-Palatinate, Germany, are analyzed. The dataset is comparatively large and cases are rather complete, as they contain events gathered not only from one institution (e. g., a single hospital), but from multiple institutions along the end-to-end patient journey. The analysis, which aims at getting insights on prostate cancer treatment pathways and contributing to state-of-the-art research in the Process Mining for Healthcare (PM4H) field, is powered by methods and techniques from the process mining domain. Therefore, dealing with a process mining project, the $$PM^2$$ P M 2 method was followed with the recommended phases in collaboration with the Cancer Registry of Rhineland-Palatinate, Germany. The initial analysis of $$\sim $$ ∼ 12k cases ( $$\sim $$ ∼ 90k events) recorded during 2018–2022 and considering only a small number of potential available data attributes already led to barely comprehensible spaghetti models, emphasizing the need for different views of granularity and complexity. This case study also provides results on the regular treatment pathways (such as surgery, or therapies).

Jana Vormann, Jonas Blatt, Flavio Horbach, Nils Herm-Stapelberg, Lukas Mittnacht, Patrick Delfmann, Tobias Walter, Sven Pagel

1st International Workshop on Empirical Research in Process Mining (ERPM 2024)

Frontmatter

Open Access

A Taxonomy for Conformance Checking Visualizations

Conformance checking is a sub-discipline of process mining, which compares process execution data with predefined process models to identify deviations between them. Although recognized as the most important feature of process mining tools, conformance checking is currently not widely applied in practice. One reason for this lack of adoption is the absence of process-mining-specific visualizations, which can effectively communicate conformance checking results to practitioners. Although researchers have identified the need for such visualizations, they have left their development to the tool providers, such that available visualizations are highly different and difficult to compare. This inhibits the opportunities to conduct empirical research on conformance checking visualizations, which would be crucial to understanding user preferences. To address this issue and establish a foundation for future empirical research, this paper provides an overview of the existing breadth of characteristics of conformance checking visualizations in the form of a taxonomy. This taxonomy consists of six dimensions, which highlight in a structured manner what information is displayed in conformance checking visualizations and how this is visualized in different academic and commercial tools. Our research enhances the comprehension of visual analytics in process mining, particularly for conformance checking, and highlights promising avenues for future empirical research.

Marie-Christin Häge, Jana-Rebecca Rehse

Open Access

Structuring Empirical Research on Process Mining at the Individual Level Using the Theory of Effective Use

A growing number of empirical papers on the topic of process mining has been published in years. After a first wave of contributions on application scenarios, there has been a second wave aiming to establish theoretical insights into how process mining tools are used and how benefits unfold from this usage. Many of these papers follow an explorative, qualitative, or inductive approach. A weakness of these contributions is their theoretical cohesion and integration. This paper makes an effort to integrate them into a more holistic theory that can eventually provide a foundation for more deductive and quantitative empirical research on process mining. To this end, we build on the theory of effective use and focus on the individual effect on decision makers. We find opportunities for revision and refinement of this theory for process mining. Specifically, we discuss moving from constructs on learning to expertise, and integrating a pragmatic perspective that complements the semantic emphasis of representational fidelity.

Jan Mendling, Mieke Jans, Kristina Sahling

Open Access

Analysing and Improving Business Processes Through Hybrid Simulation Model: A Case Study

The increasing amount of process execution data, i.e. the event logs stored by the company, can be exploited using Business Process Simulation (BPS). BPS serves as a valuable tool for business analysts, enabling them to analyze and compare business processes and identify changes that optimize key performance measures. Especially when evaluating alternative scenarios, it is crucial to start with an accurate simulation of the current process. Recent research in the field of BPS has demonstrated that Hybrid Simulation Model (HSM) approaches reliably replicates business process behaviour, overcoming the unrealistic or oversimplified assumptions often found in traditional discrete event simulators. In this paper, we present a case study conducted in collaboration with EY, where we apply the HSM to a real-life business process log. This study demonstrates the benefits of the HSM for business process analysis and its potential to improve process performance.

Francesca Meneghello, Massimo Coletti, Debora Di Marco, Massimiliano Ronzani, Chiara Di Francescomarino, Chiara Ghidini

Open Access

Leveraging Process Mining on the Shop Floor: An Exploratory Study

This paper explores the potential and limitations of process mining on the shop floor in the manufacturing industry. Despite its increasing popularity, the application of process mining in manufacturing remains under-explored. Through a combination of systematic literature review and interviews with 22 industry experts, academicians, shop floor workers, and production managers, we identify key areas where process mining can be leveraged on the shop floor. Our findings can be grouped into five dimensions: organizational management & human factors, data management & quality, digitalization & technology advancements, process efficiency & optimization, and production & supply chain complexity. The findings offer a comprehensive understanding of how process mining can be leveraged to improve manufacturing processes while also addressing the organizational and technical hurdles that may impede its adoption. This study contributes to the emerging field of process science by combining findings from the literature and collecting voices on and around the shop floor. The paper closes by proposing future research and practice by incorporating organizational and human insights from the shop floor.

Felix Rothhagen, Felix Kerst, Eduard Kant Mandal, Candan Çetin, Carolin Ullrich

Open Access

Using Facial Expressions to Predict Process Mining Task Performance

Process mining analysis is a complex task that presents significant challenges to human analysts. To aid along this process, it is essential to identify difficulties as they occur. This study takes an initial step in this direction, by predicting the quality of task performance based on analysts’ facial expressions while they are engaged in a process mining task. Data were collected using participants’ webcams and the iMotions™ cloud application while they performed a process mining task. The data were then utilized to train and evaluate several machine learning classifiers, which classified participants based on the grade given to their task outcome. Our results show the high performance of these classifiers in predicting participants’ success based on facial expressions. We further showed that the chosen outcome classifier could accurately classify additional participants, demonstrating its generalizability. Notably, the classifier was able to predict participants’ success within a very short time frame. These findings could pave the way for developing a near-real-time support system to detect when analysts engaged in process mining may benefit from assistance.

Lital Shalev, Irit Hadar, Rotem Dror, Adir Solomon, Elizaveta Sorokina, Michal Weisman Raymond, Pnina Soffer

Open Access

Using Process Mining with Pre- and Post-intervention Analysis to Improve Digital Service Delivery: A Governmental Case Study

We present a case study of Process Mining (PM) for personnel security screening in the Canadian government. We consider customer (process time) and organizational (cost) perspectives. Furthermore, in contrast to most published case studies, we assess the full process improvement lifecycle: pre-intervention analyses pointed out initial bottlenecks, and post-intervention analyses identified the intervention impact and remaining areas for improvement. Using PM techniques, we identified frequent exceptional scenarios (e.g., applications requiring amendment), time-intensive loops (e.g., employees forgetting tasks), and resource allocation issues (e.g., involvement of non-security personnel). Subsequent process improvement interventions, implemented using a flexible low-code digital platform, reduced security briefing times from around 7 days to 46 h, and overall process time from around 31 days to 26 days, on average. From a cost perspective, the involvement of hiring managers and security screening officers was significantly reduced. These results demonstrate how PM can become part of a broader digital transformation framework to improve public service delivery.

Jacques Trottier, William Van Woensel, Xiaoyang Wang, Kavya Mallur, Najah El-Gharib, Daniel Amyot

Open Access

Towards an Ethogram of Exploratory Process Mining Behavior

Exploratory process mining aims to better understand event logs. However, this is not a clear-cut procedure and relies heavily on the analyst’s cognitive skills. Research has been conducted to better understand the analyst’s behavior, yet an overview of exhibited behaviors during exploratory process mining is lacking. Such an overview would not only facilitate the direct comparison of empirical findings but would also serve as a recording tool for such process mining behavior. Drawing inspiration from the field of (human) ethology, which studies behavior, this paper presents an ethogram of exploratory process mining behavior, i.e., a catalog of behaviors. Via a systematic analysis of published process mining case studies, we developed an ethogram, consisting of 26 distinct behaviors such as “Discover process model”, “Define questions”, and “Explore data”. This ethogram provides insights into analysts’ actions, contributing to a more comprehensive understanding of their role.

Jessica Van Suetendael, Benoît Depaire, Mieke Jans, Niels Martin

1st International Workshop on Generative Artificial Intelligence for Process Mining (GenAI4PM 2024)

Frontmatter

Open Access

Local Large Language Models for Business Process Modeling

Large language models (LLMs) are capable of efficiently understanding natural language by processing large volumes of text data. Natural language is also used in process descriptions, thus LLMs appear to be a suitable candidate to significantly improve business process modeling. Although plenty of third-party LLMs exist, they raise the risk of privacy disclosure, untrustworthiness, and generalizability of the results. This paper proposes a pipeline to use a local and fine-tuned LLM that expects a textual process description as input and finally generates a visual process tree representation. We instantiate our pipeline with Llama3 8B and fine-tune the LLM with a training set of 120 self-generated examples. Initial evaluation results of our LLM-based approach for automated business process modeling promise usefulness of the approach in terms of process model quality while preserving data privacy.

Kaan Apaydin, Yorck Zisgen

Open Access

PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining Tasks

Large Language Models (LLMs) have the potential to semi-automate some process mining (PM) analyses. While commercial models are already adequate for many analytics tasks, the competitive level of open-source LLMs in PM tasks is unknown. In this paper, we propose PM-LLM-Benchmark, the first comprehensive benchmark for PM focusing on domain knowledge (process-mining-specific and process-specific) and on different implementation strategies. We focus also on the challenges in creating such a benchmark, related to the public availability of the data and on evaluation biases by the LLMs. Overall, we observe that most of the considered LLMs can perform some process mining tasks at a satisfactory level, but tiny models that would run on edge devices are still inadequate. We also conclude that while the proposed benchmark is useful for identifying LLMs that are adequate for process mining tasks, further research is needed to overcome the evaluation biases and perform a more thorough ranking of the “competitive” LLMs.

Alessandro Berti, Humam Kourani, Wil M. P. van der Aalst

Open Access

Terpsichora: A Tool to Generate Synthetic MP-Declare Process Models

Process models play a fundamental role in the Business Process Management lifecycle and are crucial for assessing the robustness of proposed algorithms and conducting benchmarks among different tools. However, public models are limited as they expose strategic knowledge. While some researchers developed public repositories of imperative models, there remains a lack of diverse, publicly available multi-perspective declarative models. Our work aims to bridge this gap by providing a tool for generating synthetic MP-Declare process models. We leverage Large Language Models to generate these models, ensuring coverage of diverse aspects and enhancing the resource pool for the BPM community.

Wesley da Silva Santos, Juliana Rezende Coutinho, Fernanda Baião, Georges Miranda Spyrides, Hélio Côrtes Vieira Lopes

Open Access

Process Modeler vs. Chatbot: Is Generative AI Taking over Process Modeling?

Large language models (LLMs) have become a promising tool for automating complex tasks such as process model generation from text. In order to evaluate the capabilities of LLMs in generating process models, it is crucial to provide means to assess the output quality. A few studies have already provided key performance indicators for assessing aspects such as completeness of the models in a quantitative way. In this paper, we focus on the qualitative assessment of generated process models generated by LLMs based on a user survey. By analyzing user preferences, we aim to determine whether LLM-generated process models meet the needs and expectations of experts. Our analysis reveals that 60% of users, regardless of their modeling experience, prefer LLM-generated models over human-created ground truth models.

Nataliia Klievtsova, Janik-Vasily Benzin, Juergen Mangler, Timotheus Kampik, Stefanie Rinderle-Ma

Open Access

Skill Learning Using Process Mining for Large Language Model Plan Generation

Large language models (LLMs) hold promise for generating plans for complex tasks, but their effectiveness is limited by sequential execution, lack of control flow models, and difficulties in skill retrieval. Addressing these issues is crucial for improving the efficiency and interpretability of plan generation as LLMs become more central to automation and decision-making. We introduce a novel approach to skill learning in LLMs by integrating process mining techniques, leveraging process discovery for skill acquisition, process models for skill storage, and conformance checking for skill retrieval. Our methods enhance text-based plan generation by enabling flexible skill discovery, parallel execution, and improved interpretability. Experimental results suggest the effectiveness of our approach, with our skill retrieval method surpassing state-of-the-art accuracy baselines under specific conditions.

Andrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin, Andrea Burattin

Open Access

Providing Domain Knowledge for Process Mining with ReWOO-Based Agents

Process mining practitioners often face the challenge of interpreting complex process data and driving process improvements with limited expertise in process optimization, tools, and the application domain of the process. This study explores the integration of LLM-based agentic frameworks in process mining to bridge this gap and democratize access to process optimization. We developed a Proof-of-Concept that leverages a Reasoning WithOut Observation (ReWOO)-based agent to perform process discovery, problem identification, generate ecosystem domain knowledge, and propose potential process improvements. Our experiments on a range of business processes suggest that LLM-based agent systems can insert meaningful domain knowledge into process mining tool interactions.

Max W. Vogt, Peter van der Putten, Hajo A. Reijers

International Workshop on Stream Management and Analytics for Process Mining (SMA4PM 2024)

Frontmatter

Open Access

Detect and Conquer: Template-Based Analysis of Processes Using Complex Event Processing

Online process analysis aims at identifying behavioral regularities or abnormalities in processes in near-real-time from continuous event streams. Yet, its realization is challenging, due to the requirements in terms of scalability and accuracy imposed by processes in Internet-of-Things environments. Against this background, this paper presents an approach for online process analysis that is based on standard models and systems for complex event processing (CEP). We present the “Detect and Conquer” approach that includes generic process templates to accurately capture behavioral regularities or deviations, which are then mapped to CEP queries to achieve their efficient evaluation. We evaluated our approach against synthetic and real-world datasets. The results demonstrate the feasibility and efficiency of our approach.

Christian Imenkamp, Samira Akili, Matthias Weidlich, Agnes Koschmider

Open Access

Task-Free Continual Learning with Dynamic Loss for Online Next Activity Prediction

Continual learning, known also as lifelong learning, aims at designing learning models that can continuously and autonomously adapt to varying data concepts without forgetting previously collected knowledge. Such concepts are referred to as tasks. Predictive business process monitoring, which predicts future process steps, is crucial in dynamic environments where tasks are not previously specified and processes frequently change or face unpredictability. However, many existing frameworks assume a static setting, ignoring dynamic nature and concept drifts in processes, leading to catastrophic forgetting—where training over new data adversely affects the performance on previously learned tasks. This paper presents TFCLPM, a framework for online next activity prediction that operates without relying on predefined tasks and employs continual learning techniques to reduce catastrophic forgetting. The methodology combines a Single Dense Layer neural network with a continual learning algorithm designed to retain challenging historical samples and include a regularizer to stabilize model parameters. Extensive experimental evaluations with synthetic and real-world event logs highlight our optimal configurations. The proposed framework’s performance is compared against three existing online next activity prediction methodologies. Results show significant improvements in prediction accuracy, especially in scenarios with gradual or recurrent drifts, highlighting the framework’s robustness and efficiency, even with large datasets.

Tamara Verbeek, Ruozhu Yao, Marwan Hassani

1st International Workshop on Process Mining for Sustainability (PM4S 2024)

Frontmatter

Open Access

Process Mining Guidelines for Greenhouse Gas Emission Management in Production Processes

Despite the urgent need for becoming more sustainable and enhancing sustainability reporting induced by, e.g., the Corporate Sustainability Reporting Directive effective from January 2024, there exists a lack in research and industry efforts for integrating sustainability metrics into business processes. One particular reporting requirement entails that large EU companies must disclose their sustainability metrics for greenhouse gas (GHG) emissions across their supply chains. To address this challenging task, this paper presents the Process Mining Guidelines for Greenhouse Gas Emission Management (PMG3), helping companies implement process mining to meet GHG emissions targets in production processes. Thereby, the PMG3 provides detailed steps for defining business and data requirements, analyzing inefficiencies, and formulating recommendations to enhance sustainability reporting. To validate PMG3, a detailed demonstration was conducted using real-world data from a business case in the production process within the consumer goods industry. The utility evaluation revealed high approval for the PMG3's usefulness, ease of use, and practitioners’ intention to use it in industry settings. Overall, this paper contributes a structured and applied approach for organizations to report GHG emissions and improve sustainability performance through process mining.

Ioana Costache, Oktay Turetken, Banu Aysolmaz, Karolin Winter

Open Access

Sustainability Analysis Patterns for Process Mining and Process Modelling Approaches

Business Process Management (BPM) has the potential to help companies manage and reduce their activities’ negative social and environmental impacts. However, so far, only limited capabilities for analysing the sustainability impacts of processes have been integrated into established BPM methods and tools. One of the main challenges of existing Sustainable BPM approaches is the lack of a sound conception of sustainability impacts. This paper describes a set of sustainability analysis patterns that integrate BPM concepts with concepts from existing sustainability analysis methods to address this challenge. The patterns provide a framework to evaluate and develop process modelling and process mining approaches for discovering, analysing and improving the sustainability impacts of processes. It is shown how the patterns can be used to evaluate existing process modelling and process mining approaches.

Andreas Fritsch

Open Access

Towards Nudging in BPM: A Human-Centric Approach for Sustainable Business Processes

Business Process Management (BPM) is mostly centered around finding technical solutions. Nudging is an approach from psychology and behavioral economics to guide people’s behavior. In this paper, we show how nudging can be integrated into the different phases of the BPM lifecycle. Further, we outline how nudging can be an alternative strategy for more sustainable business processes. We show how the integration of nudging offers significant opportunities for process mining and business process management in general to be more human-centric. We also discuss challenges that come with the adoption of nudging.

Cielo González Moyano, Finn Klessascheck, Saimir Bala, Stephan A. Fahrenkrog-Petersen, Jan Mendling

Open Access

Extending Genetic Process Discovery to Reveal Unfairness in Processes

Fairness is an essential consideration for most processes in an organization since an equitable treatment of people involved in a process is often mandated by the rules or regulations. It is also desired from a social sustainability perspective. Many processes have a social impact on the actors performing the process activities and on the subjects affected by the process. We focus on the latter case in which a group of process subjects, such as citizens or patients, experiences unfair bias or discrimination during the execution of the process. Obvious instances of such discrimination in processes are negative decisions, but any change in process behavior for a certain group may be a symptom of unfairness. Process mining has been proposed as a method to analyze such unfairness. However, when considering the classical process discovery of a single overall process model, such hidden biases may get disregarded since they are relatively rare occurrences. To address unfairness in processes through process mining, we first need to reveal it in the process model. Towards this goal, we contribute a fairness-aware process discovery approach that extends a genetic algorithm with new quality measures for group fairness. We tested the approach on a set of synthetic but realistic benchmark datasets containing controlled cases of unfairness. The results indicate that in several cases our approach succeeds in revealing hidden biases against certain groups, which would remain hidden in state-of-the-art process discovery. We consider this as an initial step towards a comprehensive analysis of unfairness in processes.

Muskan, Felix Mannhardt, Boudewijn van Dongen

Open Access

Can We Leverage Process Data from ERP Systems for Business Process Sustainability Analyses?

Sustainability is an increasingly important issue, which organizations need to take into account when assessing and improving their business processes. Doing so can contribute to enhancing an organisation’s overall sustainability. Green Business Process Management is a line of research concerned with supporting organisations to integrate a sustainability perspective into their processes. However, existing approaches that assess sustainability on activity and process levels using, for instance, Life-Cycle Assessment (LCA) are often time-consuming and complex. Therefore, this work explores whether Key Ecological Indicators (KEIs) used to assess the sustainability of a business process can be calculated using data already available within an organisation. Following a case study methodology, we analyse nine real-world datasets extracted from a business process analysis system of a large enterprise software vendor. Results indicate that current data availability is insufficient for exact assessments. To address this issue, we introduce a high-level conceptual model and provide recommendations for action based on the observations of the case study.

Dominik Schäfer, Finn Klessascheck, Timotheus Kampik, Luise Pufahl
Backmatter
Metadaten
Titel
Process Mining Workshops
herausgegeben von
Andrea Delgado
Tijs Slaats
Copyright-Jahr
2025
Electronic ISBN
978-3-031-82225-4
Print ISBN
978-3-031-82224-7
DOI
https://doi.org/10.1007/978-3-031-82225-4