The article explores the integration of control-flow and data aspects in process mining, emphasizing the use of Satisfiability Modulo Theories (SMT) techniques to address the complexities of data-aware processes. It begins by discussing the evolution of multi-perspective models that capture the interdependency between process control-flow and data payloads, highlighting the need for advanced analysis and mining techniques. The text delves into the representation of data-aware processes using Data Petri Nets (DPNs) and data-aware linear-temporal logic, providing a detailed overview of their execution semantics and symbolic representations. The article then reviews how SMT/OMT paradigms are effectively exploited to tackle key data-aware process mining tasks, including discovery, correctness checking, model repair, conformance checking, and monitoring. It also discusses the challenges and future research directions in dealing with richer datatypes, object-centric logs, and neurosymbolic techniques for predictions. The article concludes by outlining the practical implementations of these techniques and their potential impact on real-life settings, making it a comprehensive resource for understanding the current state and future of data-aware process mining.
AI Generated
This summary of the content was generated with the help of AI.
Abstract
Process mining is witnessing a shift from pure control-flow process models to richer, data-aware process models where the acceptable executions are not only characterised by the process control-flow, but also by the data carried by cases, and how they dynamically interact with the process activities and decisions. Even under the typical assumption that the process control-flow is bounded, the presence of data makes the overall state space infinite, a difficulty that transfers to process analysis and process mining tasks. In this article, we review how the challenges posed by data-aware process mining can be effectively tackled through AI techniques grounded in automated reasoning, and more specifically in Satisfiability and Optimisation Modulo Theories (SMT/OMT). To this end, we discuss data-aware logs and traces merging events with a payload. We then recall Data Petri Nets as an expressive process modelling approach matching this multi-perspective setting, and the logic \(\textsf {data-}{\textsf {LTL}_f}\) to formulate correctness properties and to represent declarative process constraints enriched with data. We then summarize how key process analysis and mining tasks can be addressed in this setting using SMT and OMT, with a particular focus on correctness and conformance checking, model repair, and monitoring. We overview the obtained results, presenting them under an integrated lens and providing corresponding pointers to the technical literature. In the conclusion, we outlook open challenges.
This research is partially funded by national funds through FCT, Fundacão para a Ciência e a Tecnologia, under projects UIDB/50021/2020 (DOI:10.54499/UIDB/50021/2020) and 2024.07385.IACDC. This work is also partially funded by the NextGenerationEU FAIR PE0000013 project MAIPM (CUP C63C22000770006), by the PRIN MIUR project PINPOINT Prot. 2020FNEB27, and by the Free University of Bozen-Bolzano with the ADAPTERS and PRISMA projects. This research is also partly funded by the project eProcess, no. 14375, supported by COMPETE2030-FEDER-00570400.
1 Introduction
Integrating control-flow and data aspects to holistically capture the dynamic evolution of processes is a central problem in AI [16], business process management (BPM) [99], and process mining [88].
Multi-perspective models that reflect this interdependency have consequently emerged [32, 53, 75, 99, 103], in turn calling for analysis and mining techniques able to address such different perspectives. When describing case-centric, data-aware processes, every case comes with a data payload that interacts with the process control-flow in three main ways:
the execution of activities is not only conditioned by the process control-flow, but also by the data payload;
similarly, executing an activity does not only result in updating the control-flow state of the case, but also its data payload;
the data payload can be used to express routing data-driven conditions within decision points.
In this article, we consider classical, case-centric processes where every process execution focuses on the evolution of a main object, indeed called case,1
Advertisement
Among the process modelling formalisms representing the process control-flow explicitly, Data Petri nets (DPNs) [80, 88] substantiate these three aspects in a concise yet expressive formalism where the control-flow backbone of the process is specified using Petri nets, the data payload consists of a set of variables, and the data-process interplay is reflected in read-write logical conditions over variables, attached to net transitions.
At the same time, within the family of declarative approaches implicitly describing the process control-flow through temporal constraints, data-aware extensions of linear-temporal logic over finite traces have emerged [13, 39, 70], providing logical foundations for multi-perspective variants [31, 86, 104] of the Declare declarative process specification paradigm [42, 94]. In this paradigm, formulae combining temporal operators and conditions over data are employed to simultaneously constrain the evolution of cases and the data they carry.
Notably, which data are actually considered – and which corresponding data conditions can be expressed – depends on which datatypes are considered, heavily impacting on the expressiveness of the resulting setting. Variables may in fact carry simple values like strings or numbers (to, e.g., store the delivery address and the cost of an order), or more complex data structures such as vectors or full-fledged relational databases (to, e.g., store the items contained in an order) [14, 34, 54].
These advances in the (formal) representation of processes are in par with an increasing shift in process mining techniques towards data-awareness. This closes a gap existing in process mining: on the one hand, event data have always been intrinsically multi-perspective, as events in a log come with a timestamp, a reference activity, and (optionally) a responsible resource and one or more data attributes [2]; on the other hand, process mining techniques typically focus on a single perspective at a time, in the majority of cases only retaining reference activities and the qualitative temporal ordering of events, without considering data attributes at all.
Advertisement
Besides representational challenges related to modelling data-aware processes, moving to data-aware process analysis and mining poses a central, computational problem: even under the classical assumption that the process control-flow is bounded, the presence of data makes the overall state space of the process infinite, a difficulty that transfers to process analysis and process mining tasks – both in terms of decidability/complexity, as well as when it comes to practical feasibility [83, 90].
To face such computational challenges, an effective strategy is to avoid the development of ad-hoc algorithms, and instead compile analysis and mining tasks into tasks that can be natively addressed using artificial intelligence (AI) techniques. Within the wide container of AI techniques, Satisfiability Modulo Theories (SMT) [19] naturally matches data-awareness. From the modelling point of view, SMT provides a unified framework to describe first-order formulae dealing with different theories, where each theory provides the logical counterpart of a specific datatype (such as real numbers and arrays) together with its domain-specific predicates (e.g., predicates to compare two real numbers or to check the length of an array). From the reasoning point of view, solid technology is available to check satisfiability of such formulae, with highly performing dedicated solvers for specific theories [17, 50, 57]. In addition, several solvers also provide support for optimisation, giving raise to so-called Optimisation Modulo Theories (OMT) [102].
In this article, we review how the SMT/OMT paradigm is being effectively exploited to address process mining for data-aware processes. We provide references to corresponding pointers to the technical literature, under an integrated lens.
The article is organised as follows. In Sect. 2, we provide the essential preliminaries regarding SMT/OMT solving. In Sect. 3, we recall case-centric logs and discuss how the data attributes associated to events can be interpreted. In Sects. 4 and 5 we provide a gentle introduction to DPNs and data-aware linear-time logics on finite traces. Section 6 presents the major contribution of this paper, providing a coherent framework where different data-aware process mining tasks, paired with dedicated process analysis tasks, are integrated to tackle data-aware process mining. We conclude by outlooking open challenges.
2 Satisfiability Modulo Theories
When formalising systems in logic, it is often necessary to express formulae involving not only boolean variables, but also variables referring to numbers or more complex data structures such as relations, uninterpreted functions, arrays, bit-vectors. This, in turn, calls for combined reasoning on the theories underlying these different datatypes. In this respect, Satisfiability Modulo Theories (SMT) [20] extends propositional satisfiability (SAT) to satisfiability of first-order formulae with respect to combinations of background theories. SMT solvers extend SAT solvers with specific decision procedures customised for the specific theories involved in the formula of interest. SMT solvers are useful both for computer-aided verification, to prove the correctness of software programs against some property of interest, and for synthesis, to generate candidate program fragments.
Well-studied SMT theories are the theory of uninterpreted functions \(\textsf{EUF}\), the theory of bitvectors \(\mathcal {B}\mathcal {V}\) and the theory of arrays \(\mathcal {A}\mathcal {X}\). SMT solvers also support different types of arithmetics for which specific decision procedures are available, like linear arithmetics (\(\textsf{LIA}\) for integers and \(\textsf{LQA}\) for rationals). Notable examples of widely adopted SMT solvers are Yices [4, 57], Z3 [51, 93], CVC5 [1, 17], and MathSAT5 [3, 43]. SMT-LIB [18] is an international initiative with the aims of providing an extensive on-line library of benchmarks and of promoting the adoption of common languages and interfaces for SMT solvers.
All in all, SMT solving relies on a universal, flexible and largely expressive approach that allows us to exploit a full-fledged declarative framework, where the actual computations are discharged to very competitive SMT solvers.
Finally, it is important to point out that many SMT solvers are also equipped with optimisation functionalities, tailored to solve problems based on Optimisation Modulo Theories (OMT) [101, 102]. OMT extends SMT by singling out, among the different satisfying assignments for the formula of interest, those that are optimal with respect to a given objective function.
3 Data-Aware Logs and Traces
We are interested in process mining over data-aware traces, that is, traces where the data payload carried by events is taken into consideration when solving process analysis and mining tasks. Our focus is on conventional case-centric process mining, where a (data-aware) event log is a set of traces denoting the evolution of cases through the process, each trace is a sequence of events, and each event comes with a data payload that is considered as a first-class citizen in the analysis.
Notably, this in line with the conventional definition of event log [5], and also with the XES standard for representing event logs [2].
More in detail, we assume that the data payload of each event in a trace refers to the values that a set V of case variables take upon the occurrence of that event. Each variable in \(x \in V\) is typed, and each type \(type (x)\) comes with a (typically infinite) data domain \(\Delta _{ type (x)}\), as well as with domain-specific predicates expressing conditions on the values that such a variable can take within its domain. For example:
a rational variable may be subject to comparison conditions, or in a richer setting, also to arithmetic conditions;
a variable pointing to a relation may be subject to database operations for inserting/removing/modifying tuples of that relation.
In this light, we define a trace more abstractly than in classical definitions [5].
Definition 1
(Event, trace) Let \(V\) be a set of variables. An evente over \(V\) is a set of assignments, one per variable in \(V\), mapping that variable to a corresponding value in its domain: \(e = \{x \mapsto d \mid x \in V, d \in \Delta _{ type (x)}\}\) – where we write \(e.x = d\) to indicate that \(x \mapsto d\) in e. A trace over \(V\) is a finite set of events over \(V\).
The following observations are important:
1.
It is customary to define a dedicate variable for timestamps. This is done by assuming that, among the available datatypes, there is one datatype \(\mathbb {T}\) for timestamps, which can be substantiated using different concrete datatypes (such as, integer), with the only requirement that \(\mathbb {T}\) is equipped with a comparison operator \(\prec\) to check whether an event comes before another event.
2.
If a timestamp variable is employed (say, variable \(ts\) with \(type (ts) = \mathbb {T}\), it is possible to directly reconstruct how events are sequenced in a trace, by comparing two events through the comparison \(\prec\) over their timestamps. Specifically, we can see a trace \(\{e_1,\ldots ,e_n\}\) as a sequence by introducing an ordering \(s: \{1,\ldots ,n\} \rightarrow \{1,\ldots ,n\}\) such that for every \(i,j \in \{1,\ldots ,n\}\), we have \(s(i) < s(j)\) if and only if \(e_{s(i)}.ts \prec e_{s(j)}.ts\).
3.
Activity names, transactional lifecycle indicators, as well as resources, can all be captured through dedicated variables; this is, again, in line with XES, where all these aspects are captured as extensions of a core representation consisting only of events and their sequencing.
4.
The representation of Definition 1 is general enough to accommodate settings that differ in their logging strategy. For example, if an event does not declare the value of one of the variables, one may assume that, by law of inertia, that value corresponds to that of the previous event (and so, in our representation, the assignment should simply be copied). At the same time, we stress the importance of explicitly pointing out these assumptions, as otherwise the same event log could lead to different interpretations.
Notice that while we follow a case-centric approach, the fact that variables may come with a complex datatype (such as a relation or other types of data structures) makes it quite general, and able to directly accommodate approaches that go beyond XES and tackle events operating over multiple objects. We comment on these so-called object-centric logs in Sect. 7.
4 Data Petri Nets
A process modelling language whose executions naturally match the notion of data-aware trace discussed in Sect. 3 is that of Petri nets with data (DPNs) [80, 88]. We provide a gentle overview of this approach, covering the model, its execution traces, and its execution semantics.
4.1 The DPN Model
A DPN is a Petri net equipped with two additional components:
1.
a finite set \(V\) of case variables, conforming to what described before;
2.
a decoration of the net transitions with constraints that read such variables to express data guards, and that write such variables to express controlled updates - to enable this, constraints are expressed as formulae containing conditions over variables in \(V\), mentioned in read or write mode (using r and w as variable superscript, respectively). These formulae can be constrained by several first-order theories, such as linear arithmetic, EUF, relational theories [60].
Figure 1a shows a DPN that models a simple auction process. Every auction represents a case, equipped with two variables: \(V = \{o,t\}\), with variable o holding the last offer, and t a timer. Constraint \(t^r\,{>}\,0 \wedge o^w\,{>}\,o^r\) associated with action \(\textsf{bid}\) expresses that a bid can be placed only if the current value of t is positive, and that when executed, it will increase the value of o.
The atomic conditions used within constraints depend on the datatypes of the variables, and on the available predicates. In the auction example, both variables maintain rational values, subject to linear arithmetic conditions. However, more restricted settings may suffice, like those recalled next:
monotonicity constraints (MCs) admit only variable-to-variable/constant comparisons wrt. \(=\), <, \(\le\), \(\ne\) for variables with domain \(\mathbb Q\) or \(\mathbb R\), such as \(x < y\) or \(3 \le x\);
integer periodicity constraints (IPCs) have the form \(x = y\), \(x \odot d\) for \(\odot \in \{=,\ne , <, >\}\), \(x \equiv _k y + d\), or \(x \equiv _k d\), for variables x, y with domain \(\mathbb Z\) and \(k,d\in \mathbb N\);
gap-order constraints (GCs) have the form \(x - y \ge k\) for x, y integer variables or constants, and \(k\,{\in }\,\mathbb N\).
4.2 Executing a DPN
To track the evolution of a case through the process, one needs to recall at once the control-flow state of the net, together with the current values assigned to the case variables. Formally, a state in a DPN \(\mathcal {N}\) is a pair \((M,\alpha )\) of a marking M and an assignment \(\alpha\) that maps V to values (according to the datatype associated to each variable). We fix some initial assignment, and call a run a finite sequence of transition firings in the underlying Petri net that respect guards. For instance, a possible run for the DPN in Fig. 1 is
It is easy to see how a DPN run relates to the notion of data-aware trace as of Sect. 3: one just selects the assignments from each state of the run. Logging the name of the last executed activity, which is essential in process mining, can be done by introducing a dedicated case variable a of type string, ensuring that whenever a transition fires, a gets updated with the name of the executed activity. For example, the guard associated to \(\textsf{init}\) in Fig. 1 can be modified according to this principle by inserting a third conjunct of the form \(a^w = \textsf{init}\). Furthermore, a surrogate, start event can be inserted to keep track of the initial values for the case variables. With this notions in place, the run above would yield the following trace:
The timestamps of executed activities can be tracked by introducing a dedicated, rational case variable ts in the DPN, ensuring that whenever a transition fires, that variable is written to a value strictly greater than the one it currently maintains (so as to respect the flow of time). This is done through an atomic guard of the form \(ts^w > ts^r\).
Fig. 1
Example of a DPN (a), together with its corresponding DDS (b) and constraint graph (c) (from [96])
For process analysis and mining tasks, it is customary to deal with a different formal model of the process, which directly results from the input DPN, under the assumption that concurrency is interpreted by interleaving. This leads to so-called data-aware dynamic systems (DDSs) [81], which are labeled transition systems where transitions are associated with constraints as above.
Although recently the need of more sophisticated data types and first-order theories has emerged in process science [60, 75], classical DDSs only support linear arithmetical theories, drastically limiting their expressiveness and their applicability in real BPM scenarios [34, 35]. For this reason, data-aware processes modulo theories (DMTs) have been introduced in [76, 77]: they are general dynamic systems manipulating variables that can store arbitrary types of data, ranging over infinite domains and equipped with domain-specific predicates. DMTs strictly generalize DDSs to support arbitrary first-order theories (in the SMT sense). Although they are intrinsically infinite-state systems whose states are characterized by the content of their global variables, DMTs can be enriched with finitely many explicit control states, as illustrated in Ex. 3 in [77].
Notably, every bounded DPN \(\mathcal {N}\) induces a corresponding DMT \(\mathcal {B}\), straightforwardly defined by considering that each one of the (finitely many) reachable markings in the DPN becomes a control state in the DMT. Figure 1b shows the auction process as a DMT – this specific case only presents arithmetical conditions, so we directly formalize it as a DDS.
4.4 Execution Semantics and Symbolic Representation
The execution semantics of a DMT is an infinite-state transition system, where each state holds the current control-flow configuration (a marking in the DPN case, or the control state in the DMT/DDS case), together with the current assignments of the case variables. Infinity here arises due to the presence of infinitely many distinct assignments for case variables with infinite data domains. In Fig. 1, for example, offers range over all non-negative rational numbers.
To solve analysis and mining tasks with DPNs, it becomes crucial to aim at obtaining finite representations of the infinite-state space (in line with, e.g., [14, 36, 54]) – which cannot be done in general, but can be achieved for certain theories, possibly under suitable restrictions on how the process operates over data variables. A natural way of attacking this problem is by aiming at symbolic representations of the state space, where SMT formulae are used to represent (possibly infinitely many) distinct states that behave equivalently in the context of the process execution. This idea is at the basis of a long series of results in the (symbolic) verification of infinite-state systems. In the context of process science and data-aware processes, such results are surveyed in [75].
When dealing with DMTs, this technique is substantiated through so-called constraint graphs. These are semi-symbolic representations of the state space induced by the input system, where each state contains a reference to the corresponding control state of the DMT, and a formula constraining which assignments are accepted for that state. This has been initially defined for the specific case of DPNs/DDS with monotonicity constraints [62, 81], then extended to other cases in [66], and finally it can be generalised to richer theories (as in the case of DMTs [77]).
The main idea underlying the constraint graph construction is to traverse the DMT in a forward fashion. While doing so, constraints are accumulated, taking into account the peculiarities of read and write guards, and inserting inertial constraints indicating that untouched variables maintain their value. In this respect, a symbolic state is a control state decorated with a formula for the accumulated constraints, which characterises the set of assignments leading there. Importantly, the same control state may now appear multiple times. Moreover, the construction is in general infinite, though for expressive and relevant classes of processes, finite constraint graphs can be obtained.
At every step, one starts from a symbolic state s, and symbolically fires a transition to get a new candidate symbolic state \(s'\). SMT solving is then applied for a twofold reason. First, it is used to check whether the formula of \(s'\) is satisfiable: only in that case, \(s'\) is indeed a reachable state, and should be considered as part of the constraint graph. Second, in order to prune the state space, the SMT solver can optionally be used to check whether \(s'\) is genuinely new, or a semantically equivalent state \(s''\) has already been inserted in the constraint graph. In the former case, \(s'\) is added to the constraint graph, together with a labelled transition linking s to \(s'\); otherwise, if an equivalent \(s''\) exists, then \(s'\) is discarded, and s is linked to \(s''\). However, the second check is feasible only when it is possible to handle in some form the quantifiers involved in formulae [37, 66, 74].
Figure 1c shows the constraint graph of the auction DPN.
Importantly, the constraint graph should be interpreted existentially, that is, when there is a path in the constraint graph, it means that the corresponding sequence of transitions can be executed for some sequence of variable assignments satisfying the respective state formulae. Consider, for example, the auction process in Fig. 1. The symbolic state \((\{p_1,p_2\}, o = 0)\) can be reached with any possible value for variable t, but only with a positive value for t it is possible to apply the \(\textsf{bid}\) transition therein.
We comment on the finiteness of the constraint graph, and the impact on reasoning over DPNs, in Sect. 6.2.
5 Data-Aware \({\textsf {LTL}_f}\)
\({\textsf {LTL}_f}\) (linear-time logic on finite traces [48]) is one of the most widely employed temporal logics to express properties and constraints of finite-length traces.
Traditionally, temporal logics are used to express (un)desired properties of a dynamic system, while the system itself is specified using a different formalism. For example, one may use \({\textsf {LTL}_f}\) to verify properties of (finite traces of) a Petri net. A less conventional approach is to declaratively describe the dynamic system itself using temporal logics. Within process science, \({\textsf {LTL}_f}\) has seen wide employment in this second acceptation, to provide a formal semantics for declarative process modelling approaches, most prominently Declare [42, 94]. The main idea is to implicitly describe the process control-flow through a declarative specification that consists of a set of temporal constraints, so that a trace belongs/conforms to the specification if and only if it satisfies all the constraints contained in the specification.
No matter which of the two usages is adopted, expressing properties and constraints of data-aware traces (as of Sect. 3) calls for
1.
linear temporal operators to relate different positions (or instances) of the trace;
2.
local data conditions to inspect and express properties of the assignments contained in the trace;
3.
data conditions relating different assignments over time.
These three features are present in first-order variants of \({\textsf {LTL}_f}\), which have been studied either in the classical setting of a single infinite interpretation domain where constants can be compared with (in)equality [13, 39], or the “modulo-theory” setting where multiple datatypes are considered [70]. We provide a bird-eye-view of the latter setting, considering a simplified variant of the \({\textsf {LTL}_f}^{\textsf {MT}}\) logic in [70, 71]. This variant, called \(\textsf {data-}{\textsf {LTL}_f}\), has been originally introduced in its full generality in [77],2 and exactly mirrors the notion of trace defined in Sect. 3.
Definition 2
(Syntax of \(\textsf {data-}{\textsf {LTL}_f}\)) Let \(V\) be a set of variables. A \(\textsf {data-}{\textsf {LTL}_f}\) formula \(\varphi\) over \(V\) is defined according to the following syntax
where q is a condition over \(V\), possibly mentioning primed versions of variables therein (i.e., versions such as \(x'\) and \(x'''\) for \(x \in V\)), and constants in the data domains of the types of variables in V.
Formulae of \(\textsf {data-}{\textsf {LTL}_f}\) are interpreted in an instance of a trace as per Definition 1. In the definition:
\(\textsf{X}\,\) and \(\,\textsf{U}\,\) are the next and (strong) until temporal operators, used to move over consequent instants of the trace. Specifically \(\textsf{X}\,\varphi\) indicates that the next instant of the trace exists, and \(\varphi\) holds therein; \(\varphi _1 \,\textsf{U}\,\varphi _2\) indicates that \(\varphi _2\) holds in a future instant k of the trace, and \(\varphi _1\) holds in every instant between the current instant (included) and k (excluded).
Conditions only mentioning non-primed variables (i.e., variables from \(V\)) are local conditions checked on the current instant of the trace;
Conditions mentioning primed versions of variables express data conditions over different instants. Specifically, a variable with n primings denotes a lookahead variable that refers to the value the variable holds n instants after the current one. In this respect, x refers to variable x in the current instant, while \(x'''\) has a lookahead of 3, and refers to x three instants after the current one. This can be used to express conditions that relate different assignments over time. For example, condition \(x' > x \wedge x'' = x'\) expresses that the value of x will increase in the next instant, and will then remain constant in the consequent instant.
Importantly, \(\textsf {data-}{\textsf {LTL}_f}\) (like the richer logic \({\textsf {LTL}_f}^{\textsf {MT}}\)) can only relate variables from instants that are at a bounded distance to each other - the maximum distance being implicitly determined by the largest employed lookahead. This enforces a well-behaved nesting of data conditions within temporal operators [70]. Relaxing such a nesting, which is needed to capture conditions relating variables in arbitrarily distant instants, causes basic reasoning tasks to become highly undecidable, even in the very restricted setting where variables range on an infinite domain whose elements can only be compared by (in)equality [39, 52].
Consider a process operating over two rational variables x and y. The following \(\textsf {data-}{\textsf {LTL}_f}\) formula (without lookahead)
captures that x is strictly increasing (formulated by expressing that it is always the case that the next value of x is greater than the current one), and at some point holds value 2.
As described above, \(\textsf {data-}{\textsf {LTL}_f}\) can be used:
to express properties of (the runs induced by) a DPN, relying on the compatibility between data-aware traces and DPN runs (cf. Sect. 3);
to declaratively specify a data-aware process as a set of temporal constraints.
In the latter setting, \(\textsf {data-}{\textsf {LTL}_f}\) provides logical underpinnings for data-aware extensions of Declare [31, 86, 95, 104]. Binary Declare templates like, for example, the response template (indicating that whenever a source activity A occurs, then a target activity B will occur later), are usually lifted to a data-aware setting by enriching the template with:
a source condition\(\varphi _A\) associated to A;
a target condition\(\varphi _B\) associated to B;
a correlation condition\(\varphi _{AB}\) relating the payload of A with that of B.
Intuitively, in the case of response the meaning of the constraint becomes: for every instant i, if A occurs in i and \(\varphi _A\) is true in i, then there must be a later instant \(j>i\) such that B occurs in j, \(\varphi _B\) is true in j, and \(\varphi _{AB}\) is true once evaluated over i for A’s variables, and j for B’s ones. The case of chain response is handled analogously, with the only difference that \(j=i+1\).
With this approach, one can ground the response template (respectively, the chain response template) to express, for example, that whenever an order is paid by a person with credit card as payment mode, then a receipt must be later (respectively, immediately next) issued to that same person.
Considering the bounded lookahead property of \(\textsf {data-}{\textsf {LTL}_f}\) discussed above, we have that \(\textsf {data-}{\textsf {LTL}_f}\) naturally supports the formalisation of Declare templates with source and target conditions, and covers correlation conditions relating positions with a bounded lookahead. In the order example above, the response version cannot be captured in \(\textsf {data-}{\textsf {LTL}_f}\), while the chain response version can be captured as follows:
$$\textsf{G}\,((a = \textsf {pay} \wedge m = \textsf {cc}) \rightarrow (a' = \textsf {issue-receipt} \wedge p' = p))$$
assuming that a, m, and p are string variables respectively denoting activities, payment modes, and persons.
As we will see later in Sect. 6, while the bounded lookahead property is essential in the context of reasoning, it can be lifted when dealing with discovery and conformance checking. Intuitively, this is due to the essential fact that in these tasks there is always a maximum bound on the length of traces to be considered (in discovery, determined by the analysed log, and in conformance checking, by the analysed log and the considered specification).
6 Data-Aware Process Mining Tasks
We are now ready to discuss how the languages and techniques described so far have been employed to solve key data-aware process mining tasks, using SMT/OMT solvers as the core, underlying algorithmic technology.
Fig. 2
Overall schema showing key data-aware process mining tasks. SMT-based tasks are shown in magenta
Figure 2 takes the standard schema describing a process mining framework, in particular the version shown in [6], and adapts and refines it to the setting of data-aware traces and data-aware specifications – formalised using DPNs or \(\textsf {data-}{\textsf {LTL}_f}\). Among the shown tasks, in magenta we report those mainly tackled with SMT in the state of the art.
We next comment each task in more detail, providing references to the literature.
6.1 Discovery with Loosely-Integrated Techniques
Discovery of data-aware process specifications has been tackled so far through a loosely integrated approach that, in its essence operates as follows.
1.
a pure control-flow specification is discovered using conventional techniques – for example, producing a Petri net or a Declare/\({\textsf {LTL}_f}\) specification;
2.
elements that are conditioned by data are identified within those specifications – for example, singling out choice points and activities in a Petri net, or activities in a declarative specification;
3.
data attributes corresponding to such elements are extracted, and a decision mining task is cast on them;
4.
the result of decision mining is, per element, integrated into the control-flow specification, yielding a final, data-aware refinement of it.
This approach has been applied to discover DPNs, where decision mining is used to attach read guards to transitions [88, 89], and to discover data-aware Declare specifications, using decision mining first to mine source and target conditions [86], and in subsequent work also addressing correlation conditions [79].
A strong point of this approach is that existing techniques separately dealing with control-flow and decision mining can be seamlessly integrated.
A downside is that there is no guarantee that the combination of the two dimensions – discovered separately and integrated a-posteriori – suitably works. In particular, it has been shown that this form of discovery may produce unsound DPNs – a phenomenon that can also arise when a (sound) discovered DPN is later subject to manual re-modelling [67, 68].
This motivates the insertion of two model-driven analysis tasks within a data-aware process mining pipeline:
correctness checking – to verify whether the (executions of the) DPN satisfy properties of interest, in this case using \(\textsf {data-}{\textsf {LTL}_f}\) to formalise such properties;
model repair – to turn an unsound DPN, or an inconsistent \(\textsf {data-}{\textsf {LTL}_f}\) specification, into a sound/correct one, by introducing minimal changes to the original specification.
6.2 Correctness Checking
Checking correctness of DPNs against full \(\textsf {data-}{\textsf {LTL}_f}\) has been addressed considering linear-time properties in [77]. In the specific setting of arithmetic datatypes, verification has been tackled studying linear-time [66], branching-time [65], and data-aware soundness [67] properties. It is worth noting that, when analysing DPNs, it is irrelevant whether or not the logic supports lookahead. Intuitively, this is because the DPN itself can impose transition constraints that relate variables across time.
Considering the auction example of Fig. 1, there are two main correctness issues. First, the \(\textsf{reset}\) transition is dead, as whenever marking \([p_3]\) is reached, then variable o must hold a strictly positive value (as witnessed in the bottom symbolic state of the constraint graph). Second, if after executing \(\textsf{init}\) transition \(\textsf{timer}\) is fired, writing a value for t that is \(\le 0\), then \(\textsf{bid}\) can never fire: it requires \(t > 0\), but only lower values of t can be written by further firings of \(\textsf{timer}\). This, in turn, indicates that variable o will always keep value 0, making it impossible to fire \(\textsf{hammer}\) and reach the desired marking \([p_3]\).
6.2.1 Linear-Time Properties
In [66], linear-time correctness checking of (the finite runs generated by) a DPN is addressed for \(\textsf {data-}{\textsf {LTL}_f}\) formulae over arithmetic datatypes in an existential way, that is, verifying whether the DPN can produce some run satisfying the given property.
The approach in [66] lifts to an arithmetic setting the classical automata-based model checking approach for linear-time logics [15]: the formula is encoded into an automaton, which is then combined with the transition system under scrutiny to obtain a cross-product that is inspected for non-emptiness to ascertain whether the property holds. Three challenges arise, though:
1.
when constructing the automaton of the formula, one has to consider that edges are not propositions, but instead constraints over the theories mentioned in the formula;
2.
also the cross-product of the formula and of the DPN constraint graph must take into account the fact that edges of the automaton, and symbolic states of the constraint graphs, contain formulae, not mere propositions;
3.
in contrast to the propositional setting, the cross-product must as we have discussed in Sect. 4.4, even when the transition system of the input DDS is represented symbolically through a constraint graph, it may contain infinitely many states.
The first challenge is tackled, at first, by dealing with constraints at a purely syntactic level. Specifically, they are treated as propositions when building the automaton of the formula – which is then constructed following the standard approach for propositional \({\textsf {LTL}_f}\) [46, 48]. The resulting automaton has constraints on its edges. In a second phase, the meaning of constraints is taken into account, to only retain edges whose formulae admit satisfying assignments: SMT solving is employed to only retain edges with satisfiable formulae.
The second challenge is handled similarly: when combining the automaton of the formula with the constraint graph of the DPN, SMT solving is used to decide which states/transitions should be kept, and which should instead filtered out.
The third challenge cannot be solved in the general case, as the correctness checking task is highly undecidable. For example, it is undecidable when employing only a single numeric datatype equipped with increment, decrement, and test for zero [66]. To single out decidable cases, [66] brings forward a semantic property of the input system, called finite history, which guarantees that the constraint graph has finitely many states, in turn yielding a correct and terminating technique for correctness checking. Having a finite history is a semantic property, undecidable to check on its own.
However, it is instrumental to single out concrete settings, defined based on syntactic conditions on the shape of constraints and/or the control-flow structure of the systems, which enjoy finite history, and in turn yield decidable cases. In fact, this approach has been used to reconstruct and generalise decidable cases known from the literature. In particular, [66] identifies the following, well-behaved classes:
DPNs with monotonicity constraints;
DPNs with integer periodicity constraints;
DPNs with gap-order constraints;
DPNs with unrestricted arithmetics, but used in a controlled way, in particular ensuring that the value of each variable depends only on boundedly many arithmetic manipulations;3
DPNs that structurally integrate components falling in two or more of the previous cases.
An important generalisation of such classes has been then achieved in [76, 77], where decidability of correctness checking is obtained for DMTs as well, in case of very general datatypes obeying to mild, model-theoretic assumptions.
6.2.2 Branching-Time Properties and Soundness
When properties are verified on a DPN, it becomes also relevant to inspect the branching structure of the modelled process. This is studied in [65], which introduces a CTL\(^*\) version of \(\textsf {data-}{\textsf {LTL}_f}\) restricted to arithmetic datatypes. Among the classes recalled above, decidability is shown to only hold for monotonicity and integer periodicity constraints [65].
Notably, the capability of CTL\(^*\) of mixing linear- and branching-time operators makes it possible to use it to specify a data-aware variant of the well-studied notion of soundness for Petri nets [9], now carried over DPNs. This has been initially introduced in [49, 63] to formally characterise soundness of control-flow processes equipped with (DMN) decisions [21], and then generalised in [67] to the case of DPN with arithmetics.
6.3 Model Repair
As discussed in the previous section, a discovered DPN (possibly subject to consequent manual intervention) may exhibit undesired behaviours. For example, the auction DPN of Fig. 1 is unsound. Unsoundness can be experienced when discovering DPNs using loosely-integrated techniques for the control-flow and data part. Some unsound DPNs from the literature are discussed in [68]. Similar issues may be experienced in the declarative setting, where discovering constraints that are only satisfied by a fraction of log traces may contradict each other, leading to an overall inconsistent specification [41].
Instead of simply rejecting the faulty specification, one may attempt at repairing it, that is, transforming it into a “correct” version, obtained from the original one through a minimal set of changes. In the case of an unsound DPN, that would require to apply minimal changes to it so as to make it sound. In the case of an inconsistent declarative specification, one may identify minimal unsatisfiable cores (that is, minimal sets of constraints causing inconsistency) [44, 87, 100], and then weaken or remove them.
A first challenge is actually to define what minimal set of changes actually means. In fact, one may consider changes at the representation or behavioural level. The first setting considers the effect of changes over the specification itself, such as how many guards are modified when repairing a DPN, or how many constraints are weakened/removed in a Declare specification. The second setting treats the specification as a compact representation of a set of traces, and thus minimality may consider which behaviours are lost/gained when altering the specification. For example, in the case of a DPN, one may consider how many traces disappear, and how many transition firings ending up in a dead state get unblocked upon repairing it.
Model repairs in a data-aware setting have so far being studied only to turn unsound DPNs into sound ones, while to the best of our knowledge no repair strategies have been studied for \(\textsf {data-}{\textsf {LTL}_f}\). More in detail, minimal repairs for unsound DPNs are studied in [108] at the representation level, and in [68] at the behavioural level. Indeed [108] treats the DPN as a (graphical) representation, and minimality considers how many different guards are altered in the repair, and how. In [68], it is assumed that the control-flow of the DPN (stripping off all the constraints) is sound, and that two types of operations can be done in a repair: removal of dead transitions, or modification of constraints, by either relaxing or tightening them. SMT solving and configuration graphs are used to define minimal overall repairs through extension or restriction, measured in terms of removed/newly inserted traces.
Repair by restriction of the auction example leads to two main changes to the DPN in Fig. 1a:
The dead \(\textsf{reset}\) transition is removed.
The dead state arising when \(\textsf{timer}\) fires before any firing of \(\textsf{bid}\), writing a value for t that is 0 or negative, is curated by tightening the guard attached to \(\textsf{timer}\) into \((t^r> 0) \wedge (t^w < t^r)) \wedge (o^r> 0 \vee t^w > 0)\). This new guard adds a third conjunct ensuring that either a bid has been placed (thus making \(o > 0\)) - and so \(\textsf{hammer}\) can fire, or that the written time is positive (thus leaving \(\textsf{bid}\) enabled). As shown in [68], this modification ensures that all the runs accepted by the original DPN are still accepted by the new one, and that the (partial) runs getting stuck in the dead state described above are now able to progress until the final marking – thus guaranteeing minimality.
6.4 Conformance Checking
Conformance checking in a data-aware setting compares logged traces and model traces (that is, traces conforming to a reference process specification) by considering both the sequencing of events and the payloads they carry. In a nutshell, the conformance checking problem takes as input a log/a trace and a process model, and computes various artifacts linking observed and modeled behaviors. Such artifacts are then used for deviation detection, (process model) quality metrics computation, and the like.
6.4.1 Alignment-Based Conformance Checking
A widely adopted conformance checking artifact for a given log trace is that of alignment, indicating how the log trace relates to the closest model trace(s), and computing a related cost [10, 28]. An alignment is a finite sequence of moves (event-firing pairs) s.t. each (e, f) is (i) a log move, if \(e\in \textsf{E}\,\) and \(f=\gg\) (\(\gg\) is a symbol denoting skipping); (ii) a model move, if \(e=\gg\) and f is a valid transition firing; (iii) a synchronous move otherwise. Given a log trace and a process model, a corresponding alignment is computed by finding a complete process run that is “the best” among all those the process model can generate. This is done by searching for an optimal alignment using a specific cost function, which can be realised as a distance between two finite strings [10, 28]. In case of logs with uncertainties, one looks into alignments obtained for realisations – possible sequentialisations of (a subset of the) events with uncertainty in a given trace so that timestamp uncertainties are resolved [59].
6.4.2 Alignments for DPNs using SMT/OMT
Checking conformance of data-aware traces to DPNs has been initially tackled in [91], defining and computing multi-perspective alignments for DPNs. Intuitively, mismatches between events of a log trace and corresponding transition firings of the DPN may be caused not only when the reference activities are different, but also in case the data disagree.
Better algorithmic bounds and a more general technique based on SMT/OMT have been then proposed in the conformance-checking approach CoCoMoT [58, 60]. The idea behind CoCoMoT stems from SAT-based approaches for the computation of pure control-flow alignments [27], lifting them to the data-aware case.
This is done in a modular way as follows, under the assumption that the input DPN supports at least one run (i.e., in technical terms, is relaxed data sound). The main modules of the encoding are: (i) the executable process model (EPM) module, which realizes the execution semantics of the input DPN by symbolically encoding all possible valid process runs; (ii) the log trace (LT) module, which accounts for the trace realization in case it contains any uncertainties; (iii) the conformance artifacts (CA) module, which accounts for a conformance artifact of interest as well as for its related decision/optimization problem, i.e., the task that needs to be solved (or decided) in order to compute such artifact.
Each module (with the exception of the part of CA related to the decision problem) is represented as a conjunction of SMT constraints. It is important to stress that the EPM is module symbolically encodes all runs of the DPN up to a given upper bound in length. This is, however, without loss of generality, as for a given log trace, one can compute an upper bound on the cost of the worst possible alignment [58, 60].
The LT module naturally encodes all the aforementioned uncertainties (if any) and, in case of the uncertainty on timestamps, for related events it provides an additional encoding to find a suitable ordering for them.
All the SMT constraints are then put into one conjunction, which is passed as input to an SMT solver in order to solve the associated decision problem. For the case of alignments, CA encodes a selected distance-based cost function defined as the sum of contributions that only depend on the local discrepancies in the moves of the considered alignment (i.e., inductively), and this is done using a finite set of variables that are later used to extract the optimal alignment solution. However, since there are in principle many possible satisfiable solutions, in order to find the best/optimal one, we solve an OMT problem for which a satisfying assignment to all the variables in the encoding is so that the total cost (represented as a single variable carrying the final result of the aforementioned inductive cost encoding) is minimal. This allows us to find a model run which is the closest one to the trace.
In the same manner, one can adjust CA by changing the decision problem associated to the specific conformance artifact of interest. Along this line, [60] brings forward encoding modifications to account for multi- and anti-alignments [26, 40]. To capture different results while computing conformance artifacts, one can also modify CA to account for different cost functions. For example, in [59] it is shown that one can employ a generic cost function whose components can be flexibly instantiated to homogeneously account for a variety of concrete situations where uncertainty come into play.
6.4.3 Dealing with Uncertainty
The approach in [58, 60] has been further extended in [59, 61], dealing with different forms of uncertainty in the input event log, building on [97] and lifting the setting of [69] to the data-aware case. In this extended approach, given a set A of activity labels and a set of attributes names Y, an event is a tuple \((b,\alpha ,\textsf{c},\textsf{LA},\textsf{TS})\), where \(b\in A\), \(\alpha\) is a (partial) payload function associating values to attributes from Y, \(\textsf{c}\in (0,1]\) is a confidence that the event actually happened, \(\textsf{LA}\) is a set of activities with related confidence values, and \(\textsf{TS}\) is a finite set of elements from (or an interval over) a totally ordered set of timestamps. The last three values support the following forms of uncertainty: (i)uncertain events, which come with confidence values describing how certain we are about the fact that recorded events actually happened during the process execution; (ii)uncertain timestamps that are either coarse or come as intervals, which in turn affect the ordering of the events in the log; (iii)uncertain activities that in bundle (together with corresponding confidence values) are assigned to a concrete event when we are not certain which of the activities actually caused it; (iv)uncertain data values which, for an attribute, come as a set or an interval of possible values.
6.4.4 Alignments for \(\textsf {data-}{\textsf {LTL}_f}\) Specifications
Conformance checking for data-aware declarative specifications has been often tackled following the seminal approach in [31], which focuses on data-aware variants of Declare, and defines conformance through metrics that count how many times a trace “activates” a constraint of interest, and how many times an activation corresponds to a consequent “satisfying target”. Data-aware alignments have been tackled only in [23], considering the fragment of \(\textsf {data-}{\textsf {LTL}_f}\) formulae restricted to arithmetical datatypes and to variable-to-constant comparison conditions only. The approach builds on the automated planning-based alignment technique for \({\textsf {LTL}_f}\) from [47], dealing with variable-to-constant conditions through an explicit data abstraction technique that propositionalises them in a faithful way. SMT/OMT techniques provide a natural candidate to generalise this approach to richer datatypes and conditions.
6.5 Monitoring Tasks
The general framework in [6] does not only put emphasis on standard, post-mortem process mining tasks conducted on event logs of already concluded process executions. It also includes online tasks of operational decision support, whose goal is to obtain insights on ongoing executions, in turn providing the basis for taking informed countermeasures. While [6] includes predictive process monitoring as an exemplar operational support task, we also include here conformance monitoring, as it naturally connects to the already discussed task of conformance checking, and has been effectively attacked using SMT.
Conformance checking of declarative process specifications has been extensively studied not only on already completed traces, but also to monitor ongoing executions [85]. In this setting, we speak of conformance monitoring, a task that consists in ascertaining whether the ongoing execution satisfies or violates a given set of temporal constraints.
Interestingly, when such constraints are formalised in \({\textsf {LTL}_f}\), one can deal with a more sophisticated version of this task, known under the name of anticipatory monitoring [82]. The idea of anticipatory monitoring is to verify whether a running trace satisfies a formula, also considering the fact that the trace can continue in any possible way (as we do not yet know what will be its future unfolding). In the field of runtime verification, this led to the definition of different runtime semantics for temporal logics over infinite traces [82], most notably the RV-LTL approach [22], which, given a trace and a formula, refines the two possible outcomes of satisfaction or violation into four possible truth values. Specifically, the verdict of the monitor of a property \(\psi\) at each point in time is one of four possible values: currently satisfied (the formula is satisfied now but may be violated in a future continuation), permanently satisfied (the formula is satisfied now and will necessarily stay so), and the two complementary values of permanent and current violation. This immediately shows that monitoring is as hard as checking satisfiability and validity in the logic of interest.
Anticipatory monitoring has been extensively applied in the context of conformance monitoring of Declare/\({\textsf {LTL}_f}\) [42], exploiting the key advantage that, over finite traces, monitoring is well-defined for every linear temporal formula [46]. This is different from the infinite trace-setting, where there are non-monitorable formulae. In addition, anticipatory monitoring of \({\textsf {LTL}_f}\) can be operationalised in a very elegant way:
first, one construct the (deterministic) automaton recognising the language of the input formula;
second, every automaton state is assigned to one of the four truth values, depending on whether the state is final or not, and on whether it can reach a non-final or a final state.
In [64], this approach has been lifted to the fragment of \(\textsf {data-}{\textsf {LTL}_f}\) that restricts to runs over arithmetic datatypes. To handle this much more convoluted case, [64] extends the construction of monitors to suitably address the arithmetic data conditions contained in the formula of interest. The starting point is the automata construction technique used for correctness checking, as recalled in 6.2. However, that technique is not enough: for existential verification problems over DPNs, the automaton of the formula need not to capture all and only those traces satisfying the formula, while a weaker approximation of this set is enough [76, 81].
Such a weaker approximation is instead not fine-grained enough for anticipatory monitoring, where language-preserving automata are needed. [64] tackles this open problem by first considering formulae without lookahead. In that case, a very similar approach to that used in linear-time verification (cf. Sect. 6.2) can be pursued, yielding solvability of monitoring for all arithmetic theories that have decidable satisfiability. As for properties with lookahead, the situation is more challenging, and incurs in even more severe undecidability boundaries than the case of linear-time verification: monitoring solvability is shown to hold for all the cases mentioned in Sect. 6.2, with the exception of gap-order constraints. The full construction is given in [64]. Figure 3 provides an example of monitor.
Fig. 3
Anticipatory monitor for the \(\textsf {data-}{\textsf {LTL}_f}\) formula \(\psi = (y\,{\ge }\,0) \,\textsf{U}\,(x\,{>}\,y \wedge \textsf{G}\,(x\,{>}\,y))\) (from [96]). \(\Theta\) is a shortcut representing all the constraints used in the automaton. Colors have the following meaning: orange means current violation, red permanent violation, and cyan current satisfaction (there is no state corresponding to permanent satisfaction for this formula)
An interesting future work building on the SMT-based anticipatory monitoring framework of [64] is to use it to extend the anticipatory monitoring approach for hybrid process specifications in [11]. There, a hybrid process specification consists of a collection of DPNs enriched with overarching \(\textsf {data-}{\textsf {LTL}_f}\) constraints, but the approach deals with data using ad-hoc faithful data abstraction techniques, only defined for variable-to-constant conditions.
Another open problem is to consider richer datatypes, as done in [77] in the case of verification. We comment on this aspect in Sect. 7.1.
6.5.2 Predictive Monitoring
Predictive process monitoring [55] aims at predicting key indicators on the future of an ongoing process execution, such as likely outcome, expected completion time, which next activity will be selected, or even the entire continuation of the trace.
As surveyed in [55], a number of approaches, mainly based on neural networks, have been proposed to tackle this task, considering not only the sequence of activities in a trace, but also the data payload they carry. This makes predictive process monitoring compatible with the notion of data-aware trace given in Sect. 3.
A question that is instead still open relates to how models (in the context of this paper, data-aware process specifications) can be employed to improve predictions. A main direction is to use models as encoding background knowledge on the process, implicitly defining which traces are possible and which not. This is then instrumental to constrain predictions, ensuring that they are compatible with the provided background knowledge (e.g., that the predicted suffix conforms to the possible suffixes captured by the background knowledge). However, it is notoriously challenging to integrate hard/logical constraints in neural network architectures, a problem known under the term of neurosymbolic artificial intelligence [25, 92].
In the context of process mining, some initial attempts have been done in the direction where background knowledge is expressed using \({\textsf {LTL}_f}\)/Declare. One line of research focusses on teaching neural networks to perform classification [105, 106] or prediction [107] tasks compatibly with the given background constraints. Another line of research deals instead with explainability via counterfactual generations, aiming at producing counterfactual traces that are compatible with background constraints [29, 30]. Dealing with data-aware background knowledge, such as \(\textsf {data-}{\textsf {LTL}_f}\) background constraints, is still an unexplored area.
7 Outlook
As summarised in this article, substantial progress has been made using SMT/OMT solving as a solid foundational and practical basis for tackling key data-aware process mining and analysis tasks: correctness checking and model repair, conformance checking, and (anticipatory) conformance monitoring. Notably, all the techniques overviewed here have been implemented in proof-of-concept prototypes:
https://lindmt.unibz.it/ deals with correctness checking for general DPNs against full \(\textsf {data-}{\textsf {LTL}_f}\) formulae, while https://ltl.adatool.dev and https://ctlstar.adatool.dev respectively tackle \(\textsf {data-}{\textsf {LTL}_f}\) and its CTL\(^*\) extension that restrict the focus on DPNs with arithmetic datatypes.
https://monitoring.adatool.dev tackles anticipatory monitoring for the fragment of \(\textsf {data-}{\textsf {LTL}_f}\) restricted to arithmetic datatypes.
Such implementations have been subject to experimental tests mainly aiming at feasibility. A deeper investigation of their performance in real-life settings, as well as further insights on their application, constitute a natural continuation of this line of research, towards practical adoption.
Even from the foundational point of view, research in data-aware process mining is in its infancy. We then close by commenting on six timely lines where further research is required.
7.1 Dealing with Rich Datatypes
While verification of DPNs has been tackled for very general datatypes with mild model-theoretic assumptions [77], foundational and applied results for conformance checking, model repair, and monitoring, are currently limited to arithmetic datatypes. A natural extension is then to lift also these tasks to more general theories. These extensions are challenging and require genuinely novel research: while reasoning with arithmetic datatypes can rely on quantifier elimination techniques, this is not the case for more general theories [37]. A concrete direction to close this gap is to follow the route pursued in [77], in which quantifier elimination methods (previously exploited in [65‐67] to deal with arithmetic datatypes) are now relaxed to uniform interpolation methods [37, 38].
7.2 Combined Discovery of Control-Flow and Data
As we have discussed in Sect. 6.1, contemporary techniques for data-aware process discovery first focus on the control-flow, and in a second step enrich it with data. Developing more tightly integrated techniques, able to discover at once the control-flow and data perspective, is an interesting avenue of unexplored research, which may potentially lead to discovery algorithms enjoying good properties such as re-discoverability, soundness, and perfect fitness.
7.3 Encodings and True Concurrency
All the presented techniques assume that concurrency is interpreted by interleaving. This opens up two open research questions. First, how to enrich the encodings with additional constraints reflecting the concurrency present in the process, which may be useful to implicitly break symmetries and, more in general, make the approaches faster in the case where the DPN under scrutiny has a high degree of concurrency. Second, how to deal with data-aware process mining by retaining true concurrency.
7.4 Handling Richer Datatypes in Process Mining
As discussed in the article, the advantage of adopting the SMT approach is that different theories can seamlessly be integrated. In principle, both DPNs and \(\textsf {data-}{\textsf {LTL}_f}\) can indeed be subject to reasoning with richer datatypes than arithmetic, for example dealing with data structures. This is however an under-explored area of research in process mining as a whole, starting from the fact that (XES) event logs typically assume that even payloads are specified as simple key-value pairs. In addition, foundational research is missing on how to reassess key process mining tasks, such as conformance checking, in the light of richer datatypes.
7.5 Dealing with Object-Centric Logs
Object-centric process mining is rapidly becoming one of the main, novel areas of research in process mining [7, 24]. Some preliminary techniques have been devised to handle discovery [8] and conformance checking [84] for process models equipped with multiple objects related via one-to-many and many-to-many relations, but they are not yet able to deal with essential constructs such as multi-object synchronisation [24], as supported in formal representations for object-centric processes [12, 73, 98]. We foresee that the SMT/OMT approach can be extremely useful in pushing the boundaries of object-centric mining towards covering these essential constructs, again in the light of the fact that they natively support richer datatypes such as uninterpreted functions, arrays, and relations. Such richer datatypes are natively capable of handling multiple evolving objects and relations, and have already being exploited to capture object-centric [73] and relational data-aware processes [33, 36, 72]. A first step concretely witnessing this potential is [78], where conformance checking of object-centric processes is tackled using SMT/OMT, properly dealing with key modelling constructs that were out of reach before.
7.6 Neurosymbolic Techniques for Predictions
As pointed out in Sect. 6.5.2, techniques infusing background knowledge captured through data-aware process specifications inside predictive monitoring techniques is an interesting, completely open challenge. Neurosymbolic approaches are being increasingly researched in artificial intelligence to provide a solid foundational and algorithmic basis for this type of integration, but only few approaches are dealing with event and trace data. A key, technical question in this space concerns the logical representations of data-aware processes and their traces, as a common approach to support this integration is to move from a boolean/crisp setting to a fuzzy/probabilistic one. An initial step to realise this shift in a process mining setting is given in [56], where conformance checking of \({\textsf {LTL}_f}\) specifications on fuzzy event logs is studied - without covering at all the data perspective.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Our product recommendations
KI - Künstliche Intelligenz
The Scientific journal "KI – Künstliche Intelligenz" is the official journal of the division for artificial intelligence within the "Gesellschaft für Informatik e.V." (GI) – the German Informatics Society - with constributions from troughout the field of artificial intelligence.
In the conclusions, we provide some insights on the connection with so-called object-centric processes where process executions simultaneously operate over multiple, interrelated objects.
Actually, in [77] \(\textsf {data-}{\textsf {LTL}_f}\) comes with no lookahead variables, which as explained next are used to relate different assignments over time. As commented later, while lookahead variables are essential when dealing with monitoring and declarative conformance checking, they are irrelevant in the context of verification, where lookahead can be expressed on the level of the process.
This property is called bounded lookback and generalises the feedback freedom property previously introduced in the context of artifact-centric processes with arithmetic dependencies [45].
van der Aalst WMP (2016) Process mining: data science in action. Springer
6.
van der Aalst WMP (2022) Process mining: a 360 degree overview. In: van der Aalst WMP, Carmona J (eds) Process mining handbook. Lecture notes in business information processing, vol. 448. Springer, pp 3–34
7.
van der Aalst WMP (2023) Twin transitions powered by event data - using object-centric process mining to make processes digital and sustainable. In: Gomes L, Leitão P, Lorenz R, van der Werf JMEM, van Zelst SJ (eds.) Joint proceedings of the workshop on algorithms & theories for the analysis of event data and the international workshop on petri nets for twin transition. CEUR workshop proceedings, vol. 3424. CEUR-WS.org
8.
van der Aalst WMP, Berti A (2020) Discovering object-centric petri nets. Fundam Informaticae 175(1–4):1–40MathSciNetCrossRef
9.
van der Aalst WMP, van Hee KM, ter Hofstede AHM, Sidorova N, Verbeek HMW, Voorhoeve M, Wynn MT (2011) Soundness of workflow nets: classification, decidability, and analysis. Formal Aspects Comput 23(3):333–363MathSciNetCrossRefMATH
10.
Adriansyah A (2014) Aligning observed and modeled behavior. Ph.D. thesis, Technische Universiteit Eindhoven
11.
Alman A, Maggi FM, Montali M, Patrizi F, Rivkin A (2023) A framework for modeling, executing, and monitoring hybrid multi-process specifications with bounded global-local memory. Inf Syst 119:102271CrossRef
12.
Artale A, Kovtunova A, Montali M, van der Aalst WMP (2019) Modeling and reasoning over declarative data-aware processes with object-centric behavioral constraints. In: Hildebrandt TT, van Dongen BF, Röglinger M, Mendling J (eds.) Proceedings of the 17th international conference on business process management (BPM 2019). Lecture notes in computer science, vol. 11675, Springer, pp 139–156
13.
Artale A, Mazzullo A, Ozaki A (2024) First-order temporal logic on finite traces: semantic properties, decidable fragments, and applications. ACM Trans Comput Log 25(2):131–1343MathSciNetCrossRef
14.
Bagheri Hariri B, Calvanese D, De Giacomo G, Deutsch A, Montali M (2013) Verification of relational data-centric dynamic systems with external services. In: Proceedings of PODS 2013, pp 163–174
15.
Baier C, Katoen J (2008) Principles of model checking. MIT Press
16.
Baral C, Giacomo GD (2015) Knowledge representation and reasoning: what’s hot. In: Bonet B, Koenig S (eds.) Proceedings of the twenty-ninth AAAI conference on artificial intelligence (AAAI 2015), AAAI Press , pp 4316–4317
17.
Barbosa H, Barrett CW, Brain M, Kremer G, Lachnitt H, Mann M, Mohamed A, Mohamed M, Niemetz A, Nötzli A, Ozdemir A, Preiner M, Reynolds A, Sheng Y, Tinelli C, Zohar Y (2022) cvc5: a versatile and industrial-strength SMT solver. In: Fisman D, Rosu G (eds) Proceedings of the 28th international conference on tools and algorithms for the construction and analysis of systems (TACAS 2022). Lecture notes in computer science, vol. 13243. Springer , pp 415–442
Barrett CW, Sebastiani R, Seshia SA, Tinelli C (2021) Satisfiability modulo theories. In: Biere A, Heule M, van Maaren H, Walsh T (eds.) Handbook of satisfiability - Second Edition. Frontiers in artificial intelligence and applications, vol. 336. IOS Press, pp 1267–1329
20.
Barrett CW, Tinelli C (2018) Satisfiability modulo theories. In: Handbook of model checking, pp. 305–343
21.
Batoulis K, Haarmann S, Weske M (2017) Various notions of soundness for decision-aware business processes. In: Mayr HC, Guizzardi G, Ma H, Pastor O (eds) Proceedings of the 36th international conference on conceptual modeling (ER 2017). Lecture notes in computer science, vol. 10650. Springer, pp 403–418
Bergami G, Maggi FM, Marrella A, Montali M (2021) Aligning data-aware declarative process models and event logs. In: Polyvyanyy A, Wynn MT, Van Looy A, Reichert M (eds) Proceedings of the 19th international conference on business process management (BPM 2021). Lecture notes in computer science, vol. 12875. Springer , pp. 235–251
24.
Berti A, Montali M, van der Aalst WMP (2023) Advancements and challenges in object-centric process mining: a systematic literature review. CoRR abs/2311.08795https://doi.org/10.48550/ARXIV.2311.08795
25.
Besold TR, d’Avila Garcez AS, Bader S, Bowman H, Domingos PM, Hitzler P, Kühnberger K, Lamb LC, Lima PMV, de Penning L, Pinkas G, Poon H, Zaverucha G (2021) Neural-symbolic learning and reasoning: a survey and interpretation. In: Hitzler P, Sarker MK (eds) Neuro-symbolic artificial intelligence: the state of the art. Frontiers in artificial intelligence and applications, vol. 342. IOS Press, pp. 1–51
26.
Boltenhagen M, Chatain T, Carmona J (2019) Encoding conformance checking artefacts in SAT. In: Proc. business process management workshops 2019, LNCS, vol. 362, pp 160–171 https://doi.org/10.1007/978-3-030-37453-2_14
Bose RPJC, van der Aalst WMP (2012) Process diagnostics using trace alignment: opportunities, issues, and challenges. Inf Syst 37(2):117–141CrossRef
29.
Buliga A, Di Francescomarino C, Ghidini C, Donadello I, Maggi FM (2024) Guiding the generation of counterfactual explanations through temporal background knowledge for predictive process monitoring. CoRR abs/2403.11642https://doi.org/10.48550/ARXIV.2403.11642
30.
Buliga A, Di Francescomarino C, Ghidini C, Montali M, Ronzani M (2025) Generating counterfactual explanations under temporal constraints. In: Proceedings of the 39th AAAI conference on artificial intelligence (AAAI 2025). AAAI Press. To appear
31.
Burattin A, Maggi FM, Sperduti A (2016) Conformance checking based on multi-perspective declarative process models. Expert Syst Appl 65:194–211CrossRef
32.
Calvanese D, de Giacomo G, Montali M (2013) Foundations of data-aware process analysis: a database theory perspective. In: Proc. 32nd PODS, pp. 1–12
33.
Calvanese D, Ghilardi S, Gianola A, Montali M, Rivkin A (2018) Verification of data-aware processes via array-based systems (extended version). CoRR abs/1806.11459. http://arxiv.org/abs/1806.11459
34.
Calvanese D, Ghilardi S, Gianola A, Montali M, Rivkin A (2019) Formal modeling and SMT-based parameterized verification of data-aware BPMN. In: Proceeding of BPM 2019. LNCS, vol. 11675. Springer, pp 157–175
35.
Calvanese D, Ghilardi S, Gianola A, Montali M, Rivkin A (2019) Verification of data-aware processes: challenges and opportunities for automated reasoning. In: Proceedings of ARCADE 2019, vol. 311. EPTCS
Calvanese D, Ghilardi S, Gianola A, Montali M, Rivkin A (2021) Model completeness, uniform interpolants and superposition calculus (with applications to verificaton of data-aware processes). J Autom Reason 65(7):941–969. https://doi.org/10.1007/S10817-021-09596-XCrossRef
Calvanese D, Giacomo GD, Montali M, Patrizi F (2022) Verification and monitoring for first-order LTL with persistence-preserving quantification over finite and infinite traces. In: Raedt LD (ed.) Proceedings of the 31st international joint conference on artificial intelligence (IJCAI 2022), pp. 2553–2560. ijcai.org
40.
Chatain T, Carmona J (2016) Anti-alignments in conformance checking – the dark side of process models. In: Proc. 37th Petri Nets, LNCS, vol. 9698, pp. 240–258. https://doi.org/10.1007/978-3-319-39086-4_15
41.
Ciccio CD, Maggi FM, Montali M, Mendling J (2017) Resolving inconsistencies and redundancies in declarative process models. Inf Syst 64:425–446CrossRef
42.
Ciccio CD, Montali M (2022) Declarative process specifications: reasoning, discovery, monitoring. In: van der Aalst WMP, Carmona J (eds) Process mining handbook. Lecture notes in business information processing, vol. 448. Springer, pp. 108–152
43.
Cimatti A, Griggio A, Schaafsma BJ, Sebastiani R (2013) The mathsat5 SMT solver. In: Proceedings of TACAS 2013. LNCS, vol. 7795. Springer, pp. 93–107
44.
Corea C, Kuhlmann I, Thimm M, Grant J (2024) Paraconsistent reasoning for inconsistency measurement in declarative process specifications. Inf Syst 122:102347CrossRef
45.
Damaggio E, Deutsch A, Vianu V (2012) Artifact systems with data dependencies and arithmetic. ACM Trans Database Syst 37(3):1–36CrossRef
46.
De Giacomo G, De Masellis R, Maggi FM, Montali M (2022) Monitoring constraints and metaconstraints with temporal logics on finite traces. ACM Trans Softw Eng Methodol 31(4):1–44CrossRef
47.
De Giacomo G, Maggi FM, Marrella A, Patrizi F (2017) On the disruptive effectiveness of automated planning for ltlf-based trace alignment. In: Singh S, Markovitch S (eds) Proceedings of the Thirty-First AAAI conference on artificial intelligence (AAAI 2017). AAAI Press, pp. 3555–3561
48.
de Giacomo G, Vardi MY (2013) Linear temporal logic and linear dynamic logic on finite traces. In: Proc. 23rd IJCAI, pp. 854–860
49.
de Leoni M, Felli P, Montali M (2018) A holistic approach for soundness verification of decision-aware process models. In: Proceedings of ER 2018. LNCS, vol. 11157. Springer, pp. 219–235
50.
de Moura L, Bjørner N (2008) Z3: an efficient SMT solver. In: Proc. 14th TACAS. LNCS, vol. 4963, pp. 337–340
51.
de Moura L, Bjørner N (2008) Z3: an efficient SMT solver. In: Proceedings of TACAS 2008. LNCS, vol. 4963. Springer, pp 337–340
52.
Demri S, Lazic R (2009) LTL with the freeze quantifier and register automata. ACM Trans Comput Log 10(3):1–30MathSciNetCrossRef
53.
Deutsch A, Hull R, Li Y, Vianu V (2018) Automatic verification of database-centric systems. ACM SIGLOG News 5(2):37–56CrossRef
54.
Deutsch A, Li Y, Vianu V (2019) Verification of hierarchical artifact systems. ACM Trans Database Syst 44(3):1–68MathSciNetCrossRef
55.
Di Francescomarino C, Ghidini C (2022) Predictive process monitoring. In: van der Aalst WMP, Carmona J (eds) Process Mining Handbook. Lecture notes in business information processing, vol 448. Springer, pp 320–346
56.
Donadello I, Felli P, Innes C, Maggi FM, Montali M (2024) Conformance checking of fuzzy logs against declarative temporal specifications. In: Marrella A, Resinas M, Jans M, Rosemann M (eds) Proceedings of the 22nd international conference on business process management (BPM 2024). Lecture notes in computer science. vol 14940. Springer, pp 39–56
Felli P, Gianola A, Montali M, Rivkin A, Winkler S (2021) CoCoMoT: conformance checking of multi-perspective processes via SMT. In: Proceedings of BPM 2021. LNCS, vol 12875. Springer, pp 217–234
59.
Felli P, Gianola A, Montali M, Rivkin A, Winkler S (2022) Conformance checking with uncertainty via SMT. In: Ciccio CD, Dijkman RM, del-Rio-Ortega A, Rinderle-Ma S (eds) Proceedings of 20th international conference on business process management (BPM 2022). Lecture notes in computer science, vol 13420. Springer, pp 199–216
Felli P, de Leoni M, Montali M (2019) Soundness verification of decision-aware process models with variable-to-variable conditions. In: Proceedings of ACSD 2019. IEEE, pp 82–91
63.
Felli P, de Leoni M, Montali M (2019) Soundness verification of decision-aware process models with variable-to-variable conditions. In: Proc. 19th ACSD. IEEE, pp 82–91
64.
Felli P, Montali M, Patrizi F, Winkler S (2023) Monitoring arithmetic temporal properties on finite traces. In: Williams B, Chen Y, Neville J (eds) Proceedings of the 37th AAAI conference on artificial intelligence (AAAI 2023). AAAI Press, pp 6346–6354. https://doi.org/10.1609/AAAI.V37I5.25781
65.
Felli P, Montali M, Winkler S (2022) CTL\(^{\text{*}}\) model checking for data-aware dynamic systems with arithmetic. In: Blanchette J, Kovács L, Pattinson D (eds) Proceedings of the 11th international joint conference on automated reasoning (IJCAR 2022). Lecture notes in computer science, vol 13385. Springer, pp 36–56
66.
Felli P, Montali M, Winkler S (2022) Linear-time verification of data-aware dynamic systems with arithmetic. In: Proceedings of the 36th AAAI conference on artificial intelligence (AAAI 2022). AAAI Press, pp 5642–5650
67.
Felli P, Montali M, Winkler S (2022) Soundness of data-aware processes with arithmetic conditions. In: Franch X, Poels G, Gailly F, Snoeck M (eds) Proceedings of the 34th international conference on advanced information systems engineering (CAiSE 2022). Lecture notes in computer science, vol 13295. Springer, pp 389–406
68.
Felli P, Montali M, Winkler S (2023) Repairing soundness properties in data-aware processes. In: Proceedings of the 5th international conference on process mining (ICPM 2023). IEEE, pp 41–48. https://doi.org/10.1109/ICPM60904.2023.10271969
69.
Gal A (2023) Everything there is to know about stochastically known logs. In: Proceedings of the 5th international conference on process mining (ICPM 2023). IEEE, pp. xvii–xxiii
70.
Geatti L, Gianola A, Gigante N (2022) Linear temporal logic modulo theories over finite traces. In: L.D. Raedt (ed) Proceedings of the 31st international joint conference on artificial intelligence (IJCAI 2022). ijcai.org, pp 2641–2647
71.
Geatti L, Gianola A, Gigante N, Winkler S (2023) Decidable fragments of LTL\(_{\text{f }}\) modulo theories. In: Gal K, Nowé A, Nalepa GJ, Fairstein R, Radulescu R (eds) Proceedings of ECAI 2023 - 26th European conference on artificial intelligence, Frontiers in Artificial Intelligence and Applications, vol 372. IOS Press, pp. 811–818. https://doi.org/10.3233/FAIA230348
72.
Ghilardi S, Gianola A, Montali M, Rivkin A (2021) Delta-BPMN: a concrete language and verifier for data-aware BPMN. In: Proceedings of BPM 2021. LNCS, vol 12875. Springer, pp 179–196
Ghilardi S, Gianola A, Montali M, Rivkin A (2023) Safety verification and universal invariants for relational action bases. In: Proceedings of the 32nd international joint conference on artificial intelligence (IJCAI 2023). ijcai.org, pp. 3248–3257
Gianola A, Montali M, Winkler S (2023) Linear-time verification of data-aware processes modulo theories via covers and automata (extended version). CoRR abs/2310.12180https://doi.org/10.48550/ARXIV.2310.12180
77.
Gianola A, Montali M, Winkler S (2024) Linear-time verification of data-aware processes modulo theories via covers and automata. In: Wooldridge MJ, Dy JG, Natarajan S (eds) Proceedings of the 38th AAAI conference on artificial intelligence (AAAI 2024). AAAI Press, pp. 10525–10534
78.
Gianola A, Montali M, Winkler S (2024) Object-centric conformance alignments with synchronization. In: Guizzardi G, Santoro FM, Mouratidis H, Soffer P (eds) Proceedings of the 36th international conference on advanced information systems engineering (CAiSE 2024). Lecture notes in computer science, vol 14663. Springer, pp 3–19
79.
Leno V, Dumas M, Maggi FM, La Rosa M, Polyvyanyy A (2020) Automated discovery of declarative process models with correlated data conditions. Inf Syst 89:101482CrossRef
80.
de Leoni M, Felli P, Montali M (2018) A holistic approach for soundness verification of decision-aware process models. In: 37th ER. LNCS, vol 11157, pp 219–235
81.
de Leoni M, Felli P, Montali M (2020) Strategy synthesis for data-aware dynamic systems with multiple actors. In: Proc. 17th KR, pp. 315–325
Li Y, Deutsch A, Vianu V (2017) VERIFAS: a practical verifier for artifact systems. PVLDB 11(3)
84.
Liss L, Adams JN, van der Aalst WMP (2023) Object-centric alignments. In: Almeida JPA, Borbinha J, Guizzardi G, Link S, Zdravkovic J (eds) Proceedings of the 42nd international conference on conceptual modeling (ER 2023). Lecture notes in computer science, vol 14320. Springer, pp 201–219
85.
Ly LT, Maggi FM, Montali M, Rinderle-Ma S, van der Aalst WMP (2015) Compliance monitoring in business processes: functionalities, application, and tool-support. Inf Syst 54:209–234CrossRef
86.
Maggi FM, Dumas M, García-Bañuelos L, Montali M (2013) Discovering data-aware declarative process models from event logs. In: Daniel F, Wang J, Weber B (eds) Proceedings of the 11th international conference on business process management (BPM 2013). Lecture notes in computer science, vol 8094. Springer, pp 81–96
87.
Maggi FM, Westergaard M, Montali M. van der Aalst WMP (2011) Runtime verification of ltl-based declarative process models. In: Khurshid S, Sen K (eds) Proceedings of the 2nd international conference on runtime verification (RV 2011). Lecture notes in computer science, vol 7186. Springer, pp 131–146
88.
Mannhardt F (2018) Multi-perspective process mining. Ph.D. thesis, Technical University of Eindhoven
89.
Mannhardt F, de Leoni M, Reijers HA, van der Aalst W (2016) Decision mining revisited: discovering overlapping rules. In: 28th CAiSE. LNCS, vol 9694, pp 377–392
90.
Mannhardt F, de Leoni M, Reijers H, van der Aalst W (2016) Balanced multi-perspective checking of process conformance. Computing 98(4):407–437MathSciNetCrossRefMATH
91.
Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP (2016) Balanced multi-perspective checking of process conformance. Computing 98(4):407–437MathSciNetCrossRefMATH
92.
Marra G, Dumancic S, Manhaeve R, De Raedt L (2024) From statistical relational to neurosymbolic artificial intelligence: a survey. Artif Intell 328:104062MathSciNetCrossRefMATH
Montali M (2024) AI for declarative processes: representation, mining, synthesis. In: Endriss U, Melo FS, Bach K, Bugarín Diz AJ, Alonso-Moral JM, Barro S, Heintz F (eds) Proceedings of the 27th European conference on artificial intelligence (ECAI 2024). Frontiers in artificial intelligence and applications, vol 392. IOS Press, pp 17–24
95.
Montali M, Chesani F, Mello P, Maggi FM (2013) Towards data-aware constraints in declare. In: Shin SY, Maldonado JC (eds) Proceedings of the 28th annual ACM symposium on applied computing (SAC ’13). ACM, pp 1391–1396
96.
Montali M, Winkler S (2024) Relating behaviour of data-aware process models. p. 102363
Polyvyanyy A, van der Werf JMEM, Overbeek S, Brouwers R (2019) Information systems modeling: language, verification, and tool support. In: Giorgini P, Weber B (eds) Advanced information systems engineering - 31st international conference, CAiSE 2019, Rome, Italy, June 3-7, 2019, Proceedings. Lecture notes in computer science, vol 11483. Springer, pp 194–212https://doi.org/10.1007/978-3-030-21290-2_13
99.
Reichert M (2012) Process and data: two sides of the same coin? In: OTM 2012. LNCS, vol 7565, pp 2–19
100.
Roveri M, Ciccio CD, Di Francescomarino C, Ghidini C (2024) Computing unsatisfiable cores for ltlf specifications. J Artif Intell Res 80:517–558CrossRef
Steinau S, Marrella A, Andrews K, Leotta F, Mecella M, Reichert M (2019) DALEC: a framework for the systematic evaluation of data-centric approaches to process management software. Softw Syst Model 18(4):2679–2716CrossRef
104.
Sturm C, Schönig S, Ciccio CD (2017) Distributed multi-perspective declare discovery. In: Clarisó R, Leopold H, Mendling J, van der Aalst WMP, Kumar A, Pentland BT, Weske M (eds) Proceedings of the BPM demo track and BPM dissertation award. CEUR workshop proceedings, vol 1920. CEUR-WS.org
105.
Umili E, Argenziano F, Capobianco R (2024) Neural reward machines. In: Endriss U, Melo FS, Bach K, Bugarín Diz AJ, Alonso-Moral JM, Barro S, Heintz F (eds) Proceedings of the 27th European conference on artificial intelligence (ECAI 2024). Frontiers in artificial intelligence and applications, vol 392. IOS Press, pp 3055–3062
106.
Umili E, Capobianco R, De Giacomo G (2023) Grounding ltlf specifications in image sequences. In: Marquis P, Son TC, Kern-Isberner G (eds) Proceedings of the 20th international conference on principles of knowledge representation and reasoning (KR 2023), pp 668–678
107.
Umili E, Licks GP, Patrizi F (2024) Enhancing deep sequence generation with logical temporal knowledge. In: Giacomo GD, Fionda V, Fournier F, Ielo A, Limonad L, Montali M (eds) Proceedings of the 3rd international workshop on process management in the AI Era (PMAI 2024). CEUR workshop proceedings, vol 3779. CEUR-WS.org, pp 23–34
108.
Zavatteri M, Bresolin D, de Leoni M (2023) Repair of unsound data-aware process models. In: De Weerdt J, Pufahl L (eds) Proceedings of the 2023 business process management workshops. Lecture notes in business information processing, vol 492. Springer, pp 383–395