Predicting process behaviour using deep learning
Introduction
Being able to predict the future behaviour of a business process is an important business capability [1]. As an application of predictive analytics in business process management, process prediction exploits data on past process instances to make predictions about current ones [2]. Example use cases are customer service agents responding to inquiries about the remaining time until a case is resolved, production managers predicting the completion time of a production process for better planning and higher utilization, or case managers identifying likely compliance violations to mitigate business risk.
We present a novel approach to predicting the next process event using deep learning. While the term “deep learning” has only recently become a popular research topic, it is essentially an application of neural networks and thus looks back on a long history of research [3]. Recent innovations both in algorithms, allowing novel architectures of neural networks, and computing hardware, especially GPU processing, have led to a resurgence in interest for neural networks and popularized the term “deep learning” [4]. Our approach is motivated by applications of neural networks to natural language processing(NLP), more specifically the prediction of the next word in a sentence [5], [6], [7]. By interpreting process event logs as text, process traces as sentences, and process events as words, these techniques can be applied to predict future process events. The contribution of our research is threefold:
- 1.
We improve on the state-of-the-art in process event prediction. Our results show our method has considerably better precision on next-event prediction.
- 2.
We demonstrate that an explicit process model is not necessary for prediction. Deep learning models, where the process structure is only implicitly reflected, can perform as well as explicit process models.
- 3.
We contribute to process management in general by showcasing the useful application of an artificial intelligence approach, illustrating that business process management can benefit from the application of smart approaches.
Our research is located at the intersection of business process management, in particular process mining, and artificial intelligence (AI) and machine learning. We bring together historic process data with an AI learning technology to leverage real-time case management, opening new perspectives into process execution, monitoring, and analysis. Extending existing solutions to novel problems (“exaptation”) is a recognized and valid way to make a contribution in design science [8], which is the research approach we apply here. We not only provide a new approach, rooted in AI, to predicting the next process event, but also give a proof-of-concept regarding its feasibility and experimentally explore its efficiency and effectiveness, thus making a valuable contribution to the field of “Smart BPM”.
This paper is a significant extension over earlier work [9], adding more advanced neural network cells, separation of training and validation samples for cross-validation to prevent overfitting, empirical assessment of the effect of different neural network parameters, prediction not only of next events but of case remainders, interpretation and visualization of neural network states, encoding of timing information, and an extended discussion of the similarities and differences between natural language processing and process event prediction.
Section snippets
Related work
Process prediction covers an array of different techniques, objectives, and data sources. It extends process mining from a post-hoc analysis method to operational decision support [10]. Most existing process prediction research focuses on prediction of process outcomes, primarily the remaining time to completion, rather than prediction of the next event in a process, as we do here. Only five approaches are concerned with predicting the next event [2], [11], [12], [13], [14], all of which use an
Implementation
A number of software frameworks for deep-learning have become available recently [26]. We implement our approach using Tensorflow as it provides a suitable level of abstraction, provides RNN specific functionality, and can be used on high-performance parallel, cluster, and GPU computing platforms. Code, data and complete results are available from the corresponding author.3
Our network features an architecture as in Fig. 1 with two hidden RNN layers using
Data
To provide a compelling evaluation of our approach, it should be compared to the state-of-the-art in next event prediction. Of the related work discussed in Section 2, only Breuker et al. [2] make an implementation publicly available and use publicly available data, demonstrating “open research” [28]. We contacted all authors of the remaining papers twice, but did not receive software or data to use for comparative evaluation.
We use the same datasets as Breuker et al. [2]. The BPI Challenge
Experimental results
We train the RNN for 100 epochs on each training dataset. Fig. 6 plots training and validation precision for each epoch for a selection of our datasets (averaged across all 10 training folds). The plot shows that 100 training epochs are sufficient for optimal and stable results. The small BPI 2012 A and O datasets converge quickly to a high precision, whereas this occurs more gradually for the BPI 2013 datasets. Moreover, the BPI 2012 A, BPI 2012 O, and BPI 2013 Problem datasets with relatively
Discussion and conclusion
This paper introduced the use of deep learning for process prediction. Our approach does not rely on explicit process models and can be applied when models do not exist or are difficult to obtain. Our results, surpassing or close to the state-of-the-art and with cross-validated precision in excess of 80% on many problems, demonstrate the feasibility and usefulness of this approach.
While one can perform prediction by mining a model from event logs, and mining decision rules for each process
Acknowledgments
The authors gratefully acknowledge the support of the Memorial University Center for Health Informatics and Analytics, St. John's, Canada and the Hasso-Plattner-Institute at the University of Potsdam, Germany, in providing access to computing resources.
Dr. Joerg Evermann received his PhD in Information Systems from the University of British Columbia. Prior to becoming a faculty member at Memorial University, Dr. Evermann was a lecturer in Information Systems with the School of Information Management at the University of Wellington, New Zealand. Dr. Evermann's interests are in business process management, statistical research methods, and information integration. Dr. Evermann has published his research in more than 70 peer-reviewed
References (41)
Deep learning in neural networks: an overview
Neural Netw.
(2015)- et al.
BPM-in-the-large - towards a higher level of abstraction in business process management
- et al.
Comprehensible predictive models for business processes
MIS Q.
(2016) - et al.
Deep learning
Nature
(2015) - et al.
Generating text with recurrent neural networks
Generating sequences with recurrent neural networks
CoRR abs/1308.
(2013)- et al.
Recurrent neural network regularization
CoRR abs/1409.
(2014) - et al.
Positioning and presenting design science research for maximum impact
MIS Q.
(2013) - et al.
A deep learning approach for predicting process behaviour at runtime
- et al.
Beyond process mining: from the past to present and future
Completion time and next activity prediction of processes using sequential pattern mining
A Markov prediction model for data-driven semi-structured business processes
Knowl. Inf. Syst.
A hybrid model for business process event prediction
Leveraging path information to generate predictions for parallel business processes
Knowl. Inf. Syst.
Business Process Intelligence 2012 Challenge Data Set
Business Process Intelligence 2013 Challenge Data Set (Closed Problems)
Business Process Intelligence 2013 Challenge Data Set (Incident Management)
Supervised Sequence Labelling With Recurrent Neural Networks
Long short-term memory
Neural Comput.
Cited by (0)
Dr. Joerg Evermann received his PhD in Information Systems from the University of British Columbia. Prior to becoming a faculty member at Memorial University, Dr. Evermann was a lecturer in Information Systems with the School of Information Management at the University of Wellington, New Zealand. Dr. Evermann's interests are in business process management, statistical research methods, and information integration. Dr. Evermann has published his research in more than 70 peer-reviewed publications. His work has appeared in high-quality journals, such as IEEE Transactions on Services Computing, IEEE Transactions on Software Engineering, IEEE Transactions on Knowledge and Data Engineering, Organizational Research Methods, Structural Equation Modeling, Journal of the AIS, Information Systems, and Information Systems Journal. Dr. Evermann has presented his work at international conferences and workshops, such as ICIS, AMCIS, CAiSE, ER, among others.
Jana-Rebecca Rehse works as a researcher at the Institute for Information Systems (IWi) at the German Research Center for Artificial Intelligence (DFKI). Before that, she was a research assistant at the same institute since 2011. Jana obtained a Bachelor's Degree in Information Systems from Saarland University, Saarbrcken, Germany in 2012 and a consecutive Master's Degree in 2015. In 2014, she spent six months as a visiting research scholar at Stevens Institute of Technology in Hoboken, NJ, where she conducted research for her Master's thesis. Jana's research interests include Business Process Management, in particular Reference Modeling, Process Mining and Design Science. She is interested in finding algorithmic solutions to practically relevant problems. The findings from her research have been published in outlets such as the International Journal on Software and Systems Modeling (SoSyM) and in various conference proceedings (e.g. WI, ECIS, and BPM Workshops).
Peter Fettke works as a professor for Business Informatics at Saarland University and is a principal researcher at the German Research Center for Artificial Intelligence (DFKI), both Saarbrcken, Germany. He is conducting research in the field of Business Informatics/Information Systems, an important, relatively new discipline at the intersection of Computer Science and Business Administration. His research interests focus on business process management and technologies and include business information systems modeling, business engineering, applications, and philosophy of information systems. He uses a broad spectrum of research methods comprising engineering methods/design science and empirical/experimental research approaches. Peter obtained a Master's Degree in Business Informatics from the University of Mnster, Germany, a Ph.D. Degree in Business Informatics from the Johannes Gutenberg-University Mainz, Germany, and a Habilitation Degree in Business Informatics from Saarland University, Germany. In 2013 he became a DFKI Research Fellow. Peter has taught and researched previously at the Technical University of Chemnitz and the University Mainz, Germany.