Detection and prediction of errors in EPCs of the SAP reference model
Introduction
There has been extensive work on formal foundations of conceptual process modeling and respective languages. However, little quantitative research has been reported on the actual use of conceptual modeling in practice [1]. Moreover, literature typically discusses and analyzes languages rather than evaluating enterprise models at a larger scale (i.e., beyond “toy examples”). A fundamental problem in this context is that large enterprise models are in general not accessible for research as they represent valuable company knowledge that enterprises do not want to reveal. In particular, this problem affects research on reference models, i.e., models that capture generic design that is meant to be reused as best practice recommendation in future modeling projects. Accordingly, it is so far neither clear how many errors can be expected in real-life business process models; nor is it clear why modelers introduce errors in process models.
One case of a model that is, at least partially, publicly available is the SAP reference model. It has been described in [2], [3] and is referred to in many research papers (see e.g. [4], [5], [6], [7], [8]). The SAP reference model was meant to be used as a blueprint for roll-out projects of SAP’s ERP system. It reflects Version 4.6 of SAP R/3 which was marketed in 2000. The extensive database of this reference model contains almost 10,000 sub-models, several of them EPC business process models [2], [9], [3]. Building on recently developed techniques to verify the formal correctness of EPC models as reported in [10], we aim to acquire knowledge about how many formal modeling errors can be expected in a large repository of process models in practice, assuming that the SAP reference model can be regarded as a representative example. We will map all EPCs in the SAP reference model onto YAWL models [11] and use the WofYAWL tool [10] as a means to verify their correctness using the relaxed soundness criterion [12], [13]. In a relaxed sound process there is a proper execution sequence for every element, but a proper completion is not guaranteed. We have to stress that this analysis yields a lower bound for errors since there are process models that are relaxed sound but not correct against the more restrictive soundness criterion [14]. To be more concise, our analysis covers only formal control flow errors that affect relaxed soundness. Beyond verification of formal correctness, a process model must also be validated to make sure that all real-world scenarios are handled as expected [15]. Since WofYAWL cannot check whether real-world processes are modeled appropriately, validation is not subject of our analysis. As a consequence, it has to be expected that there are more errors than those that we actually identify using the WofYAWL verification approach.
It is a fundamental insight of software engineering that errors should be detected as early as possible in order to minimize development cost (see e.g. [16], [17]). Therefore, it is important to understand why and in which circumstances errors occur. Several research in software engineering was conducted on complexity metrics as determinants for errors (see e.g. [18], [19], [20], [21], [22]). A similar hypothesis that complexity is a driver for errors has recently be formulated in [23] in the context of business process modeling. Yet, there is no evidence to support it. Even measuring complexity of business processes is still too little understood. We will use the sample of the 604 EPC business process models of the SAP reference model to test whether errors in terms of relaxed soundness can be statistically explained by complexity metrics.
The remainder of this article is organized as follows. Section 2 describes the design of our quantitative study. In particular, we discuss the mapping of EPCs from the SAP reference model to YAWL models, the analysis techniques employed by WofYAWL, and the identification of how the models can be corrected. In Section 3 we focus on the analysis of the EPCs in the SAP reference model. First, we calculate descriptive statistics that allow us to get a comprehensive inventory of errors in the SAP reference model. Secondly, we investigate the hypothesis that more complicated models have more errors. This hypothesis was suggested in [23], and we analyze it using different complexity measures and by testing whether they are able to explain the variance of errors, i.e. how errors are distributed across EPCs with different measures. The results allow us to conclude which complexity metrics are well suited to explain error variance and that the impact of complexity on error probability is significant. Subsequently, we discuss our findings in the light of related research (Section 4) and conclude with a summary of our contribution and its limitations (Section 5).
Section snippets
Detection of errors in EPCs
In this section, we present the way we evaluated the SAP reference model. In Section 2.1, we start with an introduction to EPCs by the help of an example that we also use to illustrate the verification. As an input for the different analysis steps, we use the ARIS1 XML export of the reference model (see Fig. 1). In a first step, the EPC to YAWL transformation program generates a YAWL XML file for each EPC in the
Prediction of errors in the SAP reference model
Using the approach depicted in Fig. 1 we analyze the SAP reference model. First of all, we locate the parts of the reference model where errors occur most frequently (Section 3.1). Second, in Section 3.2, we formulate hypotheses relating correctness to properties of the EPC (e.g., larger models are more likely to contain errors). Finally, we test these hypotheses using logistic regression (Section 3.3).
Related research
This section discusses the work that is most related for the research areas verification (Section 4.1) and quantitative analysis in process modeling (Section 4.2).
Contributions and limitations
In this article, we presented an approach to automatically identify errors in the SAP reference model. This formal analysis builds on a mapping from EPCs to YAWL and the analysis tool WofYAWL. It is one of the few studies using formal methods for quantitative research. We provided an in-depth analysis of errors in the SAP reference model which yields a lower bound for the number of errors (5.6% of the 604 EPCs). As far as we know, this is the first systematic analysis of the EPCs in the SAP
J. Mendling is a postdoctoral research fellow at the BPM Cluster in the Faculty of Information Technology at Queensland University of Technology, Brisbane, Australia. He received a PhD degree from the Vienna University of Economics and Business Administration, Austria. His research interests include business process management, enterprise modeling, and workflow standardization. He is co-author of the EPC Markup Language (EPML) and co-organizer of the XML4BPM workshop series. He has published
References (52)
- et al.
How do practitioners use conceptual modeling in practice?
Data & Knowledge Engineering
(2006) - et al.
A configurable reference modelling language
Information Systems
(2007) - et al.
YAWL: yet another workflow language
Information Systems
(2005) On the semantics of EPCs: resolving the vicious circle
Data and Knowledge Engineering
(2006)- et al.
Workflow support for electronic commerce applications
Decision Support Systems
(2002) - et al.
Verification of workflow task structures: a petri-net-based approach
Information Systems
(2000) Formalization and verification of event-driven process chains
Information and Software Technology
(1999)- et al.
SAP R/3 Business Blueprint: Understanding the Business Process Reference Model
(1997) - et al.
SAP(R) R/3 Process Oriented Implementation: Iterative Process Prototyping
(1998) - et al.
Classification of reference models – a methodology and its application
Information Systems and e-Business Management
(2003)
Workflow-supported organizational memory systems: an industrial application
Tool support for the collaborative design of reference models – a business engineering perspective
Verifying workflows with cancellation regions and or-joins: an approach based on relaxed soundness and invariants
The Computer Journal
Relaxed soundness of business processes
Bridging the gap between business models and workflow specifications
International Journal of Cooperative Information System
On the suitability of correctness criteria for business process models
Research commentary: workflow management issues in e-business
Information Systems Research
Software Engineering Economics
Research commentary: information systems and conceptual modeling – a research agenda
Information Systems Research
A complexity measure
IEEE Transactions on Software Engineering
Design complexity measurement and testing
Communications of the ACM
Elements of software science
Software structure metrics based on information-flow
IEEE Transactions On Software Engineering
Quantitative analysis of faults and failures in a complex software system
IEEE Transactions on Software Engineering
Cited by (154)
Cost-efficient auto-scaling of container-based elastic processes
2023, Future Generation Computer SystemsCitation Excerpt :Internally, CPLEX makes use of different (simplex) algorithms. We perform our evaluation using a subset of the SAP reference model [20], which has been used for multiple scientific papers, e.g., [21], and provides a solid foundation for our evaluations. From the around 600 process models in the reference model, we select ten models with different process patterns and varying levels of complexity, including sequences, XOR-blocks, AND-blocks, and repeat loops.
The role of artificial intelligence in business transformation: A case of pharmaceutical companies
2021, Technology in SocietySTRATFram: A framework for describing and evaluating elasticity strategies for service-based business processes in the cloud
2019, Future Generation Computer Systems
J. Mendling is a postdoctoral research fellow at the BPM Cluster in the Faculty of Information Technology at Queensland University of Technology, Brisbane, Australia. He received a PhD degree from the Vienna University of Economics and Business Administration, Austria. His research interests include business process management, enterprise modeling, and workflow standardization. He is co-author of the EPC Markup Language (EPML) and co-organizer of the XML4BPM workshop series. He has published several international journal and conference papers, and served in several program committees. He holds a diploma degree both in business computer science and in business administration from the University of Trier, Germany.
H.M.W. Verbeek is a scientific engineer at the Information Systems group of the department of Mathematics and Computer Science at the Technische Universiteit Eindhoven, where he also received his PhD. His research interests include workflow management, business process management, and process verification.
B.F. van Dongen is a postdoctoral in the Information Systems group of the Department of Mathematics and Computer Science of Eindhoven University of Technology, Eindhoven, The Netherlands. He received his Ph.D. in 2007, after successfully defending his thesis entitled “Process Mining and Verification”. Currently, his research interests extend from process mining and process verification to supporting flexible processes and visualization of research results. Furthermore, he plays an important role in the development of the open-source process mining framework ProM, freely available from www.processmining.org.
W.M.P. van der Aalst is a full professor of Information Systems at the Technische Universiteit Eindhoven (TU/e) having a position in both the Department of Mathematics and Computer Science and the department of Technology Management. Currently he is also an adjunct professor at Queensland University of Technology (QUT) working within the BPM group. His research interests include workflow management, process mining, Petri nets, business process management, process modeling, and process analysis. He has published more than 60 journal papers, 10 books (as author or editor), 150 refereed conference publications, and 20 book chapters. He has been a co-chair of many conferences including the International Conference on Cooperative Information Systems, the International conference on the Application and Theory of Petri Nets, and the Business Process Management conference, and is an editor/member of the editorial board of several journals, including the Business Process Management Journal, the International Journal of Business Process Integration and Management, the International Journal on Enterprise Modelling and Information Systems Architectures, and Computers in Industry.
G. Neumann is Chair of Information Systems and New Media at the University of Economics and Business Administration (WU) in Vienna, Austria. Before joining WU he was Chair of the department of Information Systems and Software Techniques at the University of Essen. Gustaf Neumann is native of Vienna, Austria. He joined the faculty of WU in 1983 as Assistant Professor at the MIS department and served as head of the research group for Logic Programming and Intelligent Information Systems. Before becoming a full professor at the University of Essen, he was working for 5 years as a scientist at IBM’s T.J. Watson Research Center in Yorktown Heights, NY, in the field of deductive databases and object orientation. Gustaf Neumann has received several research awards and published books and papers in the areas of program transformation, data modeling, and information systems technology. He has developed several widely used open source products and is author of the scripting language XOTcl. He is/was the scientific lead heading of several EC IST projects and member of the Steering Board of the Network of Excellence ProLearn. He is as well heading the Learn@WU project, which is one of the most intensively used e-learning platforms worldwide.