Skip to main content

2019 | Buch

Advanced Information Systems Engineering

31st International Conference, CAiSE 2019, Rome, Italy, June 3–7, 2019, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 31st International Conference on Advanced Information Systems Engineering, CAiSE 2019, held in Rome, Italy, in June 2019.

The 41 full papers presented in this volume were carefully reviewed and selected from 206 submissions. The book also contains one invited talk in full paper length.
The papers were organized in topical sections named: information system engineering; requirements and modeling; data modeling and analysis; business process modeling and engineering; information system security; and learning and mining in information systems.
Abstracts on the CAiSE 2019 tutorials can be found in the back matter of the volume.

Inhaltsverzeichnis

Frontmatter

Invited Keynote Talk

Frontmatter
Direct and Reverse Rewriting in Data Interoperability

Data interoperability refers to the issue of accessing and processing data from multiple sources in order to create more holistic and contextual information for improving data analysis, for better decision-making, and for accountability purposes. In the era towards a data-driven society, the notion of data interoperability is of paramount importance. Looking at the research work in the last decades, several types of data interoperability scenarios emerged, including the following.

Maurizio Lenzerini

Information System Engineering

Frontmatter
Efficient Engineering Data Exchange in Multi-disciplinary Systems Engineering

In the parallel engineering of industrial production systems, domain experts from several disciplines need to exchange data efficiently to prevent the divergence of local engineering models. However, the data synchronization is hard (a) as it may be unclear what data consumers need and (b) due to the heterogeneity of local engineering artifacts. In this paper, we introduce use cases and a process for efficient Engineering Data Exchange (EDEx) that guides the definition and semantic mapping of data elements for exchange and facilitates the frequent synchronization between domain experts. We identify main elements of an EDEx information system to automate the EDEx process. We evaluate the effectiveness and effort of the EDEx process and concepts in a feasibility case study with requirements and data from real-world use cases at a large production system engineering company. The domain experts found the EDEx process more effective and the EDEx operation more efficient than the traditional point-to-point process, and providing insight for advanced analyses.

Stefan Biffl, Arndt Lüder, Felix Rinker, Laura Waltersdorfer
Bing-CF-IDF+: A Semantics-Driven News Recommender System

With the ever growing amount of news on the Web, the need for automatically finding the relevant content increases. Semantics-driven news recommender systems suggest unread items to users by matching user profiles, which are based on information found in previously read articles, with emerging news. This paper proposes an extension to the state-of-the-art semantics-driven CF-IDF+ news recommender system, which uses identified news item concepts and their related concepts for constructing user profiles and processing unread news messages. Due to its domain specificity and reliance on knowledge bases, such a concept-based recommender neglects many highly frequent named entities found in news items, which contain relevant information about a news item’s content. Therefore, we extend the CF-IDF+ recommender by adding information found in named entities, through the employment of a Bing-based distance measure. Our Bing-CF-IDF+ recommender outperforms the classic TF-IDF and the concept-based CF-IDF and CF-IDF+ recommenders in terms of the $$F_1$$ F 1 -score and the Kappa statistic.

Emma Brocken, Aron Hartveld, Emma de Koning, Thomas van Noort, Frederik Hogenboom, Flavius Frasincar, Tarmo Robal
Methodological Framework to Guide the Development of Continual Evolution Methods

Companies live in a fast-changing environment imposing to constantly evolve in order to stay competitive. Such an evolution is carried out through continuous improvement cycles or radical changes often based on innovation that concern their products, their processes, their internal organization, etc. We refer to this situation as continual evolution. There are two implications of such continual evolution from our viewpoint: (a) the instillation of the “no end point” philosophy in organizations and (b) the use of methods based (1) on continual evolution cycles (by opposition to project-based approaches that have delimited budget and dates) and, (2) on autonomous and collective implication of the organization’s actors. This article presents a methodological framework, called As-Is/As-If framework to support method engineers in handling such continual evolution. The framework offers a process model and a product meta-model that are both reusable instruments, aiming to guide the construction of continual evolution methods. The process model and product meta-model can be seen as prototypical examples to be adapted in each situation at hand using heuristics proposed as part of the framework. The usefulness of the framework is illustrated through two methods adaptations.

Ornela Cela, Mario Cortes-Cornax, Agnès Front, Dominique Rieu
Inter-organizational Integration in the AEC/FM Industry
Exploring the “Addressed” and “Unaddressed” Information Exchange Needs Between Stakeholders

This paper explores how the needs to exchange information across organizational boundaries in the Architecture, Engineering and Construction, and Facilities Management industry in Finland have been satisfied by means of stakeholder integration at the technical, business and socio-organizational levels. We interviewed practitioners about their motivations and goals for inter-organizational integration and observed different discourses. The information exchange needs in the context of individual building projects were often described as “addressed”. These needs focused mainly on managing complex stakeholder relations or handling the variable conditions with other building projects. In the scope of the whole built environment lifecycle, the needs were rather portrayed as ongoing problems still “unaddressed”. Existing information sources remained inadequate when the benefits of inter-organizational integration had not yet been clarified. The process workflow discontinuities demanded better understanding of the value of information beyond design as well as better coordination. The uncertainty of how much data to collect and for what purposes can be mitigated by defining “useful minimum” information exchange between stakeholders.

José Carlos Camposano, Kari Smolander
A Lightweight Framework for Multi-device Integration and Multi-sensor Fusion to Explore Driver Distraction

Driver distraction is a major challenge in road traffic and major cause of accidents. Vehicle industry dedicates increasing amounts of resources to better quantify the various activities of drivers resulting in distraction. Literature has shown that significant causes for driver distraction are tasks performed by drivers which are not related to driving, like using multimedia interfaces or glancing at co-drivers. One key aspect of the successful implementation of distraction prevention mechanisms is to know when the driver performs such auxiliary tasks. Therefore, capturing these tasks with appropriate measurement equipment is crucial. Especially novel quantification approaches combining data from different sensors and devices are necessary for comprehensively determining causes of driver distraction. However, as a literature review has revealed, there is currently a lack of lightweight frameworks for multi-device integration and multi-sensor fusion to enable cost-effective and minimally obtrusive driver monitoring with respect to scalability and extendibility. This paper presents such a lightweight framework which has been implemented in a demonstrator and applied in a small real-world study involving ten drivers performing simple distraction tasks. Preliminary results of our analysis have indicated a high accuracy of distraction detection for individual distraction tasks and thus the framework’s usefulness. The gained knowledge can be used to develop improved mechanisms for detecting driver distraction through better quantification of distracting tasks.

Gernot Lechner, Michael Fellmann, Andreas Festl, Christian Kaiser, Tahir Emre Kalayci, Michael Spitzer, Alexander Stocker
Exhaustive Simulation and Test Generation Using fUML Activity Diagrams

The quality of the specifications used for test generation plays an important role in the quality of the generated tests. One approach to improve the quality of the UML specification is the use of executable models specified using the Foundational Subset for Executable UML Models (fUML) and the Action language for fUML (Alf). Due to their precise semantics, fUML and Alf models can be simulated or executed using an fUML execution engine. However, in order to execute the models exhaustively, one must provide input data required to reach and cover all essential elements not only in the graphical fUML models, but also in the textual Alf code associated with the graphical models. In this paper, we present an approach for exhaustive simulation and test generation from fUML activity diagrams containing Alf code. The proposed approach translates fUML activity diagrams and associated Alf code into equivalent Java code and then automatically generates: (1) input data needed to cover or execute all paths in the executable fUML and Alf models and (2) test cases and test oracle (expected output) for testing the actual implementation of the system under development. We also present a tool chain and demonstrate our proposed approach with the help of an example.

Junaid Iqbal, Adnan Ashraf, Dragos Truscan, Ivan Porres
A Block-Free Distributed Ledger for P2P Energy Trading: Case with IOTA?

Across the world, the organisation and operation of the electricity markets is quickly changing, moving towards decentralised, distributed, renewables-based generation with real-time data exchange-based solutions. In order to support this change, blockchain-based distributed ledgers have been proposed for implementation of peer-to-peer energy trading platform. However, blockchain solutions suffer from scalability problems as well as from delays in transaction confirmation. This paper explores the feasibility of using IOTA’s DAG-based block-free distributed ledger for implementation of energy trading platforms. Our agent-based simulation research demonstrates that an IOTA-like DAG-based solution could overcome the constraints that blockchains face in the energy market. However, to be usable for peer-to-peer energy trading, even DAG-based platforms need to consider specificities of energy trading markets (such as structured trading periods and assured confirmation of transactions for every completed period).

Joon Park, Ruzanna Chitchyan, Anastasia Angelopoulou, Jordan Murkin
Profile Reconciliation Through Dynamic Activities Across Social Networks

Since today’s online social media serve diverse purposes such as social and professional networking, photo and blog sharing, it is not uncommon for people to have multiple profiles across different social networks. Finding or reconciling these profiles would allow the creation of a holistic view of different facets of a person’s life that can be used by recommender systems, human resource management, marketing activities and also raise awareness about the potential threats to one person’s privacy. In this paper, we propose a new approach for reconciling profiles based on their temporal activity (i.e., timestamped posts) shared across similar-scope social networks. The timestamped posts are compared by considering different dynamic attributes originating from what the user shares (geographical data, text, tags, and photos) and static attributes (username and real name). Our evaluation on Flickr and Twitter social networks datasets shows that the temporal activity is a good predictor of two profiles referring or not to the same user.

Suela Isaj, Nacéra Bennacer Seghouani, Gianluca Quercini

Requirements and Modeling

Frontmatter
Towards an Ontology-Based Approach for Eliciting Possible Solutions to Non-Functional Requirements

Requirements Engineering plays a crucial role in the software development process. Many works have pointed out that Non-Functional Requirements (NFRs) are critical to the quality of software systems. NFRs, also known as quality requirements, can be difficult to elicit due to their subjective diversity nature. In this paper, we introduce the QR Framework which uses an ontology-based approach to support the collection of knowledge on possible solutions to implement NFRs, semi-automatically connecting related NFRs. Preliminary search mechanisms are provided in a tool to facilitate the identification of possible solutions to an NFR and their related consequences to other solutions and/or other NFRs. To evaluate whether our approach aids eliciting NFRs, we conducted a controlled experiment performing a software development scenario. Our results suggest that reusing NFR knowledge can drive software engineers to obtain a closer to complete set of possible solutions to address quality concerns.

Rodrigo Veleda, Luiz Marcio Cysneiros
Using a Modelling Language to Describe the Quality of Life Goals of People Living with Dementia

Although now well established, our information systems engineering theories and methods are applied only rarely in disciplines beyond systems development. This paper reports the application of the i* goal modelling language to describe the types of and relationships between quality of life goals of people living with dementia. Published social care frameworks to manage and improve the lives of people with dementia were reviewed to synthesize, for the first time, a comprehensive conceptual model of the types of goals of people living with dementia. Although the quality of life goal model was developed in order to construct automated reasoning capabilities in a new digital toolset that people with dementia can use for life planning, the multi-stage modelling exercise provided valuable insights into quality of life and dementia care practices of both researchers and experienced practitioners in the field.

James Lockerbie, Neil Maiden
Multi-platform Chatbot Modeling and Deployment with the Jarvis Framework

Chatbot applications are increasingly adopted in various domains such as e-commerce or customer services as a direct communication channel between companies and end-users. Multiple frameworks have been developed to ease their definition and deployment. They typically rely on existing cloud infrastructures and artificial intelligence techniques to efficiently process user inputs and extract conversation information. While these frameworks are efficient to design simple chatbot applications, they still require advanced technical knowledge to define complex conversations and interactions. In addition, the deployment of a chatbot application usually requires a deep understanding of the targeted platforms, increasing the development and maintenance costs. In this paper we introduce the Jarvis framework, that tackles these issues by providing a Domain Specific Language (DSL) to define chatbots in a platform-independent way, and a runtime engine that automatically deploys the chatbot application and manages the defined conversation logic. Jarvis is open source and fully available online.

Gwendal Daniel, Jordi Cabot, Laurent Deruelle, Mustapha Derras
Information Systems Modeling: Language, Verification, and Tool Support

Information and processes are both important aspects of information systems. Nevertheless, most existing languages for modeling information systems focus either on one or the other. Languages that focus on information modeling often neglect the fact that information is manipulated by processes, while languages that focus on processes abstract from the structure of the information. In this paper, we present an approach for modeling and verification of information systems that combines information models and process models using an automated theorem prover. In our approach, set theory and first-order logic are used to express the structure and constraints of information, while Petri nets of a special kind, called Petri nets with identifiers, are used to capture the dynamic aspects of the systems. The proposed approach exhibits a unique balance between expressiveness and formal foundation, as it allows capturing a wide range of information systems, including infinite state systems, while allowing for automated verification, as it ensures the decidability of the reachability problem. The approach was implemented in a publicly available modeling and simulation tool and used in teaching of Information Systems students.

Artem Polyvyanyy, Jan Martijn E. M. van der Werf, Sietse Overbeek, Rick Brouwers
Expert2Vec: Experts Representation in Community Question Answering for Question Routing

Communities of Question Answering (CQAs) are rapidly growing communities for exchanging information in the form of questions and answers. They rely on the contributions of users (i.e., members of the community) who have appropriate domain knowledge and can provide helpful answers. In order to deliver the most appropriate and valuable answers, identification of such users (experts) is critically important. However, a common problem faced in CQAs is that of poor expertise matching, i.e., routing of questions to inappropriate users. In this paper, we focus on Stack Overflow (a programming CQA) and address this problem by proposing an embedding based approach that integrates users’ textual content obtained from the community (e.g., answers) and community feedback in a unified framework. Our embedding-based approach is used to find the best relevant users for a given question by computing the similarity between questions and our user expertise representation. Then, our framework exploits feedback from the community to rank the relevant users according to their expertise. We experimentally evaluate the performance of the proposed approach using Stack Overflow’s dataset, compare it with state-of-the-art models and demonstrate that it can produce better results than the alternative models.

Sara Mumtaz, Carlos Rodriguez, Boualem Benatallah
A Pattern Language for Value Modeling in ArchiMate

In recent years, there has been a growing interest in modeling value in the context of Enterprise Architecture, which has been driven by a need to align the vision and strategic goals of an enterprise with its business architecture. Nevertheless, the current literature shows that the concept of value is conceptually complex and still causes a lot of confusion. For example, we can find in the literature the concept of value being taken as equivalent to notions as disparate as goals, events, objects and capabilities. As a result, there is still a lack of proper support for modeling all aspects of value as well as its relations to these aforementioned notions. To address this issue, we propose in this paper a pattern language for value modeling in ArchiMate, which is based on the Common Ontology of Value and Risk, a well-founded reference ontology developed following the principles of the Unified Foundation Ontology. This enables us to delineate a clear ontological foundation, which addresses the ambiguous use of the value concept. The design of the Value Pattern Language will be guided by the Design Science Research Methodology. More specifically, a first iteration of the build-and-evaluate loop is presented, which includes the development of the pattern language and its demonstration by means of a case study of a low-cost airline.

Tiago Prince Sales, Ben Roelens, Geert Poels, Giancarlo Guizzardi, Nicola Guarino, John Mylopoulos
Paving Ontological Foundation for Social Engineering Analysis

System security analysis has been focusing on technology-based attacks, while paying less attention on social perspectives. As a result, social engineering are becoming more and more serious threats to socio-technical systems, in which human plays important roles. However, due to the interdisciplinary nature of social engineering, there is a lack of consensus on its definition, hindering the further development of this research field. In this paper, we propose a comprehensive and fundamental ontology of social engineering, with the purpose of prompting the fast development of this field. In particular, we first review and compare existing social engineering taxonomies in order to summarize the core concepts and boundaries of social engineering, as well as identify corresponding research challenges. We then define a comprehensive social engineering ontology, which is embedded with extensive knowledge from psychology and sociology, providing a full picture of social engineering. The ontology is built on top of existing security ontologies in order to align social engineering analysis with typical security analysis as much as possible. By formalizing such ontology using Description Logic, we provide unambiguous definitions for core concepts of social engineering, serving as a fundamental terminology to facilitate research within this field. Finally, our ontology is evaluated based on a collection of existing social engineering attacks, the results of which indicate good expressiveness of our ontology.

Tong Li, Yeming Ni
Improving Traceability Links Recovery in Process Models Through an Ontological Expansion of Requirements

Often, when requirements are written, parts of the domain knowledge are assumed by the domain experts and not formalized in writing, but nevertheless used to build software artifacts. This issue, known as tacit knowledge, affects the performance of Traceability Links Recovery. Through this work we propose LORE, a novel approach that uses Natural Language Processing techniques along with an Ontological Requirements Expansion process to minimize the impact of tacit knowledge on TLR over process models. We evaluated our approach through a real-world industrial case study, comparing its outcomes against those of a baseline. Results show that our approach retrieves improved results for all the measured performance indicators. We studied why this is the case, and identified some issues that affect LORE, leaving room for improvement opportunities. We make an open-source implementation of LORE publicly available in order to facilitate its adoption in future studies.

Raúl Lapeña, Francisca Pérez, Carlos Cetina, Óscar Pastor
Requirements Engineering for Cyber Physical Production Systems

Traditional manufacturing and production systems are in the throes of a digital transformation. By blending the real and virtual production worlds, it is now possible to connect all parts of the production process: devices, products, processes, systems and people, in an informational ecosystem. This paper examines the underpinning issues that characterise the challenges for transforming traditional manufacturing to a Cyber Physical Production System. Such a transformation constitutes a major endeavour for requirements engineers who need to identify, specify and analyse the effects that a multitude of assets need to be transformed towards a network of collaborating devices, information sources, and human actors. The paper reports on the e-CORE approach which is a systematic, analytical and traceable approach to Requirements Engineering and demonstrates its utility using an industrial-size application. It also considers the effect of Cyber Physical Production Systems on future approaches to requirements in dealing with the dynamic nature of such systems.

Pericles Loucopoulos, Evangelia Kavakli, Natalia Chechina

Data Modeling and Analysis

Frontmatter
A Fourth Normal Form for Uncertain Data

Relational database design addresses applications for data that is certain. Modern applications require the handling of uncertain data. Indeed, one dimension of big data is veracity. Ideally, the design of databases helps users quantify their trust in the data. For that purpose, we need to establish a design framework that handles responsibly any knowledge of an organization about the uncertainty in their data. Naturally, such knowledge helps us find database designs that process data more efficiently. In this paper, we apply possibility theory to introduce the class of possibilistic multivalued dependencies that are a significant source of data redundancy. Redundant data may occur with different degrees, derived from the different degrees of uncertainty in the data. We propose a family of fourth normal forms for uncertain data. We justify our proposal showing that its members characterize schemata that are free from any redundant data occurrences in any of their instances at the targeted level of uncertainty in the data. We show how to automatically transform any schema into one that satisfies our proposal, without loss of any information. Our results are founded on axiomatic and algorithmic solutions to the implication problem of possibilistic functional and multivalued dependencies which we also establish.

Ziheng Wei, Sebastian Link
Revealing the Conceptual Schemas of RDF Datasets

RDF-based datasets, thanks to their semantic richness, variety and fine granularity, are increasingly used by both researchers and business communities. However, these datasets suffer a lack of completeness as the content evolves continuously and data contributors are loosely constrained by the vocabularies and schemes related to the data sources. Conceptual schemas have long been recognized as a key mechanism for understanding and dealing with complex real-world systems. In the context of the Web of Data and user-generated content, the conceptual schema is implicit. In fact, each data contributor has an implicit personal model that is not known by the other contributors. Consequently, revealing a meaningful conceptual schema is a challenging task that should take into account the data and the intended usage. In this paper, we propose a completeness-based approach for revealing conceptual schemas of RDF data. We combine quality evaluation and data mining approaches to find a conceptual schema for a dataset, this model meets user expectations regarding data completeness constraints. To achieve that, we propose LOD-CM; a web-based completeness demonstrator for linked datasets.

Subhi Issa, Pierre-Henri Paris, Fayçal Hamdi, Samira Si-Said Cherfi
Modeling and In-Database Management of Relational, Data-Aware Processes

It is known that the engineering of information systems usually requires a huge effort in integrating master data and business processes. Existing approaches, both from academia and the industry, typically come with ad-hoc abstractions to represent and interact with the data component. This has two disadvantages: (i) an existing database (DB) cannot be effortlessly enriched with dynamics; (ii) such approaches generally do not allow for integrated modelling, verification, and enactment. We attack these two challenges by proposing a declarative approach, fully grounded in SQL, that supports the agile modelling of relational data-aware processes directly on top of relational DBs. We show how this approach can be automatically translated into a concrete procedural SQL dialect, executable directly inside any relational DB engine. The translation exploits an in-database representation of process states that, in turn, is used to handle, at once, process enactment with or without logging of the executed instances, as well as process verification. The approach has been implemented in a working prototype.

Diego Calvanese, Marco Montali, Fabio Patrizi, Andrey Rivkin
DIA: Stream Analytics on User-Defined Event Intervals

Nowadays, modern Big Stream Processing Solutions (e.g. Spark, Flink) are working towards ultimate frameworks for streaming analytics. In order to achieve this goal, they started to offer extensions of SQL that incorporate stream-oriented primitives such as windowing and Complex Event Processing (CEP). The former enables stateful computation on infinite sequences of data items while the latter focuses on the detection of events pattern. In most of the cases, data items and events are considered instantaneous, i.e., they are single time points in a discrete temporal domain. Nevertheless, a point-based time semantics does not satisfy the requirements of a number of use-cases. For instance, it is not possible to detect the interval during which the temperature increases until the temperature begins to decrease, nor all the relations this interval subsumes. To tackle this challenge, we present $$\texttt {D}^2{\texttt {IA}}$$ D 2 IA ; a set of novel abstract operators to define analytics on user-defined event intervals based on raw events and to efficiently reason about temporal relationships between intervals and/or point events. We realize the implementation of the concepts of $$\texttt {D}^2{\texttt {IA}}$$ D 2 IA on top of Esper, a centralized stream processing system, and Flink, a distributed stream processing engine for big data.

Ahmed Awad, Riccardo Tommasini, Mahmoud Kamel, Emanuele Della Valle, Sherif Sakr

Business Process Modeling and Engineering

Frontmatter
Extracting Declarative Process Models from Natural Language

Process models are an important means to capture information on organizational operations and often represent the starting point for process analysis and improvement. Since the manual elicitation and creation of process models is a time-intensive endeavor, a variety of techniques have been developed that automatically derive process models from textual process descriptions. However, these techniques, so far, only focus on the extraction of traditional, imperative process models. The extraction of declarative process models, which allow to effectively capture complex process behavior in a compact fashion, has not been addressed. In this paper we close this gap by presenting the first automated approach for the extraction of declarative process models from natural language. To achieve this, we developed tailored Natural Language Processing techniques that identify activities and their inter-relations from textual constraint descriptions. A quantitative evaluation shows that our approach is able to generate constraints that closely resemble those established by humans. Therefore, our approach provides automated support for an otherwise tedious and complex manual endeavor.

Han van der Aa, Claudio Di Ciccio, Henrik Leopold, Hajo A. Reijers
From Process Models to Chatbots

The effect of digital transformation in organizations needs to go beyond automation, so that human capabilities are also augmented. A possibility in this direction is to make formal representations of processes more accessible for the actors involved. On this line, this paper presents a methodology to transform a formal process description into a conversational agent, which can guide a process actor through the required steps in a user-friendly conversation. The presented system relies on dialog systems and natural language processing and generation techniques, to automatically build a chatbot from a process model. A prototype tool – accessible online – has been developed to transform a process model in BPMN into a chatbot, defined in Artificial Intelligence Marking Language (AIML), which has been evaluated over academic and industrial professionals, showing potential into improving the gap between process understanding and execution.

Anselmo López, Josep Sànchez-Ferreres, Josep Carmona, Lluís Padró
Dynamic Role Binding in Blockchain-Based Collaborative Business Processes

Blockchain technology enables the execution of collaborative business processes involving mutually untrusted parties. Existing tools allow such processes to be modeled using high-level notations and compiled into smart contracts that can be deployed on blockchain platforms. However, these tools brush aside the question of who is allowed to execute which tasks in the process, either by deferring the question altogether or by adopting a static approach where all actors are bound to roles upon process instantiation. Yet, a key advantage of blockchains is their ability to support dynamic sets of actors. This paper presents a model for dynamic binding of actors to roles in collaborative processes and an associated binding policy specification language. The proposed language is endowed with a Petri net semantics, thus enabling policy consistency verification. The paper also outlines an approach to compile policy specifications into smart contracts for enforcement. An experimental evaluation shows that the cost of policy enforcement increases linearly with the number of roles and constraints.

Orlenys López-Pintado, Marlon Dumas, Luciano García-Bañuelos, Ingo Weber
3D Virtual World BPM Training Systems: Process Gateway Experimental Results

It is important for companies that their operational employees have profound knowledge of the processes in which their work is embedded. 3D virtual world (VW) environments are promising for learning, especially for complex processes that have deviations from the standard flow. We design a 3D VW process training environment to improve process learning, particularly for complex processes with alternative flows, represented with gateways in process models. We adopt the method of loci, which suggests the mental traversal of routines for improving learning. Our experiment with 145 participants compares the level of knowledge acquired for a sample process with our 3D VW environment and a 2D depiction. We found that the 3D VW environment significantly increases the level of process knowledge acquired across the typical gateways in processes. Our results contribute to our understanding of how individuals learn knowledge of processes via 3D environments. With a low initial investment, practitioners are encouraged to invest in 3D training systems for processes, since these can be set up once and reused multiple times for various employees.

Michael Leyer, Ross Brown, Banu Aysolmaz, Irene Vanderfeesten, Oktay Turetken
Deriving and Combining Mixed Graphs from Regulatory Documents Based on Constraint Relations

Extracting meaningful information from regulatory documents such as the General Data Protection Regulation (GDPR) is of utmost importance for almost any company. Existing approaches pose strict assumptions on the documents and output models containing inconsistencies or redundancies since relations within and across documents are neglected. To overcome these shortcomings, this work aims at deriving mixed graphs based on paragraph embedding as well as process discovery and combining these graphs using constraint relations such as “redundant” or “conflicting” detected by the ConRelMiner method. The approach is implemented and evaluated based on two real-world use cases: Austria’s energy use cases plus the contained process models as ground truth and the GDPR. Mixed graphs and their combinations constitute the next step towards an end-to-end solution for extracting process models from text, either from scratch or amending existing ones.

Karolin Winter, Stefanie Rinderle-Ma
A Method to Improve the Early Stages of the Robotic Process Automation Lifecycle

The robotic automation of processes is of much interest to organizations. A common use case is to automate the repetitive manual tasks (or processes) that are currently done by back-office staff through some information system (IS). The lifecycle of any Robotic Process Automation (RPA) project starts with the analysis of the process to automate. This is a very time-consuming phase, which in practical settings often relies on the study of process documentation. Such documentation is typically incomplete or inaccurate, e.g., some documented cases never occur, occurring cases are not documented, or documented cases differ from reality. To deploy robots in a production environment that are designed on such a shaky basis entails a high risk. This paper describes and evaluates a new proposal for the early stages of an RPA project: the analysis of a process and its subsequent design. The idea is to leverage the knowledge of back-office staff, which starts by monitoring them in a non-invasive manner. This is done through a screen-mouse-key-logger, i.e., a sequence of images, mouse actions, and key actions are stored along with their timestamps. The log which is obtained in this way is transformed into a UI log through image-analysis techniques (e.g., fingerprinting or OCR) and then transformed into a process model by the use of process discovery algorithms. We evaluated this method for two real-life, industrial cases. The evaluation shows clear and substantial benefits in terms of accuracy and speed. This paper presents the method, along with a number of limitations that need to be addressed such that it can be applied in wider contexts.

Andres Jimenez-Ramirez, Hajo A. Reijers, Irene Barba, Carmelo Del Valle
Generation and Transformation of Compliant Process Collaboration Models to BPMN

Collaboration is a key factor to successful businesses. To face massive competition in which SMEs compete with well established corporates, organizations tend to focus on their core businesses while delegating other tasks to their partners. Lately, Blockchain technology has yet furthered and eased the way companies collaborate in a trust-less environment. As such, interest in researching process collaborations models and techniques has been growing. However, in contrast to BPM research for intra-organizational processes, where a multitude of process models repositories exist as a support for simulation and work evaluation, the lack of such repositories in the context of inter-organizational processes has become an inconvenience. The aim of this paper is to build a repository of collaborative process models that will assist the research in this area. A top-down approach is used to automatically generate constrained and compliant choreography models, from which public and private process models are derived. Though the generation is partly random, it complies to a predefined set of compliance rules and parameters specified by the user.

Frederik Bischoff, Walid Fdhila, Stefanie Rinderle-Ma
GameOfFlows: Process Instance Adaptation in Complex, Dynamic and Potentially Adversarial Domains

Business processes often need to be executed in complex settings where a range of environmental factors can conspire to impede the execution of the process. Gou et al. [1] view process execution as an adversarial game between the process player and the environment player. While useful, their approach leaves open the question of the role of the original process design in the story. Process designs encode significant specialist knowledge and have significant investments in process infrastructure associated with them. We provide a machinery that involves careful deliberation on when and where to deviate from a process design. We conceive of a process engine that frequently (typically after executing each task) re-considers the next task or sequence of tasks to execute. It performs trade-off analysis by comparing the following: (1) the likelihood of successful completion by conforming to the mandated process design against (2) the likelihood of success if it were to deviate from the design by executing a compensation (i.e., an alternative sequence of tasks that takes the process from the current state to completion).

Yingzhi Gou, Aditya Ghose, Hoa Khanh Dam

Information System Security

Frontmatter
Security Vulnerability Information Service with Natural Language Query Support

The huge data breaches and attacks reported in the past years (e.g., the cases of Yahoo and Equifax) have significantly raised the concerns on the security of software used and developed by companies for their day-to-day operations. In this context, becoming aware about existing security vulnerabilities and taking preventive actions is of paramount importance for security professionals to help keep software secure. The increasingly large number of vulnerabilities discovered every year and the scattered and heterogeneous nature of vulnerability-related information make this, however, a non-trivial task. This paper aims at mitigating this problem by making security vulnerability information timely available and easily searchable. We propose to enrich and index security vulnerability information collected from publicly available sources on the Web. To make this information easily queryable we propose a natural language interface that allows users to query this index using plain English. The evaluation results of our proposal demonstrate that our solution can effectively answer questions typically asked in the security vulnerability domain.

Carlos Rodriguez, Shayan Zamanirad, Reza Nouri, Kirtana Darabal, Boualem Benatallah, Mortada Al-Banna
Automated Interpretation and Integration of Security Tools Using Semantic Knowledge

A security orchestration platform aims at integrating the activities performed by multi-vendor security tools to streamline the required incident response process. To make such a platform useful in practice in a Security Operation Center (SOC), we need to address three key challenges: interpretability, interoperability, and automation. In this paper, we proposed a novel semantic integration approach to automatically select and integrate security tools with essential capability for auto-execution of an incident response process in a security orchestration platform. The capability of security tools and the activities of the incident response process are formalized using ontologies, which have been used for NLP based approach to classify the activities for the emerging incident response processes. The developed ontologies and NLP approaches have been used for an interoperability model for selection and integration of security tools at runtime for the successful execution of an incident response process. Experimental results demonstrate the feasibility of the classifier and interoperability model for achieving interpretability, interoperability, and automation of security tools integrated into a security orchestration platform.

Chadni Islam, M. Ali Babar, Surya Nepal
An Assessment Model for Continuous Security Compliance in Large Scale Agile Environments
Exploratory Paper

Compliance to security-standards for engineering secure software and hardware products is essential to gain and keep customers trust. In particular, industrial control systems (ICS) have a significant need for secure development activities. The standard IEC 62443-4-1 (4-1) is a novel norm that describes activities required to engineer secure products. However, assessing if the norm is still fulfilled in continuous agile software engineering environments is difficult. It often remains unclear how the agile and the secure development process have to intertwine. This is even more problematic when changes on the basis of assessment results of 4-1 or other secure development activities have to be applied. We contribute a novel assessment model that contains a baseline process for secure agile software engineering compliant to 4-1. Our assessment results show precisely where in the development process activities or artifacts have to be applied. Moreover, it contains a refinement into goals and metrics that allow the evaluator to present the evaluate with a precise ’shopping list’ of where to invest to achieve compliance. Afterwards, management can include precise compliance expenditure estimates in their business models.

Sebastian Dännart, Fabiola Moyón Constante, Kristian Beckers

Learning and Mining in Information Systems

Frontmatter

Open Access

Proactive Process Adaptation Using Deep Learning Ensembles

Proactive process adaptation can prevent and mitigate upcoming problems during process execution. Proactive adaptation decisions are based on predictions about how an ongoing process instance will unfold up to its completion. On the one hand, these predictions must have high accuracy, as, for instance, false negative predictions mean that necessary adaptations are missed. On the other hand, these predictions should be produced early during process execution, as this leaves more time for adaptations, which typically have non-negligible latencies. However, there is an important tradeoff between prediction accuracy and earliness. Later predictions typically have a higher accuracy, because more information about the ongoing process instance is available. To address this tradeoff, we use an ensemble of deep learning models that can produce predictions at arbitrary points during process execution and that provides reliability estimates for each prediction. We use these reliability estimates to dynamically determine the earliest prediction with sufficient accuracy, which is used as basis for proactive adaptation. Experimental results indicate that our dynamic approach may offer cost savings of 27% on average when compared to using a static prediction point.

Andreas Metzger, Adrian Neubauer, Philipp Bohn, Klaus Pohl
Using Machine Learning Techniques for Evaluating the Similarity of Enterprise Architecture Models
Technical Paper

Enterprises Architectures (EA) are facilitated to coordinate enterprise’s business visions and strategies successfully and effectively. The practitioners of EA (architects) communicate the architecture to other stakeholders via architecture models. We investigate the scenario where accepted architecture models are stored in a repository. We identified the problem of unnecessary repository expansion by adding model components with similar properties or behavior as already existing repository components. The proposed solution aims to find those similar components and to notify the architect about their existence.We present two approaches for defining and combining similarities between EA model components. The similarity measures are calculated upon the properties of the components and on the context of their usage. We further investigate the behavior of similar architecture models and search for associations in order to obtain components that might be of interest. At the end, we provide a prototype tool for both generating requests and obtaining a result.

Vasil Borozanov, Simon Hacks, Nuno Silva
Efficient Discovery of Compact Maximal Behavioral Patterns from Event Logs

Techniques for process discovery support the analysis of information systems by constructing process models from event logs that are recorded during system execution. In recent years, various algorithms to discover end-to-end process models have been proposed. Yet, they do not cater for domains in which process execution is highly flexible, as the unstructuredness of the resulting models renders them meaningless. It has therefore been suggested to derive insights about flexible processes by mining behavioral patterns, i.e., models of frequently recurring episodes of a process’ behavior. However, existing algorithms to mine such patterns suffer from imprecision and redundancy of the mined patterns and a comparatively high computational effort. In this work, we overcome these limitations with a novel algorithm, coined COBPAM (COmbination based Behavioral Pattern Mining). It exploits a partial order on potential patterns to discover only those that are compact and maximal, i.e. least redundant. Moreover, COBPAM exploits that complex patterns can be characterized as combinations of simpler patterns, which enables pruning of the pattern search space. Efficiency is improved further by evaluating potential patterns solely on parts of an event log. Experiments with real-world data demonstrates how COBPAM improves over the state-of-the-art in behavioral pattern mining.

Mehdi Acheli, Daniela Grigori, Matthias Weidlich
Discovering Responsibilities with Dynamic Condition Response Graphs

Declarative process discovery is the art of using historical data to better understand the responsibilities of an organisation: its governing business rules and goals. These rules and goals can be described using declarative process notations, such as Dynamic Condition Response (DCR) Graphs, which has seen widespread industrial adoption within Denmark, in particular through its integration in a case management solution used by 70% of central government institutions. In this paper, we introduce ParNek: a novel, effective, and extensible miner for the discovery of DCR Graphs. We empirically evaluate ParNek and show that it significantly outperforms the state-of-the-art in DCR discovery and performs at least comparably to the state-of-the-art in Declare discovery. Notably, the miner can be configured to sacrifice relatively little precision in favour of significant gains in simplicity, making it the first miner able to produce understandable DCR Graphs for real-life logs.

Viktorija Nekrasaite, Andrew Tristan Parli, Christoffer Olling Back, Tijs Slaats
Fifty Shades of Green: How Informative is a Compliant Process Trace?

The problem of understanding whether a process trace satisfies a prescriptive model is a fundamental conceptual modeling problem in the context of process-based information systems. In business process management, and in process mining in particular, this amounts to check whether an event log conforms to a prescriptive process model, i.e., whether the actual traces present in the log are allowed by all behaviors implicitly expressed by the model. The research community has developed a plethora of very sophisticated conformance checking techniques that are particularly effective in the detection of non-conforming traces, and in elaborating on where and how they deviate from the prescribed behaviors. However, they do not provide any insight to distinguish between conforming traces, and understand their differences. In this paper, we delve into this rather unexplored area, and present a new process mining quality measure, called informativeness, which can be used to compare conforming traces to understand which are more relevant (or informative) than others. We introduce a technique to compute such measure in a very general way, as it can be applied on process models expressed in any language (e.g., Petri nets, Declare, process trees, BPMN) as long as a conformance checking tool is available. We then show the versatility of our approach, showing how it can be meaningfully applied when the activities contained in the process are associated to costs/rewards, or linked to strategic goals.

Andrea Burattin, Giancarlo Guizzardi, Fabrizio Maria Maggi, Marco Montali
Solution Patterns for Machine Learning

Despite the hype around machine learning (ML), many organizations are struggling to derive business value from ML capabilities. Design patterns have long been used in software engineering to enhance design effectiveness and to speed up the development process. The contribution of this paper is two-fold. First, it introduces solution patterns as an explicit way of representing generic and well-proven ML designs for commonly-known and recurring business analytics problems. Second, it reports on the feasibility, expressiveness, and usefulness of solution patterns for ML, in collaboration with an industry partner. It provides a prototype architecture for supporting the use of solution patterns in real world scenarios. It presents a proof-of-concept implementation of the architecture and illustrates its feasibility. Findings from the collaboration suggest that solution patterns can have a positive impact on ML design and development efforts.

Soroosh Nalchigar, Eric Yu, Yazan Obeidi, Sebastian Carbajales, John Green, Allen Chan
Managing and Simplifying Cognitive Business Operations Using Process Architecture Models

Enterprises increasingly rely on cognitive capabilities to enhance their core business processes by adopting systems that utilize machine learning and deep learning approaches to support cognitive decisions to aid humans responsible for business process execution. Unlike conventional information systems, for which the design and implementation is a much-studied area, the design of cognitive systems and their integration into existing enterprise business processes is less well understood. This results in long drawn-out implementation and adoption cycles, and requires individuals with highly specialized skills. As cognitively-assisted business processes involve human and machine collaboration, non-functional requirements, such as reusability and configurability that are prominent for software system design, must also be addressed at the enterprise level. Supporting processes may emerge and evolve over time to monitor, evaluate, adjust, or modify these cognitively-enhanced business processes. In this paper, we utilize a goal-oriented approach to analyze the requirements for designing cognitive systems for simplified adoption in enterprises, which are then used to guide and inform the design of a process architecture for cognitive business operations.

Zia Babar, Eric Yu, Sebastian Carbajales, Allen Chan
A Constraint Mining Approach to Support Monitoring Cyber-Physical Systems

The full behavior of cyber-physical systems (CPS) emerges during operation only, when the systems interact with their environment. Runtime monitoring approaches are used to detect deviations from the expected behavior. While most monitoring approaches assume that engineers define the expected behavior as constraints, the deep domain knowledge required for this task is often not available. We describe an approach that automatically mines constraint candidates for runtime monitoring from event logs recorded from CPS. Our approach extracts different types of constraints on event occurrence, timing, data, and combinations of these. The approach further presents the mined constraint candidates to users and offers filtering and ranking strategies. We demonstrate the usefulness and scalability of our approach by applying it to event logs from two real-world CPS: a plant automation software system and a system controlling unmanned aerial vehicles. In our experiments, domain experts regarded 74% and 63%, respectively, of the constraints mined for these two systems as useful.

Thomas Krismayer, Rick Rabiser, Paul Grünbacher
Behavior-Derived Variability Analysis: Mining Views for Comparison and Evaluation

The large variety of computerized solutions (software and information systems) calls for a systematic approach to their comparison and evaluation. Different methods have been proposed over the years for analyzing the similarity and variability of systems. These methods get artifacts, such as requirements, design models, or code, of different systems (commonly in the same domain), identify and calculate their similarities, and represent the variability in models, such as feature diagrams. Most methods rely on implementation considerations of the input systems and generate outcomes based on predefined, fixed strategies of comparison (referred to as variability views). In this paper, we introduce an approach for mining relevant views for comparison and evaluation, based on the input artifacts. Particularly, we equip SOVA – a Semantic and Ontological Variability Analysis method – with data mining techniques in order to identify relevant views that highlight variability or similarity of the input artifacts (natural language requirement documents). The comparison is done using entropy and Rand index measures. The method and its outcomes are evaluated on a case of three photo sharing applications.

Iris Reinhartz-Berger, Ilan Shimshoni, Aviva Abdal
Backmatter
Metadaten
Titel
Advanced Information Systems Engineering
herausgegeben von
Paolo Giorgini
Barbara Weber
Copyright-Jahr
2019
Electronic ISBN
978-3-030-21290-2
Print ISBN
978-3-030-21289-6
DOI
https://doi.org/10.1007/978-3-030-21290-2