Skip to main content
Top

2014 | Book

Advanced Information Systems Engineering

26th International Conference, CAiSE 2014, Thessaloniki, Greece, June 16-20, 2014. Proceedings

Editors: Matthias Jarke, John Mylopoulos, Christoph Quix, Colette Rolland, Yannis Manolopoulos, Haralambos Mouratidis, Jennifer Horkoff

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

This book constitutes the proceedings of 26th International Conference on Advanced Information Systems Engineering, CAiSE 2014, held in Thessaloniki, Greece in June 2014. The 41 papers and 3 keynotes presented were carefully reviewed and selected from 226 submissions. The accepted papers were presented in 13 sessions: clouds and services; requirements; product lines; requirements elicitation; processes; risk and security; process models; data mining and streaming; process mining; models; mining event logs; databases; software engineering.

Table of Contents

Frontmatter

Keynotes

Information Systems for the Governance of Compliant Service Systems

The traditional role of an Information System (IS) is to support operation and management within an organization. In the presentation, we will discuss the specific role that IS can also play to support the management aspects related to the governance of an organization regarding its compliance to norms and regulations. In the new service economy, governance issues are no longer limited to a single organization but should be extended at the level of service value networks, i.e. service systems. In such systems, one of the challenges is for each organization to demonstrate its compliance in a transparent way. In this paper, we discuss how to organize a global governance framework and we illustrate its use in the IT service management. On the basis of some research on TIPA, a process reference model making possible to objectively measure the quality of delivered IT services, we illustrate how IS and Enterprise Architecture can effectively support the deployment of such global governance model.

Eric Dubois
Risk Accelerators in Disasters
Insights from the Typhoon Haiyan Response on Humanitarian Information Management and Decision Support

Modern societies are increasingly threatened by disasters that require rapid response through ad-hoc collaboration among a variety of actors and organizations. The complexity within and across today’s societal, economic and environmental systems defies accurate predictions and assessments of damages, humanitarian needs, and the impact of aid. Yet, decision-makers need to plan, manage and execute aid response under conditions of high uncertainty while being prepared for further disruptions and failures. This paper argues that these challenges require a paradigm shift: instead of seeking optimality and full efficiency of procedures and plans, strategies should be developed that enable an acceptable level of aid under all foreseeable eventualities. We propose a decision- and goal-oriented approach that uses scenarios to systematically explore future developments that may have a major impact on the outcome of a decision. We discuss to what extent this approach supports robust decision-making, particularly if time is short and the availability of experts is limited. We interlace our theoretical findings with insights from experienced humanitarian decision makers we interviewed during a field research trip to the Philippines in the aftermath of Typhoon Haiyan.

Bartel Van de Walle, Tina Comes
Against the Odds: Managing the Unmanagable in a Time of Crisis

Information technology systems at the Greek Ministry of Finance could be the ideal tools for fighting widespread tax evasion, bureaucratic inefficiency, waste, and corruption. Yet making this happen requires battling against protracted procurement processes and implementation schedules, ineffective operations, and rigid management structures. This experience report details some unconventional measures, tools, and techniques that were adopted to sidestep the barriers in a time of crisis. The measures involved meritocracy,

it

utilization, and management by objectives. Sadly, this report is also a story (still being written) on the limits of such approaches. On balance, it demonstrates that in any large organization there are ample opportunities to bring about change, even against considerable odds.

Diomidis Spinellis

Clouds and Services

Queue Mining – Predicting Delays in Service Processes

Information systems have been widely adopted to support service processes in various domains,

e.g.

, in the telecommunication, finance, and health sectors. Recently, work on process mining showed how management of these processes, and engineering of supporting systems, can be guided by models extracted from the event logs that are recorded during process operation. In this work, we establish a queueing perspective in operational process mining. We propose to consider queues as first-class citizens and use queueing theory as a basis for queue mining techniques. To demonstrate the value of queue mining, we revisit the specific operational problem of

online delay prediction

: using event data, we show that queue mining yields accurate online predictions of case delay.

Arik Senderovich, Matthias Weidlich, Avigdor Gal, Avishai Mandelbaum
PO-SAAC: A Purpose-Oriented Situation-Aware Access Control Framework for Software Services

Situation-aware applications need to capture relevant

context information

and

user intention or purpose

, to provide situation-specific access to software services. As such, a situation-aware access control approach coupled with purpose-oriented information is of critical importance. Existing approaches are highly domain-specific and they control access to services depending on the specific types of context information without considering the

purpose

. To achieve

situation-aware access control

, in this paper we consider

purpose-oriented situations

rather than conventional situations (e.g., user’s state). We take

situation

to mean the states of the entities and the states of the relationships between entities that are relevant to the purpose of a resource access request. We propose a generic framework,

P

urpose-

O

riented

S

ituation-

A

ware

A

ccess

C

ontrol

, that supports access control to software services based on the relevant situations. We develop a software prototype to demonstrate the practical applicability of the framework. In addition, we demonstrate the effectiveness of our framework through a healthcare case study. Experimental results demonstrate the satisfactory performance of our framework.

A. S. M. Kayes, Jun Han, Alan Colman
Optimal Distribution of Applications in the Cloud

The emergence of the cloud computing paradigm introduces a number of challenges and opportunities to application and system developers. The multiplication and proliferation of available offerings by cloud service providers, for example, makes the selection of an appropriate solution complex and inefficient. On the other hand, this availability of offerings creates additional possibilities in the way that applications can be engineered or re-engineered to take advantage of e.g. the elastic nature, or the pay per use model of cloud computing. This work proposes a formal framework which allows to explore the possibility space of optimally distributing application components across cloud offerings in an efficient and flexible manner. The proposed approach introduces a set of innovative in their use concepts and demonstrates how this framework can be used in practice by means of a running scenario.

Vasilios Andrikopoulos, Santiago Gómez Sáez, Frank Leymann, Johannes Wettinger

Requirements

Identifying Modularity Improvement Opportunities in Goal-Oriented Requirements Models

Goal-oriented Requirements Engineering approaches have become popular in the Requirements Engineering community as they provide expressive model elements for requirements elicitation and analysis. However, as a common challenge, they are still struggling when it comes to managing the accidental complexity of their models. In this paper, we provide a set of metrics, which are formally specified and have tool support, to measure and analyze the complexity of goal models, in particular

i*

models. The aim is to identify refactoring opportunities to improve the modularity of those models, and consequently reduce their complexity. We evaluate these metrics by applying them to a set of well-known case studies from industry and academia. Our results allow the identification of refactoring opportunities in the evaluated models.

Catarina Gralha, Miguel Goulão, João Araújo
Understandability of Goal-Oriented Requirements Engineering Concepts for Enterprise Architects

ArchiMate is a graphical language for modelling business goals and enterprise architecture. In previous work we identified possible understandability issues with the goal-oriented notations in ArchiMate. [

Problem

] We investigated how understandable the goal-oriented concepts really were in two quasi-experiments with practitioners. [

Principal ideas/results

] Only three concepts were understood by most or all subjects; the stakeholder concept, the goal concept and the requirement concept. The other concepts were misunderstood by most of our subjects. We offer explanations for these (mis)understandings. [

Contribution

] This paper provides new insights into the understandability and hence usability of goal-oriented concepts by practicing enterprise architects.

Wilco Engelsman, Roel Wieringa
FACTS: A Framework for Anonymity towards Comparability, Transparency, and Sharing
Exploratory Paper

In past years, many anonymization schemes, anonymity notions, and anonymity measures have been proposed. When designing information systems that feature anonymity, choosing a good approach is a very important design choice. While experiments comparing such approaches are enlightening, carrying out such experiments is a complex task and is labor-intensive. To address this issue, we propose the framework FACTS for the experimental evaluation of anonymization schemes. It lets researchers implement their approaches against interfaces and other standardizations that we have devised. Users can then define benchmark suites that refer to those implementations. FACTS gives way to comparability, and it includes many useful features, e.g., easy sharing and reproduction of experiments. We evaluate FACTS (a) by specifying and executing a comprehensive benchmark suite for data publishing and (b) by means of a user study. Core results are that FACTS is useful for a broad range of scenarios, that it allows to compare approaches with ease, and that it lets users share and reproduce experiments.

Clemens Heidinger, Klemens Böhm, Erik Buchmann, Kai Richter

Product Lines

Trust-Aware Decision-Making Methodology for Cloud Sourcing

Cloud sourcing consists of outsourcing data, services and infrastructure to cloud providers. Even when this outsourcing model brings advantages to cloud customers, new threats also arise as sensitive data and critical IT services are beyond customers’ control. When an organization considers moving to the cloud, IT decision makers must select a cloud provider and must decide which parts of the organization will be outsourced and to which extent. This paper proposes a methodology that allows decision makers to evaluate their trust in cloud providers. The methodology provides a systematic way to elicit knowledge about cloud providers, quantify their trust factors and aggregate them into trust values that can assist the decision-making process. The trust model that we propose is based on trust intervals, which allow capturing uncertainty during the evaluation, and we define an operator for aggregating these trust intervals. The methodology is applied to an eHealth scenario.

Francisco Moyano, Kristian Beckers, Carmen Fernandez-Gago
Analyzing Variability of Software Product Lines Using Semantic and Ontological Considerations

Software Product Line Engineering (SPLE) is an approach to systematically reuse software-related artifacts among different, yet similar, software products. Previewing requirements as drivers of different development methods and activities, several studies have suggested using requirements specifications to identify and analyze commonality and variability of software products. These studies mainly employ semantic text similarity techniques. As a result, they might be limited in their ability to analyze the variability of the

expected behaviors

of software systems as perceived from an external point of view. Such a view is important when reaching different reuse decisions. In this paper we propose to introduce considerations which reflect the behavior of software products as manifested in requirement statements. To model these behavioral aspects of software requirements we use terms adapted from Bunge’s ontological model. The suggested approach automatically extracts the initial state, external events, and final state of software behavior. Then, variability is analyzed based on that view.

Iris Reinhartz-Berger, Nili Itzik, Yair Wand
Similarity Analysis within Product Line Scoping: An Evaluation of a Semi-automatic Approach

Introducing a product line approach in an organization requires a systematic scoping phase to decide what products and features should be included. Product line scoping is a non-trivial activity and traditionally consumes a lot of time and resources. This issue highlights the need to complement traditional scoping activities with semi-automatic approaches that allow to initially estimate the potential for reuse with small efforts. In this paper we present an evaluation of a tool-supported approach that enables the semi-automatic analysis of existing products in order to calculate their similarity. This approach is tailored to be used within the configuration-based systems domain, where we have used it to identify similarity within two types of industrial standard software products. The results of this evaluation highlight that our approach provides accurate results and leads to time savings compared to manual similarity analysis.

Markus Nöbauer, Norbert Seyff, Iris Groher

Requirements Elicitation

An Exploratory Study of Topic Importance in Requirements Elicitation Interviews

Interviewing stakeholders is a common way to elicit information about requirements of the system-to-be and the conditions in its operating environment. One difficulty in preparing and doing interviews is how to avoid missing the information that may be important to understand the requirements and environment conditions. Some information may remain implicit throughout the interview, if the interviewed stakeholder does not consider it important, and the business analyst fails to mention it, or a topic it relates to. We propose the so-called Elicitation Topic Map (ETM), which is intended to help business analysts prepare elicitation interviews. ETM is a diagram that shows topics that can be discussed during requirements elicitation interviews, and shows how likely it is that stakeholders tend to discuss each of the topics spontaneously (as opposed to being explicitly asked questions on that topic by the business analyst). ETM was produced through a combination of theoretical and empirical research.

Corentin Burnay, Ivan J. Jureta, Stéphane Faulkner
Expert Finding Using Markov Networks in Open Source Communities

Expert finding aims at identifying knowledgeable people to help in decision processes, such as eliciting or analysing requirements in Requirements Engineering. Complementary approaches exist to tackle specific contexts like in forum-based communities, exploiting personal contributions, or in structured organisations like companies, where the social relationships between employees help to identify experts. In this paper, we propose an approach to tackle a hybrid context like an Open Source Software (OSS) community, which involves forums open to contributors, as well as companies providing OSS-related services. By representing and relating stakeholders, their roles, the topics discussed and the terms used, and by applying inference algorithms based on Markov networks, we are able to rank stakeholders by their inferred level of expertise in one topic or more. Two preliminary experiments are presented to illustrate the approach and to show its potential benefit.

Matthieu Vergne, Angelo Susi
Unifying and Extending User Story Models

Within Agile methods,

User Stories

(

US

) are mostly used as primary requirements artifacts and units of functionality of the project. The idea is to express requirements on a low abstraction basis using natural language. Most of them are exclusively centered on the final user as only stakeholder. Over the years, some templates (in the form of concepts relating the WHO, WHAT and WHY dimensions into a phrase) have been proposed by agile methods practitioners or academics to guide requirements gathering. Using these templates can be problematic. Indeed, none of them define any semantic related to a particular syntax precisely or formally leading to various possible interpretations of the concepts. Consequently, these templates are used in an ad–hoc manner, each modeler having idiosyncratic preferences. This can nevertheless lead to an underuse of representation mechanisms, misunderstanding of a concept use and poor communication between stakeholders. This paper studies templates found in literature in order to reach unification in the concepts’ syntax, an agreement in their semantics as well as methodological elements increasing inherent scalability of US-based projects.

Yves Wautelet, Samedi Heng, Manuel Kolp, Isabelle Mirbel

Processes

How does Quality of Formalized Software Processes Affect Adoption?

Defining software processes allows companies to evaluate and improve them enhancing development productivity and product quality, as well as allowing certification or evaluation. Formalizing processes also helps eliminating ambiguity, and enables tool support for evolution and automatic analysis. But these benefits cannot be fully achieved if practitioners do not adopt the process. Some challenges related to adoption have already been identified. In this paper we analyze the influence of the quality of the specified process on its adoption. Adoption is measured in terms of work products built during projects: work products that were not built, those that were built but late during the project, and those that were built in time. We illustrate this analysis by evaluating the adoption of a formalized process in a small Chilean company along five projects. We conclude that certain kinds of errors in process specification may threaten its adoption and thus its potential benefits.

María Cecilia Bastarrica, Gerardo Matturro, Romain Robbes, Luis Silvestre, René Vidal
Context-Aware Staged Configuration of Process Variants@Runtime

Process-based context-aware applications are increasingly becoming more complex and dynamic. Besides the large sets of process variants to be managed in such dynamic systems, process variants need to be context sensitive in order to accommodate new user requirements and intrinsic complexity. This paradigm shift forces us to defer decisions to runtime where process variants must be customized and executed based on a recognized context. However, there exists a lack of deferral of the entire process variant configuration and execution to perform an automated decision of subsequent variation points at runtime. In this paper, we present a holistic methodology to automatically resolve process variability at runtime. The proposed solution performs a staged configuration considering static and dynamic context data to accomplish effective decision making. We demonstrate our approach by exemplifying a storage operation process in a smart logistics scenario. Our evaluation demonstrates the performance and scalability results of our methodology.

Aitor Murguzur, Xabier De Carlos, Salvador Trujillo, Goiuria Sagardui
Prioritizing Business Processes Improvement Initiatives: The Seco Tools Case

Chief Information Officers (CIOs) face great challenges in prioritizing business process improvement initiatives due to limited resources and politics in decision making. We developed a prioritization and categorization method (PCM) for supporting CIOs’ decision-making process. The method is designed in a collaborative research process engaging CIOs, process experts and researchers. In this experience paper, we firstly present the PCM, and then we describe the lessons learned when demonstrating the PCM prototype at a big international company, Seco Tools. The results show that the PCM can produce a holistic analysis of processes by eliciting the “collective intelligence” from process stakeholders and managers. The PCM activities create a top-down social process of process management. By using the PCM the company managed to prioritize business process improvement initiatives in a novel way. This paper contributes to theories/know how on business process management, as well as propose a novel method that can be used by CIOs of large corporations in prioritizing process initiatives.

Jens Ohlsson, Shengnan Han, Paul Johannesson, Fredrik Carpenhall, Lazar Rusu

Risk and Security

Cloud Forensics: Identifying the Major Issues and Challenges

One of the most important areas in the developing field of cloud computing is the way that investigators conduct researches in order to reveal the ways that a digital crime took place over the cloud. This area is known as cloud forensics. While great research on digital forensics has been carried out, the current digital forensic models and frameworks used to conduct a digital investigation don’t meet the requirements and standards demanded in cloud forensics due to the nature and characteristics of cloud computing. In parallel, issues and challenges faced in traditional forensics are different to the ones of cloud forensics. This paper addresses the issues of the cloud forensics challenges identified from review conducted in the respective area and moves to a new model assigning the aforementioned challenges to stages.

Stavros Simou, Christos Kalloniatis, Evangelia Kavakli, Stefanos Gritzalis
Dealing with Security Requirements for Socio-Technical Systems: A Holistic Approach

Security has been a growing concern for most large organizations, especially financial and government institutions, as security breaches in the socio-technical systems they depend on are costing billions. A major reason for these breaches is that socio-technical systems are designed in a piecemeal rather than a holistic fashion that leaves parts of a system vulnerable. To tackle this problem, we propose a three-layer security analysis framework for socio-technical systems involving business processes, applications and physical infrastructure. In our proposal, global security requirements lead to local security requirements that cut across layers and upper-layer security analysis influences analysis at lower layers. Moreover, we propose a set of analytical methods and a systematic process that together drive security requirements analysis throughout the three-layer framework. Our proposal supports analysts who are not security experts by defining transformation rules that guide the corresponding analysis. We use a smart grid example to illustrate our approach.

Tong Li, Jennifer Horkoff
IT Risk Management with Markov Logic Networks

We present a solution for modeling the dependencies of an IT infrastructure and determine the availability of components and services therein using Markov logic networks (MLN). MLNs offer a single representation of probability and first-order logic and are well suited to model dependencies and threats. We identify different kinds of dependency and show how they can be translated into an MLN. The MLN infrastructure model allows us to use marginal inference to predict the availability of IT infrastructure components and services. We demonstrate that our solution is well suited for supporting IT Risk management by analyzing the impact of threats and comparing risk mitigation efforts.

Janno von Stülpnagel, Jens Ortmann, Joerg Schoenfisch

Process Models

Automating Data Exchange in Process Choreographies

Process choreographies are part of daily business. While the correct ordering of exchanged messages can be modeled and enacted with current choreography techniques, no approach exists to describe and automate the exchange of data between processes in a choreography using messages. This paper describes an entirely model-driven approach for BPMN introducing a few concepts that suffice to model data retrieval, data transformation, message exchange, and correlation – four aspects of data exchange. For automation, this work utilizes a recent concept to enact data dependencies in internal processes. We present a modeling guideline to derive local process models from a given choreography; their operational semantics allows to correctly enact the entire choreography from the derived models only including the exchange of data. We implemented our approach by extending the

camunda BPM platform

with our approach and show its feasibility by realizing all service interaction patterns using only model-based concepts.

Andreas Meyer, Luise Pufahl, Kimon Batoulis, Sebastian Kruse, Thorben Lindhauer, Thomas Stoff, Dirk Fahland, Mathias Weske
Integrating the Goal and Business Process Perspectives in Information System Analysis

There are several motivations to promote investment and scientific effort in the integration of intentional and operational perspectives: organisational reengineering, continuous improvement of business processes, alignment among complementary analysis perspectives, information traceability, etc. In this paper we propose the integration of two modelling languages that support the creation of goal and business process models: the

i*

goal-oriented modelling method and Communication Analysis, a communication-oriented business process modelling method. We describe the methodological integration of the two modelling methods with the aim of fulfilling several criteria: i) to rely on appropriate theories; ii) to provide abstract and concrete syntaxes; iii) to provide scenarios of application; and iv) to develop tool support. We provide guidelines for using the two modelling methods in a top-down analysis scenario. We also present an illustrative case that demonstrates the feasibility of the approach.

Marcela Ruiz, Dolors Costal, Sergio España, Xavier Franch, Óscar Pastor
Formalization of fUML: An Application to Process Verification

Much research work has been done on formalizing UML Activity Diagrams for process modeling to verify different kinds of soundness properties (deadlock, unreachable activities and so on) on process models. However, these works focus mainly on the control-flow aspects of the process and have done some assumptions on the precise execution semantics defined in natural language in the UML specification. In this paper, we define a first-order logic formalization of fUML (Foundational Subset of Executable UML), the official and precise operational semantics of UML, in order to apply model checking techniques and therefore verify the correctness of fUML-based process models. Our formalization covers the control-flow, data-flow, resources, and timing dimensions of processes in a unified way. A working implementation based on the Alloy language has been developed. The implementation showed us that many kinds of behavioral properties not commonly supported by other approaches and implying multiple dimensions of the process can be efficiently checked.

Yoann Laurent, Reda Bendraou, Souheib Baarir, Marie-Pierre Gervais
On the Elasticity of Social Compute Units

Advances in human computation bring the feasibility of utilizing human capabilities as services. On the other hand, we have witnessed emerging collective adaptive systems which are formed from heterogeneous types of compute units to solve complex problems. The recently introduced Social Compute Units (SCUs) present one type of these systems, which have human-based services as their core fundamental compute units. While, there is related work on forming SCUs and optimizing their performance with adaptation techniques, most of it is focused on static structures of SCUs. To provide better runtime performance and flexibility management for SCUs, we present an elasticity model for SCUs and mechanisms for their elastic management which allow for certain fluctuations in size, structure, performance and quality. We model states of elastic SCUs, present APIs for managing SCUs as well as metrics for controlling their elasticity with which it is possible to tailor their performance parameters at runtime within the customer-set constraints. We illustrate our contribution with an example algorithm.

Mirela Riveni, Hong-Linh Truong, Schahram Dustdar

Data Mining and Streaming

Open-Source Databases: Within, Outside, or Beyond Lehman’s Laws of Software Evolution?

Lehman’s laws of software evolution is a well-established set of observations (matured during the last forty years) on how the typical software systems evolve. However, the applicability of these laws on databases has not been studied so far. To this end, we have performed a thorough, large-scale study on the evolution of databases that are part of larger open source projects, publicly available through open source repositories, and report on the validity of the laws on the grounds of properties like size, growth, and amount of change per version.

Ioannis Skoulis, Panos Vassiliadis, Apostolos Zarras
Schema Independent Reduction of Streaming Log Data

Large software systems comprise of different and tightly interconnected components. Such systems utilize heterogeneous monitoring infrastructures which produce log data at high rates from various sources and in diverse formats. The sheer volume of this data makes almost impossible the real- or near real-time processing of these system logs. In this paper, we present a log schema independent approach that allows for the real time reduction of logged data based on a set of filtering criteria. The approach utilizes a similarity measure between features of the incoming events and a set of filtering features we refer to as beacons. The similarity measure is based on information theory principles and uses caching techniques so that infinite log data streams and log data schema alterations can be handled. The approach has been applied successfully on the KDD-99 intrusion detection benchmark data set.

Theodoros Kalamatianos, Kostas Kontogiannis
Automatization of the Stream Mining Process

The problem this paper addresses is related to

Data Stream Mining

and its automatization within

Information Systems

. Our aim is to show that the expertise which is usually provided by data and data mining experts and is crucial for problems of this kind can be successfully captured and computerized. To this end we observed data mining experts at work and in discussion with them coded their knowledge in a form of an expert system. The evaluation over four different datasets confirms the automatization of the stream mining process is possible and can produce results comparable to those achieved by data mining experts.

Lovro Šubelj, Zoran Bosnić, Matjaž Kukar, Marko Bajec
Matching User Profiles Across Social Networks

Social Networking Sites

, such as Facebook and Linkedin, are clear examples of the impact that the Web 2.0 has on people around the world, because they target an aspect of life that is extremely important to anyone: social relationships. The key to building a social network is the ability of finding people that we know in real life, which, in turn, requires those people to make publicly available some personal information, such as their names, family names, locations and birth dates, just to name a few. However, it is not uncommon that individuals create multiple profiles in several social networks, each containing partially overlapping sets of personal information. Matching those different profiles allows to create a global profile that gives a holistic view of the information of an individual. In this paper, we present an algorithm that uses the network topology and the publicly available personal information to iteratively match profiles across

n

social networks, based on those individuals who disclose the links to their multiple profiles. The evaluation results, obtained on a real dataset composed of around 2 million profiles, show that our algorithm achieves a high accuracy.

Nacéra Bennacer, Coriane Nana Jipmo, Antonio Penta, Gianluca Quercini

Process Mining

Indexing and Efficient Instance-Based Retrieval of Process Models Using Untanglings

Process-Aware Information Systems (PAISs) support executions of operational processes that involve people, resources, and software applications on the basis of

process models

. Process models describe vast, often infinite, amounts of

process instances

, e.g., workflows supported by the systems. With the increasing adoption of PAISs, large process model

repositories

emerged in companies and public organizations. These repositories constitute significant information resources. Accurate and efficient retrieval of process models and/or process instances from such repositories is interesting for multiple reasons, e.g., searching for similar models/instances, filtering, reuse, standardization, process compliance checking, verification of formal properties, etc. This paper proposes a technique for

indexing

process models that relies on their alternative representations, called

untanglings

. We show the use of untanglings for retrieval of process models based on process instances that they specify via a solution to the

total executability problem

. Experiments with industrial process models testify that the proposed retrieval approach is up to three orders of magnitude faster than the state of the art.

Artem Polyvyanyy, Marcello La Rosa, Arthur H. M. ter Hofstede
Predictive Monitoring of Business Processes

Modern information systems that support complex business processes generally maintain significant amounts of process execution data, particularly records of events corresponding to the execution of activities (event logs). In this paper, we present an approach to analyze such event logs in order to predictively monitor business constraints during business process execution. At any point during an execution of a process, the user can define business constraints in the form of linear temporal logic rules. When an activity is being executed, the framework identifies input data values that are more (or less) likely to lead to the achievement of each business constraint. Unlike reactive compliance monitoring approaches that detect violations only after they have occurred, our predictive monitoring approach provides early advice so that users can steer ongoing process executions towards the achievement of business constraints. In other words, violations are predicted (and potentially prevented) rather than merely detected. The approach has been implemented in the ProM process mining toolset and validated on a real-life log pertaining to the treatment of cancer patients in a large hospital.

Fabrizio Maria Maggi, Chiara Di Francescomarino, Marlon Dumas, Chiara Ghidini
What Shall I Do Next?
Intention Mining for Flexible Process Enactment

Besides the benefits of flexible processes, practical implementations of process aware information systems have also revealed difficulties encountered by process participants during enactment. Several support and guidance solutions based on process mining have been proposed, but they lack a suitable semantics for human reasoning and decisions making as they mainly rely on low level activities. Applying design science, we created

FlexPAISSeer

, an

intention mining

oriented approach, with its component artifacts: 1)

IntentMiner

which discovers the intentional model of the executable process in an unsupervised manner; 2)

IntentRecommender

which generates recommendations as intentions and confidence factors, based on the mined intentional process model and probabilistic calculus. The artifacts were evaluated in a case study with a Netherlands software company, using a Childcare system that allows flexible data-driven process enactment.

Elena V. Epure, Charlotte Hug, Rebecca Deneckére, Sjaak Brinkkemper

Models

Using Reference Domain Ontologies to Define the Real-World Semantics of Domain-Specific Languages

This paper proposes a principled approach to the definition of real-world semantics for declarative domain-specific languages. The approach is based on: (i) the explicit representation of the admissible states of the world through a reference domain ontology (which serves as semantic foundation for the domain-specific language), (ii) a representation of the valid expressions of a domain-specific language (to determine the abstract syntax of the language), and (iii) the rigorous definition of the relation between the abstract syntax and the reference domain ontology (to define the real-world semantics of the language). These three elements of the approach are axiomatized in three corresponding logic theories, enabling a systematic treatment of real-world semantics, including formal tooling to support language design and assessment.

Victorio A. de Carvalho, João Paulo A. Almeida, Giancarlo Guizzardi
Dual Deep Instantiation and Its ConceptBase Implementation

Application integration requires the consideration of instance data and schema data. Instance data in one application may be schema data for another application, which gives rise to multiple instantiation levels. Using deep instantiation, an object may be deeply characterized by representing schema data about objects several instantiation levels below. Deep instantiation still demands a clear separation of instantiation levels: the source and target objects of a relationship must be at the same instantiation level. This separation is inadequate in the context of application integration. Dual deep instantiation (DDI), on the other hand, allows for relationships that connect objects at different instantiation levels. The depth of the characterization may be specified separately for each end of the relationship. In this paper, we present and implement set-theoretic predicates and axioms for the representation of conceptual models with DDI.

Bernd Neumayr, Manfred A. Jeusfeld, Michael Schrefl, Christoph Schütz
An Adapter-Based Approach to Co-evolve Generated SQL in Model-to-Text Transformations

Forward Engineering

advocates for code to be generated dynamically through model-to-text transformations that target a specific platform. In this setting, platform evolution can leave the transformation, and hence the generated code, outdated. This issue is exacerbated by the perpetual beta phenomenon in Web 2.0 platforms where continuous delta releases are a common practice. Here, manual co-evolution becomes cumbersome. This paper looks at how to automate —fully or in part—the synchronization process between the platform and the transformation. To this end, the transformation process is split in two parts: the stable part is coded as a MOFScript transformation whereas the unstable side is isolated through an adapter that is implicitly called by the transformation at generation time. In this way, platform upgrades impact the adapter but leave the transformation untouched. The work focuses on DB schema evolution, and takes

MediaWiki

as a vivid case study. A first case study results in the upfront cost of using the adapter paying off after three releases

MediaWiki

upgrades.

Jokin García, Oscar Dìaz, Jordi Cabot

Mining Event Logs

Mining Predictive Process Models out of Low-level Multidimensional Logs

Process Mining techniques have been gaining attention, especially as concerns the discovery of predictive process models. Traditionally focused on workflows, they usually assume that process tasks are clearly specified, and referred to in the logs. This limits however their application to many real-life BPM environments (e.g. issue tracking systems) where the traced events do not match any predefined task, but yet keep lots of context data. In order to make the usage of predictive process mining to such logs more effective and easier, we devise a new approach, combining the discovery of different execution scenarios with the automatic abstraction of log events. The approach has been integrated in a prototype system, supporting the discovery, evaluation and reuse of predictive process models. Tests on real-life data show that the approach achieves compelling prediction accuracy w.r.t. state-of-the-art methods, and finds interesting activities’ and process variants’ descriptions.

Francesco Folino, Massimo Guarascio, Luigi Pontieri
Mining Event Logs to Assist the Development of Executable Process Variants

Developing process variants has been proven as a principle task to flexibly adapt a business process model to different markets. Contemporary research on variant development has focused on conceptual process models. However, process models do not always exist, even when process logs are available in information systems. Moreover, process logs are often more detailed than process models and reflect more closely to the behavior of the process. In this paper, we propose an activity recommendation approach that takes into account process logs for assisting the development of executable process variants. To this end, we define a notion of neighborhood context for each activity based on logs, which captures order constraints between activities with their occurrence frequency. The similarity of the neighborhood context between activities provides us then with a basis to recommend activities during the process of creating a new process model. The approach has been implemented as a plug-in for ProM. Furthermore, we conducted experiments on a large collection of process logs. The results indicate that our approach is feasible and applicable in real use cases.

Nguyen Ngoc Chan, Karn Yongsiriwit, Walid Gaaloul, Jan Mendling
An Extensible Framework for Analysing Resource Behaviour Using Event Logs

Business processes depend on human resources and managers must regularly evaluate the performance of their employees based on a number of measures, some of which are subjective in nature. As modern organisations use information systems to automate their business processes and record information about processes’ executions in event logs, it now becomes possible to get objective information about resource behaviours by analysing data recorded in event logs. We present an extensible framework for extracting knowledge from event logs about the behaviour of a human resource and for analysing the dynamics of this behaviour over time. The framework is fully automated and implements a predefined set of behavioural indicators for human resources. It also provides a means for organisations to define their own behavioural indicators, using the conventional Structured Query Language, and a means to analyse the dynamics of these indicators. The framework’s applicability is demonstrated using an event log from a German bank.

Anastasiia Pika, Moe T. Wynn, Colin J. Fidge, Arthur H. M. ter Hofstede, Michael Leyer, Wil M. P. van der Aalst

Databases

Extracting Facets from Lost Fine-Grained Categorizations in Dataspaces

Categorization of instances in dataspaces is a difficult and time consuming task, usually performed by domain experts. In this paper we propose a semi-automatic approach to the extraction of facets for the fine-grained categorization of instances in dataspaces. We focus on the case where instances are categorized under heterogeneous taxonomies in several sources. Our approach leverages Taxonomy Layer Distance, a new metric based on structural analysis of source taxonomies, to support the identification of meaningful candidate facets. Once validated and refined by domain experts, the extracted facets provide a fine-grained classification of dataspace instances. We implemented and evaluated our approach in a real world dataspace in the eCommerce domain. Experimental results show that our approach is capable of extracting meaningful facets and that the new metric we propose for the structural analysis of source taxonomies outperforms other state-of-the-art metrics.

Riccardo Porrini, Matteo Palmonari, Carlo Batini
Towards a Form Based Dynamic Database Schema Creation and Modification System

The traditional approach to relational database design starts with the conceptual design of an application based schema in a model like the Entity-relationship model, then mapping that to a logical design and eventually representing it as a set of related normalized tables. The project we present has been motivated by needs of healthcare-IT where small group practices are currently in need of systems that will cater to their dynamic requirements without depending on EMR (Electronic Medical Record) systems. It is also relevant for researchers for mining huge repositories of data such as social networks, etc. and create extracts of data on the fly for data analytics. Based on user characteristics and needs, the data is likely to vary and hence, a dynamic back-end database must be created. This paper addresses a form-based approach to schema creation and modification.

Kunal Malhotra, Shibani Medhekar, Shamkant B. Navathe, M. D. David Laborde
CubeLoad: A Parametric Generator of Realistic OLAP Workloads

Differently from OLTP workloads, OLAP workloads are hardly predictable due to their inherently extemporary nature. Besides, obtaining real OLAP workloads by monitoring the queries actually issued in companies and organizations is quite hard. On the other hand, hardware and software benchmarking in the industrial world, as well as comparative evaluation of novel approaches in the research community, both need reference databases and workloads. In this paper we present CubeLoad, a parametric generator of workloads in the form of OLAP sessions, based on a realistic profile-based model. After describing the main features of CubeLoad, we discuss the results of some tests that show how workloads with very different features can be generated.

Stefano Rizzi, Enrico Gallinucci

Software Engineering

Task Specification and Reasoning in Dynamically Altered Contexts

Software systems are prone to evolution in order to be kept operational and meet new requirements. However, for large systems such evolution activities cannot occur in a vacuum. Instead, specific action plans must be devised so that evolution goals can be achieved within an acceptable level of deviation or, risk. In this paper we present an approach that allows for the identification of plans in the form of actions that satisfy a goal model when the environment is constantly changing. The approach is based on sequences of mutations of an initial solution, using a local search algorithm. Experimental results indicate that even for medium size models, the approach outperforms in execution time the weighted Max-Sat algorithms, while it is able to achieve an almost optimal solution. The approach is demonstrated on an example scenario of re-configuring a dynamically provisioned system.

George Chatzikonstantinou, Michael Athanasopoulos, Kostas Kontogiannis
Finding Optimal Plans for Incremental Method Engineering

Incremental method engineering proposes to evolve the information systems development methods of a software company through a step-wise improvement process. In practice, this approach proved to be effective for reducing the risks of failure while introducing method changes. However, little attention has been paid to the important problem of identifying an adequate plan for implementing the changes in the company’s context. To overcome this deficiency, we propose an approach that assists analysts by suggesting—via automated reasoning—optimal and quasi-optimal plans for implementing method changes. After formalizing the Process-Deliverable Diagrams language for describing the method changes to implement, we present a planning framework for generating plans that comply with different types of constraints.We also describe an implementation of the modeling and planning components of our approach.

Kevin Vlaanderen, Fabiano Dalpiaz, Sjaak Brinkkemper
On the Effectiveness of Concern Metrics to Detect Code Smells: An Empirical Study

Traditional software metrics have been used to evaluate the maintainability of software programs by supporting the identification of code smells. Recently, concern metrics have also been proposed with this purpose. While traditional metrics quantify properties of software modules, concern metrics quantify concern properties, such as scattering and tangling. Despite being increasingly used in empirical studies, there is a lack of empirical knowledge about the effectiveness of concern metrics to detect code smells. This paper reports the results of an empirical study to investigate whether concern metrics can be useful indicators of three code smells, namely Divergent Change, Shotgun Surgery, and God Class. In this study, 54 subjects from two different institutions have analyzed traditional and concern metrics aiming to detect instances of these code smells in two information systems. The study results indicate that, in general, concern metrics support developers detecting code smells. In particular, we observed that (i) the time spent in code smell detection is more relevant than the developers’ expertise; (ii) concern metrics are clearly useful to detect Divergent Change and God Class; and (iii) the concern metric Number of Concerns per Component is a reliable indicator of Divergent Change.

Juliana Padilha, Juliana Pereira, Eduardo Figueiredo, Jussara Almeida, Alessandro Garcia, Cláudio Sant’Anna
Backmatter
Metadata
Title
Advanced Information Systems Engineering
Editors
Matthias Jarke
John Mylopoulos
Christoph Quix
Colette Rolland
Yannis Manolopoulos
Haralambos Mouratidis
Jennifer Horkoff
Copyright Year
2014
Publisher
Springer International Publishing
Electronic ISBN
978-3-319-07881-6
Print ISBN
978-3-319-07880-9
DOI
https://doi.org/10.1007/978-3-319-07881-6

Premium Partner