main-content

## Über dieses Buch

This book constitutes extended and revised papers from the 19th International Conference on Enterprise Information Systems, ICEIS 2017, held in Porto, Portugal, in April 2017.

The 28 papers presented in this volume were carefully reviewed and selected for inclusion in this book from a total of 318 submissions. They were organized in topical sections named: databases and information systems integration; artificial intelligence and decision support systems; information systems analysis and specification; software agents and internet computing; human-computer interaction; and enterprise architecture.

## Inhaltsverzeichnis

### Towards a Framework for Aiding the Collaborative Management of Informal Projects

Abstract
Informal projects, such as schoolwork and social meetings, are managed by groups in a collaborative way. Conventional management approaches are not suitable to deal with dynamism and decentralization required by informal projects. Although commonly used, communication tools alone are not sufficient for managing informal projects, due to a lack of coordination mechanisms and project awareness. Based on recommendations of cooperative work, motivation mechanisms and project management, we propose a framework to aid the collaborative management of informal projects. The framework consists of five guidelines regarding projects definition, activities’ management, responsibility sharing, contribution recognition, and project visibility. We demonstrate that the proposed framework is practical by building a mobile application. Based on two case studies, we show that the framework assist management activities and also foster participation and recognition. We also discuss the use to the framework to analyze an existing tool in a way to identify weaknesses and improvements.
Luma Ferreira, Juliana Bezerra, Celso Hirata

### Using a Time-Based Weighting Criterion to Enhance Link Prediction in Social Networks

Abstract
Recently, the link prediction (LP) problem has attracted much attention from both scientific and industrial communities. This problem tries to predict whether two not linked nodes in a network will connect in the future. Several studies have been proposed to solve it. Some of them compute a compatibility degree (link strength) between connected nodes and apply similarity metrics between non-connected nodes in order to identify potential links. However, despite the acknowledged importance of temporal data for the LP problem, few initiatives investigated the use of this kind of information to represent link strength. In this paper, we propose a weighting criterion that combines the frequency of interactions and temporal information about them in order to define the link strength between pairs of connected nodes. The results of our experiment with weighted and non-weighted similarity metrics in ten co-authorship networks present statistical evidences that confirm our hypothesis that weighting links based on temporal information may, in fact, improve link prediction.
Carlos Pedro Muniz, Ronaldo Goldschmidt, Ricardo Choren

### Professional Competence Identification Through Formal Concept Analysis

Abstract
As the job market has become increasingly competitive, people who are looking for a job placement have needed help to increase their competence to achieve a job position. The competence is defined by the set of skills that is necessary to execute an organizational function. In this case, it would be helpful to identify the sets of skills which is necessary to reach job positions. Currently, the on-line professional social networks are attracting the interest from people all around the world, whose their goals are oriented to business relationships. Through the available amount of information in this kind of networks it is possible to apply techniques to identify the competencies that people have developed in their career. In this scenario it has been fundamental the adoption of computational methods to solve this problem. The formal concept analysis (FCA) has been a effective technique for data analysis area, because it allows to identify conceptual structures in data sets, through conceptual lattice and implications. A specific set of implications, know as proper implications, represent the set of conditions to reach a specific goal. So, in this work, we proposed a FCA-based approach to identify and analyze the professional competence through proper implications.
Paula R. Silva, Sérgio M. Dias, Wladmir C. Brandão, Mark A. Song, Luis E. Zárate

### Data Quality Problems in TPC-DI Based Data Integration Processes

Abstract
Many data driven organisations need to integrate data from multiple, distributed and heterogeneous resources for advanced data analysis. A data integration system is an essential component to collect data into a data warehouse or other data analytics systems. There are various alternatives of data integration systems which are created in-house or provided by vendors. Hence, it is necessary for an organisation to compare and benchmark them when choosing a suitable one to meet its requirements. Recently, the TPC-DI is proposed as the first industrial benchmark for evaluating data integration systems. When using this benchmark, we find some typical data quality problems in the TPC-DI data source such as multi-meaning attributes and inconsistent data schemas, which could delay or even fail the data integration process. This paper explains processes of this benchmark and summarises typical data quality problems identified in the TPC-DI data source. Furthermore, in order to prevent data quality problems and proactively manage data quality, we propose a set of practical guidelines for researchers and practitioners to conduct data quality management when using the TPC-DI benchmark.
Qishan Yang, Mouzhi Ge, Markus Helfert

### Efficient Filter-Based Algorithms for Exact Set Similarity Join on GPUs

Abstract
Set similarity join is a core operation for text data integration, cleaning, and mining. Most state-of-the-art solutions rely on inherently sequential, CPU-based algorithms. In this paper, we propose a parallel algorithm for the set similarity joins harnessing the power of GPU systems through filtering techniques and divide-and-conquer strategies that scale well with data size. Furthermore, we also present parallel algorithms for all data pre-processing phases. As a result, we have an end-to-end solution to the set similarity join problem, which receives input text data and outputs pairs of similar strings and is entirely executed on the GPU. Our experimental results on standard datasets show substantial speedups over the fastest algorithms in the literature.
Rafael David Quirino, Sidney Ribeiro-Junior, Leonardo Andrade Ribeiro, Wellington Santos Martins

### Experimenting and Assessing a Probabilistic Business Process Deviance Mining Framework Based on Ensemble Learning

Abstract
Business Process Intelligence (BPI) and Process Mining, two very active research areas of research, share a great interest towards the issue of discovering an effective Deviance Detection Model (DDM), computed via accessing log data. The DDM model allows us to understand whether novel instances of the target business process are deviant or not, thus becoming extremely useful in modern application scenarios such as cybersecurity and fraud detection. In this chapter, we further and significantly extend our previous line of work that has originated, across years, an innovative ensemble-learning framework for mining business process deviances, whose main benefit is that of introducing a sort of multi-view learning scheme. One of the most relevant achievements of this extended work consists in proposing an alternative meta-learning method for probabilistically combining the predictions of different base DDMs, and putting all together in a conceptual system architecture oriented to support common Business Process Management (BPM) scenarios. In addition to this, we here envisage the combination of this approach with a deviance explanation methodology that leverages and extends a previous method still proposed by us in previous research. Basically, the latter method allows to discover accurate and readable deviance-aware trace clusters defined in terms of descriptive rules over both properties and behavioral aspects of the traces. We complement our analytical contributions with a comprehensive experimental assessment and analysis, even in comparison with a state-of-the-art DDM discovery approach. The experimental results we derive confirm flexibility, reliability and effectiveness of the proposed business process deviance mining framework.
Alfredo Cuzzocrea, Francesco Folino, Massimo Guarascio, Luigi Pontieri

### Triplet Markov Chains Based- Estimation of Nonstationary Latent Variables Hidden with Independent Noise

Abstract
Estimation of hidden variables is among the most challenging tasks in statistical signal processing. In this context, hidden Markov chains have been extensively used due to their ability to recover hidden variables from observed ones even for large data. Such models fail, however, to handle nonstationary data when parameters are unknown. The aim of this paper is to show how the recent triplet Markov chains, strictly more general models exhibiting comparable computational cost, can be used to overcome this shortcoming in two different ways: (i) in a firmly Bayesian context by considering an additional Markov process to model the switches of the hidden variables; and, (ii) by introducing Dempster-Shafer theory to model the lack of precision of prior distributions. Moreover, we analyze both approaches and assess their performance through experiments conducted on sampled data and noised images.
Mohamed El Yazid Boudaren, Emmanuel Monfrini, Kadda Beghdad Bey, Ahmed Habbouchi, Wojciech Pieczynski

### Detection and Explanation of Anomalous Payment Behavior in Real-Time Gross Settlement Systems

Abstract
In this paper, we discuss how to apply an autoencoder to detect anomalies in payment data derived from an Real-Time Gross Settlement system. Moreover, we introduce a drill-down procedure to measure the extent to which the inflow or outflow of a particular bank explains an anomaly. Experimental results on real-world payment data show that our method can detect the liquidity problems of a bank when it was subject to a bank run with reasonable accuracy.
Ron Triepels, Hennie Daniels, Ronald Heijmans

### Quality of Care Driven Scheduling of Clinical Pathways Under Resource and Ethical Constraints

Abstract
Currently, hospitals have to face a growing number of patients under an increasing pressure of profitability. In this context, clinical pathways provide a more efficient organisation of the complex multidisciplinary workflows involved. They also support the timely decision making required for meeting strict treatment constraints for patients within limited hospital resources. However, implementing clinical pathways is challenging because of the variety of constraints that must be met simultaneously for a large pool of patients. In this paper we propose a decision support for driving clinical pathways primarily based on care quality indicators, so the patient health has always the priority. We demonstrate how constraint-based local search techniques can (1) support real-world chemotherapy pathways, (2) efficiently react to a variety of adverse events, such as unexpected delays or partial drug deliveries, and (3) address ethical concerns related to the fairness of resource allocation. Our claim is supported by an extensive validation on a series of scenarios of different size, load and complexity.
Christophe Ponsard, Renaud De Landtsheer, Yoann Guyot, François Roucoux, Bernard Lambeau

### Strategies to Foster Software Ecosystem Partnerships – A System Dynamics Analysis

Abstract
Software ecosystems originate in the idea of business ecosystems, as a collection of relationships among actors in an economic community. They represent the current functioning of the IT industry, in which companies co-create innovations and extend their solutions by means of partnerships. This research investigates the driving factors of relationships in a software ecosystem. To address this goal, we performed multiple case studies of two emerging software ecosystems formed by Small-to-Medium Enterprises. Based on evidence from twenty-seven interviews conducted with eight companies, we analysed the main facilitators and barriers for their partnerships to thrive. We used System Dynamics method to identify cause-effect relations among these factors. The resulting dynamic models enabled us to map key factors and interactions to propose four strategies that promote software ecosystems health. We believe that practitioners can benefit from this synthesis by understanding the facilitators to reinforce and barriers to restrain, as a means to catalyse the success of their networks. In addition, we demonstrate for researchers the utility of System Dynamics to provide a diagnostic of relevant scenarios.
George Valença, Carina Alves

### Task-oriented Requirements Engineering for Personal Decision Support Systems

Abstract
[Context and motivation] In decision-making, executives are supported by Personal Decision Support Systems (PDSSs) which are information systems providing a decision- and user-specific data presentation. PDSSs operate on current data with predefined queries and provide a rich user interface (UI). Therefore, a Requirements Engineering (RE) method for PDSSs should support the elicitation and specification of detailed requirements for specific decisions. However, existing RE approaches for decision support systems typically focus on ad-hoc decisions in the area of data warehouses. [Question/problem] Task-oriented RE (TORE) emphasizes a comprehensive RE specification which covers stakeholders’ tasks, data, system functions, interactions, and UI. TORE allows for an early UI prototyping which is crucial for PDSS. Therefore, we explore TORE’s suitability for PDSSs. [Principal ideas/results] According to the Design Science methodology, we assess TORE for its suitability for PDSS specification in a problem investigation. We propose decision-specific adjustments of TORE (DsTORE), which we evaluate in a case study. [Contribution] This paper is an extended version of previously published work. The contribution of this paper is fourfold. First, the suitability of the task-oriented RE method TORE for the specification of a PDSS is investigated as problem investigation. Second, the decision-specific extension of TORE is proposed as the DsTORE-method. DsTORE allows identifying and specifying details of decisions to be supported by a PDSS, utilizing a number of artifacts. Third, strategic information management is used as the example task for the evaluation of DsTORE in a case study. All interview questions used in the design science cycle are given. Experiences from the study and the method design are presented. Fourth, the evaluation of the developed system prototype is presented along with the questionnaire we used, showing the utility and acceptance.
Christian Kücherer, Barbara Paech

### Feature Model as a Design-pattern-based Service Contract for the Service Provider in the Service Oriented Architecture

Abstract
In Service Oriented Architecture (SOA), many feature modeling approaches of Service Provider (SP) have been proposed, notably: the two widely used service contracts WSDL and WADL. By studying these approaches, we found that they suffer from several problems, notably: they only work for specific communication technologies (e.g., SOAP or REST) and they do not explicitly model SOA Design Pattern (DPs) and their compounds. One major benefit of using a DP or a compound DP is to develop SPs with proven design solutions. In this paper, in order to overcome these problems, we propose an approach that integrates Software Product Line (SPL) techniques in the development of SPs. Essentially, we propose a Feature Model (FM), which is the defacto standard for variability modeling in SPL, for the feature modeling of SP. This FM, named $$FM_{SP}$$, is designed as a DP-based service contract for SP that models different features including 16 SOA DPs and their compounds that are related to the service messaging category. Its objective to enable developers to generate fully functional, valid, DP-based and highly customized SPs for different communication technologies. Through a practical case study and a developed tool, we validate our $$FM_{SP}$$ and demonstrate that it reduces the development costs (effort and time) of SPs.
Akram Kamoun, Mohamed Hadj Kacem, Ahmed Hadj Kacem, Khalil Drira

### A Method and Programming Model for Developing Interacting Cloud Applications Based on the TOSCA Standard

Abstract
Many cloud applications are composed of several interacting components and services. The communication between these components can be enabled, for example, by using standards such as WSDL and the workflow technology. In order to wire these components several endpoints must be exchanged, e.g., the IP addresses of deployed services. However, this exchange of endpoint information is highly dependent on the (i) middleware technologies, (ii) programming languages, and (iii) deployment technology used in a concrete scenario and, thus, increases the complexity of implementing such interacting applications. In this paper, we propose a programming model that eases the implementation of interacting components of automatically deployed TOSCA-based applications. Furthermore, we present a method following our programming model, which describes how such a cloud application can be systematically modeled, developed, and automatically deployed based on the TOSCA standard and how code generation capabilities can be utilized for this. The practical feasibility of the presented approach is validated by a system architecture and a prototypical implementation based on the OpenTOSCA ecosystem. This work is an extension of our previous research we presented at the International Conference on Enterprise Information Systems (ICEIS).
Michael Zimmermann, Uwe Breitenbücher, Frank Leymann

### Security Requirements and Tests for Smart Toys

Abstract
The Internet of Things creates an environment to allow the integration of physical objects into computer-based systems. More recently, smart toys have been introduced in the market as conventional toys equipped with electronic components that enable wireless network communication with mobile devices, which provide services to enhance the toy’s functionalities and data transmission over Internet. Smart toys provide users with a more sophisticated and personalised experience. To do so, they need to collect lots of personal and context data by means of mobile applications, web applications, camera, microphone and sensors, for instance. All data are processed and stored locally or in cloud servers. Naturally, it raises concerns around information security and child safety because unauthorised access to confidential information may bring many consequences. In fact, several security flaws in smart toys have been recently reported in the news. In this context, this paper presents an analysis of the toy computing environment based on the threat modelling process from Microsoft Security Development Lifecycle with the aim of identifying a minimum set of security requirements a smart toy should meet, and propose a general set of security tests in order to validate the implementation of the security requirements. As result, we have identified 16 issues to be addressed, 15 threats and 22 security requirements for smart toys. We also propose using source code analysis tools to validate seven of the security requirements; three test classes to validate seven security requirements; and specific alpha and beta tests to validate the remaining requirements.
Luciano Gonçalves de Carvalho, Marcelo Medeiros Eler

### An Approach for Semantically-Enriched Recommendation of Refactorings Based on the Incidence of Code Smells

Abstract
Code smells are symptoms of bad decisions on the design and development of software. The occurrence of code smells in software can lead to costly consequences. Refactorings are considered adequate resources when it comes to reducing or removing the undesirable effects of smells in software. Ontologies and semantics can play a substantial role in reducing the interpretation burden of software engineers as they have to decide about adequate refactorings to mitigate the impact of smells. However, related work has given little attention to associating the recommendation of refactorings with the use of ontologies and semantics. Developers can benefit from the combination of code smells detection with a semantically-oriented approach for recommendation of refactorings. To make this possible, we expand the application of our previous ontology, ONTOlogy for Code smEll ANalysis (ONTOCEAN), to combine it with a new one, Ontology for SOftware REfactoring (OSORE). We also introduce a new tool, our REfactoring REcommender SYStem (RESYS) which is capable of binding our two ontologies. As a result, refactorings are automatically chosen and semantically linked to their respective code smells. We also conducted a preliminary evaluation of our approach in a real usage scenario with four open-source software projects.
Luis Paulo da Silva Carvalho, Renato Lima Novais, Laís do Nascimento Salvador, Manoel Gomes de Mendonça Neto

### A Scoring Method Based on Criteria Matching for Cloud Computing Provider Ranking and Selection

Abstract
Cloud computing has become a successful service model for hosting and elastic on demand distribution of computing resources all around the world, using the Internet. This cornerstone paradigm has been adopted and incorporated not only in all major known service providers IT companies (e.g., Google, Amazon, etc.), but also triggered a competitive race at the creation of new companies as providers of cloud computing services. Although this increase on the companies offering cloud computing services is beneficial to the client, on the other hand it challenges the clients’ ability to choose among those companies the most suitable to attend their requirements. Therefore, this work proposes a logical/mathematical scoring method to be used to rank and select among several cloud computing provider candidates the most appropriate to the user. This method is based on the analysis of several criteria comprising performance indicators values required by the user and associated with every cloud computing provider that is able to attend the user’s requirements. The proposed method is composed of a three stages algorithm that evaluates, scores, sorts and selects different cloud providers based on the utility of their performance indicators according to the values of the performance indicators required by the users. In order to illustrate the proposed method’s operation, example of its utilization is provided.
Lucas Borges de Moraes, Adriano Fiorese

### Describing Scenarios and Architectures for Time-Aware Recommender Systems for Learning

Abstract
This work investigates the use of Time-Aware Recommender Systems in e-learning systems. In this sense, in the work are defined recommender systems architectures taking into account how the time can be used in recommender systems in the learning domain. For each architecture the main requirements to use the time in a specific way is identified, and some algorithm ideas area presented. Scenarios are presented to illustrate how the proposal architectures can be useful. The results of this work can guide other researches on the field to apply recommender systems techniques in the learning domain.
Eduardo José de Borba, Isabela Gasparini, Daniel Lichtnow

### Towards Generating Spam Queries for Retrieving Spam Accounts in Large-Scale Twitter Data

Abstract
Twitter, as a top microblogging site, has became a valuable source of up-to-date and real-time information for a wide range of social-based researches and applications. Intuitively, the main factor of having an acceptable performance in those recherches and applications is the working and relying on information having an adequate quality. However, given the painful truth that Twitter has turned out a fertile environment for publishing noisy information in different forms. Consequently, maintaining the condition of high quality is a serious challenge, requiring great efforts from Twitter’s administrators and researchers to address the information quality issues. Social spam is a common type of the noisy information, which is created and circulated by ill-intentioned users, so-called social spammers. More precisely, they misuse all possible services provided by Twitter to propagate their spam content, leading to have a large information pollution flowing in Twitter’s network. As Twitter’s anti-spam mechanism is not both effective and immune towards the spam problem, enormous recherches have been dedicated to develop methods that detect and filter out spam accounts and tweets. However, these methods are not scalable when handling large-scale Twitter data. Indeed, as a mandatory step, the need for an additional information from Twitter’s servers, limited to a few number of requests per 15 min time window, is the main barrier for making these methods too effective, requiring months to handle large-scale Twitter data. Instead of inspecting every account existing in a given large-scale Twitter data in a sequential or randomly fashion, in this paper, we explore the applicability of information retrieval (IR) concept to retrieve a sub-set of accounts having high probability of being spam ones. Specifically, we introduce a design of an unsupervised method that partially processes a large-scale of tweets to generate spam queries related to account’s attributes. Then, the spam queries are issued to retrieve and rank the highly potential spam accounts existing in the given large-scale Twitter accounts. Our experimental evaluation shows the efficiency of generating spam queries from different attributes to retrieve spam accounts in terms of precision, recall, and normalized discounted cumulative gain at different ranks.
Mahdi Washha, Aziz Qaroush, Manel Mezghani, Florence Sedes

### Statistical Methods for Use in Analysis of Trust-Skyline Sets

Abstract
Volume and veracity of Resource Description Framework (RDF) data in the web are two main issues in managing information. Due to the diversity of RDF data, several researchers enriched the basic RDF data model with trust information to rate the trustworthiness of the collected data.
This paper is an extension of our previous work in which we extended Trust-Skyline queries over RDF data. We are interested in analyzing the trust-Skyline list. We particularly study the user-defined trust measure ($$\alpha$$) problem, which consists in checking the impact of such measure on the resulting list. To this end, we first distinguish between the trust-Skyline points, we propose two main categories, points that enter to the final list after the Pareto-dominance check and points that have trust measures less than $$\alpha$$.
Then, we proposed statistical methods to investigate the trust measures dependence. Indeed we used the central tendency measures, and the measures of spread for such analysis. Experiments led on the algorithm’s implementations showed promising results.
Amna Abidi, Mohamed Anis Bach Tobji, Allel Hadjali, Boutheina Ben Yaghlane

### Enabling Semantics in Enterprises

Abstract
Nowadays, enterprises generate massive amounts of heterogeneous structured and unstructured data within their factories and attempt to store them inside data lakes. However, potential users, such as data scientists, encounter problems when they have to find, analyze and especially understand the data. Possible existing solutions use ontologies as data governance technique for establishing a common understanding of data sources. While ontologies build a solid basis for representing knowledge, their construction is a very complex task which requires the knowledge of multiple domain experts. However, in fast and continuously evolving enterprises a static ontology will be quickly outdated.
To cope with this problem, we developed the information processing platform ESKAPE. With the help of ESKAPE, data publishers annotate their added data sources with semantic models providing additional knowledge which enables later users to process, query and subscribe to heterogeneous data as information products. Instead of solely creating semantic models based on a pre-defined ontology, ESKAPE maintains a knowledge graph which learns from the knowledge provided within the semantic models by data publishers. Based on the semantic models and the evolving knowledge graph, ESKAPE supports enterprises’ data scientists in finding, analyzing and understanding data.
To evaluate ESKAPE’s usability, we conducted an open competitive hackathon where users had to develop mobile applications. The received feedback shows that ESKAPE already reduced the workload of the participants for getting the appropriate required data and enhanced the usability of dealing with the available data.
André Pomp, Alexander Paulus, Sabina Jeschke, Tobias Meisen

### Behavioral Economics Through the Lens of Persuasion Context Analysis: A Review of Contributions in Leading Information Systems Journals

Abstract
As technology becomes an integral part of our everyday lives, the more crucial it is to investigate how it can be further harnessed to improve individuals’ wellbeing. This involves studying users’ interactions with technology, how different design techniques influence their use, and the factors that might lead to sub-optimal use of technology. Such factors include decision biases which are mostly investigated in behavioral economics research. Behavioral economics counters the arguments of standard economic theories and combines psychological theories and economics to study how people actually behave as opposed to how they should behave as rational beings. Thus, this review provides an overview of behavioral economics research in the major IS journals. The aim is to determine the extent of such research within the IS field. An electronic search of the major IS journals was conducted over an 8-year period and the findings were categorized according to the use, user and technology contexts of the persuasive systems design model. The findings reveal the need for awareness of how various behavioral economic principles (or decision biases) influence decision making in technology-mediated settings and the development of strategies to mitigate their influence.
Michael Oduor, Harri Oinas-Kukkonen

### YouMake: A Low-Cost, Easy for Prototyping, Didactic and Generic Platform for Acquisition and Conditioning of Biomedical Signals

Abstract
The study of cell’s electric properties began in XVIII. Since then, several researchers began focusing their studies in biomedical signals, making way for today’s high precision tech for modern medicine - expensive and used by professionals. However, the emergence of new research fields in the biomedical area like monitoring of human activity and human-machine interface brought the need to measure biomedical signals through simple devices. In addition, there was a growth of the DIY (do-it-yourself) movement boosted by prototyping platforms such as Arduino and Raspberry-pi. Thus, came the idea to develop YouMake, a platform for acquisition and conditioning of biomedical signals with low cost, easy prototyping, versatile and generic. For evaluation purposes, an experimental study using YouMake with twenty-four participants was divided into two groups, the first consisting of participants with experience in the study area and the latter represented by participants with no experience. Usability and prototyping time of the participants in the prototyping of the platform for the acquisition of three biological signals were evaluated: ECG, EMG and EOG. The usability and prototyping time of the participants were evaluated in the prototyping of the platform for the acquisition of three biological signals: ECG, EMG and EOG. The results were statistically analyzed using the Shapiro-Wilk, Levene and t-student tests, which showed that there was no statistical difference between the means of the experienced and the non-experienced groups. This showed that both experienced and inexperienced people in the study have the same ease in using the platform.
Diego Assis Siqueira Gois, João Paulo Andrade Lima, Marco Túlio Chella, Methanias Colaço Rodrigues Júnior

### A Human-Centered Approach for Interactive Data Processing and Analytics

Abstract
In recent years, the amount of data increases continuously. With newly emerging paradigms, such as the Internet of Things, this trend will even intensify in the future. Extracting information and, consequently, knowledge from this large amount of data is challenging. To realize this, approved data analytics approaches and techniques have been applied for many years. However, those approaches are oftentimes very static, i.e., cannot be dynamically controlled. Furthermore, their implementation and modification requires deep technical knowledge only technical experts can provide, such as an IT department of a company. The special needs of the business users are oftentimes not fully considered. To cope with these issues, we introduce in this article a human-centered approach for interactive data processing and analytics. By doing so, we put the user in control of data analytics through dynamic interaction. This approach is based on requirements derived from typical case scenarios.
Michael Behringer, Pascal Hirmer, Bernhard Mitschang

### Understanding Governance Mechanisms and Health in Software Ecosystems: A Systematic Literature Review

Abstract
In a software ecosystem, organizations work collaboratively to remain profitable and survive market changes. For the relationship between these organizations to succeed, it is necessary to participate in the ecosystem software without violating rules of collaboration or to take advantages that destabilize the general health of the ecosystem. The application of governance mechanisms is essential for achieving this balance. Governance mechanisms are employed to define the level of control, rights of decision and scope of owner versus shared ownership in an ecosystem. Selecting appropriate governance mechanisms, organizations can gain strategic advantage over others leading them to better performance and, consequently, to be healthier. In this article, we report a systematic literature review that aggregates definitions of software ecosystem governance and classify governance mechanisms in three dimensions: value creation, coordination of players, and organizational openness and control. Additionally, we propose a research agenda that addresses relevant topics for researchers and practitioners to explore these issues. Initially, we performed a systematic literature review of 63 primary studies. In this extended article, we have included more 26 studies to analyze the relation between health and governance. In total, we reviewed 89 studies. 52 metrics were identified and classified into the three health elements (productivity, robustness, niche creation). Our results suggest that software ecosystems governance determines decision rights between platform owners and extension developers, control mechanisms used by the platform owner, and platform ownership. We posit that ecosystem health is under the direct influence of how governance mechanisms are implemented by ecosystem’s players.
Carina Alves, Joyce Oliveira, Slinger Jansen

### Exploring the Ambidextrous Analysis of Business Processes: A Design Science Research

Abstract
Traditionally, business processes are analyzed in a qualitative or quantitative form with the purpose to exploit, reduce or eliminate existing problems in the processes, such as bottlenecks, financial or resources waste, cycle time and handworks. Business process analysis is an important phase of the Business Process Management (BPM) lifecycle because it provides a critical examination of problems and potential improvements of business processes. However, few studies have been conducted to provide novel analysis techniques and methods to explore external and future opportunities, in addition to satisfying clients’ expectations, needs and experience. In this context, we used the Design Science Research approach to build the Ambidextrous Analysis of Business Process (A2BP) method, which enables process analysts to balance exploration and exploitation thinking. We defined the problem and the research questions through a systematic literature mapping. Then, we empirically evaluate the proposed method through an expert opinion survey and an observational case study to assess the usefulness and ease-of-use of the method. Overall, the participants of the empirical study evaluated the method positively and suggested feedbacks to refine it.
Higor Santos, Carina Alves

### Toward an Understanding of the Tradeoffs of Adopting the MEAN Web Server Stack

Abstract
In the past decade, the performance of web services has been enhanced via scale-up and scale-out methods, which increase available system resources, and also by improvements in database performance. As cloud technology continues to rise in popularity, storage and compute services are reaching unprecedented scale, with great scrutiny being turned to the performance tradeoffs of the web application server stacks. In particular, the MEAN (MongoDB, Express.js, AngularJS, and Node.js) web server stack is increasingly popular in the computing industry, yet has largely escaped the focus of formal benchmarking efforts. In this work, we compare MEAN to its more entrenched competitor, the LAMP (Linux, Apache, MySQL, PHP) web server stack, the most widely distributed web platform in production. We herein describe the design, execution, and results of a number of benchmark tests constructed and executed to facilitate direct comparison between Node.js and Apache/PHP, the web server applications of these stacks. We investigate each web server’s ability to handle heavy static file service, remote database interaction, and common compute-bound tasks. Analysis of our results indicates that Node.js outperforms the Apache/PHP by a considerable margin in all single-application web service scenarios, and performs as well as Apache/PHP under heterogeneous server workloads. We extend our understanding of the MEAN stack’s performance potential by exploring the performance and memory tradeoffs of Angularizing the Mongo-Express web application, a database administration dashboard for MongoDB. We find that porting Mongo-Express to MEAN’s AngularJS provides up to a 4x improvement in document read bandwidth and up to almost 2.2x improvement in collection read bandwidth, at a cost of roughly double the client memory footprint.
Steve Kitzes, Eric DeMauro, Adam Kaplan

### Recognition of Business Process Elements in Natural Language Texts

Abstract
Process modeling is a complex and important task in any business process management project. Gathering information to build a process model needs effort by analysts in different ways, such as interviews and document review. However, this documentation is not always well structured and can be difficult to be understood. Thus, techniques that allow the structuring and recognition of process elements in the documentation can help in the understanding of the process and, consequently, in the modeling activity. In this context, this paper proposes an approach to recognize business process elements in natural language texts. We defined a set of 32 mapping rules to recognize business process elements in texts using natural language processing techniques and which were identified through an empirical study in texts containing descriptions of a process. Furthermore, a prototype was developed and it showed promising results. The analyses of 70 texts revealed 73.61% precision, 70.15% recall and 71.82% F-measure. Moreover, two surveys showed that 93.33% of participants agree with the mapping rules and that the approach helps the analysts in both the time spent and the effort made in the process modeling task. This paper is a reiteration and an evolution of the work presented in Ferreira et al. [1].
Renato César Borges Ferreira, Thanner Soares Silva, Diego Toralles Avila, Lucinéia Heloisa Thom, Marcelo Fantinato

### An Interface Prototype Proposal to a Semiautomatic Process Model Verification Method Based on Process Modeling Guidelines

Abstract
The design of comprehensible process models is a very complex task. In order to obtain them, process analysts usually rely on process modeling guidelines. This is specially true when dealing with collections counting up to hundreds of process models, since querying or organizing such a collection is not easy. In this paper we report a method presented in an earlier work to verify if a process model is following process modeling guidelines. In addition we propose an interface prototype to display which process models are not following which guidelines. A collection of 31 process models were used to validate the identification method and the results shows that 23 of these process models contains at least one guideline violation.
Valter Helmuth Goldberg Júnior, Vinicius Stein Dani, Diego Toralles Avila, Lucineia Heloisa Thom, José Palazzo Moreira de Oliveira, Marcelo Fantinato

### Backmatter

Weitere Informationen