Skip to main content

2019 | Buch

On the Move to Meaningful Internet Systems: OTM 2019 Conferences

Confederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, October 21–25, 2019, Proceedings

herausgegeben von: Hervé Panetto, Christophe Debruyne, Martin Hepp, Dave Lewis, Dr. Claudio Agostino Ardagna, Robert Meersman

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This volume LNCS 11877 constitutes the refereed proceedings of the Confederated International Conferences: Cooperative Information Systems, CoopIS 2019, Ontologies, Databases, and Applications of Semantics, ODBASE 2019, and Cloud and Trusted Computing, C&TC, held as part of OTM 2019 in October 2019 in Rhodes, Greece. The 38 full papers presented together with 8 short papers were carefully reviewed and selected from 156 submissions. The OTM program every year covers data and Web semantics, distributed objects, Web services, databases, informationsystems, enterprise workflow and collaboration, ubiquity, interoperability, mobility, grid and high-performance computing.

Inhaltsverzeichnis

Frontmatter
Correction to: On the Move to Meaningful Internet Systems

The affiliation of Martin Hepp was incorrect. The correct information is given below.Universität der Bundeswehr München, Munich, GermanyThe chapter ‘VoIDext: Vocabulary and Patterns for Enhancing Interoperable Datasets with Virtual Links’ was originally published as non-open access and has now been converted to open access under a CC BY 4.0 license and the copyright holder updated to ‘The Author(s)’.

Hervé Panetto, Christophe Debruyne, Martin Hepp, Dave Lewis, Claudio Agostino Ardagna, Robert Meersman

International Conference on Cooperative Information Systems (CoopIS) 2019

Frontmatter
Towards a Reference Ontology of Trust

Trust is a key component of relationships in social life. It is commonly argued that trust is the “glue” that holds families, societies, organizations and companies together. In the literature trust is frequently considered as a strategic asset for organizations. Having a clear understanding of the notion of trust and its components is paramount to both trust assessment and trust management. Although much progress has been made to clarify the ontological nature of trust, the term remains overloaded and there is not yet a shared or prevailing, and conceptually clear notion of trust. In this paper we address this issue by means of an in-depth ontological analysis of the notion of trust, grounded in the Unified Foundational Ontology. As a result, we propose a concrete artifact, namely, the Reference Ontology for Trust, in which we characterize the general concept of trust and distinguish between two types of trust, namely, social trust and institution-based trust. We also represent the emergence of risk from trust relations. In addition, we make a comparative analysis of our Reference Ontology to other trust ontologies. To validate and demonstrate the contribution of our approach, we apply it to model two application examples.

Glenda Amaral, Tiago Prince Sales, Giancarlo Guizzardi, Daniele Porello
Personalised Exploration Graphs on Semantic Data Lakes

Recently, organisations operating in the context of Smart Cities are spending time and resources in turning large amounts of data, collected within heterogeneous sources, into actionable insights, using indicators as powerful tools for meaningful data aggregation and exploration. Data lakes, which follow a schema-on-read approach, allow for storing both structured and unstructured data and have been proposed as flexible repositories for enabling data exploration and analysis over heterogeneous data sources, regardless their structure. However, indicators are usually computed based on the centralisation of the data storage, according to a less flexible schema on write approach. Furthermore, domain experts, who know data stored within the data lake, are usually distinct from data analysts, who define indicators, and users, who exploit indicators to explore data in a personalised way. In this paper, we propose a semantics-based approach for enabling personalised data lake exploration through the conceptualisation of proper indicators. In particular, the approach is structured as follows: (i) at the bottom, heterogeneous data sources within a data lake are enriched with Semantic Models, defined by domain experts using domain ontologies, to provide a semantic data lake representation; (ii) in the middle, a Multi-Dimensional Ontology is used by analysts to define indicators and analysis dimensions, in terms of concepts within Semantic Models and formulas to aggregate them; (iii) at the top, Personalised Exploration Graphs are generated for different categories of users, whose profiles are defined in terms of a set of constraints that limit the indicators instances on which the users may rely to explore data. Benefits and limitations of the approach are discussed through an application in the Smart City domain.

Ada Bagozi, Devis Bianchini, Valeria De Antonellis, Massimiliano Garda, Michele Melchiori
Characterizing Conceptual Modeling Research

The field of conceptual modeling continues to evolve and be applied to important modeling problems in many domains. With a goal of articulating the breadth and depth of the field, our initial work focused on the many implicit and explicit definitions of conceptual modeling, resulting in the Characterizing Conceptual Modeling (CCM) framework. In this paper, we focus on conceptual modeling research, presenting a Characterizing Conceptual Model Research (CCMR) framework and a series of evaluations to assess the coverage and usability of CCMR, the utility and independence of the individual fields in the framework, and likelihood of consistent use.

Lois M. L. Delcambre, Stephen W. Liddle, Oscar Pastor, Veda C. Storey
MapSDI: A Scaled-Up Semantic Data Integration Framework for Knowledge Graph Creation

Semantic web technologies have significantly contributed with effective solutions for the problems of data integration and knowledge graph creation. However, with the rapid growth of big data in diverse domains, different interoperability issues still demand to be addressed, being scalability one of the main challenges. In this paper, we address the problem of knowledge graph creation at scale and provide MapSDI, a mapping rule-based framework for optimizing semantic data integration into knowledge graphs. MapSDI allows for the semantic enrichment of large-sized, heterogeneous, and potentially low-quality data efficiently. The input of MapSDI is a set of data sources and mapping rules being generated by a mapping language such as RML. First, MapSDI pre-processes the sources based on semantic information extracted from mapping rules, by performing basic database operators; it projects out required attributes, eliminates duplicates, and selects relevant entries. All these operators are defined based on the knowledge encoded by the mapping rules which will be then used by the semantification engine (or RDFizer) to produce a knowledge graph. We have empirically studied the impact of MapSDI on existing RDFizers, and observed that knowledge graph creation time can be reduced on average in one order of magnitude. It is also shown, theoretically, that the sources and rules transformations provided by MapSDI are data-lossless.

Samaneh Jozashoori, Maria-Esther Vidal
A Contextual Approach to Detecting Synonymous and Polluted Activity Labels in Process Event Logs

Process mining, as a well-established research area, uses algorithms for process-oriented data analysis. Similar to other types of data analysis, the existence of quality issues in input data will lead to unreliable analysis results (garbage in - garbage out). An important input for process mining is an event log which is a record of events related to a business process as it is performed through the use of an information system. While addressing quality issues in event logs is necessary, it is usually an ad-hoc and tiresome task. In this paper, we propose an automatic approach for detecting two types of data quality issues related to activities, both critical for the success of process mining studies: synonymous labels (same semantics with different syntax) and polluted labels (same semantics and same label structures). We propose the use of activity context, i.e. control flow, resource, time, and data attributes to detect semantically identical activity labels. We have implemented our approach and validated it using real-life logs from two hospitals and an insurance company, and have achieved promising results in detecting frequent imperfect activity labels.

Sareh Sadeghianasl, Arthur H. M. ter Hofstede, Moe T. Wynn, Suriadi Suriadi
Automated Robotic Process Automation: A Self-Learning Approach

Robotic Process Automation (RPA) recently gained a lot of attention, in both industry and academia. RPA embodies a collection of tools and techniques that allow business owners to automate repetitive manual tasks. The intrinsic value of RPA is beyond dispute, e.g., automation reduces errors and costs and thus allows us to increase the overall business process performance. However, adoption of current-generation RPA tools requires a manual effort w.r.t. identification, elicitation and programming of the to-be-automated tasks. At the same time, several techniques exist that allow us to track the exact behavior of users in the front-end, in great detail. Therefore, in this paper, we present a novel end-to-end approach that allows for completely automated, algorithmic RPA-rule deduction, on the basis of captured user behavior. Furthermore, our proposed approach is accompanied by a publicly available proof-of-concept implementation.

Junxiong Gao, Sebastiaan J. van Zelst, Xixi Lu, Wil M. P. van der Aalst
iDropout: Leveraging Deep Taylor Decomposition for the Robustness of Deep Neural Networks

In this work, we present iDropout, a new method to adjust dropout, from purely randomly dropping inputs to dropping inputs based on a mix based on the relevance of the nodes and some randomness. We use Deep Taylor Decomposition to calculate the respective relevance of the inputs and based on this, we give input nodes with a higher relevance a higher probability to be included than input nodes that seem to have less of an impact. The proposed method does not only seem to increase the performance of a Neural Network, but it also seems to make the network more robust to missing data. We evaluated the approach on artificial data with various settings, e.g. noise in data, number of informative features and on real-world datasets from the UCI Machine Learning Repository.

Christian Schreckenberger, Christian Bartelt, Heiner Stuckenschmidt
A Case Study Lens on Process Mining in Practice

Process mining has a history of over two decades of published research papers and case studies started to appear a bit over a decade ago. In this paper we review these published process mining case studies to assess the maturity of the field from a practice point of view by considering (i) diffusion of tools and techniques into practice, and (ii) the thoroughness of the application of process mining methodologies. Diffusion is assessed by analysing the breadth of domains to which process mining has been applied and the variety of tools and techniques employed. We define measures of thoroughness for each of the various phases of a generalised process mining methodology and examine case studies identified from a literature search against these measures. We conclude that, despite maturing in terms of diffusion, application of process mining in practice has not seen an increased maturity over time in terms of thoroughness. One way to redress this situation is to pay more attention to the development of and adherence to methodological guidance.

Fahame Emamjome, Robert Andrews, Arthur H. M. ter Hofstede
Initializing k-Means Efficiently: Benefits for Exploratory Cluster Analysis

Data analysis is a highly exploratory task, where various algorithms with different parameters are executed until a solid result is achieved. This is especially evident for cluster analyses, where the number of clusters must be provided prior to the execution of the clustering algorithm. Since this number is rarely known in advance, the algorithm is typically executed several times with varying parameters. Hence, the duration of the exploratory analysis heavily dependends on the runtime of each execution of the clustering algorithm. While previous work shows that the initialization of clustering algorithms is crucial for fast and solid results, it solely focuses on a single execution of the clustering algorithm and thereby neglects previous executions. We propose Delta Initialization as an initialization strategy for k-Means in such an exploratory setting. The core idea of this new algorithm is to exploit the clustering results of previous executions in order to enhance the initialization of subsequent executions. We show that this algorithm is well suited for exploratory cluster analysis as considerable speedups can be achieved while additionally achieving superior clustering results compared to state-of-the-art initialization strategies.

Manuel Fritz, Holger Schwarz
Exploiting EuroVoc’s Hierarchical Structure for Classifying Legal Documents

Multi-label document classification is a challenging problem because of the potentially huge number of classes. Furthermore, real-world datasets often exhibit a strongly varying number of labels per document, and a power-law distribution of those class labels. Multi-label classification of legal documents is additionally complicated by long document texts and domain-specific use of language. In this paper we use different approaches to compare the performance of text classification algorithms on existing datasets and corpora of legal documents, and contrast the results of our experiments with results on general-purpose multi-label text classification datasets. Moreover, for the EUR-Lex legal datasets, we show that exploiting the hierarchy of the EuroVoc thesaurus helps to improve classification performance by reducing the number of potential classes while retaining the informative value of the classification itself.

Erwin Filtz, Sabrina Kirrane, Axel Polleres, Gerhard Wohlgenannt
Fairness-Aware Process Mining

Process mining is a multi-purpose tool enabling organizations to improve their processes. One of the primary purposes of process mining is finding the root causes of performance or compliance problems in processes. The usual way of doing so is by gathering data from the process event log and other sources and then applying some data mining and machine learning techniques. However, the results of applying such techniques are not always acceptable. In many situations, this approach is prone to making obvious or unfair diagnoses and applying them may result in conclusions that are unsurprising or even discriminating. In this paper, we present a solution to this problem by creating a fair classifier for such situations. The undesired effects are removed at the expense of reduction on the accuracy of the resulting classifier.

Mahnaz Sadat Qafari, Wil van der Aalst
Model-Aware Clustering of Non-conforming Traces

Process deviations often are a difficult issue to deal with, revealing a concept in need of improvement. However, deviations are not equally interesting and focusing on each alike is probably not the most efficient approach in most applications. Our novel approach identifies groups of action sequences, called traces, with similar deviation characteristics. In contrast to trace clustering, we utilize a process model as a baseline, showing model awareness, to define conformance and establish a density-based anomaly aggregation regarding the process model as a map-like ground distance. Dense clusters of non-conform traces are collected as micro-processes, which might reveal itself as significant process variants from another perspective. Handling groups of deviating objects bears much more efficacy - either countermeasures or process augmentations - than dealing with singular objects individually .

Florian Richter, Ludwig Zellner, Janina Sontheim, Thomas Seidl
Bank Branch Network Optimization Based on Customers Geospatial Profiles

In this study, the bank branch network optimization problem is considered. The problem consists in choosing several branches for closure based on the overall expected level of dissatisfaction of bank customers with the location of remaining branches. This problem is considered as a black-box optimization problem. We propose a new algorithm for determining dissatisfaction of customers, based on their geospatial profiles. Namely, the following geospatial metrics are used for this purpose: Loyalty and Diversity. Also, a method for comparison of algorithms aimed at solving the mentioned problem is proposed. In this method, data on really dissatisfied customers is employed. Using the method, the proposed algorithm was compared with its competitor on a data set from one of the largest regional banks in Russia. It turned out, that the new algorithm usually shows better accuracy.

Oleg Zaikin, Anton Petukhov, Klavdiya Bochenina
Using Maps for Interlinking Geospatial Linked Data

The creation of interlinks between Linked Data datasets is key to the creation of a global database. One can create such interlinks in various ways: manually, semi-automatically, and automatically. While quite a few tools exist to facilitate this process in a (semi-)automatic manner, often with support for geospatial data. It is not uncommon that interlinks need to be created manually, e.g., when interlinks need to be authoritative. In this study, we focus on the manual interlinking of geospatial data using maps. The State-of-the-Art uses maps to facilitate the search and visualization of such data. Our contribution is to investigate whether maps are useful for the creation of interlinks. We designed and developed such a tool and set up an experiment in which 16 participants used the tool to create links between different Linked Data datasets. We not only describe the tool but also analyze the data we have gathered. The data suggests the creation of these interlinks from these maps is a viable approach. The data also indicate that people had a harder time dealing with Linked Data principles (e.g., content negotiation) than with the creation of interlinks.

Dieter Roosens, Kris McGlinn, Christophe Debruyne
A Conceptual Modelling Approach to Visualising Linked Data

Increasing numbers of Linked Open Datasets are being published, and many possible data visualisations may be appropriate for a user’s given exploration or analysis task over a dataset. Users may therefore find it difficult to identify visualisations that meet their data exploration or analyses needs. We propose an approach that creates conceptual models of groups of commonly used data visualisations, which can be used to analyse the data and users’ queries so as to automatically generate recommendations of possible visualisations. To our knowledge, this is the first work to propose a conceptual modelling approach to recommending visualisations for Linked Data.

Peter McBrien, Alexandra Poulovassilis
S-RDF: A New RDF Serialization Format for Better Storage Without Losing Human Readability

Nowadays, RDF data becomes more and more popular on the Web due to the advances of the Semantic Web and the Linked Open Data initiatives. Several works are focused on transforming relational databases to RDF by storing related data in N-Triple serialization format. However, these approaches do not take into account the existing normalization of their databases since N-Triple format allows data redundancy and does not control any normalization by itself. Moreover, the mostly used and recommended serialization formats, such as RDF/XML, Turtle, and HDT, have either high human-readability but waste storage capacity, or focus further on storage capacities while providing low human-readability. To overcome these limitations, we propose here a new serialization format, called S-RDF. By considering the structure (graph) and values of the RDF data separately, S-RDF reduces the duplicity of values by using unique identifiers. Results show an important improvement over the existing serialization formats in terms of storage (up to 71,66% w.r.t. N-Triples) and human readability.

Irvin Dongo, Richard Chbeir
A Linked Open Data Approach for Web Service Evolution

Web services are subject to changes during their lifetime, such as updates in data types, operations, and the overall functionality. Such changes may impact the way Web services are discovered, consumed, and recommended. We propose a Linked Open Data (LOD) approach for managing Web services new deployment and updates. We propose algorithms, based on semantic LOD similarity measures, to infer composition and substitution relationships for both newly deployed and updated services. We introduce a technique that gathers Web service interactions and users’ feedbacks to continuously update service relationships. To improve the accuracy of relationship recommendation, we propose an algorithm to learn new LOD relationships from Web service past interaction. We conduct extensive experiments on real-world Web services to evaluate our approach.

Hamza Labbaci, Nasredine Cheniki, Yacine Sam, Nizar Messai, Brahim Medjahed, Youcef Aklouf
Security Risk Management in Cooperative Intelligent Transportation Systems: A Systematic Literature Review

The automotive industry is maximizing cooperative interactions between vehicular sensors and infrastructure components to make intelligent decisions in its application (i.e., traffic management, navigation, or autonomous driving services). This cooperative behaviour also extends to security. More connected and cooperative components of vehicular intelligent transportation systems (ITS) result in an increased potential for malicious attacks that can negatively impact security and safety. The security risks in one architecture layer affect other layers of ITS; thus, cooperation is essential for secure operations of these systems. This paper presents results from a comprehensive literature review on the state-of-the-art of security risk management in vehicular ITS, evaluating its assets, threats/risks, and countermeasures. We examine these security elements along the dimensions of the perception, network, and application architecture layers of ITS. The study reveals gaps in ITS security risk management research within these architecture layers and provides suggestions for future research.

Abasi-Amefon O. Affia, Raimundas Matulevičius, Alexander Nolte
Data Sharing in Presence of Access Control Policies

In the context of data analysis and data integration, very often information from different and autonomous sources are shared. Sources use their own schema and their own access control policies. We consider the case where data sources decide to share information by specifying entity matching rules between their contents. A query to a given data source is rewritten to produce queries to other data sources that share information with that data source. This entity-matching oriented and policy-oriented rewriting preserves local data source policies. In this paper, we describe our methodology for data sharing between sources by ensuring the satisfaction of local access control policies.

Juba Agoun, Mohand-Saïd Hacid
Library Usage Detection in Ethereum Smart Contracts

We analyze the usage of the SafeMath library on the blockchain with a data set of 6.9 million bytecodes of Ethereum smart contracts. This library provides safe arithmetic operations for contracts. In order to detect smart contracts that make use of the library from the bytecode alone, we perform the following five steps: download all available library versions, write test contracts that use the library, compile all test contracts with all compatible compiler versions, extract the internal library functions from the compiled bytecode, and search for contracts on the blockchain that use these library function bytecodes. In total, we detect usage of the SafeMath library in 1.34% of all smart contracts on the blockchain and in 27.52% of all distinct contract codes. To evaluate our approach, we use more than 50,000 verified contracts from Etherscan for which both the Solidity source code and the bytecode is available. Our algorithm correctly detects library usage for 86.34% of the smart contracts in the evaluation set.

Alexander Hefele, Ulrich Gallersdörfer, Florian Matthes
JusticeChain: Using Blockchain to Protect Justice Logs

The auditability of information systems plays an essential role in public administration. Information system accesses are saved in log files so auditors can later inspect them. However, there are often distinct stakeholders with different roles and different levels of trust, namely the IT Department that manages the system and the government ministries that access the logs for auditing. This scenario happens at the Portuguese judicial system, where stakeholders utilize an information system managed by third-parties. This paper proposes using blockchain technology to make the storage of access logs more resilient while supporting such a multi-stakeholder scenario, in which different entities have different access rights to data. This proposal is implemented in the Portuguese Judicial System through JusticeChain. JusticeChain comprises the blockchain components and blockchain client components. The blockchain components grant log integrity and redundancy, while the blockchain client component is responsible for saving logs on behalf of an information system. The client allows end-users to access the blockchain, allowing audits mediated by the blockchain.

Rafael Belchior, Miguel Correia, André Vasconcelos
Triage of IoT Attacks Through Process Mining

The impressive growth of the IoT we witnessed in the recent years came together with a surge in cyber attacks that target it. Factories adhering to digital transformation programs are quickly adopting the IoT paradigm and are thus increasingly exposed to a large number of cyber threats that need to be detected, analyzed and appropriately mitigated. In this scenario, a common approach that is used in large organizations is to setup an attack triage system. In this setting, security operators can cherry-pick new attack patterns requiring further in-depth investigation from a mass of known attacks that can be managed automatically. In this paper, we propose an attack triage system that helps operators to quickly identify attacks with unknown behaviors, and later analyze them in detail. The novelty introduced by our solution is in the usage of process mining techniques to model known attacks and identify new variants. We demonstrate the feasibility of our approach through an evaluation based on three well-known IoT botnets, BASHLITE, LIGHTAIDRA and MIRAI, and on real current attack patterns collected through an IoT honeypot.

Simone Coltellese, Fabrizio Maria Maggi, Andrea Marrella, Luca Massarelli, Leonardo Querzoni
An Automatic Emotion Recognition System for Annotating Spotify’s Songs

The recognition of emotions for annotating large-size music datasets is still an open challenge. The problem lies in that most of the solutions require the audio of the songs and user/expert intervention during certain phases of the recognition process. In this paper, we propose an automatic solution for overcoming these drawbacks. It consists of a heterogeneous set of machine learning models that have been developed from Spotify’s Web data services and miner tools. In order to improve the accuracy of resulting annotations, each model is specialized in recognizing a class of emotions. These models have been validated by using the AcousticBrainz database and have been exported to be integrated into a music emotion recognition system. It has been used to emotionally annotate the Spotify music database which is composed by more than 30 million songs.

J. García de Quirós, S. Baldassarri, J. R. Beltrán, A. Guiu, P. Álvarez
An Approach for the Automated Generation of Engaging Dashboards

Organizations use Key Performance Indicators (KPIs) to monitor whether they attain their goals. To support organizations at tracking the performance of their business, software vendors offer dashboards to these organizations. For the development of the dashboards that will engage organizations and enable them to make informed decisions, software vendors leverage dashboard design principles. However, the dashboard design principles available in the literature are expressed as natural language texts. Therefore, software vendors and organizations either do not use them or spend significant efforts to internalize and apply them literally in every engaging dashboard development process. We show that engaging dashboards for organizations can be automatically generated by means of automatically visualized KPIs. In this context, we present our novel approach for the automated generation of engaging dashboards for organizations. The approach employs the decision model for visualizing KPIs that is developed based on the dashboard design principles in the literature. We implemented our approach and evaluated its quality in a case study.

Ünal Aksu, Adela del-Río-Ortega, Manuel Resinas, Hajo A. Reijers
Process-Based Quality Management in Care: Adding a Quality Perspective to Pathway Modelling

Care pathways (CPs) are used as a tool to organize complex care processes and to foster the quality management in general. However, the quality management potentials have not been sufficiently exploited yet, since the development, documentation, and controlling of quality indicators (QIs) for quality management purposes are not fully integrated to the process standards defined by CPs. To support the integration of a quality perspective in CPs, the paper addresses the questions which and how quality concepts can be integrated into the process documentation in order to support managers, health service providers, and patients. Therefore, we extended the widely accepted modelling language “Business Process Model and Notation” (BPMN) with a quality perspective. The conceptualization is grounded on a systematic literature review on (quality) indicator modelling. Together with previous work on the conceptualization of QIs in health care, it provided the basis for a comprehensive domain requirements analysis. Following a design-oriented research approach, the requirements were evaluated and used to design a BPMN extension by implementing the quality indicator enhancements as BPMN meta model extension. All design decisions were evaluated in a feedback workshop with a domain expert experienced in quality management and certification of cancer centres on national and international level. The approach is demonstrated with an example from stroke care. The proposed language extension provides a tool to be used for the governance of care processes based on QIs and for the implementation of a more real-time, pathway-based quality management in health care.

Peggy Richter, Hannes Schlieter
Preference-Based Resource and Task Allocation in Business Process Automation

Preference plays an important role in organisational and human decision making as it may be a manifestation of proven practices or of individual working styles. The significance of the notion of preference has been recognised in a number of different disciplines. Unfortunately its potential does not seem to have been fully unlocked in the field of Business Process Automation (BPA), even though resource and task allocations play a pivotal role in process performance and these allocations could be guided by explicit formulations of preferences. In this paper, we examine the state of the art with respect to preference in the field of BPA and use this as the basis for a conceptual model capturing recognised manifestations of preference in the literature. We investigate how preferences may exhibit themselves in process automation through the notion of well-established (workflow) resource patterns. We then show that manifestations of preference may occur in real-life process event logs and how these can be extracted through the application of machine learning techniques. The findings from this research contribute towards establishing a rich understanding of preferences in the context of business processes, ways of specifying and deriving these preferences, and their more explicit incorporation in work allocation mechanisms, which can lead to a step change for realising better process performance and more effective work collaboration in today’s organisations .

Reihaneh Bidar, Arthur ter Hofstede, Renuka Sindhgatta, Chun Ouyang
Scenario-Based Prediction of Business Processes Using System Dynamics

Many organizations employ an information system that supports the execution of their business processes. During the execution of these processes, event data are stored in the databases that support the information system. The field of process mining aims to transform such data into actionable insights, which allow business owners to improve their daily operations. For example, a process model describing the actual execution of the process can be easily extracted from the captured event data. Most process mining techniques are “backward-looking” providing compliance and performance information. Few process mining techniques are “forward-looking”. Therefore, in this paper, we propose a novel scenario-based predictive approach that allows us to assess and predict future behavior in business processes. In particular, we propose to use system dynamics to allow for “what-if” questions. We create a system dynamics model using variables trained on the basis of the past behavior of the process, as captured in the event log. This model is used to explore the effect of possibly applied changes in the process as well as roles of external factors, e.g., human behavior. Using real event data, we demonstrate the feasibility of our approach to predict possible consequences of future decisions and policies .

Mahsa Pourbafrani, Sebastiaan J. van Zelst, Wil M. P. van der Aalst
A Three-Layered Approach for Designing Smart Contracts in Collaborative Processes

In collaborative environments, where enterprises interact each other’s without a centralised authority that ensures trust among them, the ability of providing cross-organisational services must be enabled also between mutually untrusting participants. Blockchain platforms and smart contracts have been proposed to implement collaborative processes. However, current solutions are platform-dependent and deploy on-chain the whole process, thus increasing the execution costs of smart contracts if deployed on permissionless blockchain. In this paper, we propose an approach that includes criteria to identify trust-demanding objects and activities in collaborative processes, a model to describe smart contracts in a platform-independent way and guidelines to deploy them in a blockchain. To this aim, a three-layered model is used to describe: (i) the collaborative process, represented in BPMN, where the business expert is supported to add annotations that identify trust-demanding objects and activities; (ii) Abstract Smart Contracts based on trust-demanding objects and activities only and specified by means of descriptors, that are independent from any blockchain technology; (iii) Concrete Smart Contracts, that implement abstract ones and are deployed over a specific blockchain, enabling the creation of a repository where a single descriptor is associated with multiple implementations. The flexibility and reduced costs of the approach, due to the smart contracts abstraction and the use of blockchain only when necessary, are discussed with a case study on remote monitoring services in the digital factory.

Ada Bagozi, Devis Bianchini, Valeria De Antonellis, Massimiliano Garda, Michele Melchiori
Towards Green Value Network Modeling: A Case from the Agribusiness Sector in Brazil

The main purpose of a value network model is to prospect the sustainability of business strategies. However, much attention has been paid on the economic issues of value modeling, leaving critical environmental and social issues uncovered. On the environmental scope, this study proposes an ontology for modeling value networks to match Green Computing requirements. The ontology supports semi-automatic configuration of value network models to help business analysts deciding upon alternative value paths to satisfy market segments demanding products or services bundled with green accreditations or certifications. The ontology was built according to guidelines of Design Science in combination with specific methodologies for Ontology Engineering. For the ontology evaluation process, the acceptance, utility and usability of the ontology were evaluated by means of Technical Action Research (TAR) applied in a real-world case from the Brazilian agribusiness sector. Business expert opinion pointed to the viability of the models produced, from both the economic and environmental perspectives.

Juscimara Gomes Avelino, Patrício de Alencar Silva, Faiza Allah Bukhsh
Business Object Centric Microservices Patterns

A key impediment towards maturing microservice architecture conceptions is the uncertainty about what it means to design fine-grained functionality for microservices. Under a traditional service-oriented architecture (SOA), the unit of functionality for software components concerns individual business domain objects and encapsulated operations, enabling desirable architectural properties such as high cohesion and loose-coupling of its components. However, at present it is not clear how this SOA design strategy should be refined for microservices nor, more generally, how design considerations for different degrees of granularity apply, in a consistent and systematic way, for large SOA systems to smaller microservices. This paper proposes microservice patterns, as a contribution to the maturity of microservice architectures, through the refinement of the functional structure of SOAs. The patterns are derived by considering the splitting of business object (BO) operations and salient types of BO relationships, which influence software structure (as captured in UML): object association, exclusive containment, inclusive containment and specialisation (i.e., subtyping). The viability of the patterns for evolving large SOA systems into microservices is demonstrated through automated microservices discovery algorithms, on two open-source enterprise systems used widely in practice, Dolibarr and SugarCRM.

Adambarage Anuruddha Chathuranga De Alwis, Alistair Barros, Colin Fidge, Artem Polyvyanyy
Availability and Scalability Optimized Microservice Discovery from Enterprise Systems

Microservices have been introduced to industry as a novel architectural design for software development in cloud-based applications. This development has increased interest in finding new methodologies to migrate existing enterprise systems into microservices to achieve desirable performance characteristics such as high scalability, high availability, high cohesion and low coupling. A key challenge in this context is discovering microserviceable components with promising characteristics from a complex monolithic code base while predicting their resulting characteristics. This paper presents a technique to support such re-engineering of an enterprise system based on the fundamental mechanisms for structuring its architecture, i.e., business objects managed by software functions and their interactions. The technique relies on queuing theory and business object relationship analysis. A prototype for microservice discovery and characteristic analysis was developed using the NSGA II software clustering and optimization technique and has been validated against two open-source enterprise systems, SugarCRM and ChurchCRM. Our experiments demonstrate that the proposed approach can recommend microservice design which improves scalability, availability and execution efficiency of the system while achieving high cohesion and low coupling in software modules.

Adambarage Anuruddha Chathuranga De Alwis, Alistair Barros, Colin Fidge, Artem Polyvyanyy
Discovering Crossing-Workflow Fragments Based on Activity Knowledge Graph

This paper proposes a novel crossing-workflow fragment discovery mechanism, where an activity knowledge graph (AKG) is constructed to capture partial-ordering relations between activities in scientific workflows, and parent-child relations specified upon sub-workflows and their corresponding activities. The biterm topic model is adopted to generate topics and quantify the semantic relevance of activities and sub-workflows. Given a requirement specified in terms of a workflow template, individual candidate activities or sub-workflows are discovered leveraging their semantic relevance and text description in short documents. Candidate fragments are generated through exploring the relations in AKG specified upon candidate activities or sub-workflows, and these fragments are evaluated through balancing their structural and semantic similarities. Evaluation results demonstrate that this technique is accurate and efficient on discovering and recommending appropriate crossing-workflow fragments in comparison with the state of art’s techniques.

Jinfeng Wen, Zhangbing Zhou, Yasha Wang, Walid Gaaloul, Yucong Duan
History-Aware Dynamic Process Fragmentation for Risk-Aware Resource Allocation

Most Process-Aware Information Systems (PAIS) and resource allocation approaches do the selection of the resource to be allocated to a certain process activity at run time, when the activity must be executed. This results in cumulative (activity per activity) local optimal allocations for which assumptions (e.g. on loop repetitions) are not needed beforehand, but which altogether might incur in an increase of cycle time and/or cost. Global optimal allocation approaches take all the process-, organization- and time-related constraints into account at once before process execution, handling better the optimization objectives. However, a number of assumptions must be made upfront on the decisions made at run time. When an assumption does not hold at run time, a resource reallocation must be triggered. Aiming at achieving a compromise between the pros and cons of these two methods, in this paper we introduce a novel approach that fragments the process dynamically for the purpose of risk-aware resource allocation. Given historical execution data and a process fragmentation threshold, our method enhances the feasibility of the resource allocations by dynamically generating the process fragments (i.e. execution horizons) that satisfy the given probabilistic threshold. Our evaluation with simulations demonstrates the advantages in terms of reduction in reallocation efforts.

Giray Havur, Cristina Cabanillas
Modeling Conversational Agents for Service Systems

Service providers are increasingly exploring the use of conversational agents (CA) or dialogue based systems to support end customers, as a CA promises natural method for users to interact and a convenient channel for customer service. Commercial CAs, excel in addressing specific tasks or functions such as searching for restaurants, providing location directions, or scheduling meetings, with small variations in the user request. Designing a CA for a more complex service system, requires sufficient knowledge of its services such as the service capabilities, their constraints, and effects, in addition to understanding user utterances. The design of a CA is typically an independent activity and its linkages to the service system it supports are left to the designers. In this paper, we study existing work with respect to text-based CAs and identify the conceptual elements of a CA. Further, a linkage between the model elements of a CA and service model of the service system it supports is established and presented. We show that interesting insights can be derived from the linkages, that can be useful to CA designers.

Renuka Sindhgatta, Alistair Barros, Alireza Nili
Interactive Modification During the Merge of Graph-Based Business Process Models

Companies are constantly changing their business process models. In team environments, different versions of a process model are created at the same time. These versions of a process model need to be merged from time to time to consolidate changes and create a new common version.In this short paper, we propose a solution for modifying a merge result. The goal is to create a meaningful merge result by adding connector nodes to the model at specific locations. This increases the amount of possible result models and reduces additional implementation effort.

Jürgen Krauß, Martin Schmollinger

International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE) 2019

Frontmatter
Complex Query Augmentation for Question Answering over Knowledge Graphs

Question answering systems have often a pipeline architecture that consists of multiple components. A key component in the pipeline is the query generator, which aims to generate a formal query that corresponds to the input natural language question. Even if the linked entities and relations to an underlying knowledge graph are given, finding the corresponding query that captures the true intention of the input question still remains a challenging task, due to the complexity of sentence structure or the features that need to be extracted. In this work, we focus on the query generation component and introduce techniques to support a wider range of questions that are currently less represented in the community of question answering.

Abdelrahman Abdelkawi, Hamid Zafar, Maria Maleshkova, Jens Lehmann
A Subjective Logic Based Approach to Handling Inconsistencies in Ontology Merging

Ontologies reflect their creators’ view of the domain at hand and are thus subjective. For specific applications it may be necessary to combine several of these ontologies into a more comprehensive domain model by merging them. However, due to the subjective nature of the source ontologies, this can result in inconsistencies. Handling these inconsistencies is a challenging task even for modestly sized ontologies. Therefore, in this paper, we propose a Subjective Logic based approach to cope with inconsistencies originating in the ontology merging process. We formulate subjective opinions about the inconsistency causing axioms based on several pieces of evidence such as provenance information and structural relevance by utilizing consensus and conditional deduction operators. This allows creating an environment that supports handling of these inconsistencies. It provides the necessary mechanisms to capture the subjective opinion of different communities represented by the input ontologies on the trustworthiness of each axiom in the merged ontology and identifies the least trustworthy axioms. It suggests remedies of the inconsistencies, e.g. deleting or rewriting axioms, to the user. Our experimental results show that with this approach it is possible to overcome the inconsistency problem in ontology merging and that the approach is feasible and effective.

Samira Babalou, Birgitta König-Ries

Open Access

VoIDext: Vocabulary and Patterns for Enhancing Interoperable Datasets with Virtual Links

Semantic heterogeneity remains a problem when interoperating with data from sources of different scopes and knowledge domains. Causes for this challenge are context-specific requirements (i.e. no “one model fits all”), different data modelling decisions, domain-specific purposes, and technical constraints. Moreover, even if the problem of semantic heterogeneity among different RDF publishers and knowledge domains is solved, querying and accessing the data of distributed RDF datasets on the Web is not straightforward. This is because of the complex and fastidious process needed to understand how these datasets can be related or linked, and consequently, queried. To address this issue, we propose to extend the existing Vocabulary of Interlinked Datasets (VoID) by introducing new terms such as the Virtual Link Set concept and data model patterns. A virtual link is a connection between resources such as literals and IRIs (Internationalized Resource Identifier) with some commonality where each of these resources is from a different RDF dataset. The links are required in order to understand how to semantically relate datasets. In addition, we describe several benefits of using virtual links to improve interoperability between heterogenous and independent datasets. Finally, we exemplify and apply our approach to multiple world-wide used RDF datasets.

Tarcisio Mendes de Farias, Kurt Stockinger, Christophe Dessimoz
NotaryPedia: A Knowledge Graph of Historical Notarial Manuscripts

The Notarial Archives in Valletta, the capital city of Malta, houses a rich and valuable collection of around twenty thousand notarial manuscripts dating back to the 15th century. The Archive wants to make the contents of this collection easily accessible and searchable to researchers and the general public. Knowledge Graphs have been successfully used to represent similar historical content. Nevertheless, building a Knowledge Graph for the archives is challenging as these documents are written in medieval Latin and currently there is a lack of information extraction tools that recognise this language. This is, furthermore, compounded with a lack of medieval Latin corpora to train and evaluate machine learning algorithms, as well as a lack of an ontological representation for the contents of notarial manuscripts. In this paper, we present NotaryPedia, a Knowledge Graph for the Notarial Archives. We extend our previous work on entity and keyphrase extraction with relation extraction to populate the Knowledge Graph using an ontological vocabulary for notarial deeds. Furthermore, we perform Knowledge Graph completeness using link-prediction and inference. Our work was evaluated using different translational distance and semantic matching models to predict relations amongst literals by promoting them to entities and to infer new knowledge from existing entities. A 49% relation prediction accuracy using TransE was achieved.

Charlene Ellul, Joel Azzopardi, Charlie Abela
Applying a Model-Driven Approach for UML/OCL Constraints: Application to NoSQL Databases

Big Data have received a great deal of attention in recent years. Not only the amount of data is on a completely different level than before, but also we have different type of data including factors such as format, structure, and sources. This has definitely changed the tools we need to handle Big Data, giving rise to NoSQL systems. While NoSQL systems have proven their efficiency to handle Big Data, it’s still an unsolved problem how the automatic storage of Big Data in NoSQL systems could be done. This paper proposes an automatic approach for implementing UML conceptual models in NoSQL systems, including the mapping of the associated OCL constraints to the code required for checking them. In order to demonstrate the practical applicability of our work, we have realized it in a tool supporting four fundamental OCL expressions: Iterate-based expressions, OCL predefined operations, If expression and Let expression.

Fatma Abdelhadi, Amal Ait Brahim, Gilles Zurfluh
Manhattan Siamese LSTM for Question Retrieval in Community Question Answering

Community Question Answering (cQA) are platforms where users can post their questions, expecting for other users to provide them with answers. We focus on the task of question retrieval in cQA which aims to retrieve previous questions that are similar to new queries. The past answers related to the similar questions can be therefore used to respond to the new queries. The major challenges in this task are the shortness of the questions and the word mismatch problem as users can formulate the same query using different wording. Although question retrieval has been widely studied over the years, it has received less attention in Arabic and still requires a non trivial endeavour. In this paper, we focus on this task both in Arabic and English. We propose to use word embeddings, which can capture semantic and syntactic information from contexts, to vectorize the questions. In order to get longer sequences, questions are expanded with words having close word vectors. The embedding vectors are fed into the Siamese LSTM model to consider the global context of questions. The similarity between the questions is measured using the Manhattan distance. Experiments on real world Yahoo! Answers dataset show the efficiency of the method in Arabic and English.

Nouha Othman, Rim Faiz, Kamel Smaïli
A Formalisation and a Computational Characterisation of ORM Derivation Rules

Object-Role Modelling (ORM) is a framework for modelling a domain using a rich set of constraints with an intuitive diagrammatic representation, not dissimilar to UML class diagrams. ORM is backed by Microsoft with Visual Studio, and it is used to support the design of large database schemas and/or complex software, easing the workflow for all stakeholders and bridging the gap among them, since every constraint of the diagram is encoded in a language which is understandable even by non-IT users. Besides the standard constraints, ORM also supports Derivation Rules that, in a way similar to UML/OCL constraints and SQL triggers, are able to express knowledge which is beyond standard graphic-based ORM capabilities. Despite ORM has its own formalisation in literature, Derivation Rules in ORM lack of this feature. The purpose of this paper is to provide a formalisation for ORM Derivation Rules in order to extend the automated reasoning on diagrams equipped with Derivation Rules. Automated reasoning is useful to check the consistency of diagrams, new inferred knowledge to validate the diagram or to avoid mistakes which could degrade the quality of the system. We provide the formalisation of Derivation Rules with a precise syntax and a semantics grounded on a precise and non-ambiguous encoding in first-order logic. Finally, we also detect an expressive decidable fragment of Derivation Rules by means of an encoding in an expressive Description Logic. A reasoner for this fragment has been implemented in a plugin for Microsoft Visual Studio.

Francesco Sportelli, Enrico Franconi
What Are the Parameters that Affect the Construction of a Knowledge Graph?

A large number of datasets are made publicly available on a wide range of formats. Due to interoperability problems, the construction of RDF-based knowledge graphs (KG) using declarative mapping languages has emerged with the aim of integrating heterogeneous sources in a uniform way. Although the scientific community has actively contributed with several engines to solve the problem of knowledge graph construction, the lack of testbeds has prevented reproducible benchmarking of these engines. In this paper, we tackle the problem of evaluating knowledge graph creation, and analyze and empirically study a set of variables and configurations that impact on the behaviour of these engines (e.g. data size, data distribution, mapping complexity). The evaluation has been conducted on RMLMapper and the SDM-RDFizer, two state-of-the-art engines that interpret the RDF Mapping Language (RML) and transform (semi)-structured data into RDF knowledge graphs. The results allow us to discover unknown relations between these engines that cannot be observed in other configurations.

David Chaves-Fraga, Kemele M. Endris, Enrique Iglesias, Oscar Corcho, Maria-Esther Vidal
Creating a Vocabulary for Data Privacy
The First-Year Report of Data Privacy Vocabularies and Controls Community Group (DPVCG)

Managing privacy and understanding handling of personal data has turned into a fundamental right, at least within the European Union, with the General Data Protection Regulation (GDPR) being enforced since May 25th 2018. This has led to tools and services that promise compliance to GDPR in terms of consent management and keeping track of personal data being processed. The information recorded within such tools, as well as that for compliance itself, needs to be interoperable to provide sufficient transparency in its usage. Additionally, interoperability is also necessary towards addressing the right to data portability under GDPR as well as creation of user-configurable and manageable privacy policies. We argue that such interoperability can be enabled through agreement over vocabularies using linked data principles. The W3C Data Privacy Vocabulary and Controls Community Group (DPVCG) was set up to jointly develop such vocabularies towards interoperability in the context of data privacy. This paper presents the resulting Data Privacy Vocabulary (DPV), along with a discussion on its potential uses, and an invitation for feedback and participation.

Harshvardhan J. Pandit, Axel Polleres, Bert Bos, Rob Brennan, Bud Bruegger, Fajar J. Ekaputra, Javier D. Fernández, Roghaiyeh Gachpaz Hamed, Elmar Kiesling, Mark Lizar, Eva Schlehahn, Simon Steyskal, Rigo Wenning

Cloud and Trusted Computing (C&TC) 2019

Frontmatter
Multi-cloud Services Configuration Based on Risk Optimization

Nowadays risk analysis becomes critical in the Cloud Computing domain due to the increasing number of threats affecting applications running on cloud infrastructures. Multi-cloud environments allow connecting and migrating services from multiple cloud providers to manage risks. This paper addresses the question of how to model and configure multi-cloud services that can adapt to changes in user preferences and threats on individual and composite services. We propose an approach that combines Product Line (PL) and Machine Learning (ML) techniques to model and timely find optimal configurations of large adaptive systems such as multi-cloud services. A three-layer variability modeling on domain, user preferences, and adaptation constraints is proposed to configure multi-cloud solutions. ML regression algorithms are used to quantify the risk of resulting configurations by analyzing how a service was affected by incremental threats over time. An experimental evaluation on a real life electronic identification and trust multi-cloud service shows the applicability of the proposed approach to predict the risk for alternative re-configurations on autonomous and decentralized services that continuously change their availability and provision attributes.

Oscar González-Rojas, Juan Tafurth
Modeling a Multi-agent Tourism Recommender System

Today’s design of e-services for tourists means dealing with a big quantity of information and metadata that designers should be able to leverage to generate perceived values for users. In this paper we revise the design choices followed to implement a recommender system, highlighting the data processing and architectural point of view, and finally we propose a multi-agent recommender system.

Valerio Bellandi, Paolo Ceravolo, Eugenio Tacchini
Backmatter
Metadaten
Titel
On the Move to Meaningful Internet Systems: OTM 2019 Conferences
herausgegeben von
Hervé Panetto
Christophe Debruyne
Martin Hepp
Dave Lewis
Dr. Claudio Agostino Ardagna
Robert Meersman
Copyright-Jahr
2019
Electronic ISBN
978-3-030-33246-4
Print ISBN
978-3-030-33245-7
DOI
https://doi.org/10.1007/978-3-030-33246-4

Premium Partner