Skip to main content

Über dieses Buch

This volume constitutes the refereed proceedings of the Confederated International Conferences: Cooperative Information Systems, CoopIS 2014, and Ontologies, Databases, and Applications of Semantics, ODBASE 2014, held as part of OTM 2014 in October 2014 in Amantea, Italy. The 39 full papers presented together with 12 short papers and 5 keynotes were carefully reviewed and selected from a total of 115 submissions. The OTM program covers subjects as follows: process designing and modeling, process enactment, monitoring and quality assessment, managing similarity, software services, improving alignment, collaboration systems and applications, ontology querying methodologies and paradigms, ontology support for web, XML, and RDF data processing and retrieval, knowledge bases querying and retrieval, social network and collaborative methodologies, ontology-assisted event and stream processing, ontology-assisted warehousing approaches, ontology-based data representation, and management in emerging domains.



Process Design and Modeling


Decomposing Alignment-Based Conformance Checking of Data-Aware Process Models

Process mining techniques relate observed behavior to modeled behavior, e.g., the automatic discovery of a Petri net based on an event log. Process mining is not limited to process discovery and also includes conformance checking. Conformance checking techniques are used for evaluating the quality of discovered process models and to diagnose deviations from some normative model (e.g., to check compliance). Existing conformance checking approaches typically focus on the control-flow, thus being unable to diagnose deviations concerning data. This paper proposes a technique to check the conformance of data-aware process models. We use so-called

Petri nets with Data

to model data variables, guards, and read/write actions. Data-aware conformance checking problem may be very time consuming and sometimes even intractable when there are many transitions and data variables. Therefore, we propose a technique to decompose large data-aware conformance checking problems into smaller problems that can be solved more efficiently. We provide a general correctness result showing that decomposition does not influence the outcome of conformance checking. The approach is supported through ProM plug-ins and experimental results show significant performance improvements. Experiments have also been conducted with a real-life case study, thus showing that the approach is also relevant in real business settings.

Massimiliano de Leoni, Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der Aalst

A Pattern Approach to Conquer the Data Complexity in Simulation Workflow Design

Scientific workflows may be used to enable the collaborative implementation of scientific applications across various domains. Since each domain has its own requirements and solutions for data handling, such workflows often have to deal with a highly heterogeneous data environment. This results in an increased complexity of workflow design. As scientists typically design their scientific workflows on their own, this complexity hinders them to concentrate on their core issue, namely the experiments, analyses, or simulations they conduct. In this paper, we present a novel approach to a pattern-based abstraction support for the complex data management in simulation workflows that goes beyond related work in similar research areas. A pattern hierarchy with different abstraction levels enables a separation of concerns according to the skills of different persons involved in workflow design. The goal is that scientists are no longer obliged to specify low-level details of data management in their workflows. We discuss the advantages of this approach and show to what extent it reduces the complexity of simulation workflow design. Furthermore, we illustrate how to map patterns onto executable workflows. Based on a prototypical implementation of three real-world simulations, we evaluate our approach according to relevant requirements.

Peter Reimann, Holger Schwarz, Bernhard Mitschang

Augmenting and Assisting Model Elicitation Tasks with 3D Virtual World Context Metadata

Accurate process model elicitation continues to be a time consuming task, requiring skill on the part of the interviewer to process information from interviewees. Many errors occur in this stage that would be avoided by better activity recall, more consistent specification and greater engagement by interviewees. Situated cognition theory indicates that 3D representations of real work environments engage and prime viewer cognitive states. In this paper, we augment a previous process elicitation methodology with virtual world context metadata, drawn from a 3D simulation of the workplace. We present a conceptual and formal approach for representing this contextual metadata, integrated into a process similarity measure that provides hints for the business analyst to use in process modelling steps. Finally, we conclude with examples from two use cases to illustrate the potential abilities of this approach.

Ross Brown, Stefanie Rinderle-Ma, Simone Kriglstein, Sonja Kabicher-Fuchs

Deployment of Service-Based Processes in the Cloud Using Petri Net Decomposition

Cloud Computing is a new distributed computing paradigm that consist in provisioning of infrastructure, software and platform resources as services. Platform services are limited to proprietary or specific programming frameworks and APIs. This issue is not adequate for the deployment of service-based processes which are likely to be composed of a diverse and heterogeneous set of services. In this paper, we propose a new approach to provision appropriate platform resources in order to deploy service-based processes in existing Cloud platforms. Our approach consists in slicing a given process to deploy into a set of elementary services through a Petri net decomposition approach. Source codes of obtained services are generated. After that, the services are packaged in our already developed service micro-containers and deployed in any target PaaS. For the slicing, we defined algorithms to slice their correspondent Petri net into a set of dependent WF-nets and to determine the orchestration to follow for their execution. We also provided the proof of preservation of initial business process semantics when executing the WF-nets. To illustrate and show the feasibility of our proposition, we provide a realistic use case scenario, i.e. Shop process deployment in Cloud Foundry PaaS.

Sami Yangui, Kais Klai, Samir Tata

Process Enactment

Log-Based Understanding of Business Processes through Temporal Logic Query Checking

Process mining is a discipline that aims at discovering, monitoring and improving real-life processes by extracting knowledge from event logs. Process discovery and conformance checking are the two main process mining tasks. Process discovery techniques can be used to learn a process model from example traces in an event log, whereas the goal of conformance checking is to compare the observed behavior in the event log with the modeled behavior. In this paper, we propose an approach based on

temporal logic query checking

, which is in the middle between process discovery and conformance checking. It can be used to discover those LTL-based business rules that are valid in the log, by checking against the log a (user-defined) class of rules. The proposed approach is not limited to provide a boolean answer about the validity of a business rule in the log, but it rather provides valuable diagnostics in terms of traces in which the rule is satisfied (witnesses) and traces in which the rule is violated (counterexamples). We have implemented our approach as a proof of concept and conducted a wide experimentation using both synthetic and real-life logs.

Margus Räim, Claudio Di Ciccio, Fabrizio Maria Maggi, Massimo Mecella, Jan Mendling

Approach and Refinement Strategies for Flexible Choreography Enactment

Collaborative, Dynamic & Complex (CDC) systems such as adaptive pervasive systems, eScience applications, and complex business systems inherently require modeling and run time flexibility. Since domain problems in CDC systems are expressed as service choreographies and enacted by service orchestrations, we propose an approach introducing placeholder modeling constructs usable both on the level of choreographies and orchestrations, and a classification of strategies for their refinement to executable workflows. These abstract modeling constructs allow deferring the modeling decisions to later points in the life cycle of choreographies. This supports run time scenarios such as incorporating new participants into a choreography after its enactment has started or enhancing the process logic of some of the participants. We provide a prototypical implementation of the approach and evaluate it by means of a case study.

Andreas Weiß, Santiago Gómez Sáez, Michael Hahn, Dimka Karastoyanova

Business Process Fragments Behavioral Merge

In the present work, we propose an approach to merge business process fragments in order to facilitate the reuse of fragments in business process designs. The approach relies on the so-called adjacency matrices. Typically used to handle graphs, this concept represents a new way to systematically merge fragments through their corresponding matrices. At the same time, fragments merging must keep the behavior of the original fragments consisting of their execution scenarios and rule out undesirable ones that may be generated during the merge task. Indeed, such behaviors probably lead to process execution blocking. The proposed approach has been implemented and tested on a collection of fragments and experimental results are provided.

Mohamed Anis Zemni, Nejib Ben Hadj-Alouane, Amel Mammar

Monitoring and Quality Assessment

Provenance-Based Quality Assessment and Inference in Data-Centric Workflow Executions

In this article we present a rule-based quality model for data centric workflows. The goal is to build a tool assisting workflow designers and users in annotating, exploring and improving the quality of data produced by complex media mining workflow executions. Our approach combines an existing fine-grained provenance generation approach [3] with a new quality assessment model for


XML fragments with data/application-specific quality values and


new values from existing annotations and provenance dependencies. We define the formal semantics using an appropriate fixpoint operator and illustrate how it can be implemented using standard Jena inference rules provided by current semantic web infrastructures.

Clément Caron, Bernd Amann, Camelia Constantin, Patrick Giroux, André Santanchè

Collaborative Building of an Ontology of Key Performance Indicators

In the present paper we propose a logic model for the representation of Key Performance Indicators (KPIs) that supports the construction of a valid reference model (or KPI ontology) by enabling the integration of definitions proposed by different engineers in a minimal and consistent system. In detail, the contribution of the paper is as follows: (i) we combine the descriptive semantics of KPIs with a logical representation of the formula used to calculate a KPI, allowing to make the algebraic relationships among indicators explicit; (ii) we discuss how this representation enables reasoning over KPI formulas to check equivalence of KPIs and overall consistency of the set of indicators, and present an empirical study on the efficiency of the reasoning; (iii) we present a prototype implementing the approach to collaboratively manage a shared ontology of KPI definitions.

Claudia Diamantini, Laura Genga, Domenico Potena, Emanuele Storti

Aligning Monitoring and Compliance Requirements in Evolving Business Networks

Dynamic business networks (BNs) are intrinsically characterised by change. Compliance requirements management, in this context, may become particularly challenging. Partners in the network may join and leave the collaboration dynamically and tasks over which compliance requirements are specified may be consequently delegated to new partners or backsourced by network participants. This paper considers the issue of aligning the compliance requirements in a BN with the monitoring requirements they induce on the BN participants when change (or evolution) occurs. We first provide a conceptual model of BNs and their compliance requirements, introducing the concept of monitoring capabilities induced by compliance requirements. Then, we present a set of mechanisms to ensure consistency between the monitoring and compliance requirements when BNs evolve, e.g. tasks are delegated or backsourced in-house. Eventually, we discuss a prototype implementation of our framework, which also implements a set of metrics to check the status of a BN in respect of compliance monitorability.

Marco Comuzzi

Managing Similarity

TAGER: Transition-Labeled Graph Edit Distance Similarity Measure on Process Models

Although several approaches have been proposed to compute the similarity between process models, they have various limitations. We propose an approach named TAGER (








dit distance similarity Measu


e) to compute the similarity based on the edit distance between coverability graphs. As the coverability graph represents the behavior of a Petri net well, TAGER, based on it, has a high precise computation. Besides, the T-labeled graphs (an isomorphic graph of the coverability graph) of models are independent, so TAGER can be used as the index for searching process models in a repository. We evaluate TAGER from efficiency and quality perspectives. The results show that TAGER meets all the seven criteria and the distance metric requirement that a good similarity algorithm should have. TAGER also balances the efficiency and precision well.

Zixuan Wang, Lijie Wen, Jianmin Wang, Shuhao Wang

CFS: A Behavioral Similarity Algorithm for Process Models Based on Complete Firing Sequences

Similarity measurement of process models is an indispensable task in business process management, which is widely used in many scenarios such as organizations merging, user requirements change and model repository management. This paper focuses on a behavioral process similarity algorithm named CFS based on complete firing sequences which are used to express model behavior. We propose a matching theme for two sets of complete firing sequences by A* algorithm with pruning strategy and define new similarity measure method. Experiments show that this method improves rationality than existing behavior-based similarity algorithms.

Zihe Dong, Lijie Wen, Haowei Huang, Jianmin Wang

Efficient Behavioral-Difference Detection between Business Process Models

Recently, business process management plays a more and more important role on the management of organizations. Business process models are used to enhance the efficiency of management and gain the control of (co)operations both within an organization and between business partners. Organizations, especially big enterprises, maintain thousands of process models and process models of the same category are with high similarity. For example, China Mobile Communication Corporation (CMCC) have more than 8000 processes in their office automation systems and the processes of the same business in different subsidiaries are with some variations. Therefore, to analyze and manage these similar process models, techniques are required to detect the differences between them. This paper focuses on detecting behavioral differences between process models. The current technique takes hours to compare the behaviors of two models in worst cases. Therefore, in this paper, an efficient technique is proposed by comparing dependency sets and trace sets of features in process models. Experiments show that the technique can detect behavioral differences between two models within seconds in worst cases, at least four orders of magnitude faster than existing techniques.

Zhiqiang Yan, Yuquan Wang, Lijie Wen, Jianmin Wang

Compliance Checking of Data-Aware and Resource-Aware Compliance Requirements

Compliance checking is gaining importance as today’s organizations need to show that their business practices are in accordance with predefined (legal) requirements. Current compliance checking techniques are mostly focused on checking the control-flow perspective of business processes. This paper presents an approach for checking the compliance of observed process executions taking into account data, resources, and control-flow. Unlike the majority of conformance checking approaches we do not restrict the focus to the ordering of activities (i.e., control-flow). We show a collection of typical data and resource-aware compliance rules together with some domain specific rules. Moreover providing diagnostics and insight about the deviations is often neglected in current compliance checking techniques. We use control-flow and data-flow alignment to check compliance of processes and combine diagnostics obtained from both techniques to show deviations from prescribed behavior. Furthermore we also indicate the severity of observed deviations. This approach integrates with two existing approaches for control-flow and temporal compliance checking, allowing for multi-perspective diagnostic information in case of compliance violations. We have implemented our techniques and show their feasibility by checking compliance of synthetic and real life event logs with resource and data-aware compliance rules.

Elham Ramezani Taghiabadi, Vladimir Gromov, Dirk Fahland, WilM. P. van der Aalst

Software Services

RelBOSS: A Relationship-Aware Access Control Framework for Software Services


is an important aspect of the

dynamically changing environments

and the

relationship context information

brings new benefits to the access control systems. Existing relationship-aware access control approaches are highly domain-specific and consider the expression of access control policies in terms of the relationship context information. However, these approaches are unable to dynamically capture the granularity levels and strengths of the relevant relationship. To this end, in this paper we present a formal

Relationship-Aware Access Control (RAAC)

model for specifying the relevant

relationship context information

and the corresponding

access control policies

. Using the


model, we introduce an ontology-based framework,




ased access control


ntology for







). One of the main novelties of the framework is that it dynamically captures the

relationship context information




granularity levels



of the relevant relationship). Experiments with a software prototype confirm the feasibility of our framework.

A. S. M. Kayes, Jun Han, Alan Colman, Md. Saiful Islam

A QoS-Aware, Trust-Based Aggregation Model for Grid Federations

In this work we deal with the issue of optimizing the global Quality of Service (QoS) of a Grid Federation by means of an aggregation model specifically designed for intelligent agents assisting Grid nodes. The proposed model relies on an algorithm, called FGF (Friendship and Group Formation), by which the nodes select their partners with the aim of maximizing the QoS they perceive when a computational task requires the collaboration of several Grid nodes. In the proposed solution, in order to assist the selection of the partners, a suitable trust model has been designed. Since jobs sent to Grid Federations hold complex requirements involving well defined resource sets, trust values are calculated for specific sets of resources. We also provide a theoretical foundation and some experiments to prove that, by means of the adoption of the FGF algorithm suitably supported by the proposed trust model, the Grid Capital (which reflect the global QoS) of the Grid Federation is eventually improved.

Antonello Comi, Lidia Fotia, Fabrizio Messina, Domenico Rosaci, Giuseppe M. L. Sarnè

Towards a Formal Specification of SLAs with Compensations

In Cooperative Information Systems, service level agreements (SLA) can be used to describe the rights and obligations of parties involved in the transactions (typically the service consumer and the service provider); amongst other information, SLA could define guarantees associated with the idea of service level objectives (SLOs) that normally represent key performance indicators of either the consumer or the provider. In case the guarantee is under-fulfilled or over-fulfilled SLAs could also define some compensations (i.e. penalties or rewards). In such a context, during the last years there have been important steps towards the automation of the management of SLAs, however the formalization of compensations in SLAs still remains as an important challenge.

In this paper we aim to provide a characterization model to create SLAs with compensations; specifically, the main contributions are twofold: (i) the conceptualization of the Compensation Function to express consistently penalties and rewards and (ii) a model for Compensable Guarantees that associate SLOs with Compensation Functions. This formalization models aim to establish a foundation to elaborate tools that could provide an automated support to the modeling and analysis of SLAs with compensations. Additionally, in order to validate our approach, we model and analyze a set of guarantee terms from three real world examples of SLAs and our formalization proves to be useful for detecting mistakes that are typically derived from the manual specification of SLAs in natural language.

Carlos Müller, Antonio M. Gutiérrez, Octavio Martín-Díaz, Manuel Resinas, Pablo Fernández, Antonio Ruiz-Cortés

Steady Network Service Design for Seamless Inter-cloud VM Migration

The ability to move virtual machines between physical hosts is a powerful tool that virtualization provides to guarantee an on demand, scalable and survivable networking such as the Cloud. This Virtual Machine (VM) migration can be handled seamlessly when a VM is moved between physical machines belonging to the same data center. However, seamless inter-Cloud VM migration (WAN migrations) cannot be guaranteed, mainly for today’s delay-sensitive video streaming applications. We propose in this paper, a hybrid virtual network to transparently migrate large volume virtual machines between data centers. Our proposed network service is designed using control theory concepts. We propose also a Lyapunov-based controller to prove the hybrid migration network stability.

Nihed Bahria El Asghar, Omar Cherkaoui, Mounir Frikha, Sami Tabbane

Improving Alignment

Prognosing the Compliance of Declarative Business Processes Using Event Trace Robustness

Several proposals have studied the compliance of execution of business process traces in accordance with a set of compliance rules. Unfortunately, the detection of a compliance violation (diagnosis) means that the observed events have already violated the compliance rules that describe the model. In turn, the detection of a compliance violation before its actual occurrence would prevent misbehaviour of the business processes. This functionality is referred to as proactive management of compliance violations in literature. However, existing approaches focus on the detection of inconsistencies between the compliance rules or monitoring process instances that are in a violable state. The notion of robustness could help us to prognosticate the occurrence of these inconsistent states in a premature way, and to detect, depending on the current execution state of the process instance, how “close” the execution is to a possible violation. On top of being able to possibly avoid violations, a robust trace is not sensitive to small changes. In this paper we propose the way to determine whether a process instance is robust against a set of compliance rules during its execution at runtime. Thanks to the use of constraint programming and the capacities of super solutions, a robust trace can be guaranteed.

María Teresa Gómez-López, Luisa Parody, Rafael M. Gasca, Stefanie Rinderle-Ma

Event-Based Real-Time Decomposed Conformance Analysis

Process mining deals with the extraction of knowledge from event logs. One important task within this research field is denoted as conformance checking, which aims to diagnose deviations and discrepancies between modeled behavior and real-life, observed behavior. Conformance checking techniques still face some challenges, among which scalability, timeliness and traceability issues. In this paper, we propose a novel conformance analysis methodology to support the real-time monitoring of event-based data streams, which is shown to be more efficient than related approaches and able to localize deviations in a more fine-grained manner. Our developed approach can be directly applied in business process contexts where rapid reaction times are crucial; an exhaustive case example is provided to evidence the validity of the approach.

Seppe K. L. M. vanden Broucke, Jorge Munoz-Gama, Josep Carmona, Bart Baesens, Jan Vanthienen

Capitalizing the Designers’ Experience for Improving Web API Selection

Solutions for supporting Web API selection may depend, behind the compliance of available Web APIs with respect to the search request (according to user-specified tags, Web API technical features, such as protocols or data formats, Web API categories), on votes assigned to Web APIs by web application designers who used them in the past. A web application designer, who is looking for a Web API for his/her own mashup, may learn from Web API rating performed by other designers. Votes should be properly weighted considering the experience/skill on application development gained by designers who assigned the votes. Therefore, estimating this experience is crucial for exploiting it in the best way. In this paper, we propose new techniques for the estimation of this experience by combining several factors, such as the reputation and popularity of the applications developed by the designers in the past. We validated our proposal with preliminary experiments, based on the contents of a well-known Web API public repository.

Devis Bianchini, Valeria De Antonellis, Michele Melchiori

Collaboration Systems and Applications

Software Support Requirements for Awareness in Collaborative Modeling

To address issues of traditional modeling tools (installation, model versioning and lack of model repositories), Axellience has developed the first online UML modeling tool. In GenMyModel’s beta-phase, the most requested feature was collaboration. Supporting collaborative modeling involves addressing classical concerns of CSCW. These issues are usually classified through core dimensions like awareness and articulation work. We decided to focus our research on the most important dimension: awareness. Commercial modeling tools and research prototypes provide little support for awareness.To define the importance of awareness in modeling tools, we decided to study what awareness information is really required in collaborative modeling and to assess its importance according to articulation work types. To do this, we have implemented a basic collaboration system without constraint on articulation work. After a few months of use, we have identified three articulation work types present in more than 500 collaborative projects. This preliminary study allowed us to define awareness elements potentially needed for each articulation work type. As these elements are different for each articulation work type, we launched different surveys for each one of them. With these surveys, we have sorted awareness information by relevance according to articulation work types.

Michel Dirix, Xavier Le Pallec, Alexis Muller

Trusted Dynamic Storage for Dunbar-Based P2P Online Social Networks

Online Social Networks (OSNs) are becoming more and more popular in today’s Internet. Distributed Online Social Networks (DOSNs), are OSNs which do not exploit a central server for storing users’ data and enable users to have more control on their profile content, ensuring a higher level of privacy. The main challenge of DOSNs comes from guaranteeing availability of the data when the data owner is offline. In this paper we propose a new P2P dynamic approach to the problem of data persistence in DOSNs. By following Dunbar’s approach, our system stores the data of a user only on a restricted number of friends which have regular contacts with him/her. Users in this set are chosen by considering several criteria targeting different goals. Differently from other approaches, nodes chosen to keep data replicas are not statically defined but dynamically change according to users’ churn. Our dynamic friend selection achieves availability higher than 90% with a maximum of 2 online profile replicas at a time for users with at least 40 friends. By using real Facebook data traces we prove that our approach offers high availability even when the online time of users is low.

Marco Conti, Andrea De Salve, Barbara Guidi, Francesco Pitto, Laura Ricci

A Cooperative Information System for Managing Traffic Incidents with the PAUSETA Protocol

When a serious traffic incident happens several independent agencies must be coordinated in order to allocate the necessary resources to attend it. Currently, the coordination among these agencies is done manually, increasing the response time. In this paper is presented an information system suitable to be applied to reach agreements in a distributed and autonomous way among agencies about which an agency should give which resource to manage a traffic incident. This information system is guided by the PAUSETA protocol, which executes a distributed combinatorial auction for resource allocation. By using this protocol several issues around private information about resource features and availability and administrative and legal competences are individually managed by each agency. This information system has been implemented following a multi-agent software architecture. The prototype has been tested for a real traffic incident scenario in Castellón (Spain). The results obtained are consistent with respect to the ones it might provide an optimum centralized system. However this optimum centralized system is impossible to apply due to the inherent distributed nature of the problem addressed.

Miguel Prades-Farrón, Luis A. García, Vicente R. Tomás

Short Papers

Mining Business Process Deviance: A Quest for Accuracy

This paper evaluates the suitability of sequence classification techniques for analyzing deviant business process executions based on event logs. Deviant process executions are those that deviate in a negative or positive way with respect to normative or desirable outcomes, such as executions that undershoot or exceed performance targets. We evaluate a range of features and classification methods based on their ability to accurately discriminate between normal and deviant executions. We also analyze the ability of the discovered rules to explain potential causes of observed deviances. The evaluation shows that feature types extracted using pattern mining techniques only slightly outperform those based on individual activity frequency. It also suggest that more complex feature types ought to be explored to achieve higher levels of accuracy.

Hoang Nguyen, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, Suriadi Suriadi

Multi-paradigm Process Mining: Retrieving Better Models by Combining Rules and Sequences

(Short Paper)

Business process mining is a well-established field of research which focuses on the automatic retrieval and analysis of process flows. The discovery and representation of these models is based on techniques that come in all shapes and forms. Most notably, procedurally-based algorithms such as Heuristics Miner have been used successfully for this purpose. Also, declarative process model miners have been proposed, which give other insights into the model by generating rules that apply on the activities. This paper proposes an integrated approach to combining these paradigms to discover process models that contain best of both worlds to enrich insights into the event logs under scrutiny.

Johannes De Smedt, Jochen De Weerdt, Jan Vanthienen

Ontology Querying Methodologies and Paradigms


Fuzzy XPath Queries in XQuery

We have recently designed a fuzzy extension of the XPath language which provides ranked answers to flexible queries taking profit of fuzzy variants of






operators for XPath conditions, as well as two structural constraints, called




, for which a certain degree of relevance is associated. In this work, we describe how to implement the proposed fuzzy XPath with the XQuery language. Basically, we have defined an XQuery library able to fuzzily handle XPath expressions in such a way that our proposed fuzzy XPath can be encoded as XQuery expressions. The advantages of our approach is that any XQuery processor can handle a fuzzy version of XPath by using the library we have implemented.

Jesús M. Almendros-Jiménez, Alejandro Luna, Ginés Moreno

Flexible Querying for SPARQL

Flexible querying techniques can be used to enhance users’ access to heterogeneous data sets, such as Linked Open Data. This paper extends SPARQL 1.1 with approximation and relaxation operators that can be applied to regular expressions for querying property paths in order to find more answers than would be returned by the exact form of a user query. We specify the semantics of the extended language and we consider the complexity of query answering with the new operators, showing that both data and query complexity are not impacted by our extensions. We present a query evaluation algorithm that returns results incrementally according to their “distance” from the original query. We have implemented this algorithm and have conducted preliminary trials over the YAGO SPARQL endpoint and the Lehigh University Benchmark, showing promising performance for the language extensions.

Andrea Calì, Riccardo Frosini, Alexandra Poulovassilis, Peter T. Wood

How Good Is Your SPARQL Endpoint?

A QoS-Aware SPARQL Endpoint Monitoring and Data Source Selection Mechanism for Federated SPARQL Queries

Due to the decentralised and autonomous architecture of the Web of Data, data replication and local deployment of SPARQL endpoints is inevitable. Nowadays, it is common to have multiple copies of the same dataset accessible by various SPARQL endpoints, thus leading to the problem of selecting optimal data source for a user query based on data properties and requirements of the user or the application. Quality of Service (QoS) parameters can play a pivotal role for the selection of optimal data sources according to the user’s requirements. QoS parameters have been widely studied in the context of web service selection. However, to the best of our knowledge, the potential of associating QoS parameters to SPARQL endpoints for optimal data source selection has not been investigated.

In this paper, we define various QoS parameters associated with the SPARQL endpoints and represent a semantic model for QoS parameters and their evaluation. We present a monitoring service for the SPARQL endpoint which automatically evaluates the QoS metrics of any given SPARQL endpoint. We demonstrate the utility of our monitoring service by implementing an extension of the SPARQL query language, which caters for user requirements based on QoS parameters and selects the optimal data source for a particular user query over federated sources.

Muhammad Intizar Ali, Alessandra Mileo

Embedding OWL Querying and Reasoning into XQuery

In this paper we present a framework called XQOWL that makes possible to handle XML and RDF/OWL data with XQuery. XQOWL can be considered as an extension of the XQuery language that connects XQuery with SPARQL and OWL reasoners. XQOWL embeds SPARQL queries (via Jena SPARQL engine) into XQuery and enables to make calls to OWL reasoners (HermiT, Pellet and FaCT++) from XQuery. It permits to combine queries against XML and RDF/OWL resources as well as to reason with RDF/OWL data. Therefore input data can be either XML or RDF/OWL and output data can be formatted in XML (also using RDF/OWL XML serialization).

Jesús M. Almendros-Jiménez

Ontology Support for Web, XML and RDF Data Processing and Retrieval

Deriving Folksonomies for Improving Web API Search

Web APIs, that is, software components made available by third parties through web interfaces, can be aggregated to develop web applications, also known as mashups. Also in this application domain, tagging performed by other mashup designers, who used available Web APIs and mashups composed of them, might be exploited as knowledge that progressively emerges from the community of designers. Web API tagging has some peculiar aspects that will be analyzed in this paper. On the other hand, folksonomies are Web 2.0 tools for conceptualizing knowledge emerging from the bottom. In this paper, we discuss the adoption of folksonomy concepts in modeling Web API use for mashup development. We motivate the adoption of folksonomies in this context and we present the differences with other models that represent very close information. Our folksonomy model is meant to be fully compliant with existing and commonly used public Web API repositories. It is not intended to substitute them, but to complement their contents in order to enable advanced Web API search facilities in such a collaborative environment.

Devis Bianchini

Adaptive Similarity of XML Data

In this work we explore application of XML schema similarity mapping in the area of conceptual modeling of XML schemas. We expand upon our previous efforts to map XML schemas to a common platform-independent schema using similarity evaluation based on exploitation of a decision tree. In particular, in this paper a more versatile method is implemented and the decision tree is trained using a large set of user-annotated mapping decision samples. Several variations of training that could improve the mapping results are proposed. The approach is implemented within a modeling and evolution management framework called


and its variations are evaluated using a wide range of experiments.

Eva Jílková, Marek Polák, Irena Holubová

FAGI: A Framework for Fusing Geospatial RDF Data

In this paper, we present FAGI, a framework for fusing geospatial RDF data. Starting from two interlinked datasets, FAGI handles all the steps of the fusion process, producing an integrated, richer dataset that combines entities and attributes from both initial ones. In contrast to existing approaches and tools, which deal either with RDF fusion or with spatial conflation, FAGI specifically addresses the fusion of geospatial RDF data. We describe the main components of the framework and their functionalities, which include aligning dataset vocabularies, processing geospatial features, applying -manually or automatically- fusion strategies, and recommending link creation or rejection between RDF entities, with emphasis on their geospatial properties.

Giorgos Giannopoulos, Dimitrios Skoutas, Thomas Maroulis, Nikos Karagiannakis, Spiros Athanasiou

Knowledge Bases Querying and Retrieval

Online Reasoning for Ontology-Based Error Detection in Text

Detecting error in text is a difficult task. Current methods use a domain ontology to identify elements in the text that contradicts domain knowledge. Yet, these methods require manually defining the type of errors that are expected to be found in the text before applying them. In this paper we propose a new approach that uses logic reasoning to detect errors in a statement from text online. Such approach applies Information Extraction to transform text into a set of logic clauses. The logic clauses are incorporated into the domain ontology to determine if it contradicts the ontology or not. If the statement contradicts the domain ontology, then the statement is incorrect with respect to the domain knowledge. We have evaluated our proposed method by applying it to a set of written summaries from the domain of Ecosystems. We have found that this approach, although depending on the quality of the Information Extraction output, can identify a significant amount of errors. We have also found that modeling elements of the ontology (i.e., property domain and range) likewise affect the capability of detecting errors.

Fernando Gutiererz, Dejing Dou, Stephen Fickas, Gina Griffiths

Making Metaquerying Practical for Hi(DL − Lite R ) Knowledge Bases







) is a higher-order Description Logic obtained from





by adding metamodeling features, and is equipped with a query language that is able to express higher-order queries. We investigate the problem of answering a particular class of such queries, called instance higher-order queries, posed over







) knowledge bases (KBs). The only existing algorithm for this problem is based on the idea of reducing the evaluation of a higher-order query


over a







) KB to the evaluation of a union of first-order queries over a





KB, built from


by instantiating its metavariables in all possible ways. Although of polynomial time complexity with respect to the size of the KB, this algorithm turns out to be inefficient in practice. In this paper we present a new algorithm, called Smart Binding Planner (SBP), that compiles Q into a program, that issues a sequence of first-order conjunctive queries, where each query has the goal of providing the bindings for metavariables of the next ones, and the last one completes the process by computing the answers to Q. We also illustrate some experiments showing that, in practice, SBP is significantly more efficient than the previous approach.

Maurizio Lenzerini, Lorenzo Lepore, Antonella Poggi

Knowledge-Based Compliance Checking of Business Processes

Discovering discrepancies between actual business processes and business processes outlined by standards, best practices facilitates decision making of process owners regarding business process improvement. In semantic compliance management, knowledge related to job roles can serve as a basis to check compliance. Process ontologies preserve the structure of business processes and bear background knowledge for executing processes, which can be structured in domain ontologies. This paper presents how process and domain ontologies can work together in ontology matching to make a proposal for process owners regarding compliance check. A case from nursing practice will illustrate the applicability of this approach.

Ildikó Szabó, Krisztián Varga

Improved Automatic Maturity Assessment of Wikipedia Medical Articles

(Short Paper)

The Internet is naturally a simple and immediate mean to retrieve information. However, not everything one can find is equally accurate and reliable. In this paper, we continue our line of research towards effective techniques for assessing the quality of online content. Focusing on the Wikipedia Medicinal Portal, in a previous work we implemented an automatic technique to assess the quality of each article and we compared our results to the classification of the articles given by the portal itself, obtaining quite different outcomes. Here, we present a lightweight instantiation of our methodology that reduces both redundant features and those not mentioned by the WikiProject guidelines. What we obtain is a fine-grained assessment and a better discrimination of the articles’ quality, w.r.t. previous work. Our proposal could help to automatically evaluate the maturity of Wikipedia medical articles in an efficient way.

Emanuel Marzini, Angelo Spognardi, Ilaria Matteucci, Paolo Mori, Marinella Petrocchi, Riccardo Conti

Social Network and Collaborative Methodologies

Integrity Management in a Trusted Utilitarian Data Exchange Platform

Utilitarian data refers to data elements that can be readily put to use by one or more stakeholders. Utility of a data element is often subjective and intertwined in the sense that, a positive utility for one stakeholder may result in a negative utility for some other stakeholder. Also, credibility of utilitarian data is often established based on the credibility of its source. For this reason, defining and managing the


of utilitarian data exchanges is a non-trivial problem. This paper describes the problem of integrity management in an inter-organizational utilitarian data exchange platform, and introduces a


-based subsystem for managing integrity. Scalability is addressed based on mechanisms of privilege percolation though containment. Formal characteristics of the proposed model are derived based on an approach of adversarial scenario-handling.

Sweety Agrawal, Chinmay Jog, Srinath Srinivasa

A Model to Support Multi-Social-Network Applications

It is not uncommon that people create multiple profiles in different social networks, spreading out over them personal information. This leads to a multi-social-network scenario where different social networks cannot be viewed as monads, but are strongly correlated to each other. Building a suitable middleware on top of social networks to support internetworking applications is an important challenge, as the global view of the social network world provides very powerful knowledge and opportunities. In this paper, we do a first important step towards this goal, by defining and implementing a model aimed at generalizing concepts, actions and relationships of existing social networks.

Francesco Buccafurri, Gianluca Lax, Serena Nicolazzo, Antonino Nocera

A Method for Detecting Behavior-Based User Profiles in Collaborative Ontology Engineering

Ontology engineering is far from trivial and most collaborative methods and tools start from a predefined set of rules, stakeholders can have in the ontology engineering process. We, however, believe that the different types of user behavior are not known a priori and depend on the ontology engineering project. The detection of such user profiles based on unsupervised learning allows finding roles and responsibilities along peers in a collaborative setting. In this paper, we present a method for automatic detection of user profiles in a collaborative ontology engineering environment by means of the K-means clustering algorithm only by looking at the type of interactions a user makes. In this paper we use the GOSPL ontology engineering tool and method to demonstrate this method. The data used to demonstrate the method stems from two ontology engineering projects involving respectively 42 and 36 users.

Sven Van Laere, Ronald Buyl, Marc Nyssen

Defining and Investigating the Scope of Users and Hashtags in Twitter

(Short Paper)

In this paper we aim at analyzing the scope of an entity in Twitter. In particular, we want to define a framework for measuring this scope from multiple viewpoints (e.g., influence, reliability, popularity) simultaneously and for multiple entities (e.g., users, hashtags). In this way, we can compare different properties and/or different entities. This comparison allows the extraction of knowledge patterns (for instance, the presence of anomalies and outliers) that can be exploited in several application domains (for example, information diffusion).

Daniel Leggio, Giuseppe Marra, Domenico Ursino

Ontology-Assisted Event and Stream Processing

Constructing Event Processing Systems of Layered and Heterogeneous Events with SPARQL

SPARQL was originally developed as a derivative of SQL to process queries over finite-length datasets encoded as RDF graphs. Processing of infinite data streams with SPARQL has been approached by using pre-processors dividing streams into finite-length windows based on either time or the number of incoming triples. Recent extensions to SPARQL can support interconnections of queries, enabling event processing applications to be constructed out of multiple incrementally processed collaborating SPARQL update rules. With more elaborate networks of queries it is possible to perform event processing on heterogeneous event formats without strict restrictions on the number of triples per event. Heterogeneous event support combined with the capability to synthesize new events enables the creation of layered event processing systems. In this paper we review the different types of complex event processing building blocks presented in literature and show their translations to SPARQL update rules through examples, supporting a modular and layered approach. The interconnected examples demonstrate the creation of an elaborate network of SPARQL update rules for solving event processing tasks.

Mikko Rinne, Esko Nuutila

On Efficient Processing of Linked Stream Data

Today, many application areas require continuous processing of data streams in an efficient manner and real-time fashion. Processing these continuous flows of data, integrating dynamic data with other data sources, and providing the required semantics lead to real challenges. Thus, Linked Stream Data (LSD) has been proposed which combines two concepts: Linked Open Data and Data Stream Processing (DSP). Recently, several LSD engines have been developed, including C-SPARQL and CQELS, which are based on SPARQL extensions for continuous query processing. However, this SPARQL-centric view makes it difficult to express complex processing pipelines. In this paper, we propose a LSD engine based on a more general stream processing approach. Instead of a variant of SPARQL, our engine provides a dataflow specification language called


which is compiled into native code.


supports native stream processing operators (e.g., window, aggregates, and joins), complex event processing as well as RDF data transformation operators such as tuplifier and triplifier to efficiently support LSD queries and provide a higher degree of expressiveness. We discuss the main concepts addressing the challenges of LSD processing and describe the usage of these concepts for processing queries from LSBench and SRBench. We show the effectiveness of our system in terms of query execution times through a comparison with existing systems as well as through a detailed performance analysis of our system implementation.

Omran Saleh, Kai-Uwe Sattler

Applying Semantics to Optimize End-User Services in Telecommunication Networks

This paper describes Aesop, a semantically-enhanced approach for optimizing telecommunication networks to manage over-the-top end-user services such as web-browsing and video-streaming, according to the experience expectations of individual users. Aesop is built using technologies already available in the semantic ecosystem. In this paper, we explain our semantic approach to this problem domain and describe how we applied semantic technologies in the design and implementation of our system. We give an overview of an implementation and evaluation of Aesop. We describe our experiences in using semantic technologies, and explore the potential and limitations of semantic technologies in the telecommunication management domain.

Liam Fallon, John Keeney, Declan O’Sullivan

Ontology-Assisted Warehousing Approaches

An Ontology-Based Data Exploration Tool for Key Performance Indicators

This paper describes the main functionalities of an ontology-based data explorer for Key Performance Indicators (KPI), aimed to support users in the extraction of KPI values from a shared repository. Data produced by partners of a Virtual Enterprise are semantically annotated through a domain ontology in which KPIs are described together with their mathematical formulas. Based on this model and on reasoning capabilities, the tool provides functionalities for dynamic aggregation of data and computation of KPI values through the formula. In this way, besides the usual drill-down, a novel mode of data exploration is enabled, based on the expansion of a KPI into its components.

Claudia Diamantini, Domenico Potena, Emanuele Storti, Haotian Zhang

Arabic-English Domain Terminology Extraction from Aligned Corpora

The rapid growth of information sources has produced a large amount of electronically stored documents evolving every day. The development of Information Retrieval Systems (IRS) is a response to this growth, which aims to help the user identify relevant information. Recent IRS proposes to guide the user through providing domain knowledge in the form of controlled vocabularies or terminologies and thus, domain ontologies. In this context, it is necessary to develop multilingual termino-ontological resources. This paper proposes a new approach for bilingual domain terminology extraction in Arabic and English languages as a first step in the bilingual domain ontology building, to be exploited in terminological search. The approach uses arabic vocalized texts to reduce ambiguities and the alignment process to extract the english translations. To the best of our knowledge, the process implemented in our approach (morphological analysis, arabic terminology extraction, alignment and extraction of english translations) is the first work in the field of arabic-english bilingual domain terminology extraction. The results of experiments are encouraging showing rates of relevant term extraction from multiple domains and their translations exceeding 89%.

Wiem Lahbib, Ibrahim Bounhas, Bilel Elayeb

Towards a Configurable Database Design: A Case of Semantic Data Warehouses

Many modern software systems are designed to be highly configurable. The configuration contributes in managing evolving software and controls the cost involved in making changes to software. Several standards exist for software configuration management (IEEE 828 and IEEE 1042). Unfortunately, making database configurable did not have the same spring as for software even though it can be seen as a software product. Nowadays, we are assisting to an explosion of new deployment layouts and platforms. This situation pushed the database community to admit the slogan: “one size no longer fits all”. This motivates us to study the issue to make database design configurable. To satisfy this objective, we need to perform the following three tasks:


a deep understanding of the database design life-cycle,


a formalization of each phase and


an identification of the interactions between these phases. In this paper, we detail these tasks by considering the case of designing semantic data warehouses.

Selma Khouri, Ladjel Bellatreche

Ontology-Based Data Representation and Management in Emerging Domains

Describing Research Data: A Case Study for Archaeology

The growth of the digital resources produced by the research activities demand the development of e-Infrastructures in which researchers can access remote facilities, select and re-use huge volumes of data and services, run complex experimental processes and share results. Data registries aim to describe uniformly the data of e-Infrastructures contributing to the re-usability and interoperability of big scientific data. However the current situation requires the development of powerful resource integration mechanisms that step beyond the principles guaranteed by the data registries standards. This paper proposes a conceptual model for describing data resources and services and extends the existing specifications for the development of data registries. The model has been implemented in the context of the ARIADNE project, a EU funded project that focuses on the integration of Archaeological digital resources all over the Europe.

Nicola Aloia, Christos Papatheodorou, Dimitris Gavrilis, Franca Debole, Carlo Meghini

Enriching Semantically Web Service Descriptions

Service Oriented Computing (SOC) has incrementally been adopted as the preferred programming paradigm for the development, integration and interoperation of large and complex information systems. However, despite its increasing popularity, the SOC has not achieved its full potential yet. This is mainly due to the lack of supporting tools to enrich and represent semantically Web service descriptions. This paper describes a solution approach for the automatic representation of Web service descriptions and their further semantic enrichment between operation names based on the calculation of four semantic similarity measures. The enrichment approach is accurate because the final decision is done through a voting scheme, in the case of inconsistent results, these are not asserted into the ontology. Experimentation shows that although few similarity relationships are found and asserted, they represent an important step towards the automatic discovery of information that was previously unknown.

Maricela Bravo, José Rodríguez, Alejandro Reyes

Parameterized Algorithms for Matching and Ranking Web Services

(Short Paper)

The paper presents two parameterized and customizable algorithms for matching and ranking Web services. Given a user query and a set of available Web services, the matching algorithm performs a logic-based semantic matchmaking to select services that functionally match the query and maintains those which fully verify the constraints specified by the user. The ranking algorithm takes as input the matching Web services, assigns to each one a score in the range 0-1 and finally rank them based on the score values. The algorithms have been implemented, evaluated and compared to iSEM and SPARQLent. Results show that the algorithms behave globally well in comparison to these frameworks.

Fatma Ezzahra Gmati, Nadia Yacoubi-Ayadi, Salem Chakhar

Arabic Domain Terminology Extraction: A Literature Review

(Short Paper)

Domain terminology extraction is an important step in many applications such as ontology building and information retrieval. Analyzing a corpus to automatically extract key terms is a difficult task, especially in the case of Arabic language. The complexity of spelling, morphology and semantics of Arabic makes natural language processing tasks quite difficult. In addition to the complexity of Arabic, the challenges related to domain terminology extraction are caused by the inherent difficulty in determining whether a word or a phrase represents or not a given text. All these problems have not restricted the multitude of Arabic terminology extraction approaches in the ontology building process. Therefore, this article presents a literature review in the field of Arabic terminology extraction focusing on the specificities of this language.

Ibrahim Bounhas, Wiem Lahbib, Bilel Elayeb

Erratum: Flexible Querying for SPARQL

The acknowledgment of the paper starting on page 490 of this volume is missing. It should read:


Andrea Calì acknowledges support by the EPSRC project “Logic-Based Integration and Querying of Unindexed Data” (EP/E010865/1).

Andrea Calì, Riccardo Frosini, Alexandra Poulovassilis, Peter T. Wood


Weitere Informationen

Premium Partner