Skip to main content

2015 | Buch

The Semantic Web: ESWC 2015 Satellite Events

ESWC 2015 Satellite Events, Portorož, Slovenia, May 31 – June 4, 2015, Revised Selected Papers

herausgegeben von: Fabien Gandon, Christophe Guéret, Serena Villata, John Breslin, Catherine Faron-Zucker, Antoine Zimmermann

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed post-conference proceedings of the Satellite Events of the 12th International Conference on the Semantic Web, ESWC 2015, held in Portorǒz, Slovenia, in May/June 2015. The volume contains 12 poster and 22 demonstration papers, selected from 50 submissions, as well as 22 best workshop papers selected from 140 papers presented at the workshop at ESWC 2015. The papers cover various aspects of the Semantic Web.

Inhaltsverzeichnis

Frontmatter

Demo and Poster Papers

Frontmatter
3XL News: A Cross-lingual News Aggregator and Reader

We present 3

XL

News

, a multi-lingual news aggregation application for iPad that provides real-time, comprehensive, global and multilingual news coverage. Using methods, developed within the XLike project, for semantic data extraction from news articles and linking of news stories we are able to construct a concise, yet in-depth view of current news stories and their semantic relation. This enables users real-time monitoring of current global events and analysis of diverse reporting in different languages and navigation across related news stories.

Evgenia Belyaeva, Jan Berčič, Katja Berčič, Flavio Fuart, Aljaž Košmerlj, Andrej Muhič, Aljoša Rehar, Jan Rupnik, Mitja Trampuš
Towards Scalable Visual Exploration of Very Large RDF Graphs

In this paper, we outline our work on developing a disk-based infrastructure for efficient visualization and graph exploration operations over very large graphs. The proposed platform, called graphVizdb, is based on a novel technique for indexing and storing the graph. Particularly, the graph layout is indexed with a spatial data structure, i.e., an R-tree, and stored in a database. In runtime, user operations are translated into efficient spatial operations (i.e., window queries) in the backend.

Nikos Bikakis, John Liagouris, Maria Kromida, George Papastefanatos, Timos Sellis
SmartKeepers: A Decentralized, Secure, and Flexible Social Platform for Coworkers

Coworking (style of work that involves a shared working environment and networking) has emerged as an attractive model for organizations. It relies on a highly dynamic collaboration among different partners. Traditional centralized social platforms lack fundamental requirements for such a collaboration that are the management of the dynamic topology of such professional networks, privacy, data exchanges and ownership. In this paper, we present SmartKeepers, a decentralized and secure environment for coworking activities. Each user physically owns his node that he plugs in and out as he moves from one collaborative space to another. The system supports a large variety of network topologies and is fully interoperable with W3C compliant solutions. We showcase the cogency of the SemanticWeb for building decentralized and secure services while keeping every user at the core of the data ownership process.

Romain Blin, Charline Berthot, Julien Subercaze, Christophe Gravier, Frederique Laforest, Antoine Boutet
How to Stay Ontop of Your Data: Databases, Ontologies and More

Ontop

is an Ontology Based Data Access system allowing users to access a relational database through a conceptual layer provided by an ontology. In this demo, we use the recently developed NPD benchmark (+4 billion triples) to demonstrate the features of

Ontop

. First we use

Ontop

as a SPARQL end-point to load the ontology and mappings, and answer SPARQL queries. Then, we will show how to use

Ontop

to check inconsistencies and exploit SWRL ontologies.

Diego Calvanese, Benjamin Cogrel, Sarah Komla-Ebri, Davide Lanti, Martin Rezk, Guohui Xiao
This ‘Paper’ is a Demo

This ‘paper’, when viewed on the Web, is the demo itself, since the interactive and semantic features can be directly observed while reading and consuming. The demo showcases, how scholarly communication can adapt to the audience, whether the content is read on a screen or printed on paper, listen with a screen reader, watched as a movie, shown as a presentation, or even interacted with in the document. To experience the described features please open this document in your Web browser under its canonical URI:

http://csarven.ca/this-paper-is-a-demo

.

Sarven Capadisli, Sören Auer, Reinhard Riedl
Improving Semantic Relatedness in Paths for Storytelling with Linked Data on the Web

Algorithmic storytelling over Linked Data on the Web is a challenging task in which many graph-based pathfinding approaches experience issues with

consistency

regarding the resulting path that leads to a story. In order to mitigate arbitrariness and increase consistency, we propose to improve the semantic relatedness of concepts mentioned in a story by increasing the relevance of links between nodes through additional domain delineation and refinement steps. On top of this, we propose the implementation of an optimized algorithm controlling the pathfinding process to obtain more homogeneous search domain and retrieve more links between adjacent hops in each path. Preliminary results indicate the potential of the proposal.

Laurens De Vocht, Christian Beecks, Ruben Verborgh, Thomas Seidl, Erik Mannens, Rik Van de Walle
Dataset Summary Visualization with LODSight

We present a web-based tool that shows a summary of an RDF dataset as a visualization of a graph formed from classes, datatypes and predicates used in the dataset. The visualization should allow to quickly and easily find out what kind of data the dataset contains and its structure. It also shows how vocabularies are used in the dataset.

Marek Dudáš, Vojtěch Svátek, Jindřich Mynarz
The ProtégéLOV Plugin: Ontology Access and Reuse for Everyone

Developing ontologies, by reusing already available and wellknown ontologies, is commonly acknowledge to play a crucial role to facilitate inclusion and expansion of the Web of Data. Some recommendations exist to guide ontologists in ontology engineering, but they do not provide guidelines on how to reuse vocabularies at low fine grained, i.e., reusing specific classes and properties. Moreover, it is still hard to find a tool that provides users with an environment to reuse terms. This paper presents ProtégéLOV, a plugin for the ontology editor Protégé, that combines the access to the Linked Open Vocabularies (LOV) during ontology modeling. It allows users to search a term in LOV and provides three actions if the term exists: (i) replace the selected term in the current ontology; (ii) add the

rdfs:subClassOf

or

rdfs:subPropertyOf

axiom between the selected term and the local term; and (iii) add the

owl:equivalentClass

or

owl:equivalentProperty

between the selected term and local term. Results from a preliminary user study indicate that ProtégéLOV does provide an intuitive access and reuse of terms in external vocabularies.

Nuria García-Santa, Ghislain Auguste Atemezing, Boris Villazón-Terrazas
Controlling and Monitoring Crisis

Nowadays there is an increase interest on using social media contents for detecting and help during natural disasters. That interest is mainly based on the successful use of this type of data in the different phases of a disaster management process which ranges from early detection to efficient communication during the management of disasters. This paper focuses on the first phases of disasters’ management and presents a system that allows to analyze, enrich, and detect on real time needs from a set of web sources. By using the concept of crowd as a sensors, this application can help, jointly with other traditional systems, to improve the current applied procedures during the initial phases of the disaster and detect demanded needs.

Nuria García-Santa, Esteban García-Cuesta, Boris Villazón-Terrazas
FAGI-gis: A Tool for Fusing Geospatial RDF Data

In this demonstration, we present FAGI-gis, a tool for fusing geospatial RDF data. FAGI-gis is the core component of the FAGI framework, which handles all the steps of the fusion process of two interlinked RDF datasets in order to produce an integrated, aligned and richer dataset that combines data and metadata from both initial datasets. In the demonstation, we showcase how a user can use FAGI-gis’s map based UI to perform several fusion actions on linked geospatial entities, considering both spatial and non-spatial properties of them.

Giorgos Giannopoulos, Nick Vitsas, Nikos Karagiannakis, Dimitrios Skoutas, Spiros Athanasiou
A Semantic, Task-Centered Collaborative Framework for Science

This paper gives an overview of the Organic Data Science framework, a new approach for scientific collaboration that opens the science process and exposes information about shared tasks, participants, and other relevant entities. The framework enables scientists to formulate new tasks and contribute to tasks posed by others. The framework is currently in use by a science community studying the age of water, and is beginning to be used by others.

Yolanda Gil, Felix Michel, Varun Ratnakar, Matheus Hauder
QueryVOWL: Visual Composition of SPARQL Queries

In order to make SPARQL queries more accessible to users, we have developed the visual query language QueryVOWL. It defines SPARQL mappings for graphical elements of the ontology visualization VOWL. In this demo, we present a web-based prototype that supports the creation, modification, and evaluation of QueryVOWL graphs. Based on the selected SPARQL endpoint, it provides suggestions for extending the query, and retrieves IRIs and literals according to the selections in the QueryVOWL graph. In contrast to related work, SPARQL queries can be created entirely with visual elements.

Florian Haag, Steffen Lohmann, Stephan Siek, Thomas Ertl
Merging and Enriching DCAT Feeds to Improve Discoverability of Datasets

Data Catalog Vocabulary (DCAT) is a W

3

C specification to describe datasets published on the Web. However, these catalogs are not easily discoverable based on a user’s needs. In this paper, we introduce the Node.js module ‘dcat-merger’ which allows a user agent to download and semantically merge different DCAT feeds from the Web into one DCAT feed, which can be republished. Merging the input feeds is followed by enriching them. Besides determining the subjects of the datasets, using DBpedia Spotlight, two extensions were built: one categorizes the datasets according to a taxonomy, and the other adds spatial properties to the datasets. These extensions require the use of information available in DBpedia’s SPARQL endpoint. However, public SPARQL endpoints often suffer from low availability, its Triple Pattern Fragments alternative is used. However, the need for DCAT Merger sparks the discussion for more high level functionality to improve a catalog’s discoverability.

Pieter Heyvaert, Pieter Colpaert, Ruben Verborgh, Erik Mannens, Rik Van de Walle
Minimally Supervised Instance Matching: An Alternate Approach

Instance matching concerns identifying pairs of instances that refer to the same underlying entity. Current state-of-the-art instance matchers use machine learning methods. Supervised learning systems achieve good performance by training on significant amounts of manually labeled samples. To alleviate the labeling effort, this poster (The work presented herein is also being published as a full conference paper at ESWC 2015. This poster provides a more high-level overview and discusses supplemental experimental findings beyond the scope of the material in the full paper.) presents a

minimally

supervised

instance matching approach that is able to deliver competitive performance using only 2% training data. As a first step, a committee of base classifiers is trained in an ensemble setting using

boosting

. Iterative

semi

 − 

supervised

learning

is used to improve the performance of the ensemble classifier even further, by

self

 − 

training

it on the most confident samples labeled in the current iteration. Empirical evaluations on real-world data show that, using a multilayer perceptron as base classifier, the system is able to achieve an average F-Measure that is within 2.5% of that of state-of-the-art supervised systems.

Mayank Kejriwal, Daniel P. Miranker
Discovering Types in RDF Datasets

An increasing number of linked datasets is published on the Web, expressed in RDF(S)/OWL. Interlinking, matching or querying these datasets require some knowledge about the types and properties they contain. This work presents an approach, relying on a clustering algorithm, which provides the types describing a dataset when this information is incomplete or missing.

Kenza Kellou-Menouer, Zoubida Kedad
Supporting Real-Time Monitoring in Criminal Investigations

Being able to analyze information collected from streams of data, generated by different types of sensors, is becoming increasingly important in many domains. This paper presents an approach for creating a decoupled semantically enabled event processing system, which leverages existing SemanticWeb technologies. By implementing the actor model, we show how we can create flexible and robust event processing systems, which can leverage different technologies in the same general workflow. We argue that in this context RSP systems can be viewed as generic systems for creating semantically enabled event processing agents. In the demonstration scenario we show how real-time monitoring can be used to support criminal intelligence analysis, and describe how the actor model can be leveraged further to support scalability.

Robin Keskisärkkä, Eva Blomqvist
FOODpedia: Russian Food Products as a Linked Data Dataset

Open and efficient sharing of information about food products and their ingredients is important for all parties of the chain ranging from the manufactures to consumers. There exist a public catalogue of some Russian food products (

http://goodsmatrix.ru/

) that is used by some manufactures and consumers. Although the information is open, there are many difficulties in using the site, e.g., interoperability, querying and linking that could be mitigated by Semantic Web technologies. This paper presents an approach and a project for extracting and publishing information about food products and also linking it to existing datasets in Linked Open Data Cloud.

Maxim Kolchin, Alexander Chistyakov, Maxim Lapaev, Rezeda Khaydarova
SentiML++: An Extension of the SentiML Sentiment Annotation Scheme

In this paper, we propose

SentiML

 + +, an extension of SentiML with a focus on annotating opinions answering aspects of the general question “who has what opinion about whom in which context?”. A detailed comparison with SentiML and other existing annotation schemes is also presented. The data collection annotated with SentiML has also been annotated with

SentiML

 + + and is available for download for research purpose.

Malik M. Saad Missen, Mohammed Attik, Mickaël Coustaty, Antoine Doucet, Cyril Faucher
Analysis of Companies’ Non-financial Disclosures: Ontology Learning by Topic Modeling

Prior studies highlight the merits of integrating Linked Data to aid investors’ analyses of company financial disclosures. Non-financial disclosures, including reporting on a company’s environmental footprint (

corporate

sustainability

), remains an unexplored area of research. One reason cited by investors is the need for earth science knowledge to interpret such disclosures. To address this challenge, we propose an automated system which employs Latent Dirichlet Allocation (LDA) for the discovery of earth science topics in corporate sustainability text. The LDA model is seeded with a vocabulary generated by terms retrieved via a SPARQL endpoint. The terms are seeded as lexical priors into the LDA model. An ensemble tree combines the resulting topic probabilities and classifies the quality of sustainability disclosures using domain expert ratings published by Google Finance. From an applications stance, our results may be of interest to investors seeking to integrate corporate sustainability considerations into their investment decisions.

Andy Moniz, Franciska de Jong
Curating a Document Collection via Crowdsourcing with Pundit 2.0

Pundit 2.0 is a semantic web annotation system that supports users in creating structured data on top of web pages. Annotations in Pundit are RDF triples that users build starting from web page elements, as text or images. Annotations can be made public and developers can access and combine them into RDF knowledge graphs, while authorship of each triple is always retrievable. In this demo we showcase Pundit 2.0 and demonstrate how it can be used to enhance a digital library, by providing a data crowdsourcing platform. Pundit enables users to annotate different kind of entities and to contribute to the collaborative creation of a knowledge graph. This, in turn, refines in real-time the exploration functionalities of the library’s faceted search, providing an immediate added value out of the annotation effort. Ad-hoc configurations can be used to drive specific visualisations, like the timeline-map shown in this demo.

Christian Morbidoni, Alessio Piccioli
SemNaaS: Add Semantic Dimension to the Network as a Service

CloudComputing has several provisionmodels, e.g. Infrastructure as a service (IaaS). However, cloud users (tenants) have limited or no control over the underlying network resources. Network as a Service (NaaS) is emerging as a novelmodel to fill this gap.However,NaaS requires an approach capable of modeling the underlying network resources capabilities in abstracted and vendor-independent form. In this paper we elaborate on SemNaaS, a Semantic Web based approach for developing and supporting operations of NaaS systems. SemNaaS can workwith any NaaS provider. We integrated it with the existing OpenNaaS framework. We propose Network Markup Language (NML) as the ontology for describing networking infrastructures.Based on that ontology, we develop a network modeling system and integrate with OpenNaaS. Furthermore, we demonstrate the capabilities that Semantic Web can add to the NaaS paradigm by applying SemNaaS operations to a specific NaaS use case.

Mohamed Morsey, Hao Zhu, Isart Canyameres, Paola Grosso
LIDSEARCH: A SPARQL-Driven Framework for Searching Linked Data and Semantic Web Services

The Linked Open Data (LOD) cloud is a massive source of data in different domains. However, these data might be incomplete or outdated. Furthermore, there are still a lot of data that are not published as static linked data such as sensors data, on-demand data, and data with limited access patterns, that are in general available through web services. In order to use web services as complementary sources of data, we introduce LIDSEARCH (Linked Data and Services Search), a SPARQL-driven framework for searching linked data and relevant semantic web services with a single user query.

Mohamed Lamine Mouhoub, Daniela Grigori, Maude Manouvrier
The Russian Museum Culture Cloud

We present an architecture and approach to publishing open linked data in the cultural heritage domain. We demonstrate our approach for building a system both for data publishing and consumption and show how user benefits can be achieved with semantic technologies. For domain knowledge representation the CIDOC-CRM ontology is used. As a main source of trusted data, we use the data of the web portal of the Russian Museum. For data enrichment we selected DBpedia and the published Linked Data of the British Museum. Our work can be reached at

www.culturecloud.ru

.

Dmitry Mouromtsev, Peter Haase, Eugene Cherny, Dmitry Pavlov, Alexey Andreev, Anna Spiridonova
DataOps: Seamless End-to-End Anything-to-RDF Data Integration

While individual components for semantic data integration are commonly available, end-to-end solutions are rare.

We demonstrate DataOps, a seamless Anything-to-RDF semantic data integration toolkit. DataOps supports the integration of both semantic and non-semantic data from an extensible host of different formats. Setting up data sources end-to-end works in three steps: (1) accessing the data from arbitrary locations in different formats, (2) specifying mappings depending on the data format (e.g., R2RML for relational data), and (3) consolidating new data with existing data instances (e.g., by establishing owl:sameAs links). All steps are supported through a fully integrated Web interface with configuration forms and different mapping editors. Visitors of the demo will be able to perform all three steps of the integration process.

Christoph Pinkel, Andreas Schwarte, Johannes Trame, Andriy Nikolov, Ana Sasa Bastinos, Tobias Zeuch
ABSTAT: Linked Data Summaries with ABstraction and STATistics

While much work has focused on continuously publishing Linked Open Data, little work considers how to help consumers to better understand existing datasets. ABSTAT framework aims at providing a better understanding of big and complex datasets by extracting summaries of linked data sets based on an ontology-driven data abstraction model. Our ABSTAT framework takes as input a data set and an ontology and returns an ontology-driven data summary as output. The summary is exported into RDF and then made accessible through a SPARQL endpoint and a web interface to support the navigation.

Matteo Palmonari, Anisa Rula, Riccardo Porrini, Andrea Maurino, Blerina Spahiu, Vincenzo Ferme
DaCENA: Serendipitous News Reading with Data Contexts

DaCENA (Data Context for News Articles) is a web application that showcases a new approach to reading online news articles with the support of a data context built from interlinked facts available on the Web of Data. Given a source article, a set of facts that are estimated to be more interesting for the readers are extracted from the Web and presented using tailored information visualization methods and an interactive user interface. By looking at this background factual knowledge, the reader is supported in the interpretation of the news content and is suggested connections to related topics that he/she can further explore.

Matteo Palmonari, Giorgio Uboldi, Marco Cremaschi, Daniele Ciminieri, Federico Bianchi
Visual Analysis of Statistical Data on Maps Using Linked Open Data

When analyzing statistical data, one of the most basic and at the same time widely used techniques is analyzing correlations. As shown in previous works, Linked Open Data is a rich resource for discovering such correlations. In this demo, we show how statistical analysis and visualization on maps can be combined to facilitate a deeper understanding of the statistical findings.

Petar Ristoski, Heiko Paulheim
Keyword Search on RDF Graphs: It Is More Than Just Searching for Keywords

In this paper, we propose a model for enabling users to search RDF data via keywords, thus, allowing them to discover relevant information without using complicated queries or knowing the underlying ontology or vocabulary. We aim at exploiting the characteristics of the RDF data to increase the quality of the ranked query results. We consider different dimensions for evaluating the value of results and achieving relevance, personalization and diversity.

Kostas Stefanidis, Irini Fundulaki
Collaborative Development of Multilingual Thesauri with VocBench (System Description and Demonstrator)

VocBench is an open source web application for editing of SKOS and SKOS-XL thesauri, with a strong focus on collaboration, supported by workflow management for content validation and publication. Dedicated user roles provide a clean separation of competences, addressing different specificities ranging from management aspects to vertical competences on content editing, such as conceptualization versus terminology editing. Extensive support for scheme management allows editors to fully exploit the possibilities of the SKOS model, as well as to fulfill its integrity constraints. We describe here the main features of VocBench, which will be shown along the demo held at the ESWC15 conference.

Armando Stellato, Sachit Rajbhandari, Andrea Turbati, Manuel Fiorelli, Caterina Caracciolo, Tiziano Lorenzetti, Johannes Keizer, Maria Teresa Pazienza
Distributed Linked Data Business Communication Networks: The LUCID Endpoint

With the LUCID Endpoint, we demonstrate how companies can utilize Linked Data technology to provide major data items for their business partners in a timely manner, machine readable and with open and extensible schemata. The main idea is to provide a Linked Data infrastructure which enables all partners to fetch, as well as to clone and to synchronize datasets from other partners over the network. This concept allows for building of networks of business partners much like as social network but in a distributed manner. It furthermore provides a technical infrastructure for business communication acts such as supply chain communication or master data management.

Sebastian Tramp, Ruben Navarro Piris, Timofey Ermilov, Niklas Petersen, Marvin Frommhold, Sören Auer
Evaluating Entity Annotators Using GERBIL

The need to bridge between the unstructured data on the Document Web and the structured data on the Web of Data has led to the development of a considerable number of annotation tools. However, these tools are hard to compare due to the diversity of data sets and measures used for evaluation. We will demonstrate GERBIL, an evaluation framework for semantic entity annotation that provides developers, end users and researchers with easy-to-use interfaces for the agile, finegrained and uniform evaluation of annotation tools on 11 different data sets within 6 different experimental settings on 6 different measures.

Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngomo Ngonga
Interactive Comparison of Triple Pattern Fragments Query Approaches

In order to reduce the server-side cost of publishing queryable Linked Data, Triple Pattern Fragments

(tpf)

were introduced as a simple interface to

rdf

triples. They allow for

sparql

query execution at low server cost, by partially shifting the load from servers to clients. The previously proposed query execution algorithm provides a solution that is highly inefficient, often requiring an amount of

http

calls that is magnitudes larger than the optimal solution. We have proposed a new query execution algorithm with the aim to solve this problem. Our solution significantly improves on the current work by maintaining a complete overview of the query instead of just looking at local optima. In this paper, we describe a demo that allows a user to easily compare the results of both implementations. We show both query results and number of executed

http

calls, proving a clear picture of the difference between the two algorithms.

Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van de Walle
Rubya: A Tool for Generating Rules for Incremental Maintenance of RDF Views

We present

Rubya

, a tool that automatically generates the RDF view defined on top of relational data and all rules required for the incremental maintenance of the RDF view. Our approach relies on the designer to specify a mapping between the relational schema and a target ontology and results in a specification of how to represent relational schema concepts in terms of RDF classes and properties of the designers choice. Based on this mapping, the rules for incrementally maintenance of the RDF view are generated.

Vânia M. P. Vidal, Marco A. Casanova, Valéria M. Pequeno, Narciso Arruda, Diego Sá, José M. Monteiro
Time-Aware Entity Search in DBpedia

Searching for entities is a common user activity on the Web. There is an increasing effort in developing entity search techniques in the research community. Existing approaches are usually based on static measures that do not reflect the time-awareness, which is a factor that should be taken into account in entity search. In this paper, we propose a novel approach to time-aware entity search in DBpedia, which takes into account both popularity and temporality of entities. The experimental results show that our approach can significantly improve the performance of entity search with temporal focus compared with the baselines.

Lei Zhang, Wentao Chen, Thanh Tran, Achim Rettinger

ESWC2015 Developers Workshop

Frontmatter
Templating the Semantic Web via RSLT

In this paper we introduce

RSLT

, a simple transformation language for RDF data. RSLT organises the rendering of RDF statements as transformation templates associated to properties or resource types and producing HTML. A prototype based on

AngularJs

is presented and we also discuss some implementation details and examples.

Silvio Peroni, Fabio Vitali
Developing a Sustainable Platform for Entity Annotation Benchmarks

The existing entity annotation systems that drive the extraction of RDF from unstructured data are hard to compare as their evaluation relies on different data sets and measures. We developed GERBIL, an evaluation framework for semantic entity annotation that provides developers, end users and researchers with easy-to-use interfaces for the agile, fine-grained and uniform evaluation of 9 annotation tools on 11 different data sets within 6 different experimental settings on 6 different measures. In this paper, we present the developed interfaces, data flows and data structures. Moreover, we show how GERBIL supports a better reproducibility and archiving of experimental results.

Michael Röder, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo

Managing the Evolution and Preservation of the Data Web - First Diachron Workshop

Frontmatter
A Diagnosis and Repair Framework for DL-LiteA KBs

Several logical formalisms have been proposed in the literature for expressing structural and semantic integrity constraints of Linked Open Data (LOD). Still, the integrity of the datasets published in the LOD cloud needs to be improved, as published data often violate such constraints, jeopardising the value of applications consuming linked data in an automatic way. In this work, we propose a novel, fully automatic framework for detecting and repairing violations of integrity constraints, by considering both explicit and implicit ontological knowledge. Our framework relies on the ontology language

$DL-Lite_\mathcal{A}$

for expressing several useful types of constraints, while maintaining good computational properties. The experimental evaluation shows that our framework is scalable for large datasets and numbers of invalidities exhibited in reality by reference linked datasets (e.g., DBpedia).

Michalis Chortis, Giorgos Flouris

4th Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (Know@LOD)

Frontmatter
Sorted Neighborhood for Schema-Free RDF Data

Entity Resolution (ER) concerns identifying pairs of entities that refer to the same underlying entity. To avoid

O

(

n

2

) pairwise comparison of

n

entities, blocking methods are used. Sorted Neighborhood is an established blocking method for Relational Databases. It has not been applied to schema-free Resource Description Framework (RDF) data sources widely prevalent in the Linked Data ecosystem. This paper presents a Sorted Neighborhood workflow that may be applied to schemafree RDF data. The workflow is modular and makes minimal assumptions about its inputs. Empirical evaluations of the proposed algorithm on five real-world benchmarks demonstrate its utility compared to two state-of-the-art blocking baselines.

Mayank Kejriwal, Daniel P. Miranker
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Appositions, and Adjectives

Discovering knowledge from textual sources and subsequently expanding the coverage of knowledge bases like DBpedia or Freebase currently requires either extensive manual work or carefully designed information extractors. Information extractors capture triples from textual sentences. Each triple consists of a subject, a predicate/ property, and an object. Triples can be mediated via verbs, nouns, adjectives, and appositions.We propose

Triplex

, an information extractor that complements previous efforts, concentrating on noun-mediated triples related to nouns, adjectives, and appositions.

Triplex

automatically constructs templates expressing noun-mediated triples from a bootstrapping set. The bootstrapping set is constructed without manual intervention by creating templates that include syntactic, semantic, and lexical constraints. We report on an automatic evaluation method to examine the output of information extractors both with and without the

Triplex

approach. Our experimental study indicates that

Triplex

is a promising approach for extracting noun-mediated triples.

Seyed Iman Mirrezaei, Bruno Martins, Isabel F. Cruz

LDQ: 2nd Workshop on Linked Data Quality

Frontmatter
What’s up LOD Cloud?
Observing the State of Linked Open Data Cloud Metadata

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access descriptive information about each dataset (or metadata). However, the heterogeneous nature of data sources reflects directly on the data quality as these sources often contain inconsistent as well as misinterpreted and incomplete metadata information. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that finding useful datasets without prior knowledge is increasingly complicated. We have developed Roomba, a tool that enables to validate, correct and generate dataset metadata. In this paper, we present the results of running this tool on parts of the LOD cloud accessible via the datahub.io API. The results demonstrate that the general state of the datasets needs more attention as most of them suffers from bad quality metadata and lacking some informative metrics that are needed to facilitate dataset search. We also show that the automatic corrections done by Roomba increase the overall quality of the datasets metadata and we highlight the need for manual efforts to correct some important missing information.

Ahmad Assaf, Raphaël Troncy, Aline Senart

2015 Workshop on Legal Domain and Semantic Web Applications

Frontmatter
A Bottom-Up Approach for Licences Classification and Selection

Licences are a crucial aspect of the information publishing process in the web of (linked) data. Recent work on modeling of policies with semantic web languages (RDF, ODRL) gives the opportunity to formally describe licences and reason upon them. However, choosing the right licence is still challenging. Particularly, understanding the number of features - permissions, prohibitions and obligations - constitute a steep learning process for the data provider, who has to check them individually and compare the licences in order to pick the one that better fits her needs. The objective of the work presented in this paper is to reduce the effort required for licence selection. We argue that an ontology of licences, organized by their relevant features, can help providing support to the user. Developing an ontology with a bottom-up approach based on Formal Concept Analysis, we show how the process of licence selection can be simplified significantly and reduced to answering an average of three/five key questions.

Enrico Daga, Mathieu d’Aquin, Enrico Motta, Aldo Gangemi

4th Workshop on the Multilingual Semantic Web

Frontmatter
One Ontology to Bind Them All: The META-SHARE OWL Ontology for the Interoperability of Linguistic Datasets on the Web

META-SHARE is an infrastructure for sharing Language Resources (LRs) where significant effort has been made into providing carefully curated metadata about LRs. However, in the face of the flood of data that is used in computational linguistics, a manual approach cannot suffice. We present the development of the META-SHARE ontology, which transforms the metadata schema used by META-SHARE into ontology in the Web Ontology Language (OWL) that can better handle the diversity of metadata found in legacy and crowd-sourced resources. We show how this model can interface with other more general purpose vocabularies for online datasets and licensing, and apply this model to the CLARIN VLO, a large source of legacy metadata about LRs. Furthermore, we demonstrate the usefulness of this approach in two public metadata portals for information about language resources.

John P. McCrae, Penny Labropoulou, Jorge Gracia, Marta Villegas, Víctor Rodríguez-Doncel, Philipp Cimiano
Applying the OntoLex Model to a Multilingual Terminological Resource

Terminesp is a multilingual terminological resource with terms from a range of specialized domains. Along with definitions, notes, scientific denominations and provenance information, it includes translations from Spanish into a variety of languages. A linked data resource with these features would represent a potentially relevant source of knowledge for NLP-based applications. In this contribution we show that Terminesp constitutes an appropriate validating test bench for OntoLex and its

vartrans

module, a newly developed model which evolves the

lemon

model to represent the lexicon-ontology interface. We present a first showcase of this module to account for variation across entries, while highlighting the modeling problems we encountered in this effort. Furthermore, we extend the resource with part-of-speech and syntactic information which was not explicitly declared in the original data with the aim of exploring its future use in NLP applications.

Julia Bosque-Gil, Jorge Gracia, Guadalupe Aguado-de-Cea, Elena Montiel-Ponsoda

NoISE: Workshop on Negative or Inconclusive rEsults in Semantic Web

Frontmatter
What SPARQL Query Logs Tell and Do Not Tell About Semantic Relatedness in LOD
Or: The Unsuccessful Attempt to Improve the Browsing Experience of DBpedia by Exploiting Query Logs

Linked Open Data browsers nowadays usually list facts about entities, but they typically do not respect the relatedness of those facts. At the same time, query logs from LOD datasets hold information about which facts are typically queried in conjunction, and should thus provide a notion of intra-fact relatedness. In this paper, we examine the hypothesis how query logs can be used to improve the display of information from DBpedia, by grouping presumably related facts together. The basic assumption is that properties which frequently co-occur in SPARQL queries are highly semantically related, so that co-occurence in query logs can be used for visual grouping of statements in a Linked Data browser. A user study, however, shows that the grouped display is not significantly better than simple baselines, such as the alphabetical ordering used by the standard DBpedia Linked Data interface. A deeper analysis shows that the basic assumption can be proven wrong, i.e., co-occurrence in query logs is actually

not

a good proxy for semantic relatedness of statements.

Jochen Huelss, Heiko Paulheim

PhiloWeb 2015

Frontmatter
The “Peer-to-Peer” Economy and Social Ontology: Legal Issues and Theoretical Perspectives

Several business models based on the use of web platforms have recently become more widespread. These are generally called “peer-to-peer” models, and are much disputed because of their impact on the traditional economy. In this paper, an analysis of the legal concerns - which are briefly presented by assessing a recent Italian court case - introduces the main problem, which is the manipulation of economic and social processes through the control of the information generated by these models. The definition of these issues within a philosophical framework - given by the contrast between a “realistic” perspective and a “naturalistic” vision of “social ontology” - allows directions for future research to be suggested.

Federico Costantini

PROFILES'15: 2nd International Workshop on Dataset PROFIling and fEderated Search for Linked Data

Frontmatter
Roomba: An Extensible Framework to Validate and Build Dataset Profiles

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access to descriptive information about each dataset (or metadata). This information can be used to delay data entropy, enhance dataset discovery, exploration and reuse as well as helping data portal administrators in detecting and eliminating spam. However, such metadata information is currently very limited to a few data portals where they are usually provided manually, thus being often incomplete and inconsistent in terms of quality. To address these issues, we propose a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles. This approach applies several techniques in order to check the validity of the metadata provided and to generate descriptive and statistical information for a particular dataset or for an entire data portal.

Ahmad Assaf, Raphaël Troncy, Aline Senart

RDF Stream Processing Workshop

Frontmatter
The Role of RDF Stream Processing in an Smart City ICT Infrastructure - The Aspern Smart City Use Case

In this paper we discuss the opportunities of adopting RDF stream processing in the context of smart cities. As a concrete example we take the Aspern Smart City Research project - one of the largest smart city projects in Europe - which aims at overcoming silos in smart grid and smart building domains. We present the envisioned smart ICT infrastructure and identify how RDF Stream processing can be explored in the different interactions among data sources, storage centers and applications/services.

Josiane Xavier Parreira, Deepak Dhungana, Gerhard Engelbrecht
Towards a Unified Language for RDF Stream Query Processing

In recent years, several RDF Stream Processing (RSP) systems have emerged, which allow querying RDF streams using extensions of SPARQL that include operators to take into account the velocity of this data. These systems are heterogeneous in terms of syntax, capabilities and evaluation semantics. Recently, the W3C RSP Group started to work on a common model for representing and querying RDF streams. The emergence of such a model and its accompanying query language is expected to take the most representative, significant and important features of previous efforts, but will also require a careful design and definition of its semantics. In this work, we present a proposal for the query semantics of the W3C RSP query language, and we discuss how it can capture the semantics of existing engines (CQELS, C-SPARQL, SPARQL

stream

), explaining and motivating their differences. Then, we use RSP-QL to analyze the current version of the W3C RSP Query Language proposal.

Daniele Dell’Aglio, Jean-Paul Calbimonte, Emanuele Della Valle, Oscar Corcho

SALAD Services and Applications over Linked APIs and Dat

Frontmatter
Web API Management Meets the Internet of Things

In this paper we outline the challenges of Web API management in Internet of Things (IoT) projects.Web API management is a key aspect of service-oriented systems that includes the following elements: metadata publishing, access control and key management, monitoring and monetization of interactions, as well as usage control and throttling. We look at how Web API management principles, including some of the above elements, translate into a world of connected devices (IoT). In particular, we present and evaluate a prototype that addresses the issue of managing authentication with millions of insecure low-power devices communicating with non-HTTP protocols. With this first step, we are only beginning to investigate IoT API management, therefore we also discuss necessary future work.

Paul Fremantle, Jacek Kopecký, Benjamin Aziz
A RESTful Approach for Developing Medical Decision Support Systems

Current developments in the medical sector are witnessing the growing digitalization of data in terms of patient tests, records and trials, use of sensors for monitoring and recording procedures, and employing digital imagery. Besides the increasing number of published guidelines and studies, it has been shown that clinicians are often unable to observe these guidelines correctly during the actual care process. [1] The increasing number of guidelines and studies, and also the fact that physicians are often unable to observe these guidelines correctly provide the foundation for this paper. We will tackle these problems by developing a medical assistance system which processes the gathered and integrated data from different sources, and assists the physicians in making decisions, preparing treatment plans, and even guide surgeons during invasive procedures. In this paper we demonstrate how a RESTful architecture combined with applying Linked Data principles for data storage and exchange can effectively be used for developing medical decision support systems. We propose different autonomous subsystems that automatically process data relevant to their purpose. These so-called “Cognitive Apps” provide RESTful interfaces and perform tasks such as converting and uploading data and deducing medical knowledge by using inference rules. The result is an adaptive decision support system, based on distributed decoupled Cognitive Apps, which can preprocess data in advance but also support real-time scenarios. We demonstrate the practical applicability of our approach by providing an implementation of a system for processing patients with liver tumors. Finally, we evaluate the system in terms of knowledge deduction and performance.

Tobias Weller, Maria Maleshkova, Keno März, Lena Maier-Hein

3rd International Workshop on Human Semantic Web Interaction (HSWI)

Frontmatter
QueryVOWL: A Visual Query Notation for Linked Data

In order to enable users without any knowledge of RDF and SPARQL to query Linked Data, visual approaches can be helpful by providing graphical support for query building. We present QueryVOWL, a visual query language that is based upon the ontology visualization VOWL and defines mappings to SPARQL. We aim for a language that is intuitive and easy to use, while remaining flexible and preserving most of the expressiveness of SPARQL. In contrast to related work, the queries can be created entirely with visual elements, taking into account RDFS and OWL concepts often used to structure Linked Data. This paper is a revised version of a workshop paper where we first introduced QueryVOWL.We present the query notation, some example queries, and two prototypical implementations of QueryVOWL. Also, we report on a qualitative user study that indicates lay users are able to construct and interpret QueryVOWL graphs.

Florian Haag, Steffen Lohmann, Stephan Siek, Thomas Ertl

Semantic Web for Scientific Heritage

Frontmatter
Studying the History of Pre-modern Zoology by Extracting Linked Zoological Data from Mediaeval Texts and Reasoning on It

In this paper we first present the international multidisciplinary research network Zoomathia, which aims at studying the transmission of zoological knowledge from Antiquity to Middle Ages through varied resources, and considers especially textual information, including compilation literature such as encyclopaedias. We then present a preliminary work in the context of Zoomathia consisting in (i) extracting pertinent knowledge from mediaeval texts using Natural Language Processing (NLP) methods, (ii) semantically enriching semi-structured zoological data and publishing it as an RDF dataset and its vocabulary, linked to other relevant Linked Data sources, and (iii) reasoning on this linked RDF data to help epistemologists, historians and philologists in their analysis of these ancient texts. This paper is an extended and updated version of [13].

Molka Tounsi, Catherine Faron Zucker, Arnaud Zucker, Serena Villata, Elena Cabrio
SemanticHPST: Applying Semantic Web Principles and Technologies to the History and Philosophy of Science and Technology

SemanticHPST is a project in which interacts ICT (especially Semantic Web) with history and philosophy of science and technology (HPST). Main difficulties in HPST are the large diversity of sources and points of view and a large volume of data. So, HPST scholars need to use new tools devoted to digital humanities based on semantic web. To ensure a certain level of genericity, this project is initially based on three sub-projects: the first one to the port-arsenal of Brest, the second one is dedicated to the correspondence of Henri Poincaré and the third one to the concept of energy. The aim of this paper is to present the project, its issues and goals and the first results and objectives in the field of harvesting distributed corpora, in advanced search in HPST corpora. Finally, we want to point out some issues about epistemological aspects about this project.

Olivier Bruneau, Serge Garlatti, Muriel Guedj, Sylvain Laubé, Jean Lieber

5th International USEWOD Workshop: Using the Web in the Age of Data

Frontmatter
dbpedia’s Triple Pattern Fragments:Usage Patterns and Insights

Queryable Linked Data is published through several interfaces, including

sparql

endpoints and Linked Data documents. In October 2014, the

db

pedia Association announced an official Triple Pattern Fragments interface to its popular

db

pedia dataset. This interface proposes to improve the availability of live queryable data by dividing query execution between clients and servers. In this paper, we present a usage analysis between November 2014 and July 2015. In 9 months time, the interface had an average availability of 99.99%, handling 16,776,170 requests, 43.0% of which were served from cache. These numbers provide promising evidence that low-cost Triple Pattern Fragments interfaces provide a viable strategy for live applications on top of public, queryable datasets.

Ruben Verborgh

WaSABi: 3rd Workshop on Semantic Web Enterprise Adoption and Best Practice

Frontmatter
Applying Semantic Technology to Film Production

Film production is an information- and knowledge-intensive industrial process which is undergoing dramatic changes in response to evolving digital technology. The Deep Film Access Project (DFAP) has been researching the potential role of semantic technology in film production, focussing on how a semantic infrastructure could contribute to the integration of the data and metadata generated during the film production lifecycle. This paper reports on the preliminary development of a knowledge framework to support the automatic management of feature film digital assets, based on a workflow analysis supported by an OWL ontology. We discuss the challenges of building on previous work and present examples of ontological modelling of key film production concepts in a semantically rich hybrid ontological framework.

Jos Lehmann, Sarah Atkinson, Roger Evans

4th International Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2015)

Frontmatter
Reactive Processing of RDF Streams of Events

Events on the Web are increasingly being produced in the form of data streams, and are present in many different scenarios and applications such as health monitoring, environmental sensing or social networks. The heterogeneity of event streams has raised the challenges of integrating, interpreting and processing them coherently. Semantic technologies have shown to provide both a formal and practical framework to address some of these challenges, producing standards for representation and querying, such as RDF and SPARQL. However, these standards are not suitable for dealing with streams for events, as they do not include the concpets of streaming and continuous processing. The idea of RDF stream processing (RSP) has emerged in recent years to fill this gap, and the research community has produced prototype engines that cover aspects including complex event processing and stream reasoning to varying degrees. However, these existing prototypes often overlook key principles of reactive systems, regarding the event-driven processing, responsiveness, resiliency and scalability. In this paper we present a reactive model for implementing RSP systems, based on the Actor model, which relies on asynchronous message passing of events. Furthermore, we study the responsiveness property of RSP systems, in particular for the delivery of streaming results.

Jean-Paul Calbimonte, Karl Aberer
Backmatter
Metadaten
Titel
The Semantic Web: ESWC 2015 Satellite Events
herausgegeben von
Fabien Gandon
Christophe Guéret
Serena Villata
John Breslin
Catherine Faron-Zucker
Antoine Zimmermann
Copyright-Jahr
2015
Electronic ISBN
978-3-319-25639-9
Print ISBN
978-3-319-25638-2
DOI
https://doi.org/10.1007/978-3-319-25639-9

Neuer Inhalt