nach oben

2017 | Buch

The Semantic Web – ISWC 2017

16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part II

herausgegeben von: Claudia d'Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange, Jeff Heflin

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The two-volume set LNCS 10587 + 10588 constitutes the refereed proceedings of the 16th International Semantic Web Conference, ISWC 2017, held in Vienna, Austria, in October 2017. ISWC 2017 is the premier international forum, for the Semantic Web / Linked Data Community. The total of 55 full and 21 short papers presented in this volume were carefully reviewed and selected from 300 submissions. They are organized according to the tracks that were held: Research Track; Resource Track; and In-Use Track.

Inhaltsverzeichnis

Frontmatter

Resource Track

Frontmatter

Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches

During empirical evaluations of query processing techniques, metrics like execution time, time for the first answer, and throughput are usually reported. Albeit informative, these metrics are unable to quantify and evaluate the efficiency of a query engine over a certain time period – or diefficiency –, thus hampering the distinction of cutting-edge engines able to exhibit high-performance gradually. We tackle this issue and devise two experimental metrics named dief@t and dief@k, which allow for measuring the diefficiency during an elapsed time period t or while k answers are produced, respectively. The dief@t and dief@k measurement methods rely on the computation of the area under the curve of answer traces, and thus capturing the answer concentration over a time interval. We report experimental results of evaluating the behavior of a generic SPARQL query engine using both metrics. Observed results suggest that dief@t and dief@k are able to measure the performance of SPARQL query engines based on both the amount of answers produced by an engine and the time required to generate these answers.

Maribel Acosta, Maria-Esther Vidal, York Sure-Vetter

CodeOntology: RDF-ization of Source Code

In this paper, we leverage advances in the Semantic Web area, including data modeling (RDF), data management and querying (JENA and SPARQL), to develop CodeOntology, a community-shared software framework supporting expressive queries over source code. The project consists of two main contributions: an ontology that provides a formal representation of object-oriented programming languages, and a parser that is able to analyze Java source code and serialize it into RDF triples. The parser has been successfully applied to the source code of OpenJDK 8, gathering a structured dataset consisting of more than 2 million RDF triples. CodeOntology allows to generate Linked Data from any Java project, thereby enabling the execution of highly expressive queries over source code, by means of a powerful language like SPARQL.

Mattia Atzeni, Maurizio Atzori

Linked Data Publication of Live Music Archives and Analyses

We describe the publication of a linked data set exposing metadata from the Internet Archive Live Music Archive along with detailed feature analysis data of the audio files contained in the archive. The collection is linked to existing musical and geographical resources allowing for the extraction of useful or nteresting subsets of data using additional metadata.The collection is published using a ‘layered’ approach, aggregating the original information with links and specialised analyses, and forms a valuable resource for those investigating or developing audio analysis tools and workflows.

Sean Bechhofer, Kevin Page, David M. Weigl, György Fazekas, Thomas Wilmering

The MedRed Ontology for Representing Clinical Data Acquisition Metadata

Electronic Data Capture (EDC) software solutions are progressively being adopted for conducting clinical trials and studies, carried out by biomedical, pharmaceutical and health-care research teams. In this paper we present the MedRed Ontology, whose goal is to represent the metadata of these studies, using well-established standards, and reusing related vocabularies to describe essential aspects, such as validation rules, composability, or provenance. The paper describes the design principles behind the ontology and how it relates to existing models and formats used in the industry. We also reuse well-known vocabularies and W3C recommendations. Furthermore, we have validated the ontology with existing clinical studies in the context of the MedRed project, as well as a collection of metadata of well-known studies. Finally, we have made the ontology available publicly following best practices and vocabulary sharing guidelines.

Jean-Paul Calbimonte, Fabien Dubosson, Roger Hilfiker, Alexandre Cotting, Michael Schumacher

Iguana: A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores

The performance of triples stores is crucial for applications driven by RDF. Several benchmarks have been proposed that assess the performance of triple stores. However, no integrated benchmark-independent execution framework for these benchmarks has yet been provided. We propose a novel SPARQL benchmark execution framework called Iguana. Our framework complements benchmarks by providing an execution environment which can measure the performance of triple stores during data loading, data updates as well as under different loads and parallel requests. Moreover, it allows a uniform comparison of results on different benchmarks. We execute the FEASIBLE and DBPSB benchmarks using the Iguana framework and measure the performance of popular triple stores under updates and parallel user requests. We compare our results (See https://doi.org/10.6084/m9.figshare.c.3767501.v1) with state-of-the-art benchmarking results and show that our benchmark execution framework can unveil new insights pertaining to the performance of triple stores.

Felix Conrads, Jens Lehmann, Muhammad Saleem, Mohamed Morsey, Axel-Cyrille Ngonga Ngomo

Ireland?s Authoritative Geospatial Linked Data

Data.geohive.ie aims to provide an authoritative service for serving Ireland?s national geospatial data as Linked Data. The service currently provides information on Irish administrative boundaries and the boundaries used for the Irish 2011 census. The service is designed to support two use cases: serving boundary data of geographic features at various level of detail and capturing the evolution of administrative boundaries. In this paper, we report on the development of the service and elaborate on some of the informed decisions concerned with the URI strategy and use of named graphs for the support of aforementioned use cases ? relating those with similar initiatives. While clear insights on how the data is being used are still being gathered, we provide examples of how and where this geospatial Linked Data dataset is used.

Christophe Debruyne, Alan Meehan, Éamonn Clinton, Lorraine McNerney, Atul Nautiyal, Peter Lavin, Declan O’Sullivan

LOD-a-lot

A Queryable Dump of the LOD Cloud

LOD-a-lot democratizes access to the Linked Open Data (LOD) Cloud by serving more than 28 billion unique triples from 650 K datasets over a single self-indexed file. This corpus can be queried online with a sustainable Linked Data Fragments interface, or downloaded and consumed locally: LOD-a-lot is easy to deploy and demands affordable resources (524 GB of disk space and 15.7 GB of RAM), enabling Web-scale repeatable experimentation and research even by standard laptops.

Javier D. Fernández, Wouter Beek, Miguel A. Martínez-Prieto, Mario Arias

IMGpedia: A Linked Dataset with Content-Based Analysis of Wikimedia Images

IMGpedia is a large-scale linked dataset that incorporates visual information of the images from the Wikimedia Commons dataset: it brings together descriptors of the visual content of 15 million images, 450 million visual-similarity relations between those images, links to image metadata from DBpedia Commons, and links to the DBpedia resources associated with individual images. In this paper we describe the creation of the IMGpedia dataset, provide an overview of its schema and statistics of its contents, offer example queries that combine semantic and visual information of images, and discuss other envisaged use-cases for the dataset.

Sebastián Ferrada, Benjamin Bustos, Aidan Hogan

WIDOCO: A Wizard for Documenting Ontologies

In this paper we describe WIDOCO, a WIzard for DOCumenting Ontologies that guides users through the documentation process of their vocabularies. Given an RDF vocabulary, WIDOCO detects missing vocabulary metadata and creates a documentation with diagrams, human readable descriptions of the ontology terms and a summary of changes with respect to previous versions of the ontology. The documentation consists on a set of linked enriched HTML pages that can be further extended by end users. WIDOCO is open source and builds on well established Semantic Web tools. So far, WIDOCO has been used to document more than one hundred ontologies in different domains.

Daniel Garijo

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments

The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developed—the CEDAR Workbench—is a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.

Rafael S. Gonçalves, Martin J. O’Connor, Marcos Martínez-Romero, Attila L. Egyedi, Debra Willrett, John Graybeal, Mark A. Musen

WebIsALOD: Providing Hypernymy Relations Extracted from the Web as Linked Open Data

Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. In this paper, we introduce WebIsALOD, a Linked Open Data release of the IsA database, containing 400M hypernymy relations, each provided with rich provenance information. As the original dataset contained more than 80% wrong, noisy extractions, we run a machine learning algorithm to assign confidence scores to the individual statements. Furthermore, 2.5M links to DBpedia and 23.7k links to the YAGO class hierarchy were created at a precision of 97%. In total, the dataset contains 5.4B triples.

Sven Hertling, Heiko Paulheim

Ontology-Based Data Access to Slegge

We report on our experience in ontology-based data access to the Slegge database at Statoil and share the resources employed in this use case: end-user information needs (in natural language), their translations into SPARQL, the Subsurface Exploration Ontology, the schema of the Slegge database with integrity constraints, and the mappings connecting the ontology and the schema.

Dag Hovland, Roman Kontchakov, Martin G. Skjæveland, Arild Waaler, M. Zakharyaschev

BiOnIC: A Catalog of User Interactions with Biomedical Ontologies

BiOnIC is a catalog of aggregated statistics of user clicks, queries, and reuse counts for access to over 200 biomedical ontologies. BiOnIC also provides anonymized sequences of classes accessed by users over a period of four years. To generate the statistics, we processed the access logs of BioPortal, a large open biomedical ontology repository. We publish the BiOnIC data using DCAT and SKOS metadata standards. The BiOnIC catalog has a wide range of applicability, which we demonstrate through its use in three different types of applications. To our knowledge, this type of interaction data stemming from a real-world, large-scale application has not been published before. We expect that the catalog will become an important resource for researchers and developers in the Semantic Web community by providing novel insights into how ontologies are explored, queried and reused. The BiOnIC catalog may ultimately assist in the more informed development of intelligent user interfaces for semantic resources through interface customization, prediction of user browsing and querying behavior, and ontology summarization. The BiOnIC catalog is available at: http://onto-apps.stanford.edu/bionic.

Maulik R. Kamdar, Simon Walk, Tania Tudorache, Mark A. Musen

Neural Embeddings for Populated Geonames Locations

The application of neural embedding algorithms (based on architectures like skip-grams) to large knowledge bases like Wikipedia and the Google News Corpus has tremendously benefited multiple communities in applications as diverse as sentiment analysis, named entity recognition and text classification. In this paper, we present a similar resource for geospatial applications. We systematically construct a weighted network that spans all populated places in Geonames. Using a network embedding algorithm that was recently found to achieve excellent results and is based on the skip-gram model, we embed each populated place into a 100-dimensional vector space, in a similar vein as the GloVe embeddings released for Wikipedia. We demonstrate potential applications of this dataset resource, which we release under a public license.

Mayank Kejriwal, Pedro Szekely

Distributed Semantic Analytics Using the SANSA Stack

A major research challenge is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base completion and reasoning. Analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases, and most analytics approaches which do scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input. This software framework paper describes the ongoing Semantic Analytics Stack (SANSA) project, which supports expressive and scalable semantic analytics by providing functionality for distributed computing on RDF data.

Jens Lehmann, Gezim Sejdiu, Lorenz Bühmann, Patrick Westphal, Claus Stadler, Ivan Ermilov, Simon Bin, Nilesh Chakraborty, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Hajira Jabeen

The MIDI Linked Data Cloud

The study of music is highly interdisciplinary, and thus requires the combination of datasets from multiple musical domains, such as catalog metadata (authors, song titles, dates), industrial records (labels, producers, sales), and music notation (scores). While today an abundance of music metadata exists on the Linked Open Data cloud, linked datasets containing interoperable symbolic descriptions of music itself, i.e. music notation with note and instrument level information, are scarce. In this paper, we describe the MIDI Linked Data Cloud dataset, which represents multiple collections of digital music in the MIDI standard format as Linked Data using the novel midi2rdf algorithm. At the time of writing, our proposed dataset comprises 10,215,557,355 triples of 308,443 interconnected MIDI files, and provides Web-compatible descriptions of their MIDI events. We provide a comprehensive description of the dataset, and reflect on its applications for research in the Semantic Web and Music Information Retrieval communities.

Albert Meroño-Peñuela, Rinke Hoekstra, Aldo Gangemi, Peter Bloem, Reinier de Valk, Bas Stringer, Berit Janssen, Victor de Boer, Alo Allik, Stefan Schlobach, Kevin Page

SocialLink: Linking DBpedia Entities to Corresponding Twitter Accounts

We present SocialLink, a publicly available Linked Open Data dataset that matches social media accounts on Twitter to the corresponding entities in multiple language chapters of DBpedia. By effectively bridging the Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two: on the one hand, it supports Semantic Web practitioners in better harvesting the vast amounts of valuable, up-to-date information available in Twitter; on the other hand, it permits Social Media researchers to leverage DBpedia data when processing the noisy, semi-structured data of Twitter. SocialLink is automatically updated with periodic releases and the code along with the gold standard dataset used for its training are made available as an open source project.

Yaroslav Nechaev, Francesco Corcoglioniti, Claudio Giuliano

UNDO: The United Nations System Document Ontology

Akoma Ntoso is an OASIS Committee Specification Draft standard for the electronic representations of parliamentary, normative and judicial documents in XML. Recently, it has been officially adopted by the United Nations (UN) as the main electronic format for making UN documents machine-processable. However, Akoma Ntoso does not force nor define any formal ontology for allowing the description of real-world objects, concepts and relations mentioned in documents. In order to address this gap, in this paper we introduce the United Nations System Document Ontology (UNDO), i.e. an OWL 2 DL ontology developed and adopted by the United Nations that aims at providing a framework for the formal description of all these entities.

Silvio Peroni, Monica Palmirani, Fabio Vitali

One Year of the OpenCitations Corpus

Releasing RDF-Based Scholarly Citation Data into the Public Domain

Reference lists from academic articles are core elements of scholarly communication that permit the attribution of credit and integrate our independent research endeavours. Hitherto, however, they have not been freely available in an appropriate machine-readable format such as RDF and in aggregate for use by scholars. To address this issue, one year ago we started ingesting citation data from the Open Access literature into the OpenCitations Corpus (OCC), creating an RDF dataset of scholarly citation data that is open to all. In this paper we introduce the OCC and we discuss its outcomes and uses after the first year of life.

Silvio Peroni, David Shotton, Fabio Vitali

An Entity Relatedness Test Dataset

A knowledge base stores descriptions of entities and their relationships, often in the form of a very large RDF graph, such as DBpedia or Wikidata. The entity relatedness problem refers to the question of computing the relationship paths that better capture the connectivity between a given entity pair. This paper describes a dataset created to support the evaluation of approaches that address the entity relatedness problem. The dataset covers two familiar domains, music and movies, and uses data available in IMDb and last.fm, which are popular reference datasets in these domains. The paper describes in detail how sets of entity pairs from each of these domains were selected and, for each entity pair, how a ranked list of relationship paths was obtained.

José Eduardo Talavera Herrera, Marco Antonio Casanova, Bernardo Pereira Nunes, Luiz André P. Paes Leme, Giseli Rabello Lopes

RSPLab: RDF Stream Processing Benchmarking Made Easy

In Stream Reasoning (SR), empirical research on RDF Stream Processing (RSP) is attracting a growing attention. The SR community proposed methodologies and benchmarks to investigate the RSP solution space and improve existing approaches. In this paper, we present RSPLab, an infrastructure that reduces the effort required to design and execute reproducible experiments as well as share their results. RSPLab integrates two existing RSP benchmarks (LSBench and CityBench) and two RSP engines (C-SPARQL engine and CQELS). It provides a programmatic environment to: deploy in the cloud RDF Streams and RSP engines, interact with them using TripleWave and RSP Services, and continuously monitor their performances and collect statistics. RSPLab is released as open-source under an Apache 2.0 license.

Riccardo Tommasini, Emanuele Della Valle, Andrea Mauri, Marco Brambilla

LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs

Being able to access knowledge bases in an intuitive way has been an active area of research over the past years. In particular, several question answering (QA) approaches which allow to query RDF datasets in natural language have been developed as they allow end users to access knowledge without needing to learn the schema of a knowledge base and learn a formal query language. To foster this research area, several training datasets have been created, e.g. in the QALD (Question Answering over Linked Data) initiative. However, existing datasets are insufficient in terms of size, variety or complexity to apply and evaluate a range of machine learning based QA approaches for learning complex SPARQL queries. With the provision of the Large-Scale Complex Question Answering Dataset (LC-QuAD), we close this gap by providing a dataset with 5000 questions and their corresponding SPARQL queries over the DBpedia dataset. In this article, we describe the dataset creation process and how we ensure a high variety of questions, which should enable to assess the robustness and accuracy of the next generation of QA systems for knowledge graphs.

Priyansh Trivedi, Gaurav Maheshwari, Mohnish Dubey, Jens Lehmann

PDD Graph: Bridging Electronic Medical Records and Biomedical Knowledge Graphs via Entity Linking

Electronic medical records contain multi-format electronic medical data that consist of an abundance of medical knowledge. Facing with patient’s symptoms, experienced caregivers make right medical decisions based on their professional knowledge that accurately grasps relationships between symptoms, diagnosis, and corresponding treatments. In this paper, we aim to capture these relationships by constructing a large and high-quality heterogeneous graph linking patients, diseases, and drugs (PDD) in EMRs. Specifically, we propose a novel framework to extract important medical entities from MIMIC-III (Medical Information Mart for Intensive Care III) and automatically link them with the existing biomedical knowledge graphs, including ICD-9 ontology and DrugBank. The PDD graph presented in this paper is accessible on the Web via the SPARQL endpoint, and provides a pathway for medical discovery and applications, such as effective treatment recommendations.

Meng Wang, Jiaheng Zhang, Jun Liu, Wei Hu, Sen Wang, Xue Li, Wenqiang Liu

In-Use Track

Frontmatter

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where expert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: (1) a crowdsourcing platform to support metadata annotation and addition of new terms, (2) a range of social editorial processes to make standardization decisions for those new terms, and (3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the Paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata.

Yolanda Gil, Daniel Garijo, Varun Ratnakar, Deborah Khider, Julien Emile-Geay, Nicholas McKay

An Investigative Search Engine for the Human Trafficking Domain

Enabling intelligent search systems that can navigate and facet on entities, classes and relationships, rather than plain text, to answer questions in complex domains is a longstanding aspect of the Semantic Web vision. This paper presents an investigative search engine that meets some of these challenges, at scale, for a variety of complex queries in the human trafficking domain. The engine provides a real-world case study of synergy between technology derived from research communities as diverse as Semantic Web (investigative ontologies, SPARQL-inspired querying, Linked Data), Natural Language Processing (knowledge graph construction, word embeddings) and Information Retrieval (fast, user-driven relevance querying). The search engine has been rigorously prototyped as part of the DARPA MEMEX program and has been integrated into the latest version of the Domain-specific Insight Graph (DIG) architecture, currently used by hundreds of US law enforcement agencies for investigating human trafficking. Over a hundred millions ads have been indexed. The engine is also being extended to other challenging illicit domains, such as securities and penny stock fraud, illegal firearm sales, and patent trolling, with promising results.

Mayank Kejriwal, Pedro Szekely

Lessons Learned in Building Linked Data for the American Art Collaborative

Linked Data has emerged as the preferred method for publishing and sharing cultural heritage data. One of the main challenges for museums is that the defacto standard ontology (CIDOC CRM) is complex and museums lack expertise in semantic web technologies. In this paper we describe the methodology and tools we used to create 5-star Linked Data for 14 American art museums with a team of 12 computer science students and 30 representatives from the museums who mostly lacked expertise in Semantic Web technologies. The project was completed over a period of 18 months and generated 99 mapping files and 9,357 artist links, producing a total of 2,714 R2RML rules and 9.7M triples. More importantly, the project produced a number of open source tools for generating high-quality linked data and resulted in a set of lessons learned that can be applied in future projects.

Craig A. Knoblock, Pedro Szekely, Eleanor Fink, Duane Degler, David Newbury, Robert Sanderson, Kate Blanch, Sara Snyder, Nilay Chheda, Nimesh Jain, Ravi Raju Krishna, Nikhila Begur Sreekanth, Yixiang Yao

Modeling and Using an Actor Ontology of Second World War Military Units and Personnel

This paper presents a model for representing historical military personnel and army units, based on large datasets about World War II in Finland. The model is in use in WarSampo data service and semantic portal, which has had tens of thousands of distinct visitors. A key challenge is how to represent ontological changes, since the ranks and units of military personnel, as well as the names and structures of army units change rapidly in wars. This leads to serious problems in both search as well as data linking due to ambiguity and homonymy of names. In our solution, actors are represented in terms of the events they participated in, which facilitates disambiguation of personnel and units in different spatio-temporal contexts. The linked data in the WarSampo Linked Open Data cloud and service has ca. 9 million triples, including actor datasets of ca. 100 000 soldiers and ca. 16 100 army units. To test the model in practice, an application for semantic search and recommending based on data linking was created, where the spatio-temporal life stories of individual soldiers can be reassembled dynamically by linking data from different datasets. An evaluation is presented showing promising results in terms of linking precision.

Petri Leskinen, Mikko Koho, Erkki Heino, Minna Tamper, Esko Ikkala, Jouni Tuominen, Eetu Mäkelä, Eero Hyvönen

Sustainable Linked Data Generation: The Case of DBpedia

dbpedia ef, the generation framework behind one of the Linked Open Data cloud’s central interlinking hubs, has limitations with regard to quality, coverage and sustainability of the generated dataset. dbpedia can be further improved both on schema and data level. Errors and inconsistencies can be addressed by amending (i) the dbpedia ef; (ii) the dbpedia mapping rules; or (iii) Wikipedia itself from which it extracts information. However, even though the dbpedia ef and mapping rules are continuously evolving and several changes were applied to both of them, there are no significant improvements on the dbpedia dataset since its limitations were identified. To address these shortcomings, we propose adapting a different semantic-driven approach that decouples, in a declarative manner, the extraction, transformation and mapping rules execution. In this paper, we provide details regarding the new dbpedia ef, its architecture, technical implementation and extraction results. This way, we achieve an enhanced data generation process, which can be broadly adopted, and that improves its quality, coverage and sustainability.

Wouter Maroy, Anastasia Dimou, Dimitris Kontokostas, Ben De Meester, Ruben Verborgh, Jens Lehmann, Erik Mannens, Sebastian Hellmann

Semantic Rule-Based Equipment Diagnostics

Industrial rule-based diagnostic systems are often data-dependant in the sense that they rely on specific characteristics of individual pieces of equipment. This dependence poses significant challenges in rule authoring, reuse, and maintenance by engineers. In this work we address these problems by relying on Ontology-Based Data Access: we use ontologies to mediate the equipment and the rules. We propose a semantic rule language, sigRL, where sensor signals are first class citizens. Our language offers a balance of expressive power, usability, and efficiency: it captures most of Siemens data-driven diagnostic rules, significantly simplifies authoring of diagnostic tasks, and allows to efficiently rewrite semantic rules from ontologies to data and execute over data. We implemented our approach in a semantic diagnostic system, deployed it in Siemens, and conducted experiments to demonstrate both usability and efficiency.

Gulnar Mehdi, E. Kharlamov, Ognjen Savković, G. Xiao, E. Güzel Kalaycı, S. Brandt, I. Horrocks, Mikhail Roshchin, Thomas Runkler

Automatic Query-Centric API for Routine Access to Linked Data

Despite the advatages of Linked Data as a data integration paradigm, accessing and consuming Linked Data is still a cumbersome task. Linked Data applications need to use technologies such as RDF and SPARQL that, despite their expressive power, belong to the data integration stack. As a result, applications and data cannot be cleanly separated: SPARQL queries, endpoint addresses, namespaces, and URIs end up as part of the application code. Many publishers address these problems by building RESTful APIs around their Linked Data. However, this solution has two pitfalls: these APIs are costly to maintain; and they blackbox functionality by hiding the queries they use. In this paper we describe grlc, a gateway between Linked Data applications and the LOD cloud that offers a RESTful, reusable and uniform means to routinely access any Linked Data. It generates an OpenAPI compatible API by using parametrized queries shared on the Web. The resulting APIs require no coding, rely on low-cost external query storage and versioning services, contain abundant provenance information, and integrate access to different publishing paradigms into a single API. We evaluate grlc qualitatively, by describing its reported value by current users; and quantitatively, by measuring the added overhead at generating API specifications and answering to calls.

Albert Meroño-Peñuela, Rinke Hoekstra

Realizing an RDF-Based Information Model for a Manufacturing Company – A Case Study

The digitization of the industry requires information models describing assets and information sources of companies to enable the semantic integration and interoperable exchange of data. We report on a case study in which we realized such an information model for a global manufacturing company using semantic technologies. The information model is centered around machine data and describes all relevant assets, key terms and relations in a structured way, making use of existing as well as newly developed RDF vocabularies. In addition, it comprises numerous RML mappings that link different data sources required for integrated data access and querying via SPARQL. The technical infrastructure and methodology used to develop and maintain the information model is based on a Git repository and utilizes the development environment VoCol as well as the Ontop framework for Ontology Based Data Access. Two use cases demonstrate the benefits and opportunities provided by the information model. We evaluated the approach with stakeholders and report on lessons learned from the case study.

Niklas Petersen, Lavdim Halilaj, Irlán Grangel-González, Steffen Lohmann, Christoph Lange, Sören Auer

Personalizing Actions in Context for Risk Management Using Semantic Web Technologies

The process of managing risks of client contracts is manual and resource-consuming, particularly so for Fortune 500 companies. As an example, Accenture assesses the risk of eighty thousand contracts every year. For each contract, different types of data will be consolidated from many sources and used to compute its risk tier. For high-risk tier contracts, a Quality Assurance Director (QAD) is assigned to mitigate or even prevent the risk. The QAD gathers and selects the recommended actions during regular portfolio review meetings to enable leadership to take the appropriate actions. In this paper, we propose to automatically personalize and contextualize actions to improve the efficacy. Our approach integrates enterprise and external data into a knowledge graph and interprets actions based on QADs’ profiles through semantic reasoning over this knowledge graph. User studies showed that QADs could efficiently select actions that better mitigate the risk than the existing approach.

Jiewen Wu, Freddy Lécué, Christophe Gueret, Jer Hayes, Sara van de Moosdijk, Gemma Gallagher, Peter McCanney, Eugene Eichelberger

Backmatter

Titel: The Semantic Web – ISWC 2017
herausgegeben von: Claudia d'Amato
Miriam Fernandez
Valentina Tamma
Freddy Lecue
Philippe Cudré-Mauroux
Juan Sequeda
Christoph Lange
Jeff Heflin
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-68204-4
Print ISBN: 978-3-319-68203-7
DOI: https://doi.org/10.1007/978-3-319-68204-4

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Resource Track

Frontmatter

Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches

CodeOntology: RDF-ization of Source Code

Linked Data Publication of Live Music Archives and Analyses

The MedRed Ontology for Representing Clinical Data Acquisition Metadata

Iguana: A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores

Ireland?s Authoritative Geospatial Linked Data

LOD-a-lot

IMGpedia: A Linked Dataset with Content-Based Analysis of Wikimedia Images

WIDOCO: A Wizard for Documenting Ontologies

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments

WebIsALOD: Providing Hypernymy Relations Extracted from the Web as Linked Open Data

Ontology-Based Data Access to Slegge

BiOnIC: A Catalog of User Interactions with Biomedical Ontologies

Neural Embeddings for Populated Geonames Locations

Distributed Semantic Analytics Using the SANSA Stack

The MIDI Linked Data Cloud

SocialLink: Linking DBpedia Entities to Corresponding Twitter Accounts

UNDO: The United Nations System Document Ontology

One Year of the OpenCitations Corpus

An Entity Relatedness Test Dataset

RSPLab: RDF Stream Processing Benchmarking Made Easy

LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs

PDD Graph: Bridging Electronic Medical Records and Biomedical Knowledge Graphs via Entity Linking

In-Use Track

Frontmatter

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

An Investigative Search Engine for the Human Trafficking Domain

Lessons Learned in Building Linked Data for the American Art Collaborative

Modeling and Using an Actor Ontology of Second World War Military Units and Personnel

Sustainable Linked Data Generation: The Case of DBpedia

Semantic Rule-Based Equipment Diagnostics

Automatic Query-Centric API for Routine Access to Linked Data

Realizing an RDF-Based Information Model for a Manufacturing Company – A Case Study

Personalizing Actions in Context for Risk Management Using Semantic Web Technologies

Backmatter

Premium Partner