Skip to main content

2011 | Buch

The Semanic Web: Research and Applications

8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29 – June 2, 2011, Proceedings, Part II

herausgegeben von: Grigoris Antoniou, Marko Grobelnik, Elena Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leenheer, Jeff Pan

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The books (LNCS 6643 and 6644) constitute the refereed proceedings of the 8th European Semantic Web Conference, ESWC 2011, held in Heraklion, Crete, Greece, in May/June 2011. The 57 revised full papers of the research track presented together with 7 PhD symposium papers and 14 demo papers were carefully reviewed and selected from 291 submissions. The papers are organized in topical sections on digital libraries track; inductive and probabilistic approaches track; linked open data track; mobile web track; natural language processing track; ontologies track; and reasoning track (part I); semantic data management track; semantic web in use track; sensor web track; software, services, processes and cloud computing track; social web and web science track; demo track, PhD symposium (part II).

Inhaltsverzeichnis

Frontmatter

Semantic Data Management Track

Semantics and Optimization of the SPARQL 1.1 Federation Extension

The W3C SPARQL working group is defining the new SPARQL 1.1 query language. The current working draft of SPARQL 1.1 focuses mainly on the description of the language. In this paper, we provide a formalization of the syntax and semantics of the SPARQL 1.1 federation extension, an important fragment of the language that has not yet received much attention. Besides, we propose optimization techniques for this fragment, provide an implementation of the fragment including these techniques, and carry out a series of experiments that show that our optimization procedures significantly speed up the query evaluation process.

Carlos Buil-Aranda, Marcelo Arenas, Oscar Corcho
Grr: Generating Random RDF

This paper presents

Grr

, a powerful system for generating random RDF data, which can be used to test Semantic Web applications.

Grr

has a

sparql

-like syntax, which allows the system to be both powerful and convenient. It is shown that

Grr

can easily be used to produce intricate datasets, such as the LUBM benchmark. Optimization techniques are employed, which make the generation process efficient and scalable.

Daniel Blum, Sara Cohen
High-Performance Computing Applied to Semantic Databases

To-date, the application of high-performance computing resources to Semantic Web data has largely focused on commodity hardware and distributed memory platforms. In this paper we make the case that more specialized hardware can offer superior scaling and close to an order of magnitude improvement in performance. In particular we examine the Cray XMT. Its key characteristics, a large, global shared-memory, and processors with a memory-latency tolerant design, offer an environment conducive to programming for the Semantic Web and have engendered results that far surpass current state of the art. We examine three fundamental pieces requisite for a fully functioning semantic database: dictionary encoding, RDFS inference, and query processing. We show scaling up to 512 processors (the largest configuration we had available), and the ability to process 20 billion triples completely in-memory.

Eric L. Goodman, Edward Jimenez, David Mizell, Sinan al-Saffar, Bob Adolf, David Haglin
An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce

Existing MapReduce systems support relational style join operators which translate multi-join query plans into several Map-Reduce cycles. This leads to high I/O and communication costs due to the multiple data transfer steps between

map

and

reduce

phases. SPARQL graph pattern matching is dominated by join operations, and is unlikely to be efficiently processed using existing techniques. This cost is prohibitive for RDF graph pattern matching queries which typically involve several join operations. In this paper, we propose an approach for optimizing graph pattern matching by reinterpreting certain join tree structures as grouping operations. This enables a greater degree of parallelism in join processing resulting in more “bushy” like query execution plans with fewer Map-Reduce cycles. This approach requires that the intermediate results are managed as sets of groups of triples or

TripleGroups

. We therefore propose a data model and algebra -

Nested TripleGroup Algebra

for capturing and manipulating TripleGroups. The relationship with the traditional relational style algebra used in Apache Pig is discussed. A comparative performance evaluation of the traditional Pig approach and RAPID+ (Pig extended with NTGA) for graph pattern matching queries on the BSBM benchmark dataset is presented. Results show up to 60% performance improvement of our approach over traditional Pig for some tasks.

Padmashree Ravindra, HyeongSik Kim, Kemafor Anyanwu
Query Relaxation for Entity-Relationship Search

Entity-relationship-structured data is becoming more important on the Web. For example, large knowledge bases have been automatically constructed by information extraction from Wikipedia and other Web sources. Entities and relationships can be represented by subject-property-object triples in the RDF model, and can then be precisely searched by structured query languages like SPARQL. Because of their Boolean-match semantics, such queries often return too few or even no results. To improve recall, it is thus desirable to support users by

automatically relaxing

or reformulating queries in such a way that the intention of the original user query is preserved while returning a sufficient number of ranked results.

In this paper we describe comprehensive methods to relax SPARQL-like triple-pattern queries in a fully automated manner. Our framework produces a set of relaxations by means of statistical language models for structured RDF data and queries. The query processing algorithms merge the results of different relaxations into a unified result list, with ranking based on any ranking function for structured queries over RDF-data. Our experimental evaluation, with two different datasets about movies and books, shows the effectiveness of the automatically generated relaxations and the improved quality of query results based on assessments collected on the Amazon Mechanical Turk platform.

Shady Elbassuoni, Maya Ramanath, Gerhard Weikum
Optimizing Query Shortcuts in RDF Databases

The emergence of the Semantic Web has led to the creation of large semantic knowledge bases, often in the form of RDF databases. Improving the performance of RDF databases necessitates the development of specialized data management techniques, such as the use of shortcuts in the place of path queries. In this paper we deal with the problem of selecting the most beneficial shortcuts that reduce the execution cost of path queries in RDF databases given a space constraint. We first demonstrate that this problem is an instance of the quadratic knapsack problem. Given the computational complexity of solving such problems, we then develop an alternative formulation based on a bi-criterion linear relaxation, which essentially seeks to minimize a weighted sum of the query cost and of the required space consumption. As we demonstrate in this paper, this relaxation leads to very efficient classes of linear programming solutions. We utilize this bi-criterion linear relaxation in an algorithm that selects a subset of shortcuts to materialize. This shortcut selection algorithm is extensively evaluated and compared with a greedy algorithm that we developed in prior work. The reported experiments show that the linear relaxation algorithm manages to significantly reduce the query execution times, while also outperforming the greedy solution.

Vicky Dritsou, Panos Constantopoulos, Antonios Deligiannakis, Yannis Kotidis
RDFS Update: From Theory to Practice

There is a comprehensive body of theory studying updates and schema evolution of knowledge bases, ontologies, and in particular of RDFS. In this paper we turn these ideas into practice by presenting a feasible and practical procedure for updating RDFS. Along the lines of ontology evolution, we treat schema and instance updates separately, showing that RDFS instance updates are not only feasible, but also deterministic. For RDFS schema update, known to be intractable in the general abstract case, we show that it becomes feasible in real world datasets. We present for both, instance and schema update, simple and feasible algorithms.

Claudio Gutierrez, Carlos Hurtado, Alejandro Vaisman
Benchmarking Matching Applications on the Semantic Web

The evaluation of matching applications is becoming a major issue in the semantic web and it requires a suitable methodological approach as well as appropriate benchmarks. In particular, in order to evaluate a matching application under different experimental conditions, it is crucial to provide a test dataset characterized by a controlled variety of different heterogeneities among data that rarely occurs in real data repositories. In this paper, we propose SWING (Semantic Web INstance Generation), a disciplined approach to the semi-automatic generation of benchmarks to be used for the evaluation of matching applications.

Alfio Ferrara, Stefano Montanelli, Jan Noessner, Heiner Stuckenschmidt
Efficiently Evaluating Skyline Queries on RDF Databases

Skyline queries are a class of preference queries that compute the pareto-optimal tuples from a set of tuples and are valuable for multi-criteria decision making scenarios. While this problem has received significant attention in the context of single relational table, skyline queries over joins of multiple tables that are typical of storage models for RDF data has received much less attention. A naïve approach such as a

join-first-skyline-later

strategy splits the join and skyline computation phases which limit opportunities for optimization. Other existing techniques for multi-relational skyline queries assume storage and indexing techniques that are not typically used with RDF which would require a preprocessing step for data transformation. In this paper, we present an approach for optimizing skyline queries over RDF data stored using a vertically partitioned schema model. It is based on the concept of a “

Header Point

” which maintains a concise summary of the already visited regions of the data space. This summary allows some fraction of non-skyline tuples to be pruned from advancing to the skyline processing phase, thus reducing the overall cost of expensive dominance checks required in the skyline phase. We further present more aggressive pruning rules that result in the computation of

near-complete

skylines in significantly less time than the complete algorithm. A comprehensive performance evaluation of different algorithms is presented using datasets with different types of data distributions generated by a benchmark data generator.

Ling Chen, Sidan Gao, Kemafor Anyanwu
The Design and Implementation of Minimal RDFS Backward Reasoning in 4store

This paper describes the design and implementation of

Minimal

RDFS semantics based on a backward chaining approach and implemented on a clustered RDF triple store. The system presented, called

4sr

, uses 4store as base infrastructure. In order to achieve a highly scalable system we implemented the reasoning at the lowest level of the quad store, the

bind

operation. The

bind

operation runs concurrently in all the data slices allowing the reasoning to be processed in parallel among the cluster. Throughout this paper we provide detailed descriptions of the architecture, reasoning algorithms, and a scalability evaluation with the LUBM benchmark.

4sr

is a stable tool available under a GNU GPL3 license and can be freely used and extended by the community.

Manuel Salvadores, Gianluca Correndo, Steve Harris, Nick Gibbins, Nigel Shadbolt

Semantic Web in Use Track

miKrow: Semantic Intra-enterprise Micro-Knowledge Management System

Knowledge Management systems are one of the key strategies that allow companies to fully tap into their collective knowledge. However, two main entry barriers currently limit the potential of this approach: i) the hurdles employees encounter discouraging them from a strong and active participation (knowledge providing) and ii) the lack of truly evolved intelligent technologies that allow those employees to easily benefiting from the global knowledge provided by them and other users (knowledge consuming). Both needs can sometimes require opposite approaches, tending the current solutions to be not user friendly enough for user participation to be strong or not intelligent enough for them to be useful. In this paper, a lightweight framework for Knowledge Management is proposed based on the combination of two layers that cater to each need: a microblogging layer that simplifies how users interact with the whole system and a semantic powered engine that performs all the intelligent heavy lifting by combining semantic indexing and search of messages and users. Different mechanisms are also presented as extensions that can be plugged-in on demand and help expanding the capabilities of the whole system.

Víctor Penela, Guillermo Álvaro, Carlos Ruiz, Carmen Córdoba, Francesco Carbone, Michelangelo Castagnone, José Manuel Gómez-Pérez, Jesús Contreras
A Faceted Ontology for a Semantic Geo-Catalogue

Geo-spatial applications need to provide powerful search capabilities to support users in their daily activities. However, discovery services are often limited by only syntactically matching user terminology to metadata describing geographical resources. We report our work on the implementation of a geographical catalogue, and corresponding semantic extension, for the spatial data infrastructure (SDI) of the Autonomous Province of Trento (PAT) in Italy. We focus in particular to the semantic extension which is based on the adoption of the S-Match semantic matching tool and on the use of a faceted ontology codifying geographical domain specific knowledge. We finally report our experience in the integration of the faceted ontology with the multi-lingual geo-spatial ontology GeoWordNet.

Feroz Farazi, Vincenzo Maltese, Fausto Giunchiglia, Alexander Ivanyukovich
SoKNOS – Using Semantic Technologies in Disaster Management Software

Disaster management software deals with supporting staff in large catastrophic incidents such as earthquakes or floods, e.g., by providing relevant information, facilitating task and resource planning, and managing communication with all involved parties. In this paper, we introduce the SoKNOS support system, which is a functional prototype for such software using semantic technologies for various purposes. Ontologies are used for creating a mutual understanding between developers and end users from different organizations. Information sources and services are annotated with ontologies for improving the provision of the right information at the right time, for connecting existing systems and databases to the SoKNOS system, and for providing an ontology-based visualization. Furthermore, the users’ actions are constantly supervised, and errors are avoided by employing ontology-based consistency checking. We show how the pervasive and holistic use of semantic technologies leads to a significant improvement of both the development and the usability of disaster management software, and present some key lessons learned from employing semantic technologies in a large-scale software project.

Grigori Babitski, Simon Bergweiler, Olaf Grebner, Daniel Oberle, Heiko Paulheim, Florian Probst
Semantic Technologies for Describing Measurement Data in Databases

Exploration and analysis of vast empirical data is a cornerstone of the development and assessment of driver assistance systems. A common challenge is to apply the domain specific knowledge to the (mechanised) data handling, pre-processing and analysis process.

Ontologies can describe domain specific knowledge in a structured way that is manageable for both humans and algorithms. This paper outlines an architecture to support an ontology based analysis process for data stored in databases. Build on these concepts and architecture, a prototype that handles semantic data annotations is presented. Finally, the concept is demonstrated in a realistic example. The usage of exchangeable ontologies generally allows the adaption of presented methods for different domains.

Ulf Noyer, Dirk Beckmann, Frank Köster
Ontology-Driven Guidance for Requirements Elicitation

Requirements managers aim at keeping their sets of requirements well-defined, consistent and up to date throughout a project’s life cycle. Semantic web technologies have found many valuable applications in the field of requirements engineering, with most of them focusing on requirements analysis. However the usability of results originating from such requirements analyses strongly depends on the quality of the original requirements, which often are defined using natural language expressions without meaningful structures. In this work we present the prototypic implementation of a semantic guidance system used to assist requirements engineers with capturing requirements using a semi-formal representation. The semantic guidance system uses concepts, relations and axioms of a domain ontology to provide a list of suggestions the requirements engineer can build on to define requirements. The semantic guidance system is evaluated based on a domain ontology and a set of requirements from the aerospace domain. The evaluation results show that the semantic guidance system effectively supports requirements engineers in defining well-structured requirements.

Stefan Farfeleder, Thomas Moser, Andreas Krall, Tor Stålhane, Inah Omoronyia, Herbert Zojer
The Semantic Public Service Portal (S-PSP)

One of government’s responsibilities is the provision of public services to its citizens, for example, education, health, transportation, and social services. Additionally, with the explosion of the Internet in the past 20 years, many citizens have moved online as their main method of communication and learning. Therefore, a logical step for governments is to move the provision of public services online. However, public services have a complex structure and may span across multiple, disparate public agencies. Moreover, the legislation that governs a public service is usually difficult for a layman to understand. Despite this, governments have created online portals to enable citizens to find out about and utilise specific public services. While this is positive, most portals fail to engage citizens because they do not manage to hide the complexity of public services from users. Many also fail to address the specific needs of users, providing instead only information about the most general use-case. In this paper we present the Semantic Public Service Portal (S-PSP), which structures and stores detailed public-services semantically, so that they may be presented to citizens on-demand in a relevant, yet uncomplicated, manner. This ontology-based approach enables automated and logical decision-making to take place semantically in the application layer of the portal, while the user remains blissfully unaware of its complexities. An additional benefit of this approach is that the eligibility of a citizen for a particular public service may be identified early. The S-PSP provides a rich, structured and personalised public service description to the citizen, with which he/she can consume the public service as directed. In this paper, a use-case of the S-PSP in a rural community in Greece is described, demonstrating how its use can directly reduce the administrative burden on a citizen, in this case is a rural Small and Medium Enterprise (SME).

Nikolaos Loutas, Deirdre Lee, Fadi Maali, Vassilios Peristeras, Konstantinos Tarabanis
DataFinland—A Semantic Portal for Open and Linked Datasets

The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and as a SPARQL endpoint and a source file for machines. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.

Matias Frosterus, Eero Hyvönen, Joonas Laitio
Biological Names and Taxonomies on the Semantic Web – Managing the Change in Scientific Conception 

Biodiversity management requires the usage of heterogeneous biological information from multiple sources. Indexing, aggregating, and finding such information is based on names and taxonomic knowledge of organisms. However, taxonomies change in time due to new scientific findings, opinions of authorities, and changes in our conception about life forms. Furthermore, organism names and their meaning change in time, different authorities use different scientific names for the same taxon in different times, and various vernacular names are in use in different languages. This makes data integration and information retrieval difficult without detailed biological information. This paper introduces a meta-ontology for managing the names and taxonomies of organisms, and presents three applications for it: 1) publishing biological species lists as ontology services (ca. 20 taxonomies including more than 80,000 names), 2) collaborative management of the vernacular names of vascular plants (ca. 26,000 taxa), and 3) management of individual scientific name changes based on research results, covering a group of beetles. The applications are based on the databases of the Finnish Museum of Natural History and are used in a living lab environment on the web.

Jouni Tuominen, Nina Laurenne, Eero Hyvönen
An Approach for More Efficient Energy Consumption Based on Real-Time Situational Awareness

In this paper we present a novel approach for achieving energy efficiency in public buildings (especially sensor-enabled offices) based on the application of intelligent complex event processing and semantic technologies. In the nutshell of the approach is an efficient method for realizing the real-time situational awareness that helps in recognizing the situations where a more efficient energy consumption is possible and reaction on those opportunities promptly. Semantics allows a proper contextualization of the sensor data (i.e. its abstract interpretation), whereas complex event processing enables the efficient real-time processing of sensor data and its logic-based nature supports a declarative definition of the situations of interests. The approach has been implemented in the iCEP framework for intelligent Complex Event Reasoning. The results from a preliminary evaluation study are very promising: the approach enables a very precise real-time detection of the office occupancy situations that limit the operation of the lighting system based on the actual use of the space.

Yongchun Xu, Nenad Stojanovic, Ljiljana Stojanovic, Darko Anicic, Rudi Studer

Sensor Web Track

Ontology-Driven Complex Event Processing in Heterogeneous Sensor Networks

Modern scientific applications of sensor networks are driving the development of technologies to make heterogeneous sensor networks easier to deploy, program and use in multiple application contexts. One key requirement, addressed by this work, is the need for methods to detect events in real time that arise from complex correlations of measurements made by independent sensing devices. Because the mapping of such complex events to direct sensor measurements may be poorly understood, such methods must support experimental and frequent specification of the events of interest. This means that the event specification method must be embedded in the problem domain of the end-user, must support the user to discover observable properties of interest, and must provide automatic and efficient enaction of the specification.

This paper proposes the use of ontologies to specify and recognise complex events that arise as selections and correlations (including temporal correlations) of structured digital messages, typically streamed from multiple sensor networks. Ontologies are used as a basis for the definition of contextualised complex events of interest which are translated to selections and temporal combinations of streamed messages. Supported by description logic reasoning, the event descriptions are translated to the native language of a commercial Complex Event Processor (CEP), and executed under the control of the CEP.

The software is currently deployed for micro-climate monitoring of experimental food crop plants, where precise knowledge and control of growing conditions is needed to map phenotypical traits to the plant genome.

Kerry Taylor, Lucas Leidinger
A Semantically Enabled Service Architecture for Mashups over Streaming and Stored Data

Sensing devices are increasingly being deployed to monitor the physical world around us. One class of application for which sensor data is pertinent is environmental decision support systems, e.g. flood emergency response. However, in order to interpret the readings from the sensors, the data needs to be put in context through correlation with other sensor readings, sensor data histories, and stored data, as well as juxtaposing with maps and forecast models. In this paper we use a flood emergency response planning application to identify requirements for a semantic sensor web. We propose a generic service architecture to satisfy the requirements that uses semantic annotations to support well-informed interactions between the services. We present the

SemSor-Grid4Env

realisation of the architecture and illustrate its capabilities in the context of the example application.

Alasdair J. G. Gray, Raúl García-Castro, Kostis Kyzirakos, Manos Karpathiotakis, Jean-Paul Calbimonte, Kevin Page, Jason Sadler, Alex Frazer, Ixent Galpin, Alvaro A. A. Fernandes, Norman W. Paton, Oscar Corcho, Manolis Koubarakis, David De Roure, Kirk Martinez, Asunción Gómez-Pérez

Software, Services, Processes and Cloud Computing Track

Zhi# – OWL Aware Compilation

The usefulness of the Web Ontology Language to describe domains of discourse and to facilitate automatic reasoning services has been widely acknowledged. However, the programmability of ontological knowledge bases is severely impaired by the different conceptual bases of statically typed object-oriented programming languages such as Java and C# and ontology languages such as the Web Ontology Language (OWL). In this work, a novel programming language is presented that integrates OWL and XSD data types with C#. The Zhi# programming language is the first solution of its kind to make XSD data types and OWL class descriptions first-class citizens of a widely-used programming language. The Zhi# programming language eases the development of Semantic Web applications and facilitates the use and reuse of knowledge in form of ontologies. The presented approach was successfully validated to reduce the number of possible runtime errors compared to the use of XML and OWL APIs.

Alexander Paar, Denny Vrandečić
Lightweight Semantic Annotation of Geospatial RESTful Services

RESTful services are increasingly gaining traction over WS-* ones. As with WS-* services, their semantic annotation can provide benefits in tasks related to their discovery, composition and mediation. In this paper we present an approach to automate the semantic annotation of RESTful services using a cross-domain ontology like DBpedia, domain ontologies like GeoNames, and additional external resources (suggestion and synonym services). We also present a preliminary evaluation in the geospatial domain that proves the feasibility of our approach in a domain where RESTful services are increasingly appearing and highlights that it is possible to carry out this semantic annotation with satisfactory results.

Víctor Saquicela, Luis. M. Vilches-Blazquez, Oscar Corcho
Towards Custom Cloud Services
Using Semantic Technology to Optimize Resource Configuration

In today’s highly dynamic economy, businesses have to adapt quickly to market changes, be it customer, competition- or regulation-driven. Cloud computing promises to be a solution to the ever changing computing demand of businesses. Current SaaS, PaaS and IaaS services are often found to be too inflexible to meet the diverse customer requirements regarding service composition and Quality-of-Service. We therefore propose an ontology-based optimization framework allowing Cloud providers to find the best suiting resource composition based on an abstract request for a custom service. Our contribution is three-fold. First, we describe an OWL/SWRL based ontology framework for describing resources (hard- and software) along with their dependencies, interoperability constraints and meta information. Second, we provide an algorithm that makes use of some reasoning queries to derive a graph over all feasible resource compositions based on the abstract request. Third, we show how the graph can be transformed into an integer program, allowing to find the optimal solution from a profit maximizing perspective.

Steffen Haak, Stephan Grimm

Social Web and Web Science Track

One Tag to Bind Them All: Measuring Term Abstractness in Social Metadata

Recent research has demonstrated how the widespread adoption of collaborative tagging systems yields emergent semantics. In recent years, much has been learned about how to harvest the data produced by taggers for engineering light-weight ontologies. For example, existing measures of tag similarity and tag relatedness have proven crucial step stones for making latent semantic relations in tagging systems explicit. However, little progress has been made on other issues, such as understanding the different levels of tag generality (or tag abstractness), which is essential for, among others, identifying hierarchical relationships between concepts. In this paper we aim to address this gap. Starting from a review of linguistic definitions of word abstractness, we first use several large-scale ontologies and taxonomies as grounded measures of word generality, including Yago, Wordnet, DMOZ and WikiTaxonomy. Then, we introduce and apply several folksonomy-based methods to measure the level of generality of given tags. We evaluate these methods by comparing them with the grounded measures. Our results suggest that the generality of tags in social tagging systems can be approximated with simple measures. Our work has implications for a number of problems related to social tagging systems, including search, tag recommendation, and the acquisition of light-weight ontologies from tagging data.

Dominik Benz, Christian Körner, Andreas Hotho, Gerd Stumme, Markus Strohmaier
Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web

As the most popular microblogging platform, the vast amount of content on Twitter is constantly growing so that the retrieval of relevant information (streams) is becoming more and more difficult every day. Representing the semantics of individual Twitter activities and modeling the interests of Twitter users would allow for personalization and therewith countervail the information overload. Given the variety and recency of topics people discuss on Twitter, semantic user profiles generated from Twitter posts moreover promise to be beneficial for other applications on the Social Web as well. However, automatically inferring the semantic meaning of Twitter posts is a non-trivial problem.

In this paper we investigate semantic user modeling based on Twitter posts. We introduce and analyze methods for linking Twitter posts with related news articles in order to contextualize Twitter activities. We then propose and compare strategies that exploit the semantics extracted from both tweets and related news articles to represent individual Twitter activities in a semantically meaningful way. A large-scale evaluation validates the benefits of our approach and shows that our methods relate tweets to news articles with high precision and coverage, enrich the semantics of tweets clearly and have strong impact on the construction of semantic user profiles for the Social Web.

Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao
Improving Categorisation in Social Media Using Hyperlinks to Structured Data Sources

Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks for categorising the topic of individual posts. We focus our analysis on objects that have related metadata available on the Web, either via APIs or as Linked Data. Our experiments show that the inclusion of metadata from hyperlinked objects in addition to the original post content significantly improved classifier performance on two disparate datasets. We found that including selected metadata from APIs and Linked Data gave better results than including text from HTML pages. We investigate how this improvement varies across different topics. We also make use of the structure of the data to compare the usefulness of different types of external metadata for topic classification in a social media dataset.

Sheila Kinsella, Mengjiao Wang, John G. Breslin, Conor Hayes
Predicting Discussions on the Social Semantic Web

Social Web platforms are quickly becoming the natural place for people to engage in discussing current events, topics, and policies. Analysing such discussions is of high value to analysts who are interested in assessing up-to-the-minute public opinion, consensus, and trends. However, we have a limited understanding of how content and user features can influence the amount of response that posts (e.g., Twitter messages) receive, and how this can impact the growth of discussion threads. Understanding these dynamics can help users to issue better posts, and enable analysts to make timely predictions on which discussion threads will evolve into active ones and which are likely to wither too quickly. In this paper we present an approach for predicting discussions on the Social Web, by (a) identifying seed posts, then (b) making predictions on the level of discussion that such posts will generate. We explore the use of post-content and user features and their subsequent effects on predictions. Our experiments produced an optimum

F

1

score of 0.848 for identifying seed posts, and an average measure of 0.673 for Normalised Discounted Cumulative Gain when predicting discussion levels.

Matthew Rowe, Sofia Angeletou, Harith Alani
Mining for Reengineering: An Application to Semantic Wikis Using Formal and Relational Concept Analysis

Semantic wikis enable collaboration between human agents for creating knowledge systems. In this way, data embedded in semantic wikis can be mined and the resulting knowledge patterns can be reused to extend and improve the structure of wikis. This paper proposes a method for guiding the reengineering and improving the structure of a semantic wiki. This method suggests the creation of categories and relations between categories using Formal Concept Analysis (FCA) and Relational Concept Analysis (RCA). FCA allows the design of a concept lattice while RCA provides relational attributes completing the content of formal concepts. The originality of the approach is to consider the wiki content from FCA and RCA points of view and to extract knowledge units from this content allowing a factorization and a reengineering of the wiki structure. This method is general and does not depend on any domain and can be generalized to every kind of semantic wiki. Examples are studied throughout the paper and experiments show the substantial results.

Lian Shi, Yannick Toussaint, Amedeo Napoli, Alexandre Blansché

Demo Track

SmartLink: A Web-Based Editor and Search Environment for Linked Services

Despite considerable research dedicated to Semantic Web Services (SWS), structured semantics are still not used significantly to annotate Web services and APIs. This is due to the complexity of comprehensive SWS models and has led to the emergence of a new approach dubbed

Linked Services.

Linked Services adopt Linked Data principles to produce simplified, RDF-based service descriptions that are easier to create and interpret. However, current Linked Services editors assume the existence of services documentation in the form of HTML or WSDL files. Therefore, we introduce SmartLink, a Web-based editor and search environment for Linked Services. Based on an easy-to-use Web form and a REST-ful API, SmartLink allows both humans as well as machines to produce light-weight service descriptions from scratch.

Stefan Dietze, Hong Qing Yu, Carlos Pedrinaci, Dong Liu, John Domingue
ViziQuer: A Tool to Explore and Query SPARQL Endpoints

The presented tool uses a novel approach to explore and query a SPARQL endpoint. The tool is simple to use as a user needs only to enter an address of a SPARQL endpoint of one’s interest. The tool will extract and visualize graphically the data schema of the endpoint. The user will be able to overview the data schema and use it to construct a SPARQL query according to the data schema. The tool can be downloaded from http://viziquer.lumii.lv. There is also additional information and help on how to use it in practice.

Martins Zviedris, Guntis Barzdins
EasyApp: Goal-Driven Service Flow Generator with Semantic Web Service Technologies

EasyApp is a goal-driven service flow generator based on semantic web service annotation and discovery technologies. The purpose of EasyApp is to provide application creation environment for software programmers to make new application semi-automatically by enabling the semantic composition of web services on the Web. In this demo, we introduce key technologies of EasyApp to overcome the problems of the previous work on semantic web service technologies. Demonstration of use case ‘hiring process’ shows that EasyApp helps software developers make easily a service flow with key technologies: ontology-based goal analysis, semantic service annotation, semantic service discovery, and goal-driven service flow generation.

Yoo-mi Park, Yuchul Jung, HyunKyung Yoo, Hyunjoo Bae, Hwa-Sung Kim
Who’s Who – A Linked Data Visualisation Tool for Mobile Environments

Reduced size in hand-held devices imposes significant usability and visualisation challenges. Semantic adaptation to specific usage contexts is a key feature for overcoming usability and display limitations on mobile devices. We demonstrate a novel application which: (i) links the physical world with the semantic web, facilitating context-based information access, (ii) enhances the processing of semantically enriched, linked data on mobile devices, (iii) provides an intuitive interface for mobile devices, reducing information overload.

A. Elizabeth Cano, Aba-Sah Dadzie, Melanie Hartmann
OntosFeeder – A Versatile Semantic Context Provider for Web Content Authoring

As the amount of structured information available on the Web as Linked Data has reached a respectable size. However, the question arises, how this information can be operationalised in order to boost productivity. A clear improvement over the keyword-based document retrieval as well as the manual aggregation and compilation of facts is the provision of contextual information in an integrated fashion. In this demo, we present the

Ontos Feeder

– a system serving as context information provider, that can be integrated into Content Management Systems in order to support authors by supplying additional information on the fly. During the creation of text, relevant entities are highlighted and contextually disambiguated; facts from trusted sources such as

DBpedia

or

Freebase

are shown to the author. Productivity is increased, because the author does not have to leave her working environment to research facts, thus media breaks are avoided. Additionally, the author can choose to annotate the created content with

RDFa

or

Microformats

, thus making it ”semantic-ready” for indexing by the new generation of search engines. The presented system is available as Open Source and was adapted for

WordPress

and

Drupal

.

Alex Klebeck, Sebastian Hellmann, Christian Ehrlich, Sören Auer
wayOU – Linked Data-Based Social Location Tracking in a Large, Distributed Organisation

While the publication of linked open data has gained momentum in large organisations, the way for users of these organisations to engage with these data is still unclear. Here, we demonstrate a mobile application called wayOU (where are you at the Open University) which relies on the data published by The Open University (under data.open.ac.uk) to provide social, location-based services to its students and members of staff. An interesting aspect of this application is that, not only it consumes linked data produced by the University from various repositories, but it also contributes to it by creating new connections between people, places and other types of resources.

Mathieu d’Aquin, Fouad Zablith, Enrico Motta
SeaFish: A Game for Collaborative and Visual Image Annotation and Interlinking

Many tasks in semantic content creation, from building and aligning vocabularies to annotation or data interlinking, still require human intervention. Even though automatic methods addressing the aforementioned challenges have reached a certain level of maturity, user input is still required at many ends of these processes. The idea of human computation is to rely on the human user for problems that are impossible to solve for computers. However, users need clear incentives in order to dedicate their time and manual labor to tasks. The OntoGame series uses games to hide abstract tasks behind entertaining user interfaces and gaming experiences in order to collect knowledge. SeaFish is a game for collaborative image annotation and interlinking without text. In this latest release of the OntoGame series, players have to select images that are related to a concept that is represented by an image (from DBpedia) from a collection of images (produced by querying

flickr

TM

wrappr

with the respective concept). The data collected by SeaFish is published as Linked Data on the Web. In this paper we outline the SeaFish game and demo.

Stefan Thaler, Katharina Siorpaes, David Mear, Elena Simperl, Carl Goodman
The Planetary System: Executable Science, Technology, Engineering and Math Papers

Executable scientific papers contain not just layouted text for reading. They contain, or link to, machine-comprehensible representations of the scientific findings or experiments they describe. Client-side players can thus enable readers to “check, manipulate and explore the result space” [1]. We have realized executable papers in the STEM domain with the

Planetary

system. Semantic annotations associate the papers with a content commons holding the background ontology, the annotations are exposed as Linked Data, and a frontend player application hooks modular interactive services into the semantic annotations.

Christoph Lange, Michael Kohlhase, Catalin David, Deyan Ginev, Andrea Kohlhase, Bogdan Matican, Stefan Mirea, Vyacheslav Zholudev
Semantic Annotation of Images on Flickr

In this paper we introduce an application that allows its users to have an explicit control on the meaning of tags they use when uploading photos on Flickr. In fact, this application provides to the users an improved interface with which they can add concepts to photos instead of simple free-text tags. They can thus directly provide semantic tags for their photos that can then be used to improve services such as search.

Pierre Andrews, Sergey Kanshin, Juan Pane, Ilya Zaihrayeu
FedX: A Federation Layer for Distributed Query Processing on Linked Open Data

Driven by the success of the Linked Open Data initiative today’s Semantic Web is best characterized as a Web of interlinked datasets. Hand in hand with this structure new challenges to query processing are arising. Especially queries for which more than one data source can contribute results require advanced optimization and evaluation approaches, the major challenge lying in the nature of distribution: Heterogenous data sources have to be integrated into a federation to globally appear as a single repository. On the query level, though, techniques have to be developed to meet the requirements of efficient query computation in the distributed setting. We present FedX, a project which extends the Sesame Framework with a federation layer that enables efficient query processing on distributed Linked Open Data sources. We discuss key insights to its architecture and summarize our optimization techniques for the federated setting. The practicability of our system will be demonstrated in various scenarios using the Information Workbench.

Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt

PhD Symposium

Reasoning in Expressive Extensions of the RDF Semantics

The research proposed here deals with reasoning in expressive semantic extensions of the RDF Semantics specification, up to the level of OWL 2 Full. The work aims to conduct an in-depth study of the distinctive features and the degree of implementability of OWL Full reasoning. This paper describes the core problem, presents the proposed approach, reports on initial results, and lists planned future tasks.

Michael Schneider
Personal Semantics: Personal Information Management in the Web with Semantic Technologies

Every web user has several online profiles through which personal information is exchanged with many service providers. This exchange of personal information happens at a pace difficult to fully comprehend and manage without a global view and control with obvious consequences on data control, ownership and, of course, privacy. To tackle issues associated with current service-centric approaches, we propose a user-centric architecture where the interaction between a user and other agents is managed based on a global profile for the user, maintained in a profile management system and controlled by the user herself. In this PhD, we will investigate research issues and challenges in realizing such a system based on semantic technologies.

Salman Elahi
Reasoning with Noisy Semantic Data

Based on URIs, HTTP and RDF, the Linked Data project [3] aims to expose, share and connect related data from diverse sources on the SemanticWeb. Linked Open Data (LOD) is a community effort to apply the Linked Data principles to data published under open licenses. With this effort, a large number of LOD datasets have been gathered in the LOD cloud, such as DBpedia, Freebase and FOAF profiles. These datasets are connected by links such as owl:sameAs. LOD has gained rapidly progressed and is still growing constantly. Until May 2009, there are 4.7 billion RDF triples and around 142 million RDF links [3]. After that, the total has been increased to 16 billion triples in March 2010 and another 14 billion triples have been published by the AIFB according to [17].

Qiu Ji, Zhiqiang Gao, Zhisheng Huang
Extracting and Modeling Historical Events to Enhance Searching and Browsing of Digital Cultural Heritage Collections

Currently, cultural heritage portals limit their users to search only for individual objects and not for objects related to some historical narrative. Typically, most museums select objects for an exhibition based on the story they want to tell the public, but in digital collections this context can currently not be made explicit as the historical context is not part of the object annotations.

Roxane Segers
Enriching Ontologies by Learned Negation
Or How to Teach Ontologies Vegetarianism

Ontologies form the basis of the semantic web by providing knowledge on concepts, relations and instances. Unfortunately, the manual creation of ontologies is a time intensive and hence expensive task. This leads to the so-called knowledge acquisition bottleneck being a major problem for a more widespread adoption of the semantic web. Ontology learning tries to widen the bottleneck by supporting human knowledge engineers in creating ontologies. For this purpose, knowledge is extracted from existing data sources and is transformed into ontologies. So far, most ontology learning approaches are limited to very basic types of ontologies consisting of concept hierarchies and relations but do not use large amounts of the expressivity ontologies provide.

Daniel Fleischhacker
Optimizing Query Answering over OWL Ontologies

Query answering is a key reasoning task for many ontology based applications in the Semantic Web. Unfortunately for OWL, the worst case complexity of query answering is very high. That is why, when the schema of an ontology is written in a highly expressive language like OWL 2 DL, currently used query answering systems do not find all answers to queries posed over the ontology, i.e., they are incomplete. In this paper optimizations are discussed that may make query answering over expressive languages feasible in practice. These optimizations mostly focus on the use of traditional database techniques that will be adapted to be applicable to knowledge bases. Moreover, caching techniques and a form of progressive query answering are also explored.

Ilianna Kollia
Hybrid Search Ranking for Structured and Unstructured Data

A growing amount of structured data is published on the Web and complements the textual content. Searching the textual content is performed primarily by the means of keyword queries and Information Retrieval methods. Structured data allow database-like queries for retrieval. Since structured and unstructured data occur often as a combination of both, are embedded in each other, or are complementary, the question of how search can take advantage of this hybrid data setting arises. Of particular interest is the question of how ranking as the algorithmic decision of what information is relevant for a given query can take structured and unstructured data into account by also allowing hybrid queries consisting of structured elements combined with keywords. I propose to investigate this question in the course of my PhD thesis.

Daniel M. Herzig
Backmatter
Metadaten
Titel
The Semanic Web: Research and Applications
herausgegeben von
Grigoris Antoniou
Marko Grobelnik
Elena Simperl
Bijan Parsia
Dimitris Plexousakis
Pieter De Leenheer
Jeff Pan
Copyright-Jahr
2011
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-21064-8
Print ISBN
978-3-642-21063-1
DOI
https://doi.org/10.1007/978-3-642-21064-8