Skip to main content
main-content

Über dieses Buch

This book constitutes the thoroughly refereed post conference proceedings of the 4th edition of the Semantic Web Evaluation Challenge, SemWebEval 2018, co-located with the 15th European Semantic Web conference, held in Heraklion, Greece, in June 2018.
This book includes the descriptions of all methods and tools that competed at SemWebEval 2018, together with a detailed description of the tasks, evaluation procedures and datasets. The 18 revised full papers presented in this volume were carefully reviewed and selected from 24 submissions. The contributions are grouped in the areas: the mighty storage challenge; open knowledge extraction challenge; question answering over linked data challenge; semantic sentiment analysis.

Inhaltsverzeichnis

Frontmatter

The Mighty Storage Challenge

Frontmatter

MOCHA2018: The Mighty Storage Challenge at ESWC 2018

Abstract
The aim of the Mighty Storage Challenge (MOCHA) at ESWC 2018 was to test the performance of solutions for SPARQL processing in aspects that are relevant for modern applications. These include ingesting data, answering queries on large datasets and serving as backend for applications driven by Linked Data. The challenge tested the systems against data derived from real applications and with realistic loads. An emphasis was put on dealing with data in form of streams or updates.
Kleanthi Georgala, Mirko Spasić, Milos Jovanovik, Vassilis Papakonstantinou, Claus Stadler, Michael Röder, Axel-Cyrille Ngonga Ngomo

Versioned Querying with OSTRICH and Comunica in MOCHA 2018

Abstract
In order to exploit the value of historical information in Linked Datasets, we need to be able to store and query different versions of such datasets efficiently. The 2018 edition of the Mighty Storage Challenge (MOCHA) is organized to discover the efficiency of such Linked Data stores and to detect their bottlenecks. One task in this challenge focuses on the storage and querying of versioned datasets, in which we participated by combining the OSTRICH triple store and the Comunica SPARQL engine. In this article, we briefly introduce our system for the versioning task of this challenge. We present the evaluation results that show that our system achieves fast query times for the supported queries, but not all queries are supported by Comunica at the time of writing. These results of this challenge will serve as a guideline for further improvements to our system.
Ruben Taelman, Miel Vander Sande, Ruben Verborgh

Benchmarking Virtuoso 8 at the Mighty Storage Challenge 2018: Challenge Results

Abstract
Following the success of Virtuoso at last year’s Mighty Storage Challenge - MOCHA 2017, we decided to participate once again and test the latest Virtuoso version against the new tasks which comprise the MOCHA 2018 challenge. The aim of the challenge is to test the performance of solutions for SPARQL processing in aspects relevant for modern applications: ingesting data, answering queries on large datasets and serving as backend for applications driven by Linked Data. The challenge tests the systems against data derived from real applications and with realistic loads, with an emphasis on dealing with changing data in the form of streams or updates. Virtuoso, by OpenLink Software, is a modern enterprise-grade solution for data access, integration, and relational database management, which provides a scalable RDF Quad Store. In this paper, we present the final challenge results from MOCHA 2018 for Virtuoso v8.0, compared to the other participating systems. Based on these results, Virtuoso v8.0 was declared as the overall winner of MOCHA 2018.
Milos Jovanovik, Mirko Spasić

Open Knowledge Extraction Challenge

Frontmatter

Open Knowledge Extraction Challenge 2018

Abstract
The fourth edition of the Open Knowledge Extraction Challenge took place at the 15th Extended Semantic Web Conference in 2018. The aim of the challenge was to bring together researchers and practitioners from academia as well as industry to compete of pushing further the state of the art in knowledge extraction from text for the Semantic Web. This year, the challenge reused two tasks from the former challenge and defined two new tasks. Thus, the challenge consisted of tasks such as Named Entity Identification, Named Entity Disambiguation and Linking as well as Relation Extraction. To ensure an objective evaluation of the performance of participating systems, the challenge ran on a version the FAIR benchmarking platform Gerbil integrated in the HOBBIT platform. The performance was measured on manually curated gold standard datasets with Precision, Recall, F1-measure and the runtime of participating systems.
René Speck, Michael Röder, Felix Conrads, Hyndavi Rebba, Catherine Camilla Romiyo, Gurudevi Salakki, Rutuja Suryawanshi, Danish Ahmed, Nikit Srivastava, Mohit Mahajan, Axel-Cyrille Ngonga Ngomo

Relation Extraction for Knowledge Base Completion: A Supervised Approach

Abstract
This paper outlines our approach to the extraction on predefined relations from unstructured data (OKE Challenge 2018: Task 3). Our solution uses a deep learning classifier receiving as input raw sentences and a pair of entities. Over the output of the classifier, expert rules are applied to delete known erroneous relations. For training the system we gathered data by aligning DBPedia relations and Wikipedia pages. This process was mainly automatic, applying some filters to refine the training records by human supervision. The final results show that the combination of a powerful classifier model with expert knowledge have beneficial implications in the final performance of the system.
Héctor Cerezo-Costas, Manuela Martín-Vicente

The Scalable Question Answering Over Linked Data Challenge

Frontmatter

The Scalable Question Answering Over Linked Data (SQA) Challenge 2018

Abstract
Question answering (QA) systems, which source answers to natural language questions from Semantic Web data, have recently shifted from the research realm to become commercially viable products. Increasing investments have refined an interaction paradigm that allows end users to profit from the expressive power of Semantic Web standards, while at the same time hiding their complexity behind intuitive and easy-to-use interfaces. Not surprisingly, after the first excitement we did not witness a cooling-down phenomenon: regular interactions with question answering systems have become more and more natural. As consumers expectations around the capabilities of systems able to answer questions formulated in natural language keep growing, so is the availability of such systems in various settings, devices and languages. Increasing usage in real (non-experimental) settings have boosted the demand for resilient systems, which can cope with high volume demand.
Giulio Napolitano, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo

On the scalability of the QA System WDAqua-core1

Abstract
Scalability is an important problem for Question Answering (QA) systems over Knowledge Bases (KBs). Current KBs easily contain hundreds of millions of triples and all these triples can potentially contain the information requested by the user.
In this publication, we describe how the QA system WDAqua-core1 deals with the scalability issue. Moreover, we compare the scalability of WDAqua-core1 with existing approaches.
Dennis Diefenbach, Kamal Singh, Pierre Maret

GQA: Grammatical Question Answering for RDF Data

Abstract
Nowadays, we observe a rapid increase in the volume of RDF knowledge bases (KBs) and a need for functionalities that will help users access them in natural language without knowing the features of the KBs and structured query languages, such as SPARQL. This paper introduces Grammatical Question Answering (GQA), a system for answering questions in the English language over DBpedia, which involves parsing of questions by means of Grammatical Framework and further analysis of grammar components. We built an abstract conceptual grammar and a concrete English grammar, so that the system can handle complex syntactic constructions that are in the focus of the SQA2018 challenge. The parses are further analysed and transformed into SPARQL queries that can be used to retrieve the answers for the users’ questions.
Elizaveta Zimina, Jyrki Nummenmaa, Kalervo Järvelin, Jaakko Peltonen, Kostas Stefanidis, Heikki Hyyrö

A Language Adaptive Method for Question Answering on French and English

Abstract
The LAMA (Language Adaptive Method for question Answering) system focuses on answering natural language questions using an RDF knowledge base within a reasonable time. Originally designed to process queries written in French, the system has been redesigned to also function on the English language. Overall, we propose a set of lexico-syntactic patterns for entity and property extraction to create a semantic representation of natural language requests. This semantic representation is then used to generate SPARQL queries able to answer users’ requests. The paper also describes a method for decomposing complex queries into a series of simpler queries. The use of preprocessed data and parallelization methods helps improve individual answer times.
Nikolay Radoev, Amal Zouaq, Mathieu Tremblay, Michel Gagnon

Semantic Sentiment Analysis Challenge

Frontmatter

Semantic Sentiment Analysis Challenge at ESWC2018

Abstract
Sentiment Analysis is a widely studied research field in both research and industry, and there are different approaches for addressing sentiment analysis related tasks. Sentiment Analysis engines implement approaches spanning from lexicon-based techniques, to machine learning, or involving syntactical rules analysis. Such systems are already evaluated in international research challenges. However, Semantic Sentiment Analysis approaches, which take into account or rely also on large semantic knowledge bases and implement Semantic Web best practices, are not under specific experimental evaluation and comparison by other international challenges. Such approaches may potentially deliver higher performance, since they are also able to analyze the implicit, semantics features associated with natural language concepts. In this paper, we present the fifth edition of the Semantic Sentiment Analysis Challenge, in which systems implementing or relying on semantic features are evaluated in a competition involving large test sets, and on different sentiment tasks. Systems merely based on syntax/word-count or just lexicon-based approaches have been excluded by the evaluation. Then, we present the results of the evaluation for each task.
Mauro Dragoni, Erik Cambria

Domain-Aware Sentiment Classification with GRUs and CNNs

Abstract
In this paper, we describe a deep neural network architecture for domain-aware sentiment classification task with the purpose of the sentiment classification of product reviews in different domains and evaluating nine pre-trained embeddings provided by the semantic sentiment classification challenge at the 15th Extended Semantic Web Conference. The proposed approach combines the domain and the sequence of word embeddings of the summary or text of each review for Gated Recurrent Units (GRUs) to produce the corresponding sequence of embeddings by being aware of the domain and previous words. Afterwards, it extracts local features using Convolutional Neural Networks (CNNs) from the output of the GRU layer. The two sets of local features extracted from the domain-aware summary and text of a review are concatenated into a single vector, and are used for classifying the sentiment of a review. Our approach obtained 0.9643 F1-score on the test set and achieved the 1st place in the first task of the Semantic Sentiment Analysis Challenge at the 15th Extended Semantic Web Conference.
Guangyuan Piao, John G. Breslin

Fine-Tuning of Word Embeddings for Semantic Sentiment Analysis

Abstract
In this paper, we present a state-of-the-art deep-learning approach for sentiment polarity classification. Our approach is based on a 2-layer bidirectional Long Short-Term Memory network, equipped with a neural attention mechanism to detect the most informative words in a natural language text. We test different pre-trained word embeddings, initially keeping these features frozen during the first epochs of the training process. Next, we allow the neural network to perform a fine-tuning of the word embeddings for the sentiment polarity classification task. This allows projecting the pre-trained embeddings in a new space which takes into account information about the polarity of each word, thereby being more suitable for semantic sentiment analysis. Experimental results are promising and show that the fine-tuning of the embeddings with a neural attention mechanism allows boosting the performance of the classifier.
Mattia Atzeni, Diego Reforgiato Recupero

The KABSA System at ESWC-2018 Challenge on Semantic Sentiment Analysis

Abstract
In the last decade, the focus of the Opinion Mining field moved to detection of the pairs “aspect-polarity” instead of limiting approaches in the computation of the general polarity of a text. In this work, we propose an aspect-based opinion mining system based on the use of semantic resources for the extraction of the aspects from a text and for the computation of their polarities. The proposed system participated at the third edition of the Semantic Sentiment Analysis (SSA) challenge took place during ESWC 2018 achieving the runner-up place in the Task #2 concerning the aspect-based sentiment analysis. Moreover, a further evaluation performed on the SemEval 2015 benchmarks demonstrated the feasibility of the proposed approach.
Marco Federici, Mauro Dragoni

The IRMUDOSA System at ESWC-2018 Challenge on Semantic Sentiment Analysis

Abstract
Multi-domain opinion mining consists in estimating the polarity of a document by exploiting domain-specific information. One of the main issue of the approaches discussed in literature is their poor capability of being applied on domains that have not been used for building the opinion model. In this paper, we present an approach exploiting the linguistic overlap between domains for building models enabling the estimation of polarities for documents belonging to any other domain. The system implementing such an approach has been presented at the third edition of the Semantic Sentiment Analysis Challenge co-located with ESWC 2018. Fuzzy representation of features polarity supports the modeling of information uncertainty learned from training set and integrated with knowledge extracted from two well-known resources used in the opinion mining field, namely Sentic.Net and the General Inquirer. The proposed technique has been validated on a multi-domain dataset and the results demonstrated the effectiveness of the proposed approach by setting a plausible starting point for future work.
Giulio Petrucci, Mauro Dragoni

The CLAUSY System at ESWC-2018 Challenge on Semantic Sentiment Analysis

Abstract
With different social media and commercial platforms, users express their opinion about products in a textual form. Automatically extracting the polarity(i.e. whether the opinion is positive or negative) of a user can be useful for both actors: the online platform incorporating the feedback to improve their product as well as the client who might get recommendations according to his or her preferences. Different approaches for tackling the problem, have been suggested mainly using syntactic features. The “Challenge on Semantic Sentiment Analysis” aims to go beyond the word-level analysis by using semantic information. In this paper we propose a novel approach by employing the semantic information of grammatical unit called preposition. We try to derive the target of the review from the summary information, which serves as an input to identify the proposition in it. Our implementation relies on the hypothesis that the proposition expressing the target of the summary, usually containing the main polarity information.
Andi Rexha, Mark Kröll, Mauro Dragoni, Roman Kern

The NeuroSent System at ESWC-2018 Challenge on Semantic Sentiment Analysis

Abstract
Multi-domain sentiment analysis consists in estimating the polarity of a given text by exploiting domain-specific information. One of the main issues common to the approaches discussed in the literature is their poor capabilities of being applied on domains which are different from those used for building the opinion model. In this paper, we will present an approach exploiting the linguistic overlap between domains to build sentiment models supporting polarity inference for documents belonging to every domain. Word embeddings together with a deep learning architecture have been implemented for enabling the building of multi-domain sentiment model. The proposed technique is validated by following the Dranziera protocol in order to ease the repeatability of the experiments and the comparison of the results. The outcomes demonstrate the effectiveness of the proposed approach and also set a plausible starting point for future work.
Mauro Dragoni

The FeatureSent System at ESWC-2018 Challenge on Semantic Sentiment Analysis

Abstract
The approach described in this paper explores the use of semantic structured representation of sentences extracted from texts for multi-domain sentiment analysis purposes. The presented algorithm is built upon a domain-based supervised approach using index-like structured for representing information extracted from text. The algorithm extracts dependency parse relationships from the sentences containing in a training set. Then, such relationships are aggregated in a semantic structured together with either polarity and domain information. Such information is exploited in order to have a more fine-grained representation of the learned sentiment information. When the polarity of a new text has to be computed, such a text is converted in the same semantic representation that is used (i) for detecting the domain to which the text belongs to, and then (ii), once the domain is assigned to the text, the polarity is extracted from the index-like structure. First experiments performed by using the Blitzer dataset for training the system demonstrated the feasibility of the proposed approach.
Mauro Dragoni

Evaluating Quality of Word Embeddings with Sentiment Polarity Identification Task

Abstract
Neural word embeddings have been widely used in modern NLP applications as they provide vector representation of words and capture the semantic properties of words and the linguistic relationship between the words. Many research groups have released their own version of word embeddings. However, they are trained on generic corpora, which limits their direct use for domain specific tasks. In this paper, we evaluate a set of pretrained word embeddings which were provided to us, on a standard NLP task - Sentiment Polarity Identification Task.
Vijayasaradhi Indurthi, Subba Reddy Oota

Backmatter

Weitere Informationen