nach oben

2019 | Buch

Kapitel lesen Erstes Kapitel lesen

Knowledge Graphs and Semantic Web

First Iberoamerican Conference, KGSWC 2019, Villa Clara, Cuba, June 23-30, 2019, Proceedings

herausgegeben von: Boris Villazón-Terrazas, Yusniel Hidalgo-Delgado

Verlag: Springer International Publishing

Buchreihe : Communications in Computer and Information Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the thoroughly refereed proceedings of the First Iberoamerican Conference, KGSWC 2019, held in Villa Clara, Cuba, in June 2019.

The 14 full papers and 1 short paper presented were carefully reviewed and selected from 33 submissions. The papers cover wide research fields including artificial intelligence; knowledge representation and reasoning; ontology engineering; natural language processing; description logics; information systems; query languages; world wide web; semantic web description languages; and information retrieval.

Inhaltsverzeichnis

Frontmatter

A Model for Language Annotations on the Web

Abstract

Several annotation models have been proposed to enable a multilingual Semantic Web. Such models hone in on the word and its morphology and assume the language tag and URI comes from external resources. These resources, such as ISO 639 and Glottolog, have limited coverage of the world’s languages and have a very limited thesaurus-like structure at best, which hampers language annotation, hence constraining research in Digital Humanities and other fields. To resolve this ‘outsourced’ task of the current models, we developed a model for representing information about languages, the Model for Language Annotation (MoLA), such that basic language information can be recorded consistently and therewith queried and analyzed as well. This includes the various types of languages, families, and the relations among them. MoLA is formalized in OWL so that it can integrate with Linguistic Linked Data resources. Sufficient coverage of MoLA is demonstrated with the use case of French.

Frances Gillis-Webber, Sabine Tittel, C. Maria Keet

A Description Logic for Unifying Different Points of View

Abstract

Multilevel modelling is the conceptual modelling problem of having concepts that could be instances of another concepts. It is a relevant problem for many areas and in particular for ontology design. We motivate our work by a realworld case study on the accounting domain in which the points of view of expert and operator users are conceptualized as two knowledge levels. In this paper we address theoretical aspects of extending the tableau algorithm for a description logic that enables unifiying different user perspectives in a multilevel knowledge modelling, following a Henkin semantics.

Paula Severi, Edelweis Rohrer, Regina Motz

OceanGraph: Some Initial Steps Toward a Oceanographic Knowledge Graph

Abstract

Increasing ocean temperatures severely affects marine species and ecosystems. Among other things, rising temperatures cause coral bleaching and loss of breeding grounds for marine fish and mammals. Motivated by the need to understand better these global problems, researchers from all over the world generated huge amounts of oceanographic data during the last years. However, most of this data remain isolated in their own silos. One approach to provide safe accessibility to these silos is to map local, often database-specific identifiers, to shared global identifiers. This mapping can then be used to build interoperable knowledge graphs (KGs), where entities such as publications, people, places, specimens, environmental variables and institutions are all part of a single, shared knowledge space. This short paper describes one such effort, the OceanGraph KG, including the modeling and publication processes, and the current and prospective uses of the dataset.

Marcos Zárate, Pablo Rosales, Germán Braun, Mirtha Lewis, Pablo Rubén Fillottrani, Claudio Delrieux

Digital Repositories and Linked Data: Lessons Learned and Challenges

Abstract

Digital repositories have been used by Universities and Libraries to store their bibliographic, scientific, and/or institutional contents, and then make their corresponding metadata publicly available to the web and through the OAI-PMH protocol. However, such metadata is not descriptive enough for a document to be easily discoverable. Even though the emergence of Semantic Web technologies have produced the interest of Digital Repository providers to publish and enrich their content using Linked Data (LD) technologies, those institutions have used different generation approaches, and in certain cases ad-hoc solutions to solve particular use cases, but none of them has performed a comparison between existing approaches in order to demonstrate which one is the best solution prior to its application. In order to address this question, we have performed a benchmark study that compares two commonly used generation approaches, and also describes our experience, lessons learned and challenges found during the process of publishing a DSpace digital repository as LD. Results show that the straightforward method for extracting data from a digital repository is through the standard OAI-PMH protocol, whose performance in terms of execution time is much shorter than the database approach, while additional data cleaning tasks are minimal.

Santiago Gonzalez-Toral, Mauricio Espinoza-Mejia, Victor Saquicela

Author-Topic Classification Based on Semantic Knowledge

Abstract

We propose a novel unsupervised two-phased classification model leveraging from semantic web technologies for discovering common research fields between researchers based on information available from a bibliographic repository and external resources. The first phase performs coarse-grained classification by knowledge disciplines using as reference the disciplines defined in the UNESCO thesaurus. The second phase provides a fine-grained classification by means of a clustering approach combined with external resources. The methodology was applied to the REDI (Semantic Repository of Ecuadorian researchers) project, with remarkable results and thus proving a valuable tool to one of the main REDI’s goals: discover Ecuadorian authors sharing research interests to foster collaborative research efforts.

José Segarra, Xavier Sumba, José Ortiz, Ronald Gualán, Mauricio Espinoza-Mejia, Víctor Saquicela

A General Process for the Semantic Annotation and Enrichment of Electronic Program Guides

Abstract

Electronic Program Guides (EPGs) are usual resources aimed to inform the audience about the programming being transmitted by TV stations and cable/satellite TV providers. However, they only provide basic metadata about the TV programs, while users may want to obtain additional information related to the content they are currently watching. This paper proposes a general process for the semantic annotation and subsequent enrichment of EPGs using external knowledge bases and natural language processing techniques with the aim to tackle the lack of immediate availability of related information about TV programs. Additionally, we define an evaluation approach based on a distributed representation of words that can enable TV content providers to verify the effectiveness of the system and perform an automatic execution of the enrichment process. We test our proposal using a real-world dataset and demonstrate its effectiveness by using different knowledge bases, word representation models and similarity measures. Results showed that DBpedia and Google Knowledge Graph knowledge bases return the most relevant content during the enrichment process, while word2vec and fasttext models with Words Mover’s Distance as similarity function can be combined to validate the effectiveness of the retrieval task.

Santiago Gonzalez-Toral, Mauricio Espinoza-Mejia, Kenneth Palacio-Baus, Victor Saquicela

Extraction of RDF Statements from Text

Abstract

The vision of the Semantic Web is to get information with a defined meaning in a way that computers and people can work collaboratively. In this sense, the RDF model provides such a definition by linking and representing resources and descriptions through defined schemes and vocabularies. However, much of the information able to be represented is contained within plain text, which results in an unfeasible task by humans to annotate large scale data sources such as the Web. Therefore, this paper presents a strategy for the extraction and representation of RDF statements from text. The idea is to provide an architecture that receives sentences and returns triples with elements linked to resources and vocabularies of the Semantic Web. The results demonstrate the feasibility of representing RDF statements from text through an implementation following the proposed strategy.

Jose L. Martinez-Rodriguez, Ivan Lopez-Arevalo, Ana B. Rios-Alvarado, Julio Hernandez, Edwin Aldana-Bobadilla

Meta-Modelling Ontology Design Pattern

Abstract

In the last decades, the meta-modelling problem has received increasing attention in the conceptual modelling and semantic web communities. We have proposed a solution to this problem in the context of ontological modelling which consists in extending a fragment of the Web Ontology Language OWL with a meta-modelling constructor to equate instances to classes. Even though there are methodologies and patterns that help the ontology engineer to conceptualize a domain using ontologies, there is a lack of such guides for the meta-modelling approaches that extend OWL. In this work we introduce a design pattern that guides in the conceptualization of domains for which there are requirements at different knowledge levels, in particular for different user perspectives.

Edelweis Rohrer, Paula Severi, Regina Motz

Automated Large Geographic Ontologies Generation Method from Spatial Databases

Abstract

Ontologies have emerged as an important component in Information Systems and, specifically, in Geographic Information Systems, where they play a key role. However, the creation and maintenance of geographic ontologies can become an exhausting work due to the rapid growth and availability of spatial data, which are provided through relational databases most times. For this reason there has been an increasing interest in the automatic generation of geographic ontologies from relational databases in recent years. This work describes an automatic method to generate a geographic ontology from the spatial data provided by a relational database. The importance and originality of this study lie in that it is able to model two main aspects of a spatial database in the generated ontology: (1) The three main types of spatial data (point, line and polygon) are modelled as a data property and not as an object property. (2) Four data integrity constraints: First Normal Form, Not Null, Unique and Primary Key. Another contribution of our proposal is related to the support for generating large ontologies, which are not usually supported by traditional tools of ontological engineering such as Protégé or OWL API. Finally, some experiments were conducted in order to show the effectiveness of the proposed method.

Manuel E. Puebla-Martínez, José M. Perea-Ortega, Alfredo Simón-Cuevas, Francisco P. Romero, José A. Olivas Varela

Towards the Semantic Enrichment of Existing Online 3D Building Geometry to Publish Linked Building Data

Abstract

Currently, existing online 3D databases each have their own structure according to their own needs. Additionally, the majority of online content only has limited semantics. With the advent of Semantic Web technologies, the opportunity arises to semantically enrich the information in these databases and make it widely accessible and queryable. The goal is to investigate whether online 3D content from different repositories can be processed by a single algorithm to produce the desired semantics. The emphasis of this work is on extracting building components from generic 3D building geometry and publish it as Linked Building Data.

An interpretation framework is proposed that takes as input any building mesh and outputs its components. More specifically, we use pretrained Support Vector Machines to classify the separate meshes derived from each 3D model. As a preliminary test case, realistic examples from several repositories are processed. The test results depict that, even though the building content originates from different sources and was not modeled according to any standards, it can be processed by a single machine learning application. As a result, building geometry in online repositories can be semantically enriched with component information according to classes from Linked Data ontologies such as BOT and PRODUCT. This is an important step towards making the implicit content of geometric models queryable and linkable over the Web.

Maarten Bassier, Mathias Bonduel, Jens Derdaele, Maarten Vergauwen

A Method for Automatically Generating Schema Diagrams for OWL Ontologies

Abstract

Interest in Semantic Web technologies, including knowledge graphs and ontologies, is increasing rapidly in industry and academics. In order to support ontology engineers and domain experts, it is necessary to provide them with robust tools that facilitate the ontology engineering process. Often, the schema diagram of an ontology is the most important tool for quickly conveying the overall purpose of an ontology. In this paper, we present a method for programmatically generating a schema diagram from an OWL file. We evaluate its ability to generate schema diagrams similar to manually drawn schema diagrams and show that it outperforms VOWL and OWLGrEd. In addition, we provide a prototype implementation of this tool.

Cogan Shimizu, Aaron Eberhart, Nazifa Karima, Quinn Hirt, Adila Krisnadhi, Pascal Hitzler

Conformance Test Cases for the RDF Mapping Language (RML)

Abstract

Knowledge graphs are often generated using rules that apply semantic annotations to data sources. Software tools then execute these rules and generate or virtualize the corresponding RDF-based knowledge graph. RML is an extension of the W3C-recommended R2RML language, extending support from relational databases to other data sources, such as data in CSV, XML, and JSON format. As part of the R2RML standardization process, a set of test cases was created to assess tool conformance the specification. In this work, we generated an initial set of reusable test cases to assess RML conformance. These test cases are based on R2RML test cases and can be used by any tool, regardless of the programming language. We tested the conformance of two RML processors: the RMLMapper and CARML. The results show that the RMLMapper passes all CSV, XML, and JSON test cases, and most test cases for relational databases. CARML passes most CSV, XML, and JSON test cases regarding. Developers can determine the degree of conformance of their tools, and users determine based on conformance results to determine the most suitable tool for their use cases.

Pieter Heyvaert, David Chaves-Fraga, Freddy Priyatna, Oscar Corcho, Erik Mannens, Ruben Verborgh, Anastasia Dimou

Expressive Context Modeling with Description Logics

Abstract

Modeling and verification of context-aware systems have been proven to be a challenging task mainly because expressive modeling implies, in most cases, expensive verification algorithms. On the other side, description logics have been successfully applied as a modeling and verification framework in many settings, such as in the semantic Web and bioinformatics, just to mention some. The main factor for this success is the delicate balance between expressiveness and computational cost of the corresponding algorithms in description logics. In the current work, we propose the use of an expressive description logics to model the consistency of context-aware systems. We show this expressive modeling language is capable to succinctly express complex properties, such as temporal ones.

Rolando Ramírez-Rueda, Everardo Bárcenas, Carmen Mezura-Godoy, Guillermo Molero-Castillo

Dimensions Affecting Representation Styles in Ontologies

Abstract

There are different ways to formalise roughly the same knowledge, which negatively affects ontology reuse and alignment and other tasks such as formalising competency questions automatically. We aim to shed light on, and make more precise, the intuitive notion of such ‘representation styles’ through characterising their inherent features and the dimensions by which a style may differ. This has led to a total of 28 different traits that are partitioned over 10 dimensions. The operationalisability was assessed through an evaluation of 30 ontologies on those dimensions and applicable values. It showed that it is feasible to use the dimensions and values and resulting in three easily recognisable types of ontologies. Most ontologies had clearly one or the other trait, whereas some were inherently mixed due to inclusion of different and conflicting design decisions.

Pablo Rubén Fillottrani, C. Maria Keet

Backmatter

Titel: Knowledge Graphs and Semantic Web
herausgegeben von: Boris Villazón-Terrazas
Yusniel Hidalgo-Delgado
Verlag: Springer International Publishing
Electronic ISBN: 978-3-030-21395-4
Print ISBN: 978-3-030-21394-7
DOI: https://doi.org/10.1007/978-3-030-21395-4