Skip to main content
main-content

Über dieses Buch

This book constitutes the thoroughly refereed proceedings of the 12th International Conference on Metadata and Semantic Research, MTSR 2018, held in Limassol, Cyprus, on October 23-26, 2018.
The 19 full and 16 short papers presented were carefully reviewed and selected from 77 submissions. The papers are organized in topical sections on metadata, linked data, semantics, ontologies and SKOS; digital libraries, information retrieval, big, linked, social and open data; cultural collections and applications; Knowledge IT Artifacts (KITA) in professional communities and aggregations; Digital Humanities and Digital Curation (DHC); European and national projects; agriculture, food and environment; open repositories, research information systems and data infrastructures.

Inhaltsverzeichnis

Frontmatter

Metadata, Linked Data, Semantics, Ontologies and SKOS

Frontmatter

A Semantic Web SKOS Vocabulary Service for Open Knowledge Organization Systems

In this article, the Basel Register of Thesauri, Ontologies & Classications ( BARTOC.org ) is introduced to raise awareness for an integrated, full terminology registry for knowledge organization systems. Recently, researchers have shown an increased interest in such a single access point for controlled vocabularies. The paper outlines BARTOC’s technical implementation, system architecture, and services in the light of semantic technologies. Its central thesis is that if the KOS community agreed on BARTOC as one of their main terminology registries, all involved parties would benefit from linked open knowledge organization systems.

Jonas Waeber, Andreas Ledl

Document Based RDF Storage Method for Efficient Parallel Query Processing

In this paper, we investigate the problem of efficiently evaluating SPARQL queries, over large amount of linked data utilizing distributed NoSQL system. We propose an efficient approach for partitioning large linked data graphs using distributed frameworks (MapReduce), as well as an effective data model for storing linked data in a document database using a maximum replication factor of 2 (i.e., in the worst case scenario, the data graph will be doubled in storage size). The model proposed and the partitioning approach ensure high-performance query evaluation and horizontal scaling for the type of queries called generalized star queries (i.e., queries allowing both subject-object and object-subject edges from a central node), due to the fact that no joining operations over multiple datasets are required to evaluate the queries. Furthermore, we present an implementation of our approach using MongoDB and an algorithm for translating generalized star queries into MongoDB query language, based on the proposed data model.

Eleftherios Kalogeros, Manolis Gergatsoulis, Matthew Damigos

Legal Entity Identifier Blockchained by a Hyperledger Indy Implementation of GraphChain

The main idea behind GraphChain is to use blockchain mechanisms on top of abstract RDF graphs. This paper presents an implementation of GraphChain in the Hyperledger Indy framework. The whole setting is shown to be applied to the RDF graphs containing information about Legal Entity Identifiers (LEIs). The blockchain based data management system presented in the paper preserves all the benefits of using RDF data model for the representation of LEI system reference data, including powerful querying mechanisms, explicit semantics and data model extensibility with the security and non-repudiation of LEIs as the digital identifiers for legal entities.

Mirek Sopek, Przemysław Grądzki, Dominik Kuziński, Rafał Trójczak, Robert Trypuz

Query Translation for Cross-Lingual Search in the Academic Search Engine PubPsych

We describe a lexical resource-based process for query translation of a domain-specific and multilingual academic search engine in psychology, PubPsych. PubPsych queries are diverse in language with a high amount of informational queries and technical terminology. We present an approach for translating queries into English, German, French, and Spanish. We build a quadrilingual lexicon with aligned terms in the four languages using MeSH, Wikipedia and Apertium as our main resources. Our results show that using the quadlexicon together with some simple translation rules, we can automatically translate 85% of translatable tokens in PubPsych queries with mean adequacy over all the translatable text of 1.4 when measured on a 3-point scale [0, 1, 2].

Cristina España-Bonet, Juliane Stiller, Roland Ramthun, Josef van Genabith, Vivien Petras

ViziQuer: A Visual Notation for RDF Data Analysis Queries

Visual SPARQL query notations aim at easing the RDF data querying task. At the current state of the art there is still no generally accepted visual graph-based notation suitable to describe RDF data analysis queries that involve aggregation and subqueries. In this paper we present a visual diagram-centered notation for SPARQL select query formulation, capable to handle aggregate/statistics queries and hierarchic queries with subquery structure. The notation is supported by a web-based prototype tool. We present the notation examples, describe its syntax and semantics and describe studies with possible end users, involving both IT and medicine students.

Kārlis Čerāns, Agris Šostaks, Uldis Bojārs, Juris Bārzdiņš, Jūlija Ovčiņņikova, Lelde Lāce, Mikus Grasmanis, Artūrs Sproģis

A Systematic Approach to Review Legacy Schemas Based on Ontological Analysis

Usually, data schemas are the only documentation available for legacy data. Information Technology (IT) artifacts, such as conceptual schemas, if existent, are often outdated. This leads to inconsistencies and ambiguities, as well as difficulties in reusing data. This work proposes an approach for reviewing data schemas based on ontological analysis, which considers each concept according to its nature, capturing more precisely its essence and generally improving semantic richness and precision. The idea is to provide a systematic procedure to annotate legacy data, starting with its conceptual schema, and thus to contribute to generate more consistent conceptual modeling artifacts. In order to illustrate the proposed procedure, the Unified Foundational Ontology (UFO) is used as a theoretical reference for annotating a real data schema in the Legal domain.

Raquel Lima Façanha, Maria Cláudia Cavalcanti, Maria Luiza Machado Campos

Towards a Holistic Documentation and Wider Use of Digital Cultural Heritage

This paper reviews work currently undertaken and planned to develop a more holistic approach to e-documentation of Cultural Heritage, thereby addressing the needs of a wider range of existing and potential audiences in the digital sphere. Building on the work of the ViMM Coordination and Support Action, funded under Horizon 2020, Digital Heritage Research Laboratory (DHRLab) at Cyprus University of Technology (CUT) has committed its research agenda for the years to come to the development these approaches, settings in train this vital process through three main mechanisms, aiming to create a holistic framework for DCH by carrying out the wide range of collaborative and multidisciplinary research needed within an overall construct of advanced documentation: 1. The Europeana Task Force on Advanced Documentation of 3D digital assets 2. The UNESCO Chair on Digital Heritage 3. The Mnemosyne European Research Area Chair on Digital Heritage (Horizon 2020).

Marinos Ioannides, Robert Davies

Graph Matching Based Semantic Search Engine

Explosive growth of the Web has made searching Web data a challenging task for information retrieval systems. Semantic search systems that go beyond the shallow keyword matching approaches and map words to their conceptual meaning representations offer better results to the users. On the other hand, a lot of representation formats have been specified to represent Web data into a semantic format. We propose a search engine for searching Web data represented in UNL (Universal Networking Language). UNL has numerous attractive features to support semantic search. One of the main features is that UNL does not depend on domain ontology. Our proposed search engine is based on semantic graph matching. It includes semantic expansion for graph nodes and relation matching based on relation meaning. The search results are ranked depending on the semantic similarity between the user query and the retrieved documents. We developed a prototype implementing the proposed semantic search engine, and our evaluations demonstrate its effectiveness across a wide-range of semantic search tasks.

Mamdouh Farouk, Mitsuru Ishizuka, Danushka Bollegala

SKOS-Based Concept Expansion for LOD-Enabled Recommender Systems

This paper presents a concept expansion strategy for Linked Open Data-enabled recommender systems (LDRS). This strategy is based on annotations from Simple Knowledge Organization System (SKOS) vocabularies. To this date, the knowledge structures of SKOS graphs have not yet been thoroughly explored for item similarity calculation in content-based recommender systems (RS). While some researchers have already performed an unweighted concept expansion on skos:broader links, the quantification of the relatedness of concepts from SKOS graphs with quality issues, such as the DBpedia category system, should be further investigated to improve recommendation results. For this purpose, we apply our approach in conjunction with a suitable concept-to-concept similarity metric and test it on three different LDRS datasets from the multimedia domain (i.e., movie, music and book RS). The results showed that our approach has a diversifying effect on result lists, while at least providing the same level of accuracy as a system running in non-expansion mode.

Lisa Wenige, Geraldine Berger, Johannes Ruhland

Navigating OWL 2 Ontologies Through Graph Projection

Ontologies are powerful, yet often complex, assets for representing, exchanging, and reasoning over data. Particularly, OWL 2 ontologies have been key for constructing semantic knowledge graphs. Ability to navigate ontologies is essential for supporting various knowledge engineering tasks such as querying and domain exploration. To this end, in this short paper, we describe an approach for projecting the non-hierarchical topology of an OWL 2 ontology into a graph. The approach has been implemented in two tools, one for visual query formulation and one for faceted search, and evaluated under different use cases.

Ahmet Soylu, Evgeny Kharlamov

Linked Data Live Exploration with Complete Results

Linked Data is one of the emerging ways to publish and link structured and machine-processable data on the Web, however, the existing techniques to perform live query Linked Data are based on recursive URI look-up process. These techniques contain a limitation for the query patterns having subject unbound and object containing a foreign URI. In such cases, the live query does not produce any answers to the query as the querying process could not be initiated due to unavailability of subject field in the triple pattern. In this paper, we make use of backlinking to extract and store foreign URIs and using this information for executing the queries live where the subject is unbound.

Samita Bai, Sharaf Hussain, Shakeel Khoja

The Genesis of EngMeta - A Metadata Model for Research Data in Computational Engineering

In computational engineering, numerical simulations produce huge amounts of data. To keep this research data findable, accessible, inter-operable and reusable, a structured description of the data is indispensable. This paper outlines the genesis of EngMeta – a metadata model designed to describe engineering simulation data with a focus on thermodynamics and aerodynamics. The metadata model, developed in close collaboration with engineers, is based on existing standards and adds discipline-specific information as the main contribution. Characteristics of the observed system offer researchers important search criteria. Information on the hardware and software used and the processing steps involved helps to understand and replicate the data. Such metadata are crucial to keeping the data FAIR and bridging the gap to a sustainable research data management in computational engineering.

Björn Schembera, Dorothea Iglezakis

Digital Libraries, Information Retrieval, Big, Linked, Social and Open Data

Frontmatter

Analysing and Visualising Open Data Within the Data and Analytics Framework

The principles of open data and the five-star model allow companies to develop low-cost services and Public Administrations (PA) to improve efficiency. However, the process of implementing open data models and principles is not easy unless it is supported by an appropriate technology platform. Today there is a huge number of technological platforms which each promise to be the ideal solution for opening data. Current solutions (commercial or free) do not provide users with easy access to data, nor tools for analysing and displaying data. In this paper, we discuss the potential of the DAF (Data Analytics Framework), a project based on big data, which was created by the Italian government in 2017 and which fosters the integration and standardisation of data, as well as providing three powerful tools for analysis and data visualisation. The paper will then illustrate a concrete case of dashboard development within the DAF, released at an important hackathon organised by the Italian PA sector in October 2017. The project serves as a use case in DAF implementation, where its analytical tools are used for data analysis & visualisation. They also translate a large amount of data into simple representations and use clear and effective language.

Francesca Fallucchi, Michele Petito, Ernesto William De Luca

Formalizing Enrichment Mechanisms for Bibliographic Ontologies in the Semantic Web

This paper presents an analysis of current limitations to the reuse of bibliographic data in the Semantic Web and a research proposal towards solutions to overcome them. The limitations identified derive from the insufficient convergence between existing bibliographic ontologies and the principles and techniques of linked open data (LOD); lack of a common conceptual framework for a diversity of standards often used together; reduced use of links to external vocabularies and absence of Semantic Web mechanisms to formalize relationships between vocabularies, as well as limitations of Semantic Web languages for the requirements of bibliographic data interoperability. A proposal is advanced to investigate the hypothesis of creating a reference model and specifying a superontology to overcome the misalignments found, as well as the use of SHACL (Shapes Constraint Language) to solve current limitations of RDF languages.

Helena Simões Patrício, Maria Inês Cordeiro, Pedro Nogueira Ramos

GLOBDEF: A Framework for Dynamic Pipelines of Semantic Data Enrichment Tools

Semantic data enrichment adds information to raw data to allow computational reasoning based on the meaning of data. With the introduction of Linked Data a lot of work is spent on combining existing tools for specific enhancement needs into multi-domain reusable enhancement pipeline.As part of the GloBIG project, we are working on the development of a framework for data enhancement, which attempts to be domain-agnostic and dynamically configurable. It works with pluggable enhancement modules, which are dynamically activated to create on-the-fly pipelines for data enhancement.Our research goal is to find a way for processing large amounts of data and automatically enhancing it while leveraging variety of domain knowledge sources and tools by selecting and using the most suitable ones according to the data. In this paper we present our proof-of-concept implementation of the so called GLOBDEF framework and discuss the challenges and next steps on its development.

Maria Nisheva-Pavlova, Asen Alexandrov

Ontologies for Data Science: On Its Application to Data Pipelines

Ontologies are usually applied to drive intelligent applications and also as a resource for integrating or extracting information, as in the case of Natural Language Processing (NLP) tasks. Further, ontologies as the Gene Ontology (GO) are used as an artifact for very specific research aims. However, the value of ontologies for data analysis tasks may also go beyond these uses and span supporting the reuse and composition of data acquisition, integration and fusion code. This requires that both data and code artifacts support meta-descriptions using shared conceptualizations. In this paper, we discuss the different concerns in semantically describing data pipelines as a key reusable artifact that could be retrieved, compared and reused with a degree of automation if semantically consistent descriptions are provided. Concretely, we propose attaching semantic descriptions for data and analytic transformations to current backend-independent distributed processing frameworks as Apache Beam, as these already abstract out the specificity of supporting execution engines.

Miguel-Ángel Sicilia, Elena García-Barriocanal, Salvador Sánchez-Alonso, Marçal Mora-Cantallops, Juan-José Cuadrado

Relating Legal Entities via Open Information Extraction

Concepts and relations within existing ontologies usually represent limited subjective and application-oriented views of a domain of interest. However, reusing resources and fine-grained conceptualizations is often challenging and requires significant manual efforts of adaptation to fit with unprecedented usages. In this paper, we present a system that makes use of recent Open Information Extraction technologies to unravel and explore corpus-centered unknown relations in the legal domain.

Giovanni Siragusa, Rohan Nanda, Valeria De Paiva, Luigi Di Caro

Ontology-Based Information Retrieval: Development of a Semantic-Based Tool for the Media Industry

This paper describes the creation of an RDF ontology designed to support information retrieval needs of journalists and media professionals.The purpose of the ontology is to complete the automated extensions of query terms by using the relationships between the concepts and terms registered in the ontology. By using this ontology, end-users can identify additional concepts that are related to the selected topic, and incorporate new terms to the query that will be later launched against a full-text indexer based on SOLR.The ontology focuses on politics, and has been successfully tested with the collaboration of a large Spanish media company. The ontology contributes to a better recall of the search results.

Ricardo Eito-Brun

Cultural Collections and Applications

Frontmatter

Evaluating Data Quality in Europeana: Metrics for Multilinguality

Europeana.eu aggregates metadata describing more than 50 million cultural heritage objects from libraries, museums, archives and audiovisual archives across Europe. The need for quality of metadata is particularly motivated by its impact on user experience, information retrieval and data re-use in other contexts. One of the key goals of Europeana is to enable users to retrieve cultural heritage resources irrespective of their origin and the material’s metadata language. The presence of multilingual metadata descriptions is therefore essential for successful cross-language retrieval. Quantitatively determining Europeana’s cross-lingual reach is a prerequisite for enhancing the quality of metadata in various languages. Capturing multilingual aspects of the data requires us to take into account the full lifecycle of data aggregation including data enhancement processes such as automatic data enrichment. The paper presents an approach for assessing multilinguality as part of data quality dimensions, namely completeness, consistency, conformity and accessibility. We describe the measures defined and implemented, and provide initial results and recommendations.

Péter Király, Juliane Stiller, Valentine Charles, Werner Bailer, Nuno Freire

The Benefits of Linking Metadata for Internal and External Users of an Audiovisual Archive

Like other heritage institutions, audiovisual archives adopt structured vocabularies for their metadata management. With Semantic Web and Linked Data now becoming more and more stable and commonplace technologies, organizations are looking now at linking these vocabularies to external sources, for example those of Wikidata, DBPedia or GeoNames. However, the benefits of such endeavors to the organizations are generally underexplored. In this paper, we present an in-depth case study into the benefits of linking the “Common Thesaurus for Audiovisual Archives” (or GTAA) and the general-purpose dataset Wikidata. We do this by identifying various use cases for user groups that are both internal as well as external to the organization. We describe the use cases and various proofs-of-concept prototypes that address these use cases.

Victor de Boer, Tim de Bruyn, John Brooks, Jesse de Vos

Authify: The Reconciliation of Entities at Scale

Libraries’ shift to the semantic web has been underway for a number of years. Mellon funded projects such as Linked Data for Production (LD4P) [1] or the BIBFRAME European Workshop 2018 in Florence [2] show the commitment of national, public, and academic libraries, as well as vendors, to this transition. Libraries worldwide, however, are enmeshed in hundreds of millions of metadata records communicated through flat files (the MARC formats) [3]. The shift to linked data will require the conversion of these flat files to a semantically expressive model such as the Resource Description Framework (RDF) [4]. The conversion of such large amounts of semantically inexpressive data to semantically rich data will require automated enhancements in the conversion process. Data hidden within the flat files, such as role (author, illustrator, composer, etc.), can greatly aid with the reconciliation of entities within those files. Authify is one of the first tools available to libraries to both convert their metadata to linked data, but also enrich the reconciliation process with semantic data hidden within the MARC fields. As libraries look to convert their legacy data to linked data, Authify can help them move their data to the Web in as a semantically rich way as possible.

Philip E. Schreur, Tiziana Possemato

Assessing the Preservation of Derivative Relationships in Mappings from FRBR to BIBFRAME

Support of the exploration user task demands the explicit representation of bibliographic families and of content relationships. Seamless navigation through differently modelled bibliographic datasets presumes the existence of mappings. Semantic interoperability through mappings will be evaluated using a testbed. This paper starts with the fine-tuning of a testbed for mappings from FRBR to BIBFRAME. Two Gold Standards datasets have been created along with a mechanism for the mapping of core entities and the derivation relationship from FRBR to BIBFRAME. This first attempt has revealed that derivations expressed at the FRBR Expression level are mapped to BIBFRAME more adequately than those expressed at the FRBR Work level.

Sofia Zapounidou, Michalis Sfakakis, Christos Papatheodorou

Metadata Standards for Palm Leaf Manuscripts in Asia

The goal of this research is to facilitate, as effectively as possible, user access to and use of the knowledge recorded on palm leaf manuscripts (PLMs). At the same time, the schema should serve as a standard information structure to be used in the management of PLMs and other digitized ancient documents. This will also make the linking of Asian cultural heritage and wisdom with those of countries in this region possible via the internet. Accordingly, this research aims to develop metadata schema for the management of PLMs collections to increase efficiency in the search, access, use, and management. There are four parts in this study: (1) the current state of PLMs management in Asia and the use of PLMs metadata schema in working projects were investigated, then (2) the elements were analyzed and grouped by functions, (3) the core elements were matched to KKUPLMMs 2012, 2015 and IFLA LRM User Tasks and (4) a Focus group was set up to evaluate the framework.

Nisachol Chamnongsri

Knowledge IT Artifacts (KITA) in Professional Communities and Aggregations

Frontmatter

Knowledge Artifacts for the Health: The PERCIVAL Project

Quality of life (QoL) of patients affected by chronic diseases and their caregivers is a very important and inter-disciplinar research topic. From recent literature, it emerges the need for new methodologies capable to reduce the impact of a chronic disorders on everyday life of affected people and their relatives, especially when they are geographically far from care centers: the PERsonal Care Instructor and VALuator (PERCIVAL) project, collaboration between the REDS Lab and Educational Factory srl, is a first attempt to build up an integrated environment to promote the sharing, deliberation and monitoring of decisions about different aspects of chronic diseases among all the actors involved.

Fabio Sartori, Riccardo Melen, Matteo Lombardi, Davide Maggiotto

Artfacts - A Platform for Making Sense of and Telling Stories with Cultural Objects

This paper presents the conceptualization, implementation, and evaluation of a Fast-speed IT Platform called Artfacts, which was designed within the context of the Two-speed IT Infrastructure, where a foundational, stable, and slow infrastructure is complemented by a creative, experimental, and agile additional infrastructure capable of promptly responding to the needs of communities. The platform is an attempt to digitally incorporate strategies for making sense and reusing digital collections and mitigate problems concerning specialized knowledge required for profiting from the affordances of data repositories as a creative material. In this sense, through the cartography of information, the platform aims at widening the participation of individuals with no technical background in the development and maintenance process of interpretive applications, no matter whether within cultural institutions or events such as hackathons for cultural heritage. Artfacts intermediates the reinterpretation of cultural datasets and the fabrication of interpretive applications by means of a flexible, general, and interoperable data model that is able to adapt to the demands of storytellers, and an open-ended Object-Oriented UI that enables analysis and experimentation by arranging and rearranging data elements into digital narratives.

Leonardo de Araújo

A Semantic-Based Metadata Schema to Handle System and Software Configuration

Configuration management is a key process in the system and software engineering. Product integrity is a key requirement in the development of software-based solutions, and Configuration Management defines the set of practices aimed to ensure the consistency and coherence of the product during its full life cycle. Although there are different standards that establish the principles of Configuration Management, in complex projects that require the interaction of several entities and companies, a more precise, detailed specification about how to report, compare and handle configuration data is needed. This constitutes an interesting opportunity for metadata management professionals.This contribution presents the development of a tool for managing configuration management data. The case study - developed in the context of a system engineering company – makes use of semantic web languages (RDF, OWL) and technologies to support engineers in the registration, analysis, reporting and auditing of products’ configuration. The solution defines different metadata used to handle configuration status, and a technical solution to handle them.

Ricardo Eito-Brun

Digital Humanities and Digital Curation (DHC)

Frontmatter

Connecting and Mapping LOD and CMDI Through Knowledge Organization

This paper explains the connection and mapping of knowledge representations between RDF and CMDI. Therefore, the challenge is to create a bridge between Linked Open Data (LOD) and the Component MetaData Infrastructure (CMDI) to ensure that the limits of the two paradigms are compensated and strengthened to create a new hybrid approach. While on the one hand, CMDI is easier to use for modelling purposes, the Metadata is not descriptive enough for a document to be easily discoverable using Linked Data (LD) technologies to publish and to enrich the document’s content. Yet on the other hand, the explicit semantics and high interoperability of LOD have many advantages, but its modelling process is too complex for non-expert users. Here we show how knowledge organization plays a crucial role in this issue.

Francesca Fallucchi, Ernesto William De Luca

Creating CMDI-Profiles for Textbook Resources

This paper analyses the establishment of a common infrastructure standard covering metadata, content, and inferred knowledge to allow collaborative work between researchers in the humanities. Interoperability between heterogeneous resources and services is the key for a properly functioning infrastructure. In this paper, we present a digital infrastructure of our textbook-related services and data, which are available and open for researchers worldwide. In this process we adhere to established standards and provide APIs for other services. In order to integrate our resources and tools into the CLARIN infrastructure and make them discoverable in the VLO (Virtual Language Observatory), we decided to use CMDI (Component MetaData Infrastructure). We focus in this paper on the creation process for a CMDI metadata profile which fulfils the needs of our projects.

Francesca Fallucchi, Hennicke Steffen, Ernesto William De Luca

European and National Projects

Frontmatter

Towards a Knowledge Graph Based Platform for Public Procurement

Procurement affects virtually all sectors and organizations particularly in times of slow economic recovery and enhanced transparency. Public spending alone will soon exceed EUR 2 trillion per annum in the EU. Therefore, there is a pressing need for better insight into, and management of government spending. In the absence of data and tools to analyse and oversee this complex process, too little consideration is given to the development of vibrant, competitive economies when buying decisions are made. To this end, in this short paper, we report our ongoing work for enabling procurement data value chains through a knowledge graph based platform with data management, analytics, and interaction.

Elena Simperl, Oscar Corcho, Marko Grobelnik, Dumitru Roman, Ahmet Soylu, María Jesús Fernández Ruíz, Stefano Gatti, Chris Taggart, Urška Skok Klima, Annie Ferrari Uliana, Ian Makgill, Till Christopher Lech

Metadata for Large-Scale Research Instruments

The work outlines diverse effort of a few initiatives for metadata and attribution mechanisms that can be used for large-scale instruments hosted by shared research facilities. Specifically, the role of persistent identifiers and associated metadata is considered, in relation to cases where the use of references to large-scale instruments can support research impact studies and Open Science agenda. A few routes for the adoption of large-scale instruments metadata are outlined, with indication of their advantages and limitations.

Vasily Bunakov

Agriculture, Food and Environment

Frontmatter

Identification and Exchange of Regulated Information on Chemicals: From Metadata to Core Vocabulary

Regulatory bodies perform risk assessments of chemicals and produce regulatory outcomes: evaluations and decisions on chemicals and conditions of their use. Access to scientifically proven and already regulated information becomes crucial for their efficient work and consistent decisions. Exchanging and reusing information relies on common understanding of the main concepts. Here is the challenge: even if the regulations and industry standards provide definitions of chemical substance, the interpretation poses some issues. This paper introduces a concept of Regulated Substance and aims to highlight the complexity of implementing semantic interoperability on regulated information between different parties.The regulatory activities of European Chemicals Agency (ECHA) overlap, follow or trigger activities performed by other authorities. Capabilities to exchange the information and having access to shared databases can increase regulatory benefits. The common initiative of European Commission, Publication Office, and EU agencies is looking into possibilities to exchange the information. One of the tools promoted by Publication Office – Core Vocabularies – is meant to facilitate interoperability between authorities. The initiative will build foundations for access to the public repositories of the regulated information and non-confidential scientific data for academia and researchers via Open Linked Data.

Alicja Agnieszka Dys

Semantics for Data in Agriculture: A Community-Based Wish List

The paper reports on activities carried within the Agrisemantics Working Group of the Research Data Alliance (RDA). The group investigated on what are the current problems research and practitioners experience in their work with semantic resources for agricultural data and elaborated the list of requirements that are the object of this paper. The main findings include the need to broaden the usability of tools so as to make them useful and available to the variety of profiles usually involved in working with semantics resources; the need to online platform to lift users from the burden of local installation; and the need for services that can be integrated in workflows. We further analyze requirements concerning the tools and services and provide details about the process followed to gather evidence from the community.

Caterina Caracciolo, Sophie Aubin, Brandon Whitehead, Panagiotis Zervas

Development of Methodologies and Standardized Services for Supporting Forest Economics

In the Mediterranean region, many types of forests are non-productive or degraded, although they could substantially contribute to growth of local economies. In Greece, 30% of the total area is covered by forests, however their contribution to the GDP is almost non-existent. An example is the chestnut production in Thessaly region of Greece, and especially in Mouzaki municipality, which is almost abandoned due to insufficient agricultural policies concerning establishment of alternative crops, and consequently leads to loss of potential income for the rural economy. The ARTEMIS project, funded by the Greek Secretariat for Research and Technology, aims at delivering an innovative information platform providing systematically high quality Earth Observation based products and services for monitoring forest health and supporting eventually the growth of forestry related economy and market. The architecture of the proposed platform will incorporate new OGC/ISO technologies, while the applicability of existing metadata standards for management of geospatial datasets will be evaluated. A pilot implementation of the developed system will be conducted in a selected area in Thessaly region of Greece.

Thomas Katagis, Nikolaos Grammalidis, Evangelos Maltezos, Vasiliki Charalampopoulou, Ioannis Z. Gitas

Open Repositories, Research Information Systems and Data Infrastructures

Frontmatter

Open Citation Content Data

There are several projects in the research community to make the citation data extracted from research papers more re-usable. This paper presents results from the CyrCitEc project to create a publicly available source of open citation content data extracted from PDF papers available at a research information system. To reach this aim the project team has created four outputs: (1) an open source software to parse papers’ metadata and full text PDFs; (2) an open service to process papers’ PDFs to extract citation data; (3) a dataset of citation data, including citation contexts (currently mostly for papers in Cyrillic); and (4) a visualization tool that provides users insight into the citation data extraction process and gives some control over the citation data parsing quality.

Mikhail Kogalovsky, Thomas Krichel, Victor Lyapunov, Oxana Medvedeva, Sergey Parinov, Varvara Sergeeva

The Case for Ontologies in Expressing Decisions in Decentralized Energy Systems

Advanced in technologies for the decentralization of applications have enabled micro-grid energy systems that do not rely on central control and optimization but are controlled by their owners. This may eventually enable consumers or intermediaries to specify concrete and diverse conditions on the supply that not only concern throughput, price and stability but also elements as provenance (e.g. that energy is produced from renewable sources) or locality among others. Blockchain technologies have emerged as a possible solution for the integration of the stream of events generated by smart meters and networks, providing tamper-proof ledgers for offerings, transactions and traces. However, that requires languages for expressing conditions that might become complex and have to be executed locally. In this paper, we review the state of decentralization in micro-grids and its requirements, and discuss the role of ontologies as support for expressing constraints in those networks.

Elena García-Barriocanal, Miguel-Ángel Sicilia, Salvador Sánchez-Alonso

Backmatter

Weitere Informationen