Skip to main content

2018 | Buch

Semantic Applications

Methodology, Technology, Corporate Use

herausgegeben von: Dr. Thomas Hoppe, Prof. Dr. Bernhard Humm, Anatol Reibold

Verlag: Springer Berlin Heidelberg

insite
SUCHEN

Über dieses Buch

This book describes methodologies for developing semantic applications. Semantic applications are software applications which explicitly or implicitly use the semantics, i.e. the meaning of a domain terminology, in order to improve usability, correctness, and completeness. An example is semantic search, where synonyms and related terms are used for enriching the results of a simple text-based search. Ontologies, thesauri or controlled vocabularies are the centerpiece of semantic applications.

The book includes technological and architectural best practices for corporate use. The authors are experts from industry and academia with experience in developing semantic applications.

Inhaltsverzeichnis

Frontmatter
1. Introduction to Semantic Applications
Abstract
Semantic applications today provide benefits to numerous organisations in business sectors such as health care, finance, industry, and the public sector. These applications use the semantics of a domain in order to improve usability, correctness, and completeness. Developing semantic applications requires methodological skills, e.g., ontology engineering, quality assurance for ontologies, and licence management. Various technologies are available for implementing semantic applications, e.g., data integration, semantic search, machine learning, and complex event processing. This chapter gives an overview of methodologies, technologies, and corporate use of semantic applications.
Wolfram Bartussek, Hermann Bense, Thomas Hoppe, Bernhard G. Humm, Anatol Reibold, Ulrich Schade, Melanie Siegel, Paul Walsh
2. Guide for Pragmatical Modelling of Ontologies in Corporate Settings
Abstract
The application of semantic technologies in a corporation sometimes requires modelling specialized ontologies under the requirements imposed by the corporate setting. Often a proof of concept needs to show the usefulness of a semantic application first, before additional investments will be made into the technology. Therefore, an initial ontology needs to be developed quickly with limited resources to demonstrate the usefulness of a semantic application. This chapter describes a practical and pragmatical approach for resource-limited modelling of ontologies. Guided by rules, this modelling approach starts with the modelling of a thesaurus which later can be extended to a full-fledged ontology. The initial step of modelling a thesaurus, allows for the development of a proof of concept first and to put the semantic application into productive use early, in order to acquire additional insights and information about its usage.
Thomas Hoppe, Robert Tolksdorf
3. Compliance Using Metadata
Abstract
Everybody talks about the data economy. Data is collected stored, processed and re-used. In the EU, the GDPR creates a framework with conditions (e.g. consent) for the processing of personal data. But there are also other legal provisions containing requirements and conditions for the processing of data. Even today, most of those are hard-coded into workflows or database schemes, if at all. Data lakes are polluted with unusable data because nobody knows about usage rights or data quality. The approach presented here makes the data lake intelligent. It remembers usage limitations and promises made to the data subject or the contractual partner. Data can be used as risk can be assessed. Such a system easily reacts on new requirements. If processing is recorded back into the data lake, the recording of this information allows to prove compliance. This can be shown to authorities on demand as an audit trail. The concept is best exemplified by the SPECIAL project https://​specialprivacy.​eu (Scalable Policy-aware Linked Data Architecture For Privacy, Transparency and Compliance). SPECIAL has several use cases, but the basic framework is applicable beyond those cases.
Rigo Wenning, Sabrina Kirrane
4. Variety Management for Big Data
Abstract
Of the core challenges originally associated with Big Data, namely Volume, Velocity, and Variety, the Variety aspect is the one that is least addressed by the standard analytics architectures. In this chapter, we analyze types and sources of variety and describe data- and metadata management principles for organizing data lakes. We discuss how semantic metadata can help describe and manage variety in structure, provenance, visibility and permitted use. Moreover, ontologies and metadata catalogs can aid discovery, navigation, exploration, and interpretation of heterogeneous data lakes, and can simplify interpretation, lift data quality, and simplify integration of multiple data sets. We present an application of these principles in a data architecture for the Law Enforcement domain in Australia.
Wolfgang Mayer, Georg Grossmann, Matt Selway, Jan Stanek, Markus Stumptner
5. Text Mining in Economics
Abstract
Annual business reports – containing consolidated financial statements and management reports, as well as other information – are statutory instruments of financial accounting in Germany. They are an important source of information for business analysts. The information in management reports is mostly unstructured text, and is therefore complex for an algorithm to analyze. For some analysis questions related to economic research, we have identified techniques from speech technology that can effectively support the analysis. We have implemented these techniques in a prototype. It became clear that an approach based on semantic analysis and ontological information is useful for this purpose. Natural Language Processing (NLP) techniques are used to help building an ontology database.
Melanie Siegel
6. Generating Natural Language Texts
Abstract
This chapter is about natural language texts generated automatically. We will discuss the motivation as to why these texts are needed and discuss which domains will profit from such texts. During recent years NLG technology has matured. Millions of news articles are generated daily. However, there is potential for higher quality and more elaborated styles. For this purpose, new techniques have to be developed. These techniques need to be semantically driven. As examples, we will discuss (a) the use and the retrieval of background information by following paths in knowledge graphs, (b) how to calculate and exploit information structure, and (c) how to hyper-personalise automatically generated news.
Hermann Bense, Ulrich Schade, Michael Dembach
7. The Role of Ontologies in Sentiment Analysis
Abstract
In recent years, it has become the standard behaviour of consumers to read and write evaluations of other consumers before buying a product or a service. For the companies (such as hotels, authors, producers, etc.), this is a great opportunity to learn more about what is important to their customers and what they do not like. However, this is only possible if they can quickly extract the information from the opinions expressed by the customers – sentiment analysis – which requires automatic data processing in the case of large data volumes. Sentiment analysis depends heavily on words: sentiment words, negations, amplifiers, and words for the product or its aspects. If the sentiment analysis is to achieve more than just classifying a sentence as positive or negative, and if it needs to identify the liked or hated attributes of a product and the scope of negation, it needs linguistic and ontological knowledge.
Melanie Siegel
8. Building Concise Text Corpora from Web Contents
Abstract
This is a report on ongoing work done in a research project for Small and Medium-sized Enterprises (SMEs), funded by the German Federal Ministry of Education and Research (Funding ID: 01IS15056D; project duration: Jan 2016 – Dec 2017). The project, named OntoPMS, is targeted at post market surveillance (PMS) of medical devices as required by the medical device regulation (Medical Device Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices, OJ. L, pp 1–175, 2017) which entered into power following formal publication in May 2017. Being a regulation, it is immediately legally binding in all member states of the European Union. This project aims at providing both technical support and assisting procedures to satisfy article 4 of the MDR: “Key elements of the existing regulatory approach, such as the supervision of notified bodies, conformity assessment procedures, clinical investigations and clinical evaluation, vigilance and market surveillance should be significantly reinforced, whilst provisions ensuring transparency and traceability regarding medical devices should be introduced, to improve health and safety.” This chapter focuses on one component of the software system under development, the corpus builder. This component retrieves scientific publications of interest from the web and other sources, checks them for relevance and transfers them to a linguistic corpus and in parallel to a search engine based on the open source package Elasticsearch. The challenge was, in this case, not to take everything that one can get hold of (whole web crawling) but to find and to take only those publications that really belong to the domain of interest and are relevant with respect to surveillance aspects. So, the dictum was to build comprehensive yet minimal corpora for the purposes at hand. Although the software has been developed in the context of medical device PMS, its use is not bound in any way to this specific application area.
Wolfram Bartussek
9. Ontology-Based Modelling of Web Content: Example Leipzig Health Atlas
Abstract
The realisation of a complex web portal, including the modelling of content, is a challenging process. The contents describe different interconnected entities that form a complex structure. The entities and their relations have to be systematically analysed, and the content has to be specified and integrated into a content management system (CMS). Ontologies provide a suitable solution for modelling and specifying complex entities and their relations. However, the functionality for automated import of ontologies is not available in current content management systems.
In order to describe the content of a web portal, we developed an ontology. Based on this ontology, we implemented a pipeline that allows the specification of the portal’s content and its import into the CMS Drupal. Our method is generic. It enables the development of web portals with the focus on a suitable representation of structured knowledge (entities, their properties and relations). Furthermore, it makes it possible to represent existing ontologies in such a way that their content can be understood by users without knowledge of ontologies and their semantics.
Our approach has successfully been applied in building the LHA (Leipzig Health Atlas) portal, which provides access to metadata, data, publications and methods from various research projects at the University of Leipzig.
Alexandr Uciteli, Christoph Beger, Katja Rillich, Frank A. Meineke, Markus Loeffler, Heinrich Herre
10. Personalised Clinical Decision Support for Cancer Care
Abstract
Medical consultants face increasing challenges in keeping up-to-date with the rapid development of new treatments and medications. Information providers offer evidence-based medical information services, continuously taking into account new publications and medical developments. This article describes a personalised clinical decision support system for cancer care. Data from evidence-based medical knowledge services are semantically linked with electronic health records and presented to consultants at the point-of-care.
Bernhard G. Humm, Paul Walsh
11. Applications of Temporal Conceptual Semantic Systems
Abstract
The challenging problem of describing and understanding multidimensional temporal data in practice can often be solved by employing Formal Concept Analysis (FCA) and its temporal extension, Temporal Concept Analysis (TCA). These mathematical theories are based on a formal representation of the philosophical notion of concept. Using concept lattices constructed from (temporal) data a general notion of a state of a temporal object is introduced. This notion is granularity dependent such that factorisations of temporal systems with their state spaces and trajectories of temporal objects can be generated easily. This is demonstrated by an example in the chemical industry where the behavior of a distillation column is made semantically understandable by graphically representing seven variables simultaneously.
Karl Erich Wolff
12. Context-Aware Documentation in the Smart Factory
Abstract
In every factory environment, errors and maintenance situations may occur. They must be handled quickly and accurately. This article describes a semantic application for automatically retrieving technical documentation for fixing such errors and presenting them to factory personnel. For this, machine raw data is collected and semantically enriched using Complex Event Processing (CEP). Semantic events are mapped to technical documentation via an ontology. Particular focus is drawn on the user experience of the semantic application.
Ulrich Beez, Lukas Kaupp, Tilman Deuschel, Bernhard G. Humm, Fabienne Schumann, Jürgen Bock, Jens Hülsmann
13. Knowledge-Based Production Planning for Industry 4.0
Abstract
Today and tomorrow – in the era of Digital Production Environments and Industry 4.0 – the production planning and manufacturing of a new product takes place in various partial steps, mostly in different locations, potentially distributed all over the world. In this application context, a Collaborative Adaptive Production Process Planning can be supported by semantic product data management approaches enabling production-knowledge representation and management as well as knowledge sharing, access, and reuse in a flexible and efficient way. To support such scenarios, semantic representations of production-knowledge integrated into a machine-readable process formalization is a key enabling factor for sharing such explicit knowledge resources in cloud-based knowledge repositories. We will introduce such a method and provide a corresponding prototypical Proof-of-Concept implementation called Knowledge-Based Production Planning (KPP).
Furthermore, the ProSTEP iViP Association recently published a White Paper entitled “Modern Production Planning Processes” that is based on the currently emerging ISO/DIS 18828-2 Standard. This recommendation represents a formal end-to-end reference process that can be adapted to individual needs, the so-called Reference Planning Process (RPP).
In this chapter, we will explain KPP in detail. Further, as a basis for evaluation and validation, we use the KPP approach as a possible reference implementation of RPP. We will also demonstrate the usability and interoperability of the Proof-of-Concept implementation of KPP. This includes an integrated visually direct manipulative process editor. Moreover, we will illustrate the first prototype of the KPP Mediator Architecture including a user-friendly query library based on the KPP ontology.
Benjamin Gernhardt, Tobias Vogel, Matthias Hemmje
14. Automated Rights Clearance Using Semantic Web Technologies: The DALICC Framework
Abstract
The creation of derivative data works, e.g. for purposes such as content creation, service delivery or process automation, is often accompanied by legal uncertainty about usage rights and high costs in the clearance of licensing issues. DALICC stands for Data Licenses Clearance Center. It supports legal experts, innovation managers and application developers in the legally secure reutilization of third party data and software. DALICC is a Semantic Web enabled software framework which allows the attaching of licenses in a machine readable format to a specific asset and supports the clearance of rights by providing the user with information about equivalence, similarity and compatibility between licenses if used in combination in a derivative work. In essence, DALICC helps to determine which information can be shared with whom, to what extent and under which conditions, thus lowering the costs of rights clearance and stimulating the data economy.
Tassilo Pellegrini, Victor Mireles, Simon Steyskal, Oleksandra Panasiuk, Anna Fensel, Sabrina Kirrane
15. Managing Cultural Assets: Challenges for Implementing Typical Cultural Heritage Archive’s Usage Scenarios
Abstract
In the domain of cultural heritage, a lot of archives and data collections already exist. Thus, one of the main challenges for curators and archivists in this domain consists of exchanging data with other archives and integrating similar collections. Typical usage scenarios in this area can only be realized by semantically integrating all available data sources. Therefore, the main task of semantic integration consists of bridging the different levels of heterogeneity. Semantic Web technologies might be a solution for these challenges. Ontology Matching, for example, is already successfully applied for bridging some of the heterogeneity types. However, it is not suitable for dissolving heterogeneity conflicts at all levels due to unsatisfactory matching qualities. Still, matching is crucial for the semantic integration success of the distributed data sources. As a result, a main part in semantic integration is still done manually by domain experts. It would be preferable to support their work by at least semi-automatic techniques.
In addition, a huge amount of standard vocabularies and taxonomies, which are commonly used, already exist in the domain of cultural heritage. Their automatic matching and corresponding resolving of heterogeneities between them by appropriate mappings could facilitate semantic integration. Selecting and applying fitting prevalent vocabularies and taxonomies could ease these efforts. Furthermore, standards for archives exist even on the conceptual level and their application supports digital archives in their main tasks.
The book chapter describes typical usage scenarios in the domain of cultural heritage archives and discusses the use of Semantic Web technologies to implement these scenarios. In addition, commonly used standards and vocabularies in this area are presented and ways to integrate these standards are discussed.
Kerstin Diwisch, Felix Engel, Jason Watkins, Matthias Hemmje
16. The Semantic Process Filter Bubble
Abstract
Business process models form a relevant element of enterprise knowledge in general and can be used to navigate through enterprise knowledge spaces. Tagging models and model elements and identifying users and their roles helps for better recommendation of pertaining documents, guidelines or other types of information. Thus, a filtering bubble is created that supports model readers in accessing relevant information.
Christian Fillies, Frauke Weichhardt, Henrik Strauß
17. Domain-Specific Semantic Search Applications: Example SoftwareFinder
Abstract
Domain-specific semantic search applications extend traditional full-text search by semantic application logic, supporting a specific use case. This article describes SoftwareFinder, a semantic search application for software components. Features are semantic faceted search, semantic auto-suggest, and similar product recommendations. Software architecture rationales, as well as a methodology for ontology development are presented.
Bernhard G. Humm, Hesam Ossanloo
Backmatter
Metadaten
Titel
Semantic Applications
herausgegeben von
Dr. Thomas Hoppe
Prof. Dr. Bernhard Humm
Anatol Reibold
Copyright-Jahr
2018
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-55433-3
Print ISBN
978-3-662-55432-6
DOI
https://doi.org/10.1007/978-3-662-55433-3

Premium Partner