Skip to main content

2009 | Buch

Journal on Data Semantics XIII

herausgegeben von: Stefano Spaccapietra, Esteban Zimányi, Il-Yeol Song

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The LNCS Journal on Data Semantics is devoted to the presentation of notable work that, in one way or another, addresses research and development on issues related to data semantics. The scope of the journal ranges from theories supporting the formal definition of semantic content to innovative domain-specific applications of semantic knowledge. The journal addresses researchers and advanced practitioners working on the semantic web, interoperability, mobile information services, data warehousing, knowledge representation and reasoning, conceptual database modeling, ontologies, and artificial intelligence.

Volume XIII constitutes a special issue on semantic data warehouses. The papers in this volume address several topics within this relatively new domain, providing different insights into the multiple benefits that can be gained by envisioning data warehouses from a semantic perspective. These papers broach many new ideas to be addressed in future work.

Inhaltsverzeichnis

Frontmatter
Multidimensional Integrated Ontologies: A Framework for Designing Semantic Data Warehouses
Abstract
The Semantic Web enables organizations to attach semantic annotations taken from domain and application ontologies to the information they generate. The concepts in these ontologies could describe the facts, dimensions and categories implied in the analysis subjects of a data warehouse. In this paper we propose the Semantic Data Warehouse to be a repository of ontologies and semantically annotated data resources. We also propose an ontology-driven framework to design multidimensional analysis models for Semantic Data Warehouses. This framework provides means for building a Multidimensional Integrated Ontology (MIO) including the classes, relationships and instances that represent interesting analysis dimensions, and it can be also used to check the properties required by current multidimensional databases (e.g., dimension orthogonality, category satisfiability, etc.) In this paper we also sketch how the instance data of a MIO can be translated into OLAP cubes for analysis purposes. Finally, some implementation issues of the overall framework are discussed.
Victoria Nebot, Rafael Berlanga, Juan Manuel Pérez, María José Aramburu, Torben Bach Pedersen
A Unified Object Constraint Model for Designing and Implementing Multidimensional Systems
Abstract
Models for representing multidimensional systems usually consider that facts and dimensions are two different things. In this paper we propose a model based on UML which unifies the representations of fact and of dimension members. Since a given element can play the role of a fact or of a dimension member, this model allows for more flexibility in the design and the implementation of multidimensional systems. Moreover this model offers the possibility to express various constraints to guarantee desirable properties for data. We then show that this model is able to handle most of the hierarchies which have been suggested to take real situations into account and to characterize certain properties of summarizability. Using this model we propose a complete development cycle of a multidimensional system. It appears that this cycle can be partially automated and that an end user can control the design and the implementation of his system himself.
François Pinet, Michel Schneider
Modeling Data Warehouse Schema Evolution over Extended Hierarchy Semantics
Abstract
Models for conceptual design of data warehouse schemas have been proposed, but few researchers have addressed schema evolution in a formal way and none have presented software tools for enforcing the correctness of multidimensional schema evolution operators. We generalize the core features typically found in data warehouse data models, along with modeling extended hierarchy semantics. The advanced features include multiple hierarchies, non-covering hierarchies, non-onto hierarchies, and non-strict hierarchies. We model the constructs in the Uni-level Description Language (ULD) as well as using a multilevel dictionary definition (MDD) approach. The ULD representation provides a formal foundation to specify transformation rules for the semantics of schema evolution operators. The MDD gives a basis for direct implementation in a relational database system; we define model constraints and then use the constraints to maintain integrity when schema evolution operators are applied. This paper contributes a formalism for representing data warehouse schemas and determining the validity of schema evolution operators applied to a schema. We describe a software tool that allows for visualization of the impact of schema evolution through the use of triggers and stored procedures.
Sandipto Banerjee, Karen C. Davis
An ETL Process for OLAP Using RDF/OWL Ontologies
Abstract
In this paper, we present an advanced method for on-demand construction of OLAP cubes for ROLAP systems. The method contains the steps from cube design to ETL but focuses on ETL. Actual data analysis can then be done using the tools and methods of the OLAP software at hand. The method is based on RDF/OWL ontologies and design tools. The ontology serves as a basis for designing and creating the OLAP schema, its corresponding database tables, and finally populating the database.
Our starting point is heterogeneous and distributed data sources that are eventually used to populate the OLAP cubes. Mapping between the source data and its OLAP form is done by converting the data first to RDF using ontology maps. Then the data are extracted from its RDF form by queries that are generated using the ontology of the OLAP schema. Finally, the extracted data are stored in the database tables and analysed using an OLAP software. Algorithms and examples are provided for all these steps.
In our tests, we have used an open source OLAP implementation and a database server. The performance of the system is found satisfactory when testing with a data source of 450 000 RDF statements. We also propose an ontology based tool that will work as a user interface to the system, from design to actual analysis.
Marko Niinimäki, Tapio Niemi
Ontology-Driven Conceptual Design of ETL Processes Using Graph Transformations
Abstract
One of the main tasks during the early steps of a data warehouse project is the identification of the appropriate transformations and the specification of inter-schema mappings from the source to the target data stores. This is a challenging task, requiring firstly the semantic and secondly the structural reconciliation of the information provided by the available sources. This task is a part of the Extract-Transform-Load (ETL) process, which is responsible for the population of the data warehouse. In this paper, we propose a customizable and extensible ontology-driven approach for the conceptual design of ETL processes. A graph-based representation is used as a conceptual model for the source and target data stores. We then present a method for devising flows of ETL operations by means of graph transformations. In particular, the operations comprising the ETL process are derived through graph transformation rules, the choice and applicability of which are determined by the semantics of the data with respect to an attached domain ontology. Finally, we present our experimental findings that demonstrate the applicability of our approach.
Dimitrios Skoutas, Alkis Simitsis, Timos Sellis
Policy-Regulated Management of ETL Evolution
Abstract
In this paper, we discuss the problem of performing impact prediction for changes that occur in the schema/structure of the data warehouse sources. We abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with policies for the management of evolution events. Given a change at an element of the graph, our method detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. For many cases of ETL source evolution, we present rules so that both syntactical and semantic correctness of activities are retained. Finally, we experiment with the evaluation of our approach over real-world ETL workflows used in the Greek public sector.
George Papastefanatos, Panos Vassiliadis, Alkis Simitsis, Yannis Vassiliou
Backmatter
Metadaten
Titel
Journal on Data Semantics XIII
herausgegeben von
Stefano Spaccapietra
Esteban Zimányi
Il-Yeol Song
Copyright-Jahr
2009
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-03098-7
Print ISBN
978-3-642-03097-0
DOI
https://doi.org/10.1007/978-3-642-03098-7

Premium Partner