Skip to main content

2008 | Buch

Advanced Data Warehouse Design

From Conventional to Spatial and Temporal Applications

verfasst von: Elzbieta Malinowski, Esteban Zimányi

Verlag: Springer Berlin Heidelberg

Buchreihe : Data-Centric Systems and Applications

insite
SUCHEN

Über dieses Buch

A data warehouse stores large volumes of historical data required for analytical purposes. This data is extracted from operational databases; transformed into a coherent whole using a multidimensional model that includes measures, dimensions, and hierarchies; and loaded into a data warehouse during the extraction-transformation-loading (ETL) process.

Malinowski and Zimányi explain in detail conventional data warehouse design, covering in particular complex hierarchy modeling. Additionally, they address two innovative domains recently introduced to extend the capabilities of data warehouse systems, namely the management of spatial and temporal information. Their presentation covers different phases of the design process, such as requirements specification, conceptual, logical, and physical design. They include three different approaches for requirements specification depending on whether users, operational data sources, or both are the driving force in the requirements gathering process, and they show how each approach leads to the creation of a conceptual multidimensional model. Throughout the book the concepts are illustrated using many real-world examples and completed by sample implementations for Microsoft's Analysis Services 2005 and Oracle 10g with the OLAP and the Spatial extensions.

For researchers this book serves as an introduction to the state of the art on data warehouse design, with many references to more detailed sources. Providing a clear and a concise presentation of the major concepts and results of data warehouse design, it can also be used as the basis of a graduate or advanced undergraduate course. The book may help experienced data warehouse designers to enlarge their analysis possibilities by incorporating spatial and temporal information. Finally, experts in spatial databases or in geographical information systems could benefit from the data warehouse vision for building innovative spatial analytical applications.

Inhaltsverzeichnis

Frontmatter
1. Introduction
Organizations today are facing increasingly complex challenges in terms of management and problem solving in order to achieve their operational goals. This situation compels people in those organizations to utilize analysis tools that will better support their decisions. Decision support systems provide assistance to managers at various organizational levels for analyzing strategic information. These systems collect vast amount of data and reduce it to a form that can be used to analyze organizational behavior [54].
Elzbieta Malinowski, Esteban Zimányi
2. Introduction to Databases and Data Warehouses
This chapter introduces the basic concepts of databases and data warehouses. It compares the two fields and stresses the differences and complementarities between them. The aim of this chapter is to define the terminology and the framework used in the rest of the book, not to provide an extensive coverage of these fields. The outline of this chapter is as follows.
Elzbieta Malinowski, Esteban Zimányi
3. Conventional Data Warehouses
The advantages of using conceptual models for designing applications are well known. In particular, conceptual models facilitate communication between users and designers, since they do not require knowledge about specific features of the underlying implementation platform. Further, schemas developed using conceptual models can be mapped to various logical models, such as relational, object-relational, or object-oriented models, thus simplifying responses to changes in the technology used. Moreover, conceptual models facilitate the maintenance and evolution of applications, since they focus on users’ requirements; as a consequence, they provide better support for subsequent changes in the logical and implementation schemas.
Elzbieta Malinowski, Esteban Zimányi
4. Spatial Data Warehouses
It is estimated that about 80% of the data stored in databases has a spatial or location component [260]. Therefore, the location dimension has been widely used in data warehouse and OLAP systems. However, this dimension is usually represented in an alphanumeric, nonspatial manner (i.e., using solely the place name), since these systems are not able to manipulate spatial data. Nevertheless, it is well known that including spatial data in the analysis process can help to reveal patterns that are difficult to discover otherwise.
Elzbieta Malinowski, Esteban Zimányi
5. Temporal Data Warehouses
Current data warehouse and OLAP models include a time dimension that, like other dimensions, is used for grouping purposes (using the roll-up operation) or in a predicate role (using the slice-and-dice operation). The time dimension also indicates the time frame for measures (for example, in order to know how many units of a product were sold in March 2007). However, the time dimension cannot be used to keep track of changes in other dimensions, for example, when a product changes its ingredients or its packaging. Consequently, the “nonvolatile" and “time-varying" features included in the definition of a data warehouse (Sect. 2.5) apply only to measures, and this situation leaves to applications the responsibility of representing changes in dimensions. Kimball et al. [147] proposed several solutions for this problem in the context of relational databases, the slowly changing dimensions. Nevertheless, these solutions are not satisfactory, since they either do not preserve the entire history of the data or are difficult to implement. Further, they do not take account of all research that has been done in the field of temporal databases.
Elzbieta Malinowski, Esteban Zimányi
6. Designing Conventional Data Warehouses
The development of a data warehouse is a complex and costly endeavor. A data warehouse project is similar in many aspects to any software development project and requires definition of the various activities that must be performed, which are related to requirements gathering, design, and implementation into an operational platform, among other things. Even though there is an abundant literature in the area of software development (e.g., [48, 248, 282]), few publications have been devoted to the development of data warehouses. Some of these publications [15, 114, 119, 146, 242] have been written by practitioners and are based on their experience in building data warehouses. On the other hand, the scientific community has proposed a variety of approaches for developing data warehouses [28, 29, 35, 42, 47, 79, 82, 114, 167, 203, 221, 237, 246]. Nevertheless, many of these approaches target a specific conceptual model and are often too complex to be used in real-world environments. As a consequence, there is still a lack of a methodological framework that could guide developers in the various stages of the data warehouse development process. This situation results from the fact that the need to build data warehouse systems arose before the definition of formal approaches to data warehouse development, as was the case for operational databases [166].
Elzbieta Malinowski, Esteban Zimányi
7. Designing Spatial and Temporal Data Warehouses
Although spatial and temporal data warehouses have been investigated for several years, there is still a lack of a methodological framework for their design. This situation makes the task of developing spatial and temporal data warehouses more difficult, since designers and implementers do not have any indication about when and how spatial and temporal support may be included. In response to this necessity, in this chapter we propose methods for the design of spatial and temporal data warehouses.
Elzbieta Malinowski, Esteban Zimányi
8. Conclusions and Future Work
Today, many organizations use data warehouse and online analytical processing (OLAP) systems to support their decision-making processes. These systems use a multidimensional model to express users’ analysis requirements. A multidimensional model includes measures that represent the focus of analysis, dimensions used to analyze measures according to various viewpoints, and hierarchies that provide the possibility to consider measures at different levels of detail.
Elzbieta Malinowski, Esteban Zimányi
Backmatter
Metadaten
Titel
Advanced Data Warehouse Design
verfasst von
Elzbieta Malinowski
Esteban Zimányi
Copyright-Jahr
2008
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-74405-4
Print ISBN
978-3-540-74404-7
DOI
https://doi.org/10.1007/978-3-540-74405-4

Premium Partner