Skip to main content
main-content

Über dieses Buch

Despite the significant ongoing work in the development of new database systems, many of the basic architectural and performance tradeoffs involved in their design have not previously been explored in a systematic manner. The designers of the various systems have adopted a wide range of strategies in areas such as process structure, client-server interaction, concurrency control, transaction management, and memory management.
This monograph investigates several fundamental aspects of the emerging generation of database systems. It describes and investigates implementation techniques to provide high performance and scalability while maintaining the transaction semantics, reliability, and availability associated with more traditional database architectures. The common theme of the techniques developed here is the exploitation of client resources through caching-based data replication.
Client Data Caching: A Foundation for High Performance Object Database Systems should be a value to anyone interested in the performance and architecture of distributed information systems in general and Object-based Database Management Systems in particular. It provides useful information for designers of such systems, as well as for practitioners who need to understand the inherent tradeoffs among the architectural alternatives in order to evaluate existing systems. Furthermore, many of the issues addressed in this book are relevant to other systems beyond the ODBMS domain. Such systems include shared-disk parallel database systems, distributed file systems, and distributed virtual memory systems. The presentation is suitable for practitioners and advanced students in all of these areas, although a basic understanding of database transaction semantics and techniques is assumed.

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract
In recent years, powerful technological and market forces have combined to affect a major shift in the nature of computing and data management. These forces have had a profound effect on the requirements for Database Management Systems (DBMSs) and hence, on the way such systems are designed and built. The widespread adoption of client-server architectures has made distributed computing the conventional mode of operation for many application domains. At the same time, new classes of applications and new programming paradigms have placed additional demands on database systems, resulting in an emerging generation of object-based database systems. The combination of these factors gives rise to significant challenges and performance opportunities in the design of modern DBMSs. This book proposes and examines a range of techniques to provide high performance and scalability for these new database systems while maintaining the transaction semantics, reliability, and availability associated with more traditional centralized and distributed DBMSs. The common theme of the techniques developed here is the utilization of client resources through caching-based data replication.
Michael J. Franklin

2. Client-Server Database Systems

Abstract
Client-server DBMS architectures can be categorized according to the unit of interaction among client and server processes. In general, clients can send data requests to the server as queries or as requests for specific data items. Systems of the former type are referred to as query-shipping systems and those of the latter type are referred to as data-shipping. These two alternatives are shown in Figure 2.1. In query-shipping systems, clients send a query to the server; the server then processes the query and sends the results back to the client. Queries may be sent as plain text (e.g., SQL), in a compiled representation, or as calls to precompiled queries that are stored at the server. In contrast, data-shipping systems perform the bulk of the work of query processing at the clients, and as a result, much more DBMS functionality is placed at the clients (see Figure 2.1). Data-shipping systems can be further categorized as page servers, which interact using physical units of data (e.g., individual pages or groups of pages such as segments), and object servers, which interact using logical units of data (e.g., tuples or objects).1 In a page server system, the client sends requests for particular database pages to the server. The server returns each requested page (and possibly others) to the client. The client is responsible for mapping between objects and pages. In an object server system, the client requests specific objects from the server; the server is responsible for mapping between objects and pages.
Michael J. Franklin

3. Modeling a Page Server DBMS

Abstract
The performance analyses that are presented in the following chapters were obtained using a flexible and detailed simulation model. This chapter describes how the model captures the client-server execution, database, physical resources, and workloads of a page server DBMS, and discusses the methodology used during the experiments.
Michael J. Franklin

4. Client Cache Consistency

Abstract
As discussed in the preceding chapters, client data caching is a fundamental technique for obtaining high performance and scalability in a page server DBMS. Since caching is a form of replication, a protocol is required to ensure that cached copies do not cause transaction semantics to be violated. In this chapter, we first describe the problem of cache consistency maintenance. We then provide a detailed taxonomy that organizes the design space for cache consistency maintenance algorithms. Finally, three families of algorithms are described in greater detail. The performance of these algorithms is explored in the subsequent chapter.1
Michael J. Franklin

5. Performance of Cache Consistency Algorithms

Abstract
This chapter presents a performance analysis of the three families of cache consistency maintenance algorithms described in Chapter 4, namely, Server-based Two-Phase Locking (S2PL), Optimistic Two-Phase Locking (O2PL), and Callback Locking (CB). Seven algorithms are studied under a range of workloads and system configurations. In addition to measuring the performance of these specific algorithms, the experiments presented here also provide insight into many of the tradeoffs involved in cache maintenance, including: optimism vs. pessimism, detection vs. avoidance, and invalidation vs. propagation. The analysis is performed using the simulation model described in Chapter 3. The parameter settings and workloads that are used in the study are described in the following section. The results are then presented in the two subsequent sections. Performance results for the server-based 2PL and optimistic 2PL algorithms are presented first. The performance of the callback locking algorithms is then studied and compared to the better of the S2PL and O2PL algorithms.
Michael J. Franklin

6. Global Memory Management

Abstract
The preceding two chapters were concerned with exploiting client memory by caching in order to reduce dependence on the server. This chapter takes client caching a step further by treating the aggregate memory of the clients in the system as a global cache.1
Michael J. Franklin

7. Local Disk Caching

Abstract
This chapter explores the use of client disks as an additional resource that can be used to improve system performance and scalability. As was stated in Section 2.1, most current ODBMSs exploit client processor resources through the use of a data-shipping architecture, which enables much of the work of data manipulation to be performed at clients. Furthermore, as described in the preceding chapters, systems can exploit client memory resources through the use of intra- and inter-transaction caching and global memory management techniques. In contrast, existing systems provide only limited support for exploiting client disk resources. This omission is potentially costly, as client disks represent a valuable addition to the storage hierarchy of a client-server ODBMS due to their capacity and non-volatility. This chapter addresses this issue by investigating the performance gains that can be realized by adding client disks to the storage hierarchy of page server DBMSs.1
Michael J. Franklin

8. Towards a Flexible Distributed DBMS Architecture

Abstract
The work presented in the previous chapters has focused on the traditional page server architecture (as described in Section 2.2). This chapter briefly summarizes more recent work on extending the page server approach in order to provide efficient support for a wider range of application domains. Three types of extensions are discussed, all of which provide a more flexible architecture on which to build object database systems in a distributed, workstation/server environment. These extensions are: 1) supporting multiple data granularities, 2) a peer-to-peer architecture, and 3) integrating data-shipping and query-shipping.
Michael J. Franklin

9. Conclusions

Abstract
The confluence of two trends has raised a new set of performance opportunities and challenges for the design of workstation-based database systems. First, the rapid improvement in the price/performance characteristics of workstations, servers, and local-area networks has enabled the migration of sophisticated database function from machine rooms to desktops. As a result, networks of high-performance workstations and servers have become an important target environment for the current generation of commercial and prototype database systems. At the same time, the demands of non-traditional application environments have resulted in the development of a new class of object-oriented database systems. This book has investigated several of the fundamental architectural and algorithmic issues that must be understood before high performance, scalable object-oriented database systems can be constructed in a client-server environment.
Michael J. Franklin

Backmatter

Weitere Informationen