nach oben

2018 | Buch

Kapitel lesen Erstes Kapitel lesen

Fault-Tolerant Message-Passing Distributed Systems

An Algorithmic Approach

verfasst von: Prof. Michel Raynal

Verlag: Springer International Publishing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book presents the most important fault-tolerant distributed programming abstractions and their associated distributed algorithms, in particular in terms of reliable communication and agreement, which lie at the heart of nearly all distributed applications. These programming abstractions, distributed objects or services, allow software designers and programmers to cope with asynchrony and the most important types of failures such as process crashes, message losses, and malicious behaviors of computing entities, widely known under the term "Byzantine fault-tolerance". The author introduces these notions in an incremental manner, starting from a clear specification, followed by algorithms which are first described intuitively and then proved correct.

The book also presents impossibility results in classic distributed computing models, along with strategies, mainly failure detectors and randomization, that allow us to enrich these models. In this sense, the book constitutes an introduction to the science of distributed computing, with applications in all domains of distributed systems, such as cloud computing and blockchains. Each chapter comes with exercises and bibliographic notes to help the reader approach, understand, and master the fascinating field of fault-tolerant distributed computing.

Inhaltsverzeichnis

Frontmatter

Introductory Chapter

Frontmatter

Chapter 1. A Few Definitions and Two Introductory Examples

Abstract

This chapter introduces basic definitions and basic computing models associated with fault-tolerant message-passing distributed systems. It also presents two simple distributed computing problems, whose aim is to give a first intuition of what can be done and what cannot be done in message-passing systems prone to failures. Consequently, this chapter must be considered as an introductory warm-up chapter.

Michel Raynal

The Reliable Broadcast Communication Abstraction

Frontmatter

Chapter 2. Reliable Broadcast in the Presence of Process Crash Failures

Abstract

This chapter focuses on the uniform reliable broadcast (URB) communication abstraction and its implementation in an asynchronous message-passing system prone to process crashes. This communication abstraction is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications require communication with provable guarantees on message deliveries.

After defining the URB abstraction, the chapter presents a construction of it in an asynchronous message passing system prone to process crashes but with reliable channels (i.e., in the system model CAMP_n,t\([\emptyset]\)). The chapter then considers two properties (related to the quality of service) that can be added to URB without requiring enrichment of the system model with additional assumptions. These properties concern the message delivery order, namely “first in first out” (FIFO) message delivery and “causal order” (CO) message delivery.

Michel Raynal

Chapter 3. Reliable Broadcast in the Presence of Process Crashes and Unreliable Channels

Abstract

The previous chapter presented several constructions for the uniform reliable broadcast (URB) abstraction. These constructions considered the asynchronous underlying system model CAMP\([\emptyset]\) in which processes may crash and channels are reliable. These constructions differ in the quality of service they provide to the application processes, this quality being defined with respect to the order in which the messages are delivered (namely, FIFO or CO order). This order restricts message asynchrony.

This chapter introduces constructions of URB-broadcast suited to asynchronous systems prone to process crashes and unreliable channels, i.e., asynchronous system models weaker than CAMP_n,t\([\emptyset]\).

Michel Raynal

Chapter 4. Reliable Broadcast in the Presence of Byzantine Processes

Abstract

This chapter presents two broadcast communication abstractions suited to the asynchronous systems prone to process Byzantine failures (basic model BAMP_n,t\([\emptyset]\) appropriately enriched). The first of these broadcast abstractions is called no-duplicity broadcast, while the second one is the classic non-uniform reliable broadcast adapted to Byzantine failures. (Let us notice that, as a Byzantine process may behave arbitrarily, it is meaningless to force a correct process to deliver a message only because it was delivered by a Byzantine process.) An algorithm implementing no-duplicity broadcast, and two algorithms implementing Byzantine reliable broadcast are presented.

Michel Raynal

The Read/Write Register Communication Abstraction

Frontmatter

Chapter 5. The Read/Write Register Abstraction

Abstract

The read/write register is the most basic object of sequential computing. This chapter introduces it in a concurrency context, and considers three associated consistency conditions: regularity, atomicity (also called linearizability), and sequential consistency. Atomicity and sequential consistency define the family of strong consistency conditions, namely, they require all processes to agree on the same total order in which they see the read and write operations applied to the registers. After a formalization of these notions, the chapter shows that atomic read/write registers compose for free while sequentially consistent registers do not.

Michel Raynal

Chapter 6. Building Read/Write Registers Despite Asynchrony and Less than Half of Processes Crash (t < n/2)

Abstract

This chapter is on the construction of multi-writer multi-reader registers in asynchronous messagepassing systems prone to the crash of a minority of processes (system model CAMP_n,t[t < n/2]). It first considers atomic registers for which it adopts an incremental presentation, with three constructions, each one extending the previous one. The first one builds a single-writer multi-reader (SWMR) regular register, which is extended by the second construction to obtain a single-writer multi-reader (SWMR) atomic register. The third one consists in a simple extension of the second one to obtain a multi-writer multi-reader (MWMR) atomic register. The chapter then addresses the construction of sequentially consistent registers.

Michel Raynal

Chapter 7. Circumventing the t < n/2 Read/Write Register Impossibility: the Failure Detector Approach

Abstract

This chapter presents the failure detector class (denoted Σ) that allows us to circumvent the impossibility of building an atomic read/write register in an asynchronous message-passing system in which half or more processes may commit crash failures (system model CAMP_n,t[t ≥ n/2]). (The reader is referred to Section 3.3 for formal definitions related to failure detectors.) This chapter first introduces the class Σ, and shows how it allows us to implement an atomic register for any value of t. Then, it shows that Σ is the failure detector class that provides us with the weakest information on failures that allows an atomic read/write register to be built despite asynchrony and any number of process crashes.

Michel Raynal

Chapter 8. A Broadcast Abstraction Suited to the Family of Read/Write Implementable Objects

Abstract

Chapter 6 presented algorithms constructing atomic and sequentially consistent read/write registers in the system model CAMP_n,t[t < n/2] (which, from a t-resilience point of view, is the weakest system model in which such read/write registers can be built). All these algorithms rely directly on the unreliable macro-operation denoted broadcast(), i.e., on the send() and receive() operations, which are “machine/network” low level operations.

Michel Raynal

Chapter 9. Atomic Read/Write Registers in the Presence of Byzantine Processes

Abstract

Theorem 18 (stated and proved in Section 5.4) has shown that t < n/2 is an upper bound on the resilience parameter t to build atomic read/write registers in the asynchronous crash process model CAMP_n,t\([\emptyset]\). Section 6.3 and Section 6.4 then presented an incremental construction of Single-Writer Multi-Reader (SWMR) and Multi-Writer Multi-Reader (MW-MR) atomic registers.

Michel Raynal

Agreement in Synchronous Systems

Frontmatter

Chapter 10. Consensus and Interactive Consistency in Synchronous Systems Prone to Process Crash Failures

Abstract

This first chapter on agreement in synchronous systems focuses on the consensus and interactive consistency (also called vector consensus) agreement abstractions. It first defines these abstractions, and presents algorithms that build them in the presence of any number of process crashes in the system model CSMP_n,t\([\emptyset]\).

Michel Raynal

Chapter 11. Expediting Decision in Synchronous Systems Prone to Process Crash Failures

Abstract

The last section of the previous chapter showed that there is no synchronous round-based consensus (or interactive consistency) algorithm that can cope with t process crashes and allows the processes to always decide in less than (t + 1) rounds (i.e., whatever the failure pattern).

Michel Raynal

Chapter 12. Consensus Variants: Simultaneous Consensus and k-Set Agreement

Abstract

Considering the classic system model CSMP_n,t\([\emptyset]\), this chapter presents two “variants” of the consensus agreement abstraction. One is a strengthening of the agreement property, the other one a weakening. Hence, consensus lies in between.

Michel Raynal

Chapter 13. Non-blocking Atomic Commitment in the Presence of Process Crash Failures

Abstract

The non-blocking atomic commitment (NBAC) agreement abstraction originated in databases, and is now pervasive in many distributed applications. It is a basic distributed agreement abstraction. Let us consider a job that is split into n independent parts, each executed by a process. When each process terminated the part assigned to it, the set of processes have to agree on the fate of the full job. They have to commit it if everything went well at each of them (and then each process makes its local results permanent) or abort it if something went wrong at one or several of them (and each process then discards its result). To this end, the processes starts a non-blocking atomic commitment algorithm. If locally everything went well, a process votes yes, otherwise it votes no. The idea is that if all processes voted yes, they have to commit their local computation, and if a process voted no, they have to abort them.

Michel Raynal

Chapter 14. Consensus in Synchronous Systems Prone to Byzantine Process Failures

Abstract

This chapter addresses the interactive consistency and consensus agreement abstractions in the system model BSMP_n,t\([\emptyset]\), i.e., in synchronous systems where up to t processes can be Byzantine. Let us remember that a Byzantine process is a process that behaves in an arbitrary way.

Michel Raynal

Agreement in Asynchronous Systems

Frontmatter

Chapter 15. Implementable Agreement Abstractions Despite Asynchrony and a Minority of Process Crashes

Abstract

This chapter addresses the implementation of agreement abstractions in asynchronous systems where the processes communicate by reading and writing atomic registers. We have seen in Chap. 5 that atomic registers can be built in asynchronous message-passing systems only if t < n/2. Implementations of read/write registers in the system model CAMP_n,t[t < n/2] have been presented in Chap. 6 and Chap. 8.

Michel Raynal

Chapter 16. Consensus: Power and Implementability Limit in Crash-Prone Asynchronous Systems

Abstract

This chapter first presents the TO-broadcast communication abstraction, the state machine replication paradigm, and the ledger object, and shows that they all are computationally equivalent. It also shows that any object (abstraction) defined by a sequential specification (sequential state machine, or ledger) can be implemented in CAMP_n,t[CONS] (CAMP_n,t\([\emptyset]\) enriched with consensus). In this sense the consensus agreement abstraction is universal. It provides the computability power needed to implement any object – defined by a sequential specification – despite asynchrony and the crash of any minority of processes.

Michel Raynal

Chapter 17. Implementing Consensus in Enriched Crash-Prone Asynchronous Systems

Abstract

The previous chapter focused on the consensus agreement abstraction. It showed its universality power for implementing objects whose consensus number is greater than 1, and its implementability limit (namely, the impossibility to implement consensus in the basic system model CAMP_n,t[t < n/2]).

Michel Raynal

Chapter 18. Implementing Oracles in Asynchronous Systems Prone to Process Crash Failures

Abstract

The notion of a failure detector has been introduced in Section 3.3. Considering a communication or agreement abstraction which is impossible to solve in the basic model CAMP_n,t\([\emptyset]\), an appropriate failure detector provides the processes with additional computability power, which allows this communication or agreement abstraction to be implemented in the corresponding enriched model. Various failure detectors have been presented and used in previous chapters (in Chap. 3 to implement URBbroadcast for any value of t despite fair channels, in Chap. 7 to implement a read/write register for any value of t, and in Chap. 17 to implement consensus despite asynchrony and process crashes).

Michel Raynal

Chapter 19. Implementing Consensus in Enriched Byzantine Asynchronous Systems

Abstract

This chapter is on the implementation of the consensus abstraction in the Byzantine system model BAMP_n,t\([\emptyset]\) enriched with appropriate additional assumptions. All the algorithms it presents assume that the network is not controlled by the adversary, and are optimal with respect to the model resilience parameter t (namely, t < n/3).

Michel Raynal

Appendix

Frontmatter

Chapter 20. Quorum, Signatures, and Overlays

Abstract

While all the notions described in this appendix are not specific to distributed computing, they are briefly presented here for completeness, and allow the reader to have a better understanding of them. They concern the notion of quorums, digital signatures, and network overlays.

Michel Raynal

Backmatter

Titel: Fault-Tolerant Message-Passing Distributed Systems
verfasst von: Prof. Michel Raynal
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-94141-7
Print ISBN: 978-3-319-94140-0
DOI: https://doi.org/10.1007/978-3-319-94141-7