Skip to main content

2018 | Buch

Fault-Tolerant Message-Passing Distributed Systems

An Algorithmic Approach

insite
SUCHEN

Über dieses Buch

This book presents the most important fault-tolerant distributed programming abstractions and their associated distributed algorithms, in particular in terms of reliable communication and agreement, which lie at the heart of nearly all distributed applications. These programming abstractions, distributed objects or services, allow software designers and programmers to cope with asynchrony and the most important types of failures such as process crashes, message losses, and malicious behaviors of computing entities, widely known under the term "Byzantine fault-tolerance". The author introduces these notions in an incremental manner, starting from a clear specification, followed by algorithms which are first described intuitively and then proved correct.

The book also presents impossibility results in classic distributed computing models, along with strategies, mainly failure detectors and randomization, that allow us to enrich these models. In this sense, the book constitutes an introduction to the science of distributed computing, with applications in all domains of distributed systems, such as cloud computing and blockchains. Each chapter comes with exercises and bibliographic notes to help the reader approach, understand, and master the fascinating field of fault-tolerant distributed computing.

Inhaltsverzeichnis

Frontmatter

Introductory Chapter

Frontmatter
Chapter 1. A Few Definitions and Two Introductory Examples
Abstract
This chapter introduces basic definitions and basic computing models associated with fault-tolerant message-passing distributed systems. It also presents two simple distributed computing problems, whose aim is to give a first intuition of what can be done and what cannot be done in message-passing systems prone to failures. Consequently, this chapter must be considered as an introductory warm-up chapter.
Michel Raynal

The Reliable Broadcast Communication Abstraction

Frontmatter
Chapter 2. Reliable Broadcast in the Presence of Process Crash Failures
Abstract
This chapter focuses on the uniform reliable broadcast (URB) communication abstraction and its implementation in an asynchronous message-passing system prone to process crashes. This communication abstraction is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications require communication with provable guarantees on message deliveries.
After defining the URB abstraction, the chapter presents a construction of it in an asynchronous message passing system prone to process crashes but with reliable channels (i.e., in the system model CAMPn,t\([\emptyset]\)). The chapter then considers two properties (related to the quality of service) that can be added to URB without requiring enrichment of the system model with additional assumptions. These properties concern the message delivery order, namely “first in first out” (FIFO) message delivery and “causal order” (CO) message delivery.
Michel Raynal
Chapter 3. Reliable Broadcast in the Presence of Process Crashes and Unreliable Channels
Abstract
The previous chapter presented several constructions for the uniform reliable broadcast (URB) abstraction. These constructions considered the asynchronous underlying system model CAMP\([\emptyset]\) in which processes may crash and channels are reliable. These constructions differ in the quality of service they provide to the application processes, this quality being defined with respect to the order in which the messages are delivered (namely, FIFO or CO order). This order restricts message asynchrony.
This chapter introduces constructions of URB-broadcast suited to asynchronous systems prone to process crashes and unreliable channels, i.e., asynchronous system models weaker than CAMPn,t\([\emptyset]\).
Michel Raynal
Chapter 4. Reliable Broadcast in the Presence of Byzantine Processes
Abstract
This chapter presents two broadcast communication abstractions suited to the asynchronous systems prone to process Byzantine failures (basic model BAMPn,t\([\emptyset]\) appropriately enriched). The first of these broadcast abstractions is called no-duplicity broadcast, while the second one is the classic non-uniform reliable broadcast adapted to Byzantine failures. (Let us notice that, as a Byzantine process may behave arbitrarily, it is meaningless to force a correct process to deliver a message only because it was delivered by a Byzantine process.) An algorithm implementing no-duplicity broadcast, and two algorithms implementing Byzantine reliable broadcast are presented.
Michel Raynal

The Read/Write Register Communication Abstraction

Frontmatter
Chapter 5. The Read/Write Register Abstraction
Abstract
The read/write register is the most basic object of sequential computing. This chapter introduces it in a concurrency context, and considers three associated consistency conditions: regularity, atomicity (also called linearizability), and sequential consistency. Atomicity and sequential consistency define the family of strong consistency conditions, namely, they require all processes to agree on the same total order in which they see the read and write operations applied to the registers. After a formalization of these notions, the chapter shows that atomic read/write registers compose for free while sequentially consistent registers do not.
Michel Raynal
Chapter 6. Building Read/Write Registers Despite Asynchrony and Less than Half of Processes Crash (t < n/2)
Abstract
This chapter is on the construction of multi-writer multi-reader registers in asynchronous messagepassing systems prone to the crash of a minority of processes (system model CAMPn,t[t < n/2]). It first considers atomic registers for which it adopts an incremental presentation, with three constructions, each one extending the previous one. The first one builds a single-writer multi-reader (SWMR) regular register, which is extended by the second construction to obtain a single-writer multi-reader (SWMR) atomic register. The third one consists in a simple extension of the second one to obtain a multi-writer multi-reader (MWMR) atomic register. The chapter then addresses the construction of sequentially consistent registers.
Michel Raynal
Chapter 7. Circumventing the t < n/2 Read/Write Register Impossibility: the Failure Detector Approach
Abstract
This chapter presents the failure detector class (denoted Σ) that allows us to circumvent the impossibility of building an atomic read/write register in an asynchronous message-passing system in which half or more processes may commit crash failures (system model CAMPn,t[tn/2]). (The reader is referred to Section 3.3 for formal definitions related to failure detectors.) This chapter first introduces the class Σ, and shows how it allows us to implement an atomic register for any value of t. Then, it shows that Σ is the failure detector class that provides us with the weakest information on failures that allows an atomic read/write register to be built despite asynchrony and any number of process crashes.
Michel Raynal
Chapter 8. A Broadcast Abstraction Suited to the Family of Read/Write Implementable Objects
Abstract
Chapter 6 presented algorithms constructing atomic and sequentially consistent read/write registers in the system model CAMPn,t[t < n/2] (which, from a t-resilience point of view, is the weakest system model in which such read/write registers can be built). All these algorithms rely directly on the unreliable macro-operation denoted broadcast(), i.e., on the send() and receive() operations, which are “machine/network” low level operations.
Michel Raynal
Chapter 9. Atomic Read/Write Registers in the Presence of Byzantine Processes
Abstract
Theorem 18 (stated and proved in Section 5.4) has shown that t < n/2 is an upper bound on the resilience parameter t to build atomic read/write registers in the asynchronous crash process model CAMPn,t\([\emptyset]\). Section 6.3 and Section 6.4 then presented an incremental construction of Single-Writer Multi-Reader (SWMR) and Multi-Writer Multi-Reader (MW-MR) atomic registers.
Michel Raynal

Agreement in Synchronous Systems

Frontmatter
Chapter 10. Consensus and Interactive Consistency in Synchronous Systems Prone to Process Crash Failures
Abstract
This first chapter on agreement in synchronous systems focuses on the consensus and interactive consistency (also called vector consensus) agreement abstractions. It first defines these abstractions, and presents algorithms that build them in the presence of any number of process crashes in the system model CSMPn,t\([\emptyset]\).
Michel Raynal
Chapter 11. Expediting Decision in Synchronous Systems Prone to Process Crash Failures
Abstract
The last section of the previous chapter showed that there is no synchronous round-based consensus (or interactive consistency) algorithm that can cope with t process crashes and allows the processes to always decide in less than (t + 1) rounds (i.e., whatever the failure pattern).
Michel Raynal
Chapter 12. Consensus Variants: Simultaneous Consensus and k-Set Agreement
Abstract
Considering the classic system model CSMPn,t\([\emptyset]\), this chapter presents two “variants” of the consensus agreement abstraction. One is a strengthening of the agreement property, the other one a weakening. Hence, consensus lies in between.
Michel Raynal
Chapter 13. Non-blocking Atomic Commitment in the Presence of Process Crash Failures
Abstract
The non-blocking atomic commitment (NBAC) agreement abstraction originated in databases, and is now pervasive in many distributed applications. It is a basic distributed agreement abstraction. Let us consider a job that is split into n independent parts, each executed by a process. When each process terminated the part assigned to it, the set of processes have to agree on the fate of the full job. They have to commit it if everything went well at each of them (and then each process makes its local results permanent) or abort it if something went wrong at one or several of them (and each process then discards its result). To this end, the processes starts a non-blocking atomic commitment algorithm. If locally everything went well, a process votes yes, otherwise it votes no. The idea is that if all processes voted yes, they have to commit their local computation, and if a process voted no, they have to abort them.
Michel Raynal
Chapter 14. Consensus in Synchronous Systems Prone to Byzantine Process Failures
Abstract
This chapter addresses the interactive consistency and consensus agreement abstractions in the system model BSMPn,t\([\emptyset]\), i.e., in synchronous systems where up to t processes can be Byzantine. Let us remember that a Byzantine process is a process that behaves in an arbitrary way.
Michel Raynal

Agreement in Asynchronous Systems

Frontmatter
Chapter 15. Implementable Agreement Abstractions Despite Asynchrony and a Minority of Process Crashes
Abstract
This chapter addresses the implementation of agreement abstractions in asynchronous systems where the processes communicate by reading and writing atomic registers. We have seen in Chap. 5 that atomic registers can be built in asynchronous message-passing systems only if t < n/2. Implementations of read/write registers in the system model CAMPn,t[t < n/2] have been presented in Chap. 6 and Chap. 8.
Michel Raynal
Chapter 16. Consensus: Power and Implementability Limit in Crash-Prone Asynchronous Systems
Abstract
This chapter first presents the TO-broadcast communication abstraction, the state machine replication paradigm, and the ledger object, and shows that they all are computationally equivalent. It also shows that any object (abstraction) defined by a sequential specification (sequential state machine, or ledger) can be implemented in CAMPn,t[CONS] (CAMPn,t\([\emptyset]\) enriched with consensus). In this sense the consensus agreement abstraction is universal. It provides the computability power needed to implement any object – defined by a sequential specification – despite asynchrony and the crash of any minority of processes.
Michel Raynal
Chapter 17. Implementing Consensus in Enriched Crash-Prone Asynchronous Systems
Abstract
The previous chapter focused on the consensus agreement abstraction. It showed its universality power for implementing objects whose consensus number is greater than 1, and its implementability limit (namely, the impossibility to implement consensus in the basic system model CAMPn,t[t < n/2]).
Michel Raynal
Chapter 18. Implementing Oracles in Asynchronous Systems Prone to Process Crash Failures
Abstract
The notion of a failure detector has been introduced in Section 3.3. Considering a communication or agreement abstraction which is impossible to solve in the basic model CAMPn,t\([\emptyset]\), an appropriate failure detector provides the processes with additional computability power, which allows this communication or agreement abstraction to be implemented in the corresponding enriched model. Various failure detectors have been presented and used in previous chapters (in Chap. 3 to implement URBbroadcast for any value of t despite fair channels, in Chap. 7 to implement a read/write register for any value of t, and in Chap. 17 to implement consensus despite asynchrony and process crashes).
Michel Raynal
Chapter 19. Implementing Consensus in Enriched Byzantine Asynchronous Systems
Abstract
This chapter is on the implementation of the consensus abstraction in the Byzantine system model BAMPn,t\([\emptyset]\) enriched with appropriate additional assumptions. All the algorithms it presents assume that the network is not controlled by the adversary, and are optimal with respect to the model resilience parameter t (namely, t < n/3).
Michel Raynal

Appendix

Frontmatter
Chapter 20. Quorum, Signatures, and Overlays
Abstract
While all the notions described in this appendix are not specific to distributed computing, they are briefly presented here for completeness, and allow the reader to have a better understanding of them. They concern the notion of quorums, digital signatures, and network overlays.
Michel Raynal
Backmatter
Metadaten
Titel
Fault-Tolerant Message-Passing Distributed Systems
verfasst von
Prof. Michel Raynal
Copyright-Jahr
2018
Electronic ISBN
978-3-319-94141-7
Print ISBN
978-3-319-94140-0
DOI
https://doi.org/10.1007/978-3-319-94141-7