Skip to main content

2013 | Buch

Distributed Algorithms for Message-Passing Systems

insite
SUCHEN

Über dieses Buch

Distributed computing is at the heart of many applications. It arises as soon as one has to solve a problem in terms of entities -- such as processes, peers, processors, nodes, or agents -- that individually have only a partial knowledge of the many input parameters associated with the problem. In particular each entity cooperating towards the common goal cannot have an instantaneous knowledge of the current state of the other entities. Whereas parallel computing is mainly concerned with 'efficiency', and real-time computing is mainly concerned with 'on-time computing', distributed computing is mainly concerned with 'mastering uncertainty' created by issues such as the multiplicity of control flows, asynchronous communication, unstable behaviors, mobility, and dynamicity.

While some distributed algorithms consist of a few lines only, their behavior can be difficult to understand and their properties hard to state and prove. The aim of this book is to present in a comprehensive way the basic notions, concepts, and algorithms of distributed computing when the distributed entities cooperate by sending and receiving messages on top of an asynchronous network. The book is composed of seventeen chapters structured into six parts: distributed graph algorithms, in particular what makes them different from sequential or parallel algorithms; logical time and global states, the core of the book; mutual exclusion and resource allocation; high-level communication abstractions; distributed detection of properties; and distributed shared memory. The author establishes clear objectives per chapter and the content is supported throughout with illustrative examples, summaries, exercises, and annotated bibliographies.

This book constitutes an introduction to distributed computing and is suitable for advanced undergraduate students or graduate students in computer science and computer engineering, graduate students in mathematics interested in distributed computing, and practitioners and engineers involved in the design and implementation of distributed applications. The reader should have a basic knowledge of algorithms and operating systems.

Inhaltsverzeichnis

Frontmatter

Distributed Graph Algorithms

Frontmatter
Chapter 1. Basic Definitions and Network Traversal Algorithms
Abstract
This chapter first introduces basic definitions related to distributed algorithms. Then, considering a distributed system as a graph whose vertices are the processes and whose edges are the communication channels, it presents distributed algorithms for graph traversals, namely, parallel traversal, breadth-first traversal, and depth-first traversal. It also shows how spanning trees or rings can be constructed from these distributed graph traversal algorithms. These trees and rings can, in turn, be used to easily implement broadcast and convergecast algorithms.
As the reader will see, the distributed graph traversal techniques are different from their sequential counterparts in their underlying principles, behaviors, and complexities. This come from the fact that, in a distributed context, the same type of traversal can usually be realized in distinct ways, each with its own tradeoff between its time complexity and message complexity.
Michel Raynal
Chapter 2. Distributed Graph Algorithms
Abstract
This chapter addresses three basic graph problems encountered in the context of distributed systems. These problems are (a) the computation of the shortest paths between a pair of processes where a positive length (or weight) is attached to each communication channel, (b) the coloring of the vertices (processes) of a graph in Δ+1 colors (where Δ is the maximal number of neighbors of a process, i.e., the maximal degree of a vertex when using the graph terminology), and (c) the detection of knots and cycles in a graph. As for the previous chapter devoted to graph traversal algorithms, an aim of this chapter is not only to present specific distributed graph algorithms, but also to show that their design is not always obtained from a simple extension of their sequential counterparts.
Michel Raynal
Chapter 3. An Algorithmic Framework to Compute Global Functions on a Process Graph
Abstract
This chapter is devoted to distributed graph algorithms that compute a function or a predicate whose inputs are disseminated at the processes of a network. The function (or the predicate) is global because the output at each process depends on the inputs at all the processes. It follows that the processes have to communicate in order to compute their results.
A general algorithmic framework is presented which allows global functions to be computed. This distributed framework is (a) symmetric in the sense that all processes obey the same rules of behavior, and (b) does not require the processes to exchange more information than needed. The computation of shortest distances and the determination of a cut vertex in a graph are used to illustrate the framework. The framework is then improved to allow for a reduction of the size and the number of messages that are exchanged. Finally, the chapter analyzes the particular case of regular networks (networks in which all the processes have the same number of neighbors).
Michel Raynal
Chapter 4. Leader Election Algorithms
Abstract
This chapter is on the leader election problem. Electing a leader consists for the processes of a distributed system in selecting one of them. Usually, once elected, the leader process is required to play a special role for coordination or control purposes.
Leader election is a form of symmetry breaking in a distributed system. After showing that no leader can be elected in anonymous regular networks (such as rings), this chapter presents several leader election algorithms with a special focus on non-anonymous ring networks.
Michel Raynal
Chapter 5. Mobile Objects Navigating a Network
Abstract
A mobile object is an object that, according to requests issued by user processes, travels from process to process. This chapter is on algorithms that allow a mobile object to navigate a network. It presents three distributed navigation algorithms with different properties. All these algorithms ensure both that the object remains always consistent (i.e., it is never present simultaneously at several processes), and that any process that requires the object eventually obtains it.
Michel Raynal

Logical Time and Global States in Distributed Systems

Frontmatter
Chapter 6. Nature of Distributed Computations and the Concept of a Global State
Abstract
A sequential execution can be represented by the sequence (trace) of consecutive local states it has produced, or, given its initial state, by the sequence of statements that have been executed. Hence, a question that comes naturally to mind is the following one: How do we model a distributed execution?
This chapter answers first this question. To that end, it gives basic definitions, and presents three ways to model a distributed execution, namely, a partial order on a set of events, a partial order on a set of local states, and a lattice of global states. While these three types of models are equivalent, it appears that each one is more appropriate than the others to analyze and understand specific features of distributed executions.
The chapter then focuses on a fundamental notion of distributed computing, namely, the notion of a global state. The chapter analyzes global states and presents several distributed algorithms, which compute on the fly global states of a distributed application. These algorithms are observation algorithms (they have to observe an execution without modifying its behavior). It is shown that the best that can be done is the computation of a global state in which a distributed execution has passed or could have passed. This means that no process can know if the distributed execution has passed or has not passed through the global state which is returned as a result. This noteworthy feature illustrates the relativistic nature of the observation of distributed computations. Despite this relativistic feature, the computation of such global states allows distributed computing problems to be solved (such as the detection of stable properties).
Both the terms “global state” and “snapshot” are used with the same meaning in the literature. They have to be considered as synonyms. This chapter uses the term global state.
Michel Raynal
Chapter 7. Logical Time in Asynchronous Distributed Systems
Abstract
This chapter is on the association of consistent dates with events, local states, or global states of a distributed computation. Consistency means that the dates generated by a dating system have to be in agreement with the “causality” generated by the considered distributed execution. According to the view of a distributed execution we are interested in, this causality is the causal precedence order on events (relation \(\stackrel{ev}{\longrightarrow}\)), the causal precedence order on local states (relation \(\stackrel{\sigma}{\longrightarrow}\)), or the reachability relation in the lattice of global states (relation \(\stackrel{\varSigma }{\longrightarrow}\)), all introduced in the previous chapter. In all cases, this means that the date of a “cause” has to be earlier than the date of any of its “effects”. As we consider time-free asynchronous distributed systems, these dates cannot be physical dates. (Moreover, even if processes were given access to a global physical clock, the clock granularity should be small enough to always allow for a consistent dating.)
Three types of logical time are presented, namely, scalar (or linear) time, vector time, and matrix time. Each type of time is defined, its properties are stated, and illustrations showing how to use it are presented.
Michel Raynal
Chapter 8. Asynchronous Distributed Checkpointing
Abstract
This chapter is devoted to checkpointing in asynchronous message-passing systems. It first presents the notions of local and global checkpoints and a theorem stating a necessary and sufficient condition for a set of local checkpoints to belong to the same consistent global checkpoint.
Then, the chapter considers two consistency conditions, which can be associated with a distributed computation enriched with local checkpoints (the corresponding execution is called a communication and checkpoint pattern). The first consistency condition (called z-cycle-freedom) ensures that any local checkpoint, which has been taken by a process, belongs to a consistent global checkpoint. The second consistency condition (called rollback-dependency trackability) is stronger. It states that a consistent global checkpoint can be associated on the fly with each local checkpoint (i.e., without additional communication).
The chapter discusses these consistency conditions and presents algorithms that, once superimposed on a distributed execution, ensure that the corresponding consistency condition is satisfied. It also presents a message logging algorithm suited to uncoordinated checkpointing.
Michel Raynal
Chapter 9. Simulating Synchrony on Top of Asynchronous Systems
Abstract
Synchronous distributed algorithms are easier to design and analyze than their asynchronous counterparts. Unfortunately, they do not work when executed in an asynchronous system. Hence, the idea to simulate synchronous systems on top of an asynchronous one. Such a simulation algorithm is called a synchronizer. First, this chapter presents several synchronizers in the context of fully asynchronous systems. It is important to notice that, as the underlying system is asynchronous, the synchronous algorithms simulated on top of it cannot consider physical time as a programming object they could use (e.g., to measure physical duration). The only notion of time they can manipulate is a logical time associated with the concept of a round. Then, the chapter presents synchronizers suited to partially synchronous systems. Partial synchrony means here that message delays are bounded but the clocks of the processes (processors) are not synchronized (some private local area networks have such characteristics).
Michel Raynal

Mutual Exclusion and Resource Allocation

Frontmatter
Chapter 10. Permission-Based Mutual Exclusion Algorithms
Abstract
This chapter is on one of the most important synchronization problems, namely mutual exclusion. This problem (whose name is usually shortened to “mutex”) consists of ensuring that at most one process at a time is allowed to access some resource (which can be a physical or a virtual resource).
After having defined the problem, the chapter presents two approaches which allow us to solve it. Both are based on permissions given by processes to other processes. The algorithms of the first approach are based on individual permissions, while the algorithms of the second approach are based on arbiter permissions (arbiter-based algorithms are also called quorum-based algorithms).
Michel Raynal
Chapter 11. Distributed Resource Allocation
Abstract
This chapter is on resource allocation in distributed systems. It first considers the case where there are M instances of the same resource, and a process may request several instances of it. The corresponding resource allocation problem is called k-out-of-M problem (where k, 1≤kM, stands for the—dynamically defined—number of instances requested by a process). Then, the chapter addresses the case where there are several resources, each with a single or several instances.
The multiplicity of resources may generate deadlocks if resources are arbitrarily allocated to processes. Hence, the chapter visits deadlock prevention techniques suited to resource allocation. It also introduces the notion of a conflict graph among processes. Such a graph is a conceptual tool, which captures the possible conflicts among processes, when each resource can be accessed by a subset of processes. Finally, the chapter considers two distinct cases, according to the fact that the subset of resources required by a process is always the same, or may vary from one resource session to another one.
Michel Raynal

High-Level Communication Abstractions

Frontmatter
Chapter 12. Order Constraints on Message Delivery
Abstract
High-level communication abstractions offer communication operations which ensure order properties on message delivery. The simplest (and best known) order property is the first in first out (FIFO) property, which ensures that, on each channel, the messages are received in their sending order. Another order property on message delivery is captured by the total order broadcast abstraction, which was presented in Sect. 7.​1.​4. This communication abstraction ensures that all the messages are delivered in the same order at each process, and this order complies with their causal sending order.
This chapter focuses first on causal message delivery. It defines the corresponding message delivery property and presents several algorithms that implement it, both for point-to-point communication and broadcast communication. Then the chapter presents new algorithms that implement the total order broadcast abstraction. Finally, the chapter plays with a channel by considering four order properties which can be associated with each channel taken individually.
When discussing a communication abstraction, it is assumed that all the messages sent at the application level are sent with the communication operation provided by this communication abstraction. Hence, there is no hidden relation on messages that will be unknown by the algorithms implementing these abstractions.
Michel Raynal
Chapter 13. Rendezvous (Synchronous) Communication
Abstract
While the previous chapter was devoted to communication abstractions on message ordering, this chapter is on synchronous communication (also called logically instantaneous communication, or rendezvous, or interaction). This abstraction adds synchronization to communication. More precisely, it requires that, for a message to be sent by a process, the receiver has to be ready to receive it. From an external observer point view, the message transmission looks instantaneous: The sending and the reception of a message appear as a single event (and the sense of the communication could have been in the other direction). From an operational point of view, we have the following: For each pair of processes, the first process that wants to communicate—be it the sender or the receiver—has to wait until the other process is ready to communicate.
This chapter first defines synchronous communication and introduces a characterization based on a specific message pattern called a crown. It then presents several implementations of this communication abstraction, each suited to a specific context. It also describes implementations for real-time rendezvous in the context of synchronous message-passing systems. In this case, each process is required to associate a deadline with each of its rendezvous.
Michel Raynal

Detection of Properties on Distributed Executions

Frontmatter
Chapter 14. Distributed Termination Detection
Abstract
This chapter is on the detection of the termination of a distributed computation. This problem was posed and solved for the first time in the early 1980s independently by E.W. Dijkstra and C.S. Scholten (1980) and N. Francez (1980). This is a non-trivial problem. While, in sequential computing, the termination of the only process indicates that the computation has terminated, this is no longer true in distributed computing. Even if we were able to observe simultaneously all the processes, observing all of them passive could not allow us to conclude that the distributed execution has terminated. This is because some messages can still be in transit, which will reactivate their destination processes when they arrive, and these re-activations will, in turn, entail the sending of new messages, etc.
This chapter presents several models of asynchronous computations and observation/detection algorithms suited to termination detection in each of them. As in other chapters, the underlying channels are not required to be FIFO. Moreover, while channels are bidirectional, the term “output” channels (resp., “input” channels) is used when considering message send (resp., message reception).
Michel Raynal
Chapter 15. Distributed Deadlock Detection
Abstract
This chapter addresses the deadlock detection problem. After having introduced the AND deadlock model and the OR deadlock model, it presents distributed algorithms that detect their occurrence. Let us recall that the property “there is a deadlock” is a stable property (once deadlocked, a set of processes remain deadlocked until an external agent—the underlying system—resolves it). Hence, as seen in Sect. 6.​5, algorithms computing global states of a computation can be used to detect deadlocks. Differently, the algorithms presented in this chapter are specific to deadlock detection. For simplicity, they all assume FIFO channels.
Michel Raynal

Distributed Shared Memory

Frontmatter
Chapter 16. Atomic Consistency (Linearizability)
Abstract
This chapter is on the strongest consistency condition for concurrent objects. This condition is called atomicity when considering shared registers, and linearizability when considering more sophisticated types of objects. In the following, these two terms are considered as synonyms.
The chapter first introduces the notion of a distributed shared memory. It then defines formally the atomicity concept, and presents its main composability property, and several implementations on top of a message-passing system.
Michel Raynal
Chapter 17. Sequential Consistency
Abstract
This chapter is on sequential consistency, a consistency condition for distributed shared memory, which is weaker than atomicity (linearizability). After having defined sequential consistency, this chapter shows that it is not a local property. Then, it presents two theorems which are of fundamental importance when one has to implement sequential consistency on top of asynchronous message-passing systems. Finally, the chapter presents and proves correct several distributed algorithms that implement sequential consistency. Sequential consistency was introduced by L. Lamport (1979).
Michel Raynal
Backmatter
Metadaten
Titel
Distributed Algorithms for Message-Passing Systems
verfasst von
Michel Raynal
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-38123-2
Print ISBN
978-3-642-38122-5
DOI
https://doi.org/10.1007/978-3-642-38123-2