The 2009 Edsger W. Dijkstra Prize in Distributed Computing

The Edsger W. Dijkstra Prize in Distributed Computing is awarded for an outstanding paper on the principles of distributed computing, whose significance and impact on the theory and/or practice of distributed computing has been evident for at least a decade.

Lorenzo Alvisi Chair, Rachid Guerraoui, Prasad Jayanti, Idit Keidar, Shay Kutten, Jennifer Welch

Michel Raynal and Shmuel Zaks 60th Birthday Symposium

Computing, Observing, Controlling, Checkpointing: Symbiosis Is Even Better Than Agreement!

This talk is a tribute to Michel Raynal, and more precisely to the nice work I had the great pleasure to share with him during twenty years of close collaboration (1985-2004).

Why this title “

Symbiosis is better than Agreement

”? Despite the famous FLP (1985) impossibility result, Michel and I often succeeded to reach agreement. Perhaps none of us was faulty? Perhaps our context was not as asynchronous as it could appear? Well, I don’t believe so! The deep reason lies rather in the symbiosis that prevailed between Michel (a tree) and me (a mushroom). This symbiosis not only allowed us to reach agreement, but, more interestingly, to obtain important and fundamental results in several fields of distributed computing. From “old” problems or paradigms - e.g. Network traversal, Detection of stable properties, Election of a leader,Mutual exclusion, Distributed evaluation - to advances in new ones - e.g., related to fault tolerance such as Checkpointing -, we have always strived to bring out design principles and to obtain generic solutions.

Jean-Michel Hélary

What Agreement Problems Owe Michel

Agreement problems are at the heart of the design of dependable and reliable distributed services. Distributed systems that run such services may experience unpredictable processing and communication delays, and some of their components can fail in various ways. It has been proved that in such settings, the consensus problem, the most popular and fundamental of the agreement problems has no deterministic solution.

Therefore, researchers started investigating ways of circumventing the impossibility result. Two main directions were explored: relaxing the requirements of the consensus problem, and strengthening the assumptions on the system.At least two ways of relaxing the consensus requirements have been investigated: randomization (termination is achieved only with high probability) and approximate agreement. Also, at least two ways of strengthening the assumptions on the system have been considered: adding synchrony assumptions to the system and abstracting the details of how a processor suspects a failure has occurred, without referring to particular synchrony assumptions by the mean of the

Unreliable Failure Detectors

that provides processes with a list of processes suspected to have crashed.

Achour Mostefaoui

Shmuel Zaks - The Early Years: A Combinatorialist in Distributed Computing

Celebrating Shmuels Zaks’ 60th birthday and his remarkable career, the focus of this talk is on his early contributions to Distributed Computing. In particular, in this talk I examine how this young combinatorialist/graph theorist, upon discovering the beauty and fun of distributed algorithms, was so captured by the area that he never left it. In these early explorations, his research contributions have been many, some very important (e.g. lower bound for election in complete graphs) and some very beautiful (e.g. guessing games in synchronous networks). In this talk, a few of these research results are described and commented, and some of his other contributions to the Distributed Computing community during those years are highlighted.

Nicola Santoro

Shmuel Zaks - The Mathematician, Computer Scientist and Personality

Shmuel Zaks received his BSc (cum laude) and MSc degrees in Mathematics from the Technion, Haifa, Israel, in 1971 and 1972, respectively, and his PhD degree in Computer Science from the University of Illinois at Urbana-Champaign in 1979. He is a full professor at the Department of Computer Science at the Technion, where he has been since 1979. He is an author over 100 journal and conference papers, which span his research interests, including Distributed Computing, ATM networks, Optical Networks,Graph and Combinatorial Algorithms, and Discrete Mathematics.

I had the honor to be supervised by Shmuel Zaks (and jointly of Shlomo Moran) during my MSc studies in years 1982-1985. At that period the main research interest of Shmuel was Distributed Algorithms, and my MSc thesis was on this subject. His numerous contributions to this field are subject of another talk.

In the first half of the 1990’s his major contributions was in the field of ATM networks. In part of this talk I will describe a beautiful result from [CGZ96].

He has numerous contributions in Optical Networks. I had the opportunity to collaborate with him in part of these works, which are mostly approximation algorithms to NP-hard optimization problems. In my talk will describe one of these results ([FMSZ08]).

Shmuel is a father of 4 children and grandfather of 4 grandchildren. It is impossible to talk about him without mentioning that he is an exceptional family man and enjoys helping people at every possible occasion and in every possible way.

Mordechai Shalom

Award Nominees (Session 2B)

The Disagreement Power of an Adversary

At the heart of distributed computing lies the fundamental result that the level of agreement that can be obtained in an asynchronous shared memory model where

t

processes can crash is exactly

t

+ 1. In other words, an adversary that can crash any subset of size at most

t

can prevent the processes from agreeing on

t

values. But what about the remaining (

$2^{2^n} -n$

) adversaries that might crash certain combination of processes and not others?

This paper presents a precise way to characterize such adversaries by introducing the notion of

disagreement power

: the biggest integer

k

for which the adversary can prevent processes from agreeing on

k

values. We show how to compute the disagreement power of an adversary and how this notion enables to derive

n

equivalence

classes of adversaries.

Carole Delporte-Gallet, Hugues Fauconnier, Rachid Guerraoui, Andreas Tielmann

New Bounds for the Controller Problem

(Extended Abstract)

The (

M

,

W

)

-controller

, originally studied by Afek, Awerbuch, Plotkin, and Saks, is a basic distributed tool that provides an abstraction for managing the consumption of a global resource in a distributed dynamic network. The input to the controller arrives online in the form of

requests

presented at arbitrary nodes. A request presented at node

u

corresponds to the “desire” of some entity to consume one unit of the global resource at

u

and the controller should handle this request within finite time by either

granting

it with a

permit

or

denying

it. Initially,

M

permits (corresponding to

M

units of the global resource) are stored at a designated

root

node. Throughout the execution permits can be transported from place to place along the network’s links so that they can be granted to requests presented at various nodes; when a permit is granted to some request, it is eliminated from the network. The fundamental rule of an (

M

,

W

)-controller is that a request should not be denied unless it is certain that at least

M

−

W

permits are eventually granted. The most efficient (

M

,

W

)-controller known to date has message complexity

$ O (N \log^{2} N \log \frac{M}{W + 1}) $

, where

N

is the number of nodes that ever existed in the network (the dynamic network may undergo node insertions and deletions).

In this paper we establish two new lower bounds on the message complexity of the controller problem. We first prove a simple lower bound stating that any (

M

,

W

)-controller must send

$ {\it \Omega} (N \log \frac{M}{W + 1}) $

messages. Second, for the important case when

W

is proportional to

M

(this is the common case in most applications), we use a surprising reduction from the (centralized)

monotonic labeling problem

to show that any (

M

,

W

)-controller must send

$ {\it \Omega} (N \log N) $

messages. In fact, under a long lasting conjecture regarding the complexity of the monotonic labeling problem, this lower bound is improved to a tight

$ {\it \Omega} (N \log^{2} N) $

. The proof of this lower bound requires that

N

=

O

(

M

) which turns out to be somewhat inevitable due to a new construction of an (

M

,

M

/ 2) -controller with message complexity

O

(

N

log

2

M

) .

Yuval Emek, Amos Korman

On Set Consensus Numbers

We propose a complete characterization of a large class of distributed tasks, with respect to a weakened solvability notion called

weak termination

. A task is weak-termination solvable if there is an algorithm by which at least one process outputs.

The proposed categorization of tasks is based on the weakest failure detectors needed to solve them. We show that every task

$\mathcal{T}$

in the considered class is equivalent (in the failure detector sense) to some form of set agreement, and thus its solvability with weak termination is completely characterized by its

set consensus number

: the maximal integer

k

such that

$\mathcal{T}$

can be (weak-termination) solved using read-write registers and

k

-set agreement objects.

The characterization goes through showing that

$\neg{\it \Omega}_k$

, recently shown to be the weakest failure detector for the task of

k

-set agreement, is necessary to solve

any

task that is

k

-resilient impossible.

Eli Gafni, Petr Kuznetsov

The Abstract MAC Layer

A diversity of possible communication assumptions complicates the study of algorithms and lower bounds for radio networks. We address this problem by defining an Abstract MAC Layer. This service provides reliable local broadcast communication, with timing guarantees stated in terms of a collection of abstract

delay functions

applied to the relevant contention. Algorithm designers can analyze their algorithms in terms of these functions, independently of specific channel behavior. Concrete implementations of the Abstract MAC Layer over basic radio network models generate concrete definitions for these delay functions, automatically adapting bounds proven for the abstract service to bounds for the specific radio network under consideration. To illustrate this approach, we use the Abstract MAC Layer to study the new problem of

Multi-Message Broadcast

, a generalization of standard single-message broadcast, in which any number of messages arrive at any processes at any times. We present and analyze two algorithms for Multi-Message Broadcast in static networks: a simple greedy algorithm and one that uses regional leaders. We then indicate how these results can be extended to mobile networks.

Fabian Kuhn, Nancy Lynch, Calvin Newport

Randomization Can Be a Healer: Consensus with Dynamic Omission Failures

Wireless ad-hoc networks are being increasingly used in diverse contexts, ranging from casual meetings to disaster recovery operations. A promising approach is to model these networks as distributed systems prone to dynamic communication failures. This captures transitory disconnections in communication due to phenomena like interference and collisions, and permits an efficient use of the wireless broadcasting medium. This model, however, is bound by the impossibility result of Santoro and Widmayer, which states that, even with strong synchrony assumptions, there is no deterministic solution to any non-trivial form of agreement if

n

− 1 or more messages can be lost per communication round in a system with

n

processes. In this paper we propose a novel way to circumvent this impossibility result by employing randomization. We present a consensus protocol that ensures safety in the presence of an unrestricted number of omission faults, and guarantees progress in rounds where such faults are bounded by

$f \leq \lceil \frac{n}{2} \rceil (n-k)+k-2$

, where

k

is the number of processes required to decide, eventually assuring termination with probability 1.

Henrique Moniz, Nuno Ferreira Neves, Miguel Correia, Paulo Veríssimo

Transactional Memory (Session 1A)

Interrupting Snapshots and the Java $^{\mbox{\tiny TM}}$ Size() Method

The Java

$^{\mbox{\tiny TM}}$

developers kit requires a

size()

operation for all objects. Unfortunately, the best known solution, available in the Java concurrency package, has a blocking concurrent implementation that does not scale. This paper presents a highly scalable wait-free implementation of a concurrent

size()

operation based on a new lock-free

interrupting snapshots

algorithm for the classical atomic snapshot problem. This is perhaps the first example of the potential benefit from using atomic snapshots in real industrial code (the concurrency package is currently deployed on over 10 million desktops).

The key idea behind the new algorithm is to allow snapshot scans to interrupt each other until they agree on a shared linearization point with respect to updates, rather than trying, as was done in the past, to have them coordinate the collecting of a shared global view. As we show, the new algorithm scales well, significantly outperforming existing implementations.

Yehuda Afek, Nir Shavit, Moran Tzafrir

Elastic Transactions

This paper presents

elastic transactions

, a variant of the transactional model. Upon conflict detection, an elastic transaction might drop what it did so far within a separate transaction that immediately commits, and initiate a new transaction which might itself be elastic. Elastic transactions are a complementary alternative to traditional transactions, particularly appealing when implementing search structures. Elastic transactions can be safely composed with normal ones, but significantly improve performance if used instead.

Pascal Felber, Vincent Gramoli, Rachid Guerraoui

Brief Announcement: Transactional Scheduling for Read-Dominated Workloads

A promising approach to programming concurrent applications is provided by

transactional synchronization

: a

transaction

aggregates a sequence of resource accesses that should be executed atomically by a single thread. A transaction ends either by

committing

, in which case, all of its updates take effect, or by

aborting

, in which case, no update is effective.

The transactional approach to contention management [6,8] guarantees consistency by making sure that whenever there is a conflict, one of the transactions involved is aborted. When aborted, a transaction is later

restarted

from its beginning. Two overlapping transactions

T

1

and

T

2

conflict, if

T

1

reads a resource

X

and

T

2

executes a writing access to

X

while

T

1

is still pending, or

T

1

executed a writing access to

X

and

T

2

accesses

X

while

T

1

is still pending. Note that a conflict does not mean that consistency is violated, for example, two overlapping transactions [

read

(

X

),

write

(

Y

)] and [

write

(

X

),

read

(

Z

)] can be serialized, despite having a conflict.

Hagit Attiya, Alessia Milani

Shared Memory (Session 1B)

Tight Group Renaming on Groups of Size g Is Equivalent to g-Consensus

We address two problems, the

g

-tight group renaming task and what we call,

safe

-consensus task, and show the relations between them. We show that any

g

-tight group renaming task, the first problem, implements

g

processes consensus. We show this by introducing an intermediate task, the

safe

-consensus task, the second problem, and showing that

g

-tight group renaming implements

g

-safe-consensus and that the latter implements

g

-consensus. It is known that with

g

-consensus

g

-tight group renaming is solvable, making the two problems equivalent.

The

safe

-consensus task, is of independent interest. In it the validity condition of consensus is weakened as follows: if the first processor to invoke the task returns before any other processor invokes, i.e., it runs in solo, then it outputs its input; Otherwise the consensus output can be arbitrary, not even the input of any process. We show the equivalence between safe-(set-)consensus and (set-)consensus.

Yehuda Afek, Eli Gafni, Opher Lieber

The RedBlue Adaptive Universal Constructions

We present the family of

RedBlue

algorithms, a collection of universal wait-free constructions for linearizable shared objects in an asynchronous shared-memory distributed system with

n

processes. The algorithms are adaptive and improve upon previous algorithms in terms of their time and/or space complexity.

The first of the algorithms achieves better time complexity than all previously presented algorithms but it is impractical since it uses large

LL

/

SC

registers. This algorithm comprises the keystone for the design of the other

RedBlue

algorithms which are of practical interest. The second algorithm significantly reduces the size of the required registers and it is therefore practical in many cases. The last two algorithms work efficiently for large objects improving previous universal constructions for large objects presented by Anderson and Moir (PODC 1995).

Panagiota Fatourou, Nikolaos D. Kallimanis

Help When Needed, But No More: Efficient Read/Write Partial Snapshot

An atomic snapshot object is an object that can be concurrently accessed by asynchronous processes prone to crash. It is made of

m

components (base atomic registers) and is defined by two operations: an update operation that allows a process to atomically assign a new value to a component and a snapshot operation that atomically reads and returns the values of all the components. To cope with the net effect of concurrency, asynchrony and failures, the algorithm implementing the update operation has to help concurrent snapshot operations so that they always terminate.

This paper is on

partial snapshot

objects. Such an object provides a snapshot operation that can take any subset of the components as input parameter, and atomically reads and returns the values of this subset of components. The paper has two contributions. The first is the introduction of two properties for partial snapshot object algorithms, called

help-locality

and

freshness

. Help-locality requires that an update operation helps only the concurrent partial snapshot operations that read the component it writes. When an update of a component

r

helps a partial snapshot, freshness requires that the update provides the partial snapshot with a value of the component

r

that is at least as recent as the value it writes into that component. (No snapshot algorithm proposed so far satisfies these properties.) The second contribution consists of an update and a partial snapshot algorithms that are wait-free, linearizable and satisfy the previous efficiency properties. Interestingly, the principle that underlies the proposed algorithms is different from the one used so far, namely, it is based on the “write first, and help later” strategy. An improvement of the previous algorithms is also presented. Based on LL/SC atomic registers (instead of read/write registers) this improvement decreases the number of base registers from

O

(

n

2

) to

O

(

n

). This shows an interesting tradeoff relating the synchronization power of the base operations and the number of base atomic registers when using the “write first, and help later” strategy.

Damien Imbs, Michel Raynal

Contention-Sensitive Data Structures and Algorithms

A contention-sensitive data structure is a concurrent data structure in which the overhead introduced by locking is eliminated in the common cases, when there is no contention, or when processes with non-interfering operations access it concurrently. When a process invokes an operation on a contention-sensitive data structure, in the absence of contention or interference, the process must be able to complete its operation in a small number of steps and without using locks. Using locks is permitted only when there is interference. We formally define the notion of contention-sensitive data structures, propose four general transformations that facilitate devising such data structures, and illustrate the benefits of the approach by implementing a contention-sensitive consensus algorithm, a contention-sensitive double-ended queue data structure, and a contention-sensitive election algorithm. Finally, we generalize the result to enable to avoid locking also when contention is low.

Gadi Taubenfeld

Brief Announcement: Acceleration by Contention for Shared Memory Mutual Exclusion Algorithms

This paper is exploring a possibility of designing distributed algorithms accelerated by high contention. We propose a mutual exclusion algorithm with such a property for asynchronous read/write shared memory systems with

N

processes. In a mutual exclusion algorithm, each process executes its

entry

and

exit

sections to enter its

critical

section, where

mutual exclusion:

at most one process executes its critical section at any time, and

starvation freedom:

each process that executes its entry section eventually executes its critical section, are required.

We propose an efficient mutual exclusion algorithm with respect to remote memory reference (RMR) complexity. Yang et al. [1] proposed an algorithm with the worst case RMR complexity of

O

(log

N

) and Attiya et al. [2] proved the lower bound of

${\it \Omega}(\log N)$

. Though our algorithm has the worst case RMR complexity of

O

(log

N

), it becomes efficient with increasing the number of processes executing concurrently. We show the efficiency using queuing theory and simulation.

Michiko Inoue, Tsuyoshi Suzuki, Hideo Fujiwara

Brief Announcement: Incremental Component-Based Modeling, Verification, and Performance Evaluation of Distributed Reset

Design and implementation of distributed algorithms often involve many subtleties due to their complex structure, nondeterminism, and low atomicity as well as occurrence of unanticipated physical events such as faults. Thus, constructing correct distributed systems has always been a challenge and often subjects to serious errors. This is essentially due to the fact that we currently lack disciplined methods for the rigorous design and correct implementation of distributed systems, mainly for two reasons: (1) formal methods are not easy to use by designers and developers; and (2) there is a wide gap between modeling formalisms and automated verification tools on one side, and practical development and deployment tools on the other side.

Ananda Basu, Borzoo Bonakdarpour, Marius Bozga, Joseph Sifakis

Distributed and Local Graph Algorithms (Session 1C)

Local Computation of Nearly Additive Spanners

An (

α

,

β

)-spanner of a graph

G

is a subgraph

H

that approximates distances in

G

within a multiplicative factor

α

and an additive error

β

, ensuring that for any two nodes

u

,

v

,

d

H

(

u

,

v

) ≤

α

·

d

G

(

u

,

v

) +

β

. This paper concerns algorithms for the distributed deterministic construction of a sparse (

α

,

β

)-spanner

H

for a given graph

G

and distortion parameters

α

and

β

. It first presents a generic distributed algorithm that in constant number of rounds constructs, for every

n

-node graph and integer

k

≥ 1, an (

α

,

β

)-spanner of

O

(

βn

1 + 1/

k

) edges, where

α

and

β

are constants depending on

k

. For suitable parameters, this algorithm provides a (2

k

− 1,0)-spanner of at most

k

n

1 + 1/

k

edges in

k

rounds, matching the performances of the best known distributed algorithm by Derbel et al. (PODC ’08). For

k

= 2 and constant

ε

> 0, it can also produce a (1 +

ε

,2 −

ε

)-spanner of

O

(

n

3/2

) edges in constant time. More interestingly, for every integer

k

> 1, it can construct in constant time a (1 +

ε

,

O

(1/

ε

)

k

− 2

)-spanner of

O

(

ε

−

k

+ 1

n

1 + 1/

k

) edges. Such deterministic construction was not previously known. The paper also presents a second generic deterministic and distributed algorithm based on the construction of small dominating sets and maximal independent sets. After computing such sets in sub-polynomial time, it constructs at its best a (1 +

ε

,

β

)-spanner with

O

(

βn

1 + 1/

k

) edges, where

β

=

k

log(log

k

/

ε

) +

O

(1)

. For

k

= 3, it provides a (1 +

ε

, 6 −

ε

)-spanner with

O

(

ε

− 1

n

4/3

) edges. The additive terms

β

=

β

(

k

,

ε

) in the stretch of our constructions yield the best trade-off currently known between

k

and

ε

, due to Elkin and Peleg (STOC ’01). Our distributed algorithms are rather short, and can be viewed as a unification and simplification of previous constructions.

Bilel Derbel, Cyril Gavoille, David Peleg, Laurent Viennot

A Local 2-Approximation Algorithm for the Vertex Cover Problem

We present a distributed 2-approximation algorithm for the minimum vertex cover problem. The algorithm is deterministic, and it runs in

$({\it \Delta}+1)^2$

synchronous communication rounds, where

${\it \Delta}$

is the maximum degree of the graph. For

${\it \Delta}=3$

, we give a 2-approximation algorithm also for the weighted version of the problem.

Matti Åstrand, Patrik Floréen, Valentin Polishchuk, Joel Rybicki, Jukka Suomela, Jara Uitto

Distributed Discovery of Large Near-Cliques

Given an undirected graph and 0 ≤

ε

≤ 1, a set of nodes is called

ε

-near clique if all but an

ε

fraction of the pairs of nodes in the set have a link between them. In this paper we present a fast synchronous network algorithm that uses small messages and finds a near-clique. Specifically, we present a constant-time algorithm that finds, with constant probability of success, a linear size

ε

-near clique if there exists an

ε

3

-near clique of linear size in the graph. The algorithm uses messages of

O

(log

n

) bits. The failure probability can be reduced to

$n^{-{\it \Omega}(1)}$

in

O

(log

n

) time factor, and the algorithm also works if the graph contains a clique of size

${\it \Omega}(n/\log^{\alpha}\log n)$

for some

α

∈ (0,1). Our approach is based on a new idea of adapting property testing algorithms to the distributed setting.

Zvika Brakerski, Boaz Patt-Shamir

Distributed Fractional Packing and Maximum Weighted b-Matching via Tail-Recursive Duality

We present efficient distributed

δ

-approximation algorithms for

fractional packing

and

maximum weighted

b-

matching

in hypergraphs, where

δ

is the maximum number of packing constraints in which a variable appears (for

maximum weighted

b-

matching

δ

is the maximum edge degree — for graphs

δ

= 2). (a) For

δ

= 2 the algorithm runs in

O

(log

m

) rounds in expectation and with high probability. (b) For general

δ

, the algorithm runs in

O

(log

2

m

) rounds in expectation and with high probability.

Christos Koufogiannakis, Neal E. Young

Brief Announcement: Decidable Graph Languages by Mediated Population Protocols

We work on an extension of the Population Protocol model of Angluin et al. [1] that allows edges of the communication graph,

G

, to have

states

that belong to a

constant size set

. In this extension, the so called Mediated Population Protocol model (MPP) [2,3], both

uniformity

and

anonymity

are preserved. We here study a simplified version of MPP, the Graph Decision Mediated Population Protocol model (GDM), in order to capture MPP’s ability to

decide

graph languages

. We also prove some first impossibility results both for weakly connected and possibly disconnected communication graphs.

Ioannis Chatzigiannakis, Othon Michail, Paul G. Spirakis

Brief Announcement: Towards Secured Distributed Polling in Social Networks

Social networks are growing exponentially, and one of the most celebrated examples, Facebook, currently boasts more than 250 million users. A particularly important task in such networks is

polling

, such as the recent one about the terms of service of Facebook [1]. A defining characteristic of such networks is the one to one mapping between social network identities and real ones (as opposed to virtual world platforms such as SecondLife). Participants in social networks are

respectable

, that is they do care about their

reputation

: information related to a user is considered to reflect intimately on the associated

real

person. We claim that leveraging the fact that users of social networks are concerned over their reputation, we can achieve polling in a distributed manner in the presence of malicious users without the use of heavyweight cryptography.

Rachid Guerraoui, Kévin Huguenin, Anne-Marie Kermarrec, Maxime Monod

Modeling Issues (Session 1D)

What Can Be Observed Locally?

Round-Based Models for Quantum Distributed Computing

We consider the question of

locality

in distributed computing in the context of quantum information. Specifically, we focus on the round complexity of quantum distributed algorithms, with no bounds imposed on local computational power or on the bit size of messages. Linial’s

$\mathcal{LOCAL}$

model of a distributed system is augmented through two types of quantum extensions: (1) initialization of the system in a quantum entangled state, and/or (2) application of quantum communication channels. For both types of extensions, we discuss proof-of-concept examples of distributed problems whose round complexity is in fact reduced through genuinely quantum effects. Nevertheless, we show that even such quantum variants of the

$\mathcal{LOCAL}$

model have non-trivial limitations, captured by a very simple (purely probabilistic) notion which we call “physical locality” (

$\varphi\textrm{\upshape -}\mathcal{LOCAL}$

). While this is strictly weaker than the “computational locality” of the classical

$\mathcal{LOCAL}$

model, it nevertheless leads to a generic view-based analysis technique for constructing lower bounds on round complexity. It turns out that the best currently known lower time bounds for many distributed combinatorial optimization problems, such as

Maximal Independent Set

, bounds cannot be broken by applying quantum processing, in any conceivable way.

Cyril Gavoille, Adrian Kosowski, Marcin Markiewicz

At-Most-Once Semantics in Asynchronous Shared Memory

At-most-once semantics is one of the standard models for object access in decentralized systems. Accessing an object, such as altering the state of the object by means of direct access, method invocation, or remote procedure call, with at-most-once semantics guarantees that the access is not repeated more-than-once, enabling one to reason about the safety properties of the object. This paper investigates implementations of at-most-once access semantics in a model where a set of such actions is to be performed by a set of failure-prone, asynchronous shared-memory processes. We introduce a definition of the

at-most-once

problem for performing a set of

n

jobs using

m

processors and we introduce a notion of efficiency for such protocols, called

effectiveness

, used to classify algorithms. Effectiveness measures the number of jobs safely completed by an implementation, as a function of the overall number of jobs

n

, the number of participating processes

m

, and the number of process crashes

f

in the presence of an adversary. We prove a lower bound of

n

−

f

on the effectiveness of any algorithm. We then prove that this lower bound can be matched in the two process setting by presenting two algorithms that offer a tradeoff between time and space complexity. Finally, we generalize our two-process solution in the multi-process setting with a hierarchical algorithm that achieves effectiveness of

n

− log

m

·

o

(

n

), coming reasonably close, asymptotically, to the corresponding lower bound.

Sotirios Kentros, Aggelos Kiayias, Nicolas Nicolaou, Alexander A. Shvartsman

Nonblocking Algorithms and Backward Simulation

Optimistic and nonblocking concurrent algorithms are increasingly finding their way into practical use; an important example is software transactional memory implementations. Such algorithms are notoriously difficult to design and verify as correct, and we believe complete, formal, and machine-checked correctness proofs for such algorithms are critical. We have been studying the use of automated tools such as the PVS theorem proving system to model algorithms and their specifications using formalisms such as I/O automata, and using simulation proof techniques to show the algorithms implement their specifications. While it has been relatively rare in the past, optimistic and nonblocking algorithms often require a special flavour of simulation proof, known as

backward simulation

. In this paper, we present what we believe is by far the most challenging backward simulation proof achieved to date; this proof was developed and completely checked using PVS.

Simon Doherty, Mark Moir

Brief Announcement: Efficient Model Checking of Fault-Tolerant Distributed Protocols Using Symmetry Reduction

Motivation.

Fault-tolerant (FT) distributed protocols represent fundamental building blocks behind many practical systems. A rigorous design of these protocols is desired given the complexity of manual proofs. The application of

model checking

(MC) [2] for protocol verification is attractive with its full automation and rich property language. However, being an exhaustive exploration method, its scalability is limited by the number of different system states. Although

FT distributed protocols

usually display a high degree of symmetry which stems from permuting different processes, MC efforts targeting their automated verification often disregard this symmetry. Therefore, we propose to leverage the framework of symmetry reduction [6] and improve on existing applications of it. Our secondary contribution is to define a high-level description language (called FTDP) to ease the symmetry-aware specification of FT distributed protocols.

Péter Bokor, Marco Serafini, Neeraj Suri, Helmut Veith

Brief Announcement: Dynamic FTSS in Asynchronous Systems: The Case of Unison

Context.

The advent of ubiquitous large-scale distributed systems advocates that tolerance to various kinds of faults and hazards must be included from the very early design of such systems.

Self-stabilization

[1] is a versatile technique that permits forward recovery from any kind of

transient

fault, while

Fault-tolerance

[2] is traditionally used to mask the effect of a limited number of

permanent

faults. The seminal works of [3,4] define

FTSS

protocols as protocols that are both

Fault Tolerant and Self-Stabilizing

,

i.e.

able to tolerate a few crash faults as well as arbitrary initial memory corruption. In [3], some impossibility results in asynchronous systems are presented. In [4], a general transformer is presented for synchronous systems. The transformer of [4] was proved impossible to transpose to asynchronous systems in [5] due to the impossibility of tight synchronization in the FTSS context. It turns out that FTSS possibility results in fully

asynchronous

systems known to date are restricted to

static

tasks,

i.e.

tasks that require eventual convergence to some global fixed point (tasks such as naming or vertex coloring fall in this category).

Swan Dubois, Maria Gradinariu Potop-Butucaru, Sébastien Tixeuil

Game Theory (Session 2A)

Dynamics in Network Interaction Games

We study the convergence times of dynamics in games involving graphical relationships of players. Our model of local interaction games generalizes a variety of recently studied games in game theory and distributed computing. In a local interaction game each agent is a node embedded in a graph and plays the same 2-player game with each neighbor. He can choose his strategy only once and must apply his choice in each game he is involved in. This represents a fundamental model of decision making with local interaction and distributed control. Furthermore, we introduce a generalization called 2-type interaction games, in which one 2-player game is played on edges and possibly another game is played on non-edges. For the popular case with symmetric 2 ×2 games, we show that several dynamics converge in polynomial time. This includes arbitrary sequential better response dynamics, as well as concurrent dynamics resulting from a distributed protocol that does not rely on global knowledge. We supplement these results with an experimental comparison of sequential and concurrent dynamics.

Martin Hoefer, Siddharth Suri

Brief Announcement: Cloud Computing Games: Pricing Services of Large Data Centers

Organizations opt to reduce costs by contracting their day-to-day computing needs to service providers who offer large-scale data centers and cloud computing services. Like other computing commodities, data centers provide paid services that require careful pricing. Using a Stackelberg game formulation, we present a demand-based pricing model for maximizing revenue of data center providers that serve clients who aim to maximize their utilities.

Ashraf Al Daoud, Sachin Agarwal, Tansu Alpcan

Failure Detectors (Session 2C)

On the Existence of Weakest Failure Detectors for Mutual Exclusion and k-Exclusion

(Extended Abstract)

Research over the past two decades has identified the weakest failure detectors for several important problems in fault-tolerant distributed computing. A recent work has shown that, for a certain definition of the term “problem,” every problem that is solvable using failure detectors has a weakest failure detector. In sharp contrast to these results, we prove that a fundamental problem in concurrent computing—FCFS Mutual Exclusion—is solvable using failure detectors, but has no weakest failure detector in the shared memory model. To the best of our knowledge, this is the first problem that is proved not to have a weakest failure detector. We also show that, if the FCFS requirement is dropped, the mutual exclusion problem has a weakest failure detector. In fact, we present the weakest failure detector for the more general problem of starvation-free

k

-exclusion, for any

k

.

Vibhor Bhatt, Prasad Jayanti

Crash-Quiescent Failure Detection

A distributed algorithm is

crash quiescent

if it eventually stops sending messages to crashed processes. An algorithm can be made crash quiescent by providing it with either a crash notification service or a reliable communication service. Both services can be implemented in practical environments with

failure detectors

. Therefore, crash-quiescent failure detection is fundamental to system-wide crash quiescence. We establish necessary and sufficient conditions for crash-quiescent failure detection in partially synchronous environments where a bounded, but unknown, number of consecutive messages can be arbitrarily late or lost. Without a correct majority of processes, not even the weakest oracle for fault-tolerant consensus,

$\Diamond\mathcal{W}$

, can be implemented crash quiescently. With a correct majority, however, the eventually perfect failure detector,

$\Diamond\mathcal{P}$

, is possible. Our

$\Diamond\mathcal{P}$

algorithm is correct in all runs, but improves performance via crash quiescence in any run with a correct majority. We also present a refinement of our

$\Diamond\mathcal{P}$

algorithm to mitigate the overhead of achieving crash quiescence; the resulting bit complexity per utilized link is asymptotically better than or equal to that of non-crash-quiescent counterparts.

Srikanth Sastry, Scott M. Pike, Jennifer L. Welch

The Price of Anonymity: Optimal Consensus Despite Asynchrony, Crash and Anonymity

This paper addresses the consensus problem in asynchronous systems prone to process crashes, where additionally the processes are anonymous (they cannot be distinguished one from the other: they have no name and execute the same code). To circumvent the three computational adversaries (asynchrony, failures and anonymity) each process is provided with a failure detector of a class denoted

ψ

, that gives it an upper bound on the number of processes that are currently alive (in a non-anonymous system, the classes

ψ

and

$\cal P$

-the class of perfect failure detectors- are equivalent).

The paper first presents a simple

ψ

-based consensus algorithm where the processes decide in 2

t

+ 1 asynchronous rounds (where

t

is an upper bound on the number of faulty processes). It then shows one of its main results, namely, 2

t

+ 1 is a lower bound for consensus in the anonymous systems equipped with

ψ

. The second contribution addresses early-decision. The paper presents and proves correct an early-deciding algorithm where the processes decide in min (2

f

+ 2,2

t

+ 1) asynchronous rounds (where

f

is the actual number of process failures). This leads to think that anonymity doubles the cost (wrt synchronous systems) and it is conjectured that min (2

f

+ 2,2

t

+ 1) is the corresponding lower bound.

The paper finally considers the

k

-set agreement problem in anonymous systems. It first shows that the previous

ψ

-based consensus algorithm solves the

k

-set agreement problem in

$R_t=2 \left\lfloor \frac{t}{k}\right\rfloor +1$

asynchronous rounds. Then, considering a family of failure detector classes {

ψ

ℓ

}

0 ≤ ℓ<

k

that generalizes the class

ψ

( =

ψ

0

), the paper presents an algorithm that solves the

k

-set agreement in

$R_{t,\ell}=2 \left\lfloor \frac{t}{k-\ell}\right\rfloor +1$

asynchronous rounds. This last formula relates the cost (

R

t

,ℓ

), the coordination degree of the problem (

k

), the maximum number of failures (

t

) and the the strength (ℓ) of the underlying failure detector. Finally the paper concludes by presenting problems that remain open.

François Bonnet, Michel Raynal

Brief Announcement: On Implementing Omega Efficiently in the Crash-Recovery Model

This work focuses on implementing Omega in the crash-recovery model. Previously proposed algorithms either use stable storage or have a permanent all-to-all communication pattern. We propose a more efficient algorithm which does not use stable storage, and in which eventually, among correct processes, only one keeps sending messages.

Mikel Larrea, Cristian Martín

Brief Announcement: The Minimum Failure Detector for Non-Local Tasks in Message-Passing Systems

This paper defines the basic notions of local and non-local tasks, and determines the minimum information about failures that is necessary to solve any non-local task in message-passing systems. It also introduces a natural weakening of the well-known set agreement task, and show that, in some precise sense, it is the weakest non-local task in message-passing systems.

Carole Delporte-Gallet, Hugues Fauconnier, Sam Toueg

Brief Announcement: Weak Synchrony Models and Failure Detectors for Message Passing (k-)Set Agreement

Motivation.

In recent years, the quest for weak system assumptions, which add just enough synchrony resp. failure information to purely asynchronous systems to circumvent impossibility results, has been an active research topic in distributed computing. Most work in this area has been devoted to (1) identifying weak(est) failure detectors (FDs), and (2) identifying synchrony assumptions just strong enough to implement these weak FDs.

Due to the FLP impossibility result [1], the first focus of this research has been the

consensus

problem. More recently,

k-set agreement

(termed set agreement for

k

=

n

−1) has been identified as a promising target for further exploring the solvability border in asynchronous systems. As for (1),

anti-

${\it \Omega}$

[2] was shown to be the weakest FD for set agreement in shared memory systems: Whereas

${\it \Omega}$

(the weakest FD for solving consensus [3]) outputs the id of one

correct

process infinitely often, anti-

${\it \Omega}$

outputs the id of one

correct

process only finitely often. Subsequently, a generalization called anti-

${\it \Omega}_k$

that returns

n

−

k

processes has been shown to be the weakest FD for

k

-set agreement in shared memory system [4,5]. For message passing systems, only the weakest FD for set agreement is known, namely, the “loneliness” failure detector

$\cal{L}$

[6]. Concerning (2), a class of shared memory models for implementing anti-

${\it \Omega}_k$

was presented in [7].

Martin Biely, Peter Robinson, Ulrich Schmid

From Theory to Practice (Session 3A)

Brief Announcement Zab: A Practical Totally Ordered Broadcast Protocol

At Yahoo!, we have developed a fault-tolerant coordination service called

ZooKeeper

[4] that allows large scale applications to implement coordination tasks such as leader election, status propagation, and rendezvous. ZooKeeper forgoes locks [2] and instead implements simple wait-free data objects [3] along with a consistency model that guarantees linearizable updates and FIFO order for client operations. We have found the service to be flexible with performance that meets the production demands of the Web-scale applications of Yahoo!.

The ZooKeeper service comprises

n

ZooKeeper replicas (

n

≥ 2

f

+ 1,

f

is a threshold on the number of faulty replicas). Among these replicas, there is a distinguished, elected replica: the

leader

. The remaining replicas are

followers

. Clients of the ZooKeeper service can connect and submit requests through any ZooKeeper replica. If this request reads the state of ZooKeeper, the replica serves this request locally. Otherwise, it forwards the request to the leader. The leader receives ZooKeeper requests and transforms them into

idempotent transactions

. The transformation corresponds to generating the state modifications for the given request, as with primary-backup protocols [1]. The leader then sends transactions as messages using atomic broadcast. As a leader can crash, there must be an additional leadership election protocol. To elect a leader, ZooKeeper requires at least ⌈(

n

+ 1)/2⌉ non-faulty replicas.

Flavio P. Junqueira, Benjamin C. Reed

Graph Algorithms and Routing (Session 3B)

Compact Multicast Routing

In a distributed network, a

compact multicast scheme

is a routing scheme that allows any source to send messages to any set of targets. We study the trade-off between the

space

used to store the routing table on each node and the

stretch

factor of the multicast scheme – the maximum ratio over all sets of nodes between the cost of the multicast route induced by the scheme and the cost of a steiner tree between the same set of target nodes. We obtain results in several variants of the problem:

labeled

– in which the designer can choose polylogarithmic node names,

name-independent

– in which nodes have arbitrarily chosen names,

dynamic

– an online version of the problem in which nodes dynamically join and leave the multicast service and the goal is to minimize both the cost of the multicast tree at each stage and the total cost of control messages needed to update the tree.

Ittai Abraham, Dahlia Malkhi, David Ratajczak

Compact Routing in Power-Law Graphs

We adapt the compact routing scheme by Thorup and Zwick to optimize it for power-law graphs. We analyze our adapted routing scheme based on the theory of unweighted random power-law graphs with fixed expected degree sequence by Aiello, Chung, and Lu. Our result is the first theoretical bound coupled to the parameter of the power-law graph model for a compact routing scheme. In particular, we prove that, for stretch 3, instead of routing tables with

$\tilde{O}(n^{1/2})$

bits as in the general scheme by Thorup and Zwick, expected sizes of

O

(

n

γ

log

n

) bits are sufficient, and that all the routing tables can be constructed at once in expected time

O

(

n

1 +

γ

log

n

), with

$\gamma=\frac{\tau-2}{2\tau-3}+\varepsilon$

, where

τ

∈ (2,3) is the power-law exponent and

ε

> 0. Both bounds also hold with probability at least 1 − 1/

n

(independent of

ε

). The routing scheme is a labeled scheme, requiring a stretch-5 handshaking step and using addresses and message headers with

O

(log

n

loglog

n

) bits, with probability at least 1 −

o

(1). We further demonstrate the effectiveness of our scheme by simulations on real-world graphs as well as synthetic power-law graphs. With the same techniques as for the compact routing scheme, we also adapt the approximate distance oracle by Thorup and Zwick for stretch 3 and obtain a new upper bound of expected

$\tilde{O}(n^{1+\gamma})$

for space and preprocessing.

Wei Chen, Christian Sommer, Shang-Hua Teng, Yajun Wang

Virtual Ring Routing Trends

Virtual Ring Routing (VRR) schemes were introduced in the context of wireless ad hoc networks and Internet anycast overlays. They build a network-routing layer using ideas from distributed hash table design, utilizing randomized virtual identities along a ring. This makes maintenance practical when nodes may enter or leave.

Previously, VRR was evaluated over a small wireless network and through medium-scale simulations, exhibiting remarkably good performance. In this paper, we provide a formal analysis of a family of VRR-like schemes. The analysis provides insight into a variety of issues, e.g., how well does VRR perform compared with brute force shortest paths routing? What properties of an underlying network topology make VRR work well?

Our analysis is backed by extensive simulation over a variety of topologies. Whereas previous works evaluated VRR over fairly small networks (up to 200 nodes), we are interested in scaling the simulations so as to exhibit asymptotic trends. Simulating network sizes beyond 2

20

results in a memory explosion: In some of the topologies of interest, such as a 2-dimensional plane, the total memory taken up by routing tables is

${\it \Omega}(N^{3/2})$

for an

N

-node network. We devise a simulation strategy that builds necessary information on the fly using a Luby and Rackoff pseudo-random permutation, leading to simulations at a scale of 2

32

nodes.

Dahlia Malkhi, Siddhartha Sen, Kunal Talwar, Renato F. Werneck, Udi Wieder

A New Self-stabilizing Minimum Spanning Tree Construction with Loop-Free Property

The minimum spanning tree (MST) construction is a classical problem in Distributed Computing for creating a globally minimized structure distributedly. Self-stabilization is versatile technique for forward recovery that permits to handle any kind of transient faults in a unified manner. The loop-free property provides interesting safety assurance in dynamic networks where edge-cost changes during operation of the protocol.

We present a new self-stabilizing MST protocol that improves on previous known approaches in several ways. First, it makes fewer system hypotheses as the size of the network (or an upper bound on the size) need

not

be known to the participants. Second, it is loop-free in the sense that it guarantees that a spanning tree structure is always preserved while edge costs change dynamically and the protocol adjusts to a new MST. Finally, time complexity matches the best known results, while space complexity results show that this protocol is the most efficient to date.

Lélia Blin, Maria Potop-Butucaru, Stephane Rovedakis, Sébastien Tixeuil

Euler Tour Lock-In Problem in the Rotor-Router Model

I Choose Pointers and You Choose Port Numbers

The

rotor-router model

, also called the

Propp machine

, was first considered as a deterministic alternative to the random walk. It is known that the route in an undirected graph

G

= (

V

,

E

), where |

V

| =

n

and |

E

| =

m

, adopted by an agent controlled by the rotor-router mechanism forms eventually an Euler tour based on arcs obtained via replacing each edge in

G

by two arcs with opposite direction. The process of ushering the agent to an Euler tour is referred to as the

lock-in problem

. In recent work [11] Yanovski et al. proved that independently of the initial configuration of the rotor-router mechanism in

G

the agent locks-in in time bounded by 2

mD

, where

D

is the diameter of

G

.

In this paper we examine the dependence of the lock-in time on the initial configuration of the rotor-router mechanism. The case study is performed in the form of a game between a player

$\cal P$

intending to lock-in the agent in an Euler tour as quickly as possible and its adversary

$\cal A$

with the counter objective. First, we observe that in certain (easy) cases the lock-in can be achieved in time

O

(

m

). On the other hand we show that if adversary

$\cal A$

is solely responsible for the assignment of ports and pointers, the lock-in time Ω(

m

·

D

) can be enforced in any graph with

m

edges and diameter

D

. Furthermore, we show that if

$\cal A$

provides its own port numbering after the initial setup of pointers by

$\cal P$

, the complexity of the lock-in problem is bounded by

O

(

m

· min {log

m

,

D

}). We also propose a class of graphs in which the lock-in requires time Ω(

m

·log

m

). In the remaining two cases we show that the lock-in requires time Ω(

m

·

D

) in graphs with the worst-case topology. In addition, however, we present non-trivial classes of graphs with a large diameter in which the lock-in time is

O

(

m

).

Evangelos Bampas, Leszek Gąsieniec, Nicolas Hanusse, David Ilcinkas, Ralf Klasing, Adrian Kosowski

Consensus and Byzantine Agreement (Session 3C)

Optimum Simultaneous Consensus for General Omissions Is Equivalent to an NP Oracle

The general omissions failure model, in which a faulty process may omit both to send

and

to receive messages is inherently more complex than the more popular sending omissions model. This fact is exemplified in tasks involving simultaneous decisions, such as the simultaneous consensus (SC) problem. While efficient polynomial protocols for SC that are optimal in all runs are known for the sending omissions model, they do not exists for general omissions. It has been shown that such a protocol must perform at least NP-hard computations (in the number of processes

n

) between rounds. In fact, the best previously known SC protocol that is optimal in all runs in this model performs PSPACE (in

n

) computations between rounds. The current paper closes this twenty-year old gap by presenting such an optimal SC protocol that performs

P

NP

computations (polynomial-time computations using an oracle for NP; in fact, a constant number of accesses to the oracle are needed per round.) The result is based on a new characterization of common knowledge in the general omissions failure model.

Yoram Moses

On the Number of Synchronous Rounds Sufficient for Authenticated Byzantine Agreement

Byzantine agreement is typically considered with respect to either a fully synchronous network or a fully asynchronous one. In the synchronous case,

t

+ 1 communication rounds are necessary for deterministic protocols whereas all known probabilistic protocols require an

expected

large number of rounds. In this paper we examine the question of how many initial synchronous rounds are required for Byzantine agreement in the

worst case

if we allow to switch to asynchronous operation afterward. Let

n

=

h

+

t

be the number of parties where

h

are honest and

t

are corrupted. As the main result we show that, in the model with a public-key infrastructure and signatures (aka authenticated Byzantine agreement),

d

+

O

(1) deterministic synchronous rounds are sufficient where

d

is the minimal integer such that

n

−

d

> 3(

t

−

d

). This improves over the

t

+ 1 necessary deterministic rounds for almost all cases, and over the exact expected number of rounds in the non-deterministic case for many cases.

Matthias Fitzi, Jesper Buus Nielsen

From Almost Everywhere to Everywhere: Byzantine Agreement with $\tilde{O}(n^{3/2})$ Bits

We address the problem of designing distributed algorithms for large scale networks that are robust to Byzantine faults. We consider a message passing, full information synchronous model: the adversary is malicious, controls a constant fraction of processors, and can view all messages in a round before sending out its own messages for that round. Furthermore, each corrupt processor may send an unlimited number of messages. The only constraint on the adversary is that it must choose its corrupt processors at the start, without knowledge of the processors’ private random bits. To the authors’ best knowledge, there have been no protocols for such a model that compute Byzantine agreement without all-to-all communication, even if private channels or cryptography are assumed, unless corrupt processors’ messages are limited.

In this paper, we give a polylogarithmic time algorithm to agree on a small representative committee of processors using only

$\tilde{O}(n^{3/2})$

total bits which succeeds with high probability. This representative set can then be used to efficiently solve Byzantine agreement, leader election, or other problems. This work extends the authors’ work on scalable almost everywhere agreement.

Valerie King, Jared Saia

Brief Announcement: A Leader-free Byzantine Consensus Algorithm

We consider the consensus problem in a partially synchronous system with Byzantine faults. In a distributed system of

n

processes, where each process has an initial value, Byzantine consensus is the problem of agreeing on a common value, even though some of the processes may fail in arbitrary, even malicious, ways. It is shown in [11] that — in a synchronous system — 3

t

+ 1 processes are needed to solve the Byzantine consensus problem without signatures, where

t

is the maximum number of Byzantine processes. In an asynchronous system, Fischer, Lynch and Peterson [7] proved that no deterministic asynchronous consensus protocol can tolerate even a single non-Byzantine (= crash) failure. The problem can however be solved using randomization for benign and Byzantine faults. For Byzantine faults, Ben-Or [2] and Rabin [12] showed that this requires 5

t

+ 1 processes. Later, Bracha [3] increased the resiliency of the randomized algorithm to 3

t

+ 1.

Fatemeh Borran, André Schiper

Radio Networks (Session 3D)

Efficient k-Shot Broadcasting in Radio Networks

The paper concerns time-efficient

k

-shot broadcasting in undirected radio networks. In a

k-shot

broadcasting algorithm, each node in the network is allowed to transmit at most

k

times. Both known and unknown topology models are considered. For the known topology model, the problem has been studied before by Ga̧sieniec et al. [14], who established an upper bound of

D

+

O

(

kn

1/(

k

− 2)

log

2

n

) and a lower bound of

$D+{\it \Omega}((n-D)^{1/2k})$

on the length of

k

-shot broadcasting schedules for

n

-node graphs of diameter

D

. We improve both the upper and the lower bound, providing a randomized algorithm for constructing a

k

-shot broadcasting schedule of length

D

+

O

(

kn

1/2

k

log

2 + 1/

k

n

) on undirected graphs, and a lower bound of

$D+{\it \Omega}(k\cdot(n-D)^{1/2k})$

, which almost closes the gap between these bounds. For the unknown topology model, we provide the first

k

-shot broadcasting algorithm. Assuming that each node knows only the network size

n

(or a linear upper bound on it), our randomized

k

-shot broadcasting algorithm completes broadcasting in

O

((

D

+ min {

D

·

k

,log

n

}) ·

n

1/(

k

− 1)

log

n

) rounds with high probability. Moreover, we present an

${\it \Theta}(\log n )$

-shot broadcasting algorithm that completes broadcasting in at most

O

(

D

log

n

+ log

2

n

) rounds with high probability. This algorithm matches the broadcasting time of the algorithm of Bar-Yehuda et al. [3], which assumes no limitation on the maximum number of transmissions per node.

Erez Kantor, David Peleg

Keeping Mobile Robot Swarms Connected

Designing robust algorithms for mobile agents with reliable communication is difficult due to the distributed nature of computation, in mobile ad hoc networks (MANETs) the matter is exacerbated by the need to ensure connectivity. Existing distributed algorithms provide coordination but typically assume connectivity is ensured by other means. We present a connectivity service that encapsulates an arbitrary motion planner and can refine any plan to preserve connectivity (the graph of agents remains connected) and ensure progress (the agents advance towards their goal). The service is realized by a distributed algorithm that is

modular

in that it makes no assumptions of the motion-planning mechanism except the ability for an agent to query its position and intended goal position,

local

in that it uses 1-hop broadcast to communicate with nearby agents but doesn’t need any network routing infrastructure, and

oblivious

in that it does not depend on previous computations.

We prove the progress of the algorithm in one round is at least

${\it \Omega}(\min(d,r))$

, where

d

is the minimum distance between an agent and its target and

r

is the communication radius. We characterize the worst case configuration and show that when

d

≥

r

this bound is tight and the algorithm is optimal, since no algorithm can guarantee greater progress. Finally we show all agents get

ε

-close to their targets within

$O(D_0/r+n^2/\varepsilon)$

rounds where

n

is the number of agents and

D

0

is the sum of the initial distances to the targets.

Alejandro Cornejo, Fabian Kuhn, Ruy Ley-Wild, Nancy Lynch

Consensus and Mutual Exclusion in a Multiple Access Channel

We consider deterministic feasibility and time complexity of two fundamental tasks in distributed computing: consensus and mutual exclusion. Processes have different labels and communicate through a multiple access channel. The adversary wakes up some processes in possibly different rounds. In any round every awake process either listens or transmits. The message of a process

i

is heard by all other awake processes, if

i

is the only process to transmit in a given round. If more than one process transmits simultaneously, there is a collision and no message is heard. We consider three characteristics that may or may not exist in the channel: collision detection (listening processes can distinguish collision from silence), the availablity of a global clock showing the round number, and the knowledge of the number

n

of all processes.

If none of the above three characteristics is available in the channel, we prove that consensus and mutual exclusion are infeasible; if at least one of them is available, both tasks are feasible and we study their time complexity. Collision detection is shown to cause an exponential gap in complexity: if it is available, both tasks can be performed in time logarithmic in

n

, which is optimal, and without collision detection both tasks require linear time. We then investigate both consensus and mutual exclusion in the absence of collision detection, but under alternative presence of the two other features. With global clock, we give an algorithm whose time complexity linearly depends on

n

and on the wake-up time, and an algorithm whose complexity does not depend on the wake-up time and differs from the linear lower bound only by a factor

O

(log

2

n

). If

n

is known, we also show an algorithm whose complexity differs from the linear lower bound only by a factor

O

(log

2

n

).

Jurek Czyzowicz, Leszek Gąsieniec, Dariusz R. Kowalski, Andrzej Pelc

Brief Announcement: Efficient Utilization of Multiple Interfaces in Wireless Ad Hoc Networks

A wireless ad hoc network is composed of devices that are capable of communicating directly with their neighbors (roughly speaking, nodes that are nearby). Many such devices are battery-operated, e.g., laptops, smart-phones and PDAs. Thus, their operational life-time before the battery should be recharged or replaced is limited. Among all subsystems operating inside these devices, wireless communication is accounted for the major consumption of power [1,2]. Additionally, platforms enabled with multiple wireless communication interfaces are becoming quite common. This turns the problem of efficient power usage by the wireless communication subsystem even more acute.

Roy Friedman, Alex Kogan

Brief Announcement: The Speed of Broadcasting in Random Networks – Density Does Not Matter

We consider the problem of spreading information in large random networks with small average degree. Randomized broadcasting is among the most fundamental and well-studied communication primitives in distributed computing, and has also applications in several other disciplines, like e.g. in mathematical theories of epidemics. A particularly popular example [1] is the maintenance of consistency in a distributed database, which is replicated at many hundreds or thousands of sites in a large, heterogeneous network. Obviously, efficient broadcasting algorithms are crucial in order to ensure that all copies of the database converge quickly and effectively to the same content.

Nikolaos Fountoulakis, Anna Huber, Konstantinos Panagiotou

Springer Professional

About this book

Table of Contents

Frontmatter

The 2009 Edsger W. Dijkstra Prize in Distributed Computing

The 2009 Edsger W. Dijkstra Prize in Distributed Computing

Michel Raynal and Shmuel Zaks 60th Birthday Symposium

Computing, Observing, Controlling, Checkpointing: Symbiosis Is Even Better Than Agreement!

What Agreement Problems Owe Michel

Shmuel Zaks - The Early Years: A Combinatorialist in Distributed Computing

Shmuel Zaks - The Mathematician, Computer Scientist and Personality

Award Nominees (Session 2B)

The Disagreement Power of an Adversary

New Bounds for the Controller Problem

On Set Consensus Numbers

The Abstract MAC Layer

Randomization Can Be a Healer: Consensus with Dynamic Omission Failures

Transactional Memory (Session 1A)

Interrupting Snapshots and the Java $^{\mbox{\tiny TM}}$ Size() Method

Elastic Transactions

Brief Announcement: Transactional Scheduling for Read-Dominated Workloads

Shared Memory (Session 1B)

Tight Group Renaming on Groups of Size g Is Equivalent to g-Consensus

The RedBlue Adaptive Universal Constructions

Help When Needed, But No More: Efficient Read/Write Partial Snapshot

Contention-Sensitive Data Structures and Algorithms

Brief Announcement: Acceleration by Contention for Shared Memory Mutual Exclusion Algorithms

Brief Announcement: Incremental Component-Based Modeling, Verification, and Performance Evaluation of Distributed Reset

Distributed and Local Graph Algorithms (Session 1C)

Local Computation of Nearly Additive Spanners

A Local 2-Approximation Algorithm for the Vertex Cover Problem

Distributed Discovery of Large Near-Cliques

Distributed Fractional Packing and Maximum Weighted b-Matching via Tail-Recursive Duality

Brief Announcement: Decidable Graph Languages by Mediated Population Protocols

Brief Announcement: Towards Secured Distributed Polling in Social Networks

Modeling Issues (Session 1D)

What Can Be Observed Locally?

At-Most-Once Semantics in Asynchronous Shared Memory

Nonblocking Algorithms and Backward Simulation

Brief Announcement: Efficient Model Checking of Fault-Tolerant Distributed Protocols Using Symmetry Reduction

Brief Announcement: Dynamic FTSS in Asynchronous Systems: The Case of Unison

Game Theory (Session 2A)

Dynamics in Network Interaction Games

Brief Announcement: Cloud Computing Games: Pricing Services of Large Data Centers

Failure Detectors (Session 2C)

On the Existence of Weakest Failure Detectors for Mutual Exclusion and k-Exclusion

Crash-Quiescent Failure Detection

The Price of Anonymity: Optimal Consensus Despite Asynchrony, Crash and Anonymity

Brief Announcement: On Implementing Omega Efficiently in the Crash-Recovery Model

Brief Announcement: The Minimum Failure Detector for Non-Local Tasks in Message-Passing Systems

Brief Announcement: Weak Synchrony Models and Failure Detectors for Message Passing (k-)Set Agreement

From Theory to Practice (Session 3A)

Brief Announcement Zab: A Practical Totally Ordered Broadcast Protocol

Graph Algorithms and Routing (Session 3B)

Compact Multicast Routing

Compact Routing in Power-Law Graphs

Virtual Ring Routing Trends

A New Self-stabilizing Minimum Spanning Tree Construction with Loop-Free Property

Euler Tour Lock-In Problem in the Rotor-Router Model

Consensus and Byzantine Agreement (Session 3C)

Optimum Simultaneous Consensus for General Omissions Is Equivalent to an NP Oracle

On the Number of Synchronous Rounds Sufficient for Authenticated Byzantine Agreement

From Almost Everywhere to Everywhere: Byzantine Agreement with $\tilde{O}(n^{3/2})$ Bits

Brief Announcement: A Leader-free Byzantine Consensus Algorithm

Radio Networks (Session 3D)

Efficient k-Shot Broadcasting in Radio Networks

Keeping Mobile Robot Swarms Connected

Consensus and Mutual Exclusion in a Multiple Access Channel

Brief Announcement: Efficient Utilization of Multiple Interfaces in Wireless Ad Hoc Networks

Brief Announcement: The Speed of Broadcasting in Random Networks – Density Does Not Matter

Backmatter

Premium Partner