article

Self-stabilizing clock synchronization in the presence of Byzantine faults

Authors:
Shlomi Dolev

Ben-Gurion University of the Negev, Beer-Sheva, Israel

Ben-Gurion University of the Negev, Beer-Sheva, Israel
View Profile

,
Jennifer L. Welch

Texas A&M University, College Station, Texas

Texas A&M University, College Station, Texas
View Profile

Authors Info & Claims

Journal of the ACM Volume 51 Issue 5pp 780–799https://doi.org/10.1145/1017460.1017463

Published:01 September 2004Publication History

Journal of the ACM

Abstract

We initiate a study of bounded clock synchronization under a more severe fault model than that proposed by Lamport and Melliar-Smith [1985]. Realistic aspects of the problem of synchronizing clocks in the presence of faults are considered. One aspect is that clock synchronization is an on-going task, thus the assumption that some of the processors never fail is too optimistic. To cope with this reality, we suggest self-stabilizing protocols that stabilize in any (long enough) period in which less than a third of the processors are faulty. Another aspect is that the clock value of each processor is bounded. A single transient fault may cause the clock to reach the upper bound. Therefore, we suggest a bounded clock that wraps around when appropriate.We present two randomized self-stabilizing protocols for synchronizing bounded clocks in the presence of Byzantine processor failures. The first protocol assumes that processors have a common pulse, while the second protocol does not. A new type of distributed counter based on the Chinese remainder theorem is used as part of the first protocol.

References

Arora, A., Dolev, S., and Gouda, M. 1991. Maintaining digital clocks in step. Parall. Proc. Lett. 1, 1, 11--18.Google Scholar
Daliot, A., Dolev, D., and Parnas, H. 2002. Self-stabilizing pulse synchronization inspired by biological pacemaker. In Proceedings of the 6th International Symposium on Self-Stabilizing Systems. Lecture Notes in Computer Science, Vol. 2704. Springer-Verlag, New York, 32--48. Google Scholar
Dijkstra, E. W. 1974. Self stabilizing systems in spite of distributed control. Commun. ACM 17, 643--644. Google Scholar
Dolev, D., Halpern, J. Y., and Strong, H. R. 1986. On the possibility and impossibility of achieving clock synchronization. J. Comput. Syst. Sci. 32, 2, 230--250. Google Scholar
Dolev, S., Israeli, A., and Moran, S. 1991. Uniform dynamic self stabilizing leader election. In Proceedings of the 5th International Workshop on Distributed Algorithms. 167--180. Google Scholar
Dolev, S., Israeli, A., and Moran, S. 1995. Analyzing expected time by scheduler-luck games. IEEE Trans. Softw. Eng. 21, 5 (May). Google Scholar
Dolev, D., Lynch, N. A., Pinter, S. S., Stark, E. W., and Weihl, W. E. 1986. Reaching approximate agreement in the presence of faults. In J. ACM 33, 499--516. Google Scholar
Dolev, S. and Welch, J. L. 1993. Wait-free clock synchronization. In Proceedings of the 12th ACM Symposium on Principles of Distributed Computing. ACM, New York, 97--108. Also appeared in Algorithmica 18, 1997, 486--511. Google Scholar
Cristian, F. 1989. Probabilistic clock synchronization. Distrib. Comput. 3, 146--158.Google Scholar
Gopal, A. S. and Perry, K. J. 1993. Unifying self-stabilization and fault-tolerance. In Proceedings of the 12th ACM Symposium on Principles of Distributed Computing. ACM, New York, 195--206. Google Scholar
Halpern, J. Simons, B., Strong, R., and Dolev, D. 1984. Fault-tolerant clock synchronization. In Proceedings of the 3rd ACM Symposium on Principles of Distributed Computing. ACM, New York, 89--102. Google Scholar
Knuth, D. E. 1981. The Art of Computer Programming, Vol. 2, 2nd ed., Addison-Wesley, Reading, Mass. Google Scholar
Lamport, L. and Melliar-Smith, P. M. 1985. Synchronizing clocks in the presence of faults. J. ACM 32, 1, 1--36. Google Scholar
Lamport, L., Shostak, R., and Pease, M. 1982. The Byzantine generals problem. ACM Trans. Prog. Lang. Syst. 4, 3 (July), 382--401. Google Scholar
Lundelius, J., and Lynch, N. 1984. An upper and lower bound for clock synchronization. Info. Cont. 62, 190--204.Google Scholar
Mahaney, S., and Schneider, F.1985. Inexact agreement: Accuracy, precision and graceful degradation. In Proceedings of the 4th ACM Symposium on Principles of Distributed Computing. ACM, New York, 237--249. Google Scholar
Ramanathan, P., Shin, K. G., and Butler, R. W. 1990. Fault-tolerant clock synchronization in distributed systems. IEEE Comput. (Oct), 33--42. Google Scholar
Srikanth, T. K. and Toueg, S. 1987. Optimal clock synchronization. J. ACM 34, 3, 626--645. Google Scholar
Szabo, S., and Tanaka, R. I. 1967. Residue Arithmetic and its Applications to Computer Technology, McGraw-Hill, New York.Google Scholar
Welch, J. L., and Lynch, N. 1988. A new fault-tolerant algorithm for clock synchronization. Info. Comput. 77, 1, 1--36. Google Scholar
Wensley, J. H., Lamport, L., Goldberg, J., Green, M. W., Levitt, K. N., Melliar-Smith, P. M., Shostak, R. E., and Weinstock, C. B.1978. SIFT: Design and analysis of fault-tolerant computer for aircraft control. Proc. IEEE 66, 10, 1240--1255.Google Scholar

Index Terms

Self-stabilizing clock synchronization in the presence of Byzantine faults

Recommendations

Fast self-stabilizing byzantine tolerant digital clock synchronization
PODC '08: Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing

Consider a distributed network in which up to a third of the nodes may be Byzantine, and in which the non-faulty nodes may be subject to transient faults that alter their memory in an arbitrary fashion. Within the context of this model, we are ...
Read More
A Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems
SSS'06: Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems

Embedded distributed systems have become an integral part of safety-critical computing applications, necessitating system designs that incorporate fault tolerant clock synchronization in order to achieve ultra-reliable assurance levels. Many efficient ...
Read More
Dynamic fault-tolerant clock synchronization

This paper gives two simple efficient distributed algorithms: one for keeping clocks in a network synchronized and one for allowing new processors to join the network with their clocks synchronized. Assuming a fault-tolerant authentication protocol, the ...
Read More

Reviews

Reviewer: Bayard Kohlhepp

The devil is in the details. Twenty-five million Europeans died from the bubonic plague, the "Black Death," in the Middle Ages. The European outbreaks have been traced to a single Italian merchant ship that picked up goods from China during an outbreak there; it was unknown in Europe beforehand. Mighty Athens fell to the same plague centuries earlier, when its citizens crowded together in the city during a Spartan siege. What triggered these outbreaks__?__ Fleas, which transmitted the plague from rat hosts to humans. We now have computer systems reaching around the globe, spreading themselves farther and thinner (as in cell phones, and other handhelds). Is there some small detail that could bring them crashing down__?__ The integrity of every transactional system, whether it's a small database, or the largest business-to-business (B2B) network, depends on clocks. If the order of transactions cannot be determined, a transactional system can't function. Imagine transactions being sent to a bank account. They can produce very different real-world results if you vary the order in which you apply them. Synchronized clocks are the heartbeat of distributed systems. This paper was published in the Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing , in August 1995, as "Self-stabilizing clock synchronization with Byzantine faults." I could only find a one-page summary when I looked it up in the ACM Digital Library today, but I used the entire paper a few years ago to design some middleware. Dolev and Welch build on the work of Lamport and Melliar-Smith [1]. They make several distinct contributions. First, they expand the robustness of the recovery protocol, to function with up to one-third of the processors known or suspected to be faulty. Two, they also remove the specification of unbounded clocks that was present in previous work, which makes their protocols usable on real-world systems. Third, they make the protocol self-stabilizing, so that no initial state has to be globally asserted, and to allow resynchronization (recovery) after transient faults, like more than a third of the processors becoming temporarily faulty. Finally, they present two protocols, one of which is dependent on a shared pulse. The article is 19 pages long, very readable, and, as mentioned above, was used to create real software. Online Computing Reviews Service

Reviewer: Veronica Lagrange

Dolev and Welch describe two self-stabilizing randomized protocols to synchronize distributed clocks in a fault tolerant environment. The protocols will synchronize, in bounded time, a distributed system in any initial state, as long as less than one-third of the processors are faulty. The emphasis of the paper is in bringing the system back to a synchronous, reliable state from a faulty, unsynchronized one. The first protocol, called synchronous, assumes the existence of a common pulse that propagates throughout all processors at a constant interval. This pulse is the trigger that each processor uses to compare its clock with all the others, and execute the synchronization steps, if necessary. The second protocol, called semi-synchronous, drops the assumption of a common pulse. This second protocol has a rather complicated proof, again mostly aimed at showing that this protocol will synchronize all nonfaulty processors after any initially faulty scenario. One important feature of fault tolerant environments, apparently overlooked in the second protocol, is that the clock of a nonfaulty processor should not be moved backward; this will cause the timestamp of a later event to be smaller than the timestamp of an earlier event. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Journal of the ACM Volume 51, Issue 5
September 2004
151 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/1017460
Issue’s Table of Contents

Copyright © 2004 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2004
Published in jacm Volume 51, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Byzantine failures
clock synchronization
self-stabilization
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 113
  Total Citations
  View Citations
- 1,632
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Self-stabilizing clock synchronization in the presence of Byzantine faults

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Fast self-stabilizing byzantine tolerant digital clock synchronization

A Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

Dynamic fault-tolerant clock synchronization

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Self-stabilizing clock synchronization in the presence of Byzantine faults

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Fast self-stabilizing byzantine tolerant digital clock synchronization

A Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

Dynamic fault-tolerant clock synchronization

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media