skip to main content
article
Free Access

The Totem single-ring ordering and membership protocol

Published:01 November 1995Publication History
Skip Abstract Section

Abstract

Fault-tolerant distributed systems are becoming more important, but in existing systems, maintaining the consistency of replicated data is quite expensive. The Totem single-ring protocol supports consistent concurrent operations by placing a total order on broadcast messages. This total order is derived from the sequence number in a token that circulates around a logical ring imposed on a set of processors in a broadcast domain. The protocol handles reconfiguration of the system when processors fail and restart or when the network partitions and remerges. Extended virtual synchrony ensures that processors deliver messages and configuration changes to the application in a consistent, systemwide total order. An effective flow control mechanism enables the Totem single-ring protocol to achieve message-ordering rates significantly higher than the best prior total-ordering protocols.

References

  1. AMiR, Y, DOLEV, D., KRAMER, S., AND MALKI, D. 1992a. Membership algorithms in broadcast domains In Proceedings of the 6th Internattonal Workshop on Distributed Algorithms (Haifa, Israel). Spnnger-Verlag, Berlin, 292-312. Google ScholarGoogle Scholar
  2. AMm, Y, DOLEV, D., KRAIvlER, S., AND MALKI, D. 1992b. Transis: A commumcation subsystem for high avafiabfilty. In Proceedings of the IEEE 22nd Annual Internatmnal Svmposiiim on Fault-Tolerant Computing (Boston, Mass). IEEE, New York, 76-84Google ScholarGoogle Scholar
  3. A~nR, Y, MOSER, L E., MELLIAR-SMITH, P. M., AGARWAL, D. A., AND CIARFELLA, P. 1993 Fast message ordering and membership using a logical token-passing ring. In Proceedings of the IEEE 13th International Conference on Distributed Computing Systems (Pittsburgh, Pa) IEEE, New York, 551-560.Google ScholarGoogle Scholar
  4. A~nn, Y, MOSER, L. E, MELLIAR-SMITH, P. M., AGARWAL, D. A, A~'4O CIAnFELLA, P. 1994. The Totem single-ring ordering and membership protocol. Tech Rep. 94-19, Dept of Electrical and Computer Engnneermg, Umv. of California, Santa Barbara, Calif. Aug.Google ScholarGoogle Scholar
  5. BIRMAN K. P AND VAN RENESSE, R. 1994 Reliable Distributed Computing w~th the Isis Toolkit. IEEE Computer Somety Press, Los Alamltos, Calif Google ScholarGoogle Scholar
  6. BOXMA, O. J., LEVY J., AND WESTRATE, J.A. 1990. Optimization of polling systems. In Performance '90, Proceedings of the 14th IFIP WG 7.3 International Symposium on Computer Perforrnarice Modelhng, Measurement and Evaluation (Edinburg, U.K ). North-Holland, Amsterdam, 349 361. Google ScholarGoogle Scholar
  7. CHANG, J. M. AND Mg24EMCHUK, N F. 1984. Reliable broadcast protocols ACM Trans Comput Syst. 2, 3 (Aug.), 251 273. Google ScholarGoogle Scholar
  8. FISCHER, M. J., LYNCH, N. A., AND PATERSON, M S. 1985. Impossibility of distributed consensus with one faulty process J. ACM 32, 2 (Apr), 374-382 Google ScholarGoogle Scholar
  9. K~SHOEK, M F. ANn TANENnAU~, A. S. 1991. Group communicatmn m the Amoeba distnbuted operating system. In Proceedings of the IEEE lltb International Conference on Distributed Computing Systems (Arlington, Tax ). IEEE, New York, 882 891.Google ScholarGoogle Scholar
  10. LAMPORT, L 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July), 558-565. Google ScholarGoogle Scholar
  11. MELLIAR-SMITH, P. M., MOSER, L. E., AND AGARWAL, D.A. 1991. Ring-based ordering protocols. In Proceedings of the IEE International Conference on Information Engineering (Singapore) IEE, Stevenage, Harts, U.K, 882 891Google ScholarGoogle Scholar
  12. MELLIAn-SMITn, P. M., MOSE~, L. E., ANi~ A(n~AWALA, V. 1990. Broadcast protocols for d~stributed sytems. IEEE Trans. Parallel D*strib. Syst. 1, 1 (Jan.), 17-25. Google ScholarGoogle Scholar
  13. MISHRA, S., PETERSON, L. L., AND SCHLICHTING, R.D. 1991. A membership protocol based on partml order. In Procee&ngs of the 2rid IFIP WG 10.4 Internatzonal Working Conference on Dependable Computing for Cr~tzcal Appl~catmns (Tucson, Ariz.). Springer-Verlag, Wien, Austria, 309-331.Google ScholarGoogle Scholar
  14. MosE~, L. E ANn MELLIAR-SMITH, P.M. 1994. Probablhstm bounds on message delivery for the Totem single-rang protocol. In Proceedings of the IEEE 15th Real-T~me Systems Symposium (San Juan, Puerto Rico). IEEE, New York, 238-248.Google ScholarGoogle Scholar
  15. MOSEn, L. E., AMIn, Y., MnLLIAn-SMITn, P M., AND AOARWAL, D.A. 1994a. Extended virtual synchrony. In Proceedings of the IEEE 14th International Conference on Dzstrlbuted Computing Systems (Posnan, Poland). IEEE, New York, 56-65.Google ScholarGoogle Scholar
  16. MOSER, L. E., MELL~AR-SMn'H, P. M., AND AGRAWALA, V. 1994b. Processor membership in asynchronous distributed systems IEEE Trans Parallel Dlstrzb. Syst. 5, 5 (May), 459-473. Google ScholarGoogle Scholar
  17. PETERSON, L. L,, BUCHHOLZ, N. C., AND 8CHLICHTING, R.D. 1989. Preaerving and using context information in interprocess communication. ACM Trans Comput. Syst 7, 3 (Aug.), 217-246. Google ScholarGoogle Scholar
  18. RAJAGOPALAN, B. AND McKINLEY, P. K. 1989. A token-based protocol for reliable, ordered multicast communication In Proceedings of the IEEE 8th Symposium on Reliable Distributed Systems (Seattle, Wash ). IEEE, New York, 84-93.Google ScholarGoogle Scholar
  19. VAN RENESSE, R., HICKEY, T. M., AND BIRMAN, K. P 1994. Design and performance of Horus: A lightwmght group communications system Tech Rep. 94-1442, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y. Aug. Google ScholarGoogle Scholar

Index Terms

  1. The Totem single-ring ordering and membership protocol

          Recommendations

          Reviews

          Valentin Cristea

          Totem is a single-ring protocol for high-performance, fault-tolerant distributed systems that must continue to operate despite network partitioning and re-merging and despite processor failure and restart. Totem provides totally ordered message delivery with good performance using a logical token-passing ring imposed on a broadcast domain. After an introductory section, the authors present related work and highlight the differences of the Totem protocol. Significant literature on the subject is analyzed. Section 3 is dedicated to the distributed system model used in the Totem protocol design. Several terms related to protocol functioning are defined. The objective of Totem is to provide the application with reliable message delivery and membership services. These services are described in section 4 of the paper. Section 5 is devoted to the total ordering protocol with the assumptions that the token is never lost; processor failures do not occur; and the network does not become partitioned. In section 6, the conditions are relaxed, and the protocol to handle token loss, processor failure and restart, and network partitioning and re-merging is presented. The protocol is described using a finite-state machine model. Data structures used, as well as pseudocode for the work performed by processors during different states of the model, are also given. Sections 7 and 8 present the recovery protocol that maintains extended virtual synchrony during recovery after a failure, and the flow control mechanism that avoids message loss due to buffer overflow. Section 9 addresses implementation and performance. Future work is mentioned at the end of the paper. The paper is well structured, but the presentation is not uniform, some aspects being described in great detail, while others are quickly summarized. The important works on the subject are included as references.

          Access critical reviews of Computing literature here

          Become a reviewer for Computing Reviews.

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader