ABSTRACT
A simple and general design uses message-based communication to provide software tolerance of single-point hardware failures. By delivering all interprocess messages to inactive backups for both the sender and the destination, both backups are kept in a state in which they can take over for their primaries.
An implementation for the Auragen 4000 series of M68000-based systems is described. The operating system, AurosTM, is a distributed version of UNIX*. Major goals have been transparency of fault tolerance and efficient execution in the absence of failure.
- 1.Bartlett, J.F., A NonStop Kernel, Proceedings of the Eighth Symposium on Operating Systems Principles, December 1981, pp 22-29. Google ScholarDigital Library
- 2.Baumbach, J., Memory Management in a Fault Tolerant System, In preparation.Google Scholar
- 3.Denning, P., Fault Tolerant Operating Systems, Computing Surveys, Vol.8, No.4, December 1976. Google ScholarDigital Library
- 4.Ritchie, D.M. and K. Thompson, The UNIX Time-sharing System, Bell System Technical Journal 57, 6, July 1978.Google ScholarCross Ref
- 5.Russell, D.L., Process Backup in Producer-Consumer Systems, Proceedings of the Sixth SOSP, November 1977, pp 151-157. Google ScholarDigital Library
- 6.Stratus/32, VOS Reference Manual, October 1982. *UNIX is a trademark of Bell Laboratories.Google Scholar
Index Terms
- A message system supporting fault tolerance
Recommendations
A message system supporting fault tolerance
A simple and general design uses message-based communication to provide software tolerance of single-point hardware failures. By delivering all interprocess messages to inactive backups for both the sender and the destination, both backups are kept in a ...
The cascade fault tolerance message system
CSC '89: Proceedings of the 17th conference on ACM Annual Computer Science ConferenceThis paper addresses the problem of constructing a highly reliable message delivery system in a distributed environment. It presents a fault tolerance algorithm that guarantees the delivery of a message to its destination despite faults in one or more ...
Comments