2012 | OriginalPaper | Buchkapitel
Fault Tolerance in Distributed Computing
verfasst von : Christian Storm
Erschienen in: Specification and Analytical Evaluation of Heterogeneous Dynamic Quorum-Based Data Replication Schemes
Verlag: Vieweg+Teubner Verlag
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
A distributed system consists of several independent processing components that interact with each other via an interconnecting communication link network consisting of communication components. Distributed computing refers to the algorithmic controlling of the distributed system’s processing components by means of a distributed program in order to reach a collective goal, that is, to provide a certain service. Unfortunately, the components of literally every system are naturally imperfect and therefore prone to failures that may render the system unable to provide the service. In order to be able to tolerate the failure of some components, that is, to keep the service available despite these failures, the system must be equipped with redundancy in space and time. The former refers to redundant components that take over the part played by failed components. The latter refers to the additional overhead required to manage these components. Fault-tolerant distributed computing refers to the algorithmic controlling of the distributed system’s components to provide the desired service despite the presence of certain failures in the system by exploiting redundancy in space and time.