2011 | OriginalPaper | Buchkapitel
A Dependable and Efficient Scheduling Model and Fault Tolerance Service for Critical Applications on Grid Systems
verfasst von : Bahman Arasteh, Mohammad Javad Hosseini
Erschienen in: Future Information Technology
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The grid system is a framework with heterogeneous remote resources and a hazardous environment. Hence, the reliability and performance must be considered as a major criterion to execute the safety-critical applications in the grid. This paper proposes a model for job scheduling and fault tolerance service in the grid to improve dependability with respect economic efficiency. Dynamic architecture of the scheduling model leads to reduce resource consumption. The proposed fault tolerance service consists of failure detection and failure recovery. A three layered detection service is proposed to improve failure coverage and reduce the probability of false negative and false positive states. Checkpointing technique with an appropriate graining size is proposed as recovery service to attain a tradeoff between failure detection latency and performance overhead. Analytical approach (Markov approach) is used to analyze the reliability, safety and economic efficiency of proposed model in the presence of permanent and transient faults.