2011 | OriginalPaper | Chapter
A Dependable and Efficient Scheduling Model and Fault Tolerance Service for Critical Applications on Grid Systems
Authors : Bahman Arasteh, Mohammad Javad Hosseini
Published in: Future Information Technology
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
The grid system is a framework with heterogeneous remote resources and a hazardous environment. Hence, the reliability and performance must be considered as a major criterion to execute the safety-critical applications in the grid. This paper proposes a model for job scheduling and fault tolerance service in the grid to improve dependability with respect economic efficiency. Dynamic architecture of the scheduling model leads to reduce resource consumption. The proposed fault tolerance service consists of failure detection and failure recovery. A three layered detection service is proposed to improve failure coverage and reduce the probability of false negative and false positive states. Checkpointing technique with an appropriate graining size is proposed as recovery service to attain a tradeoff between failure detection latency and performance overhead. Analytical approach (Markov approach) is used to analyze the reliability, safety and economic efficiency of proposed model in the presence of permanent and transient faults.