2005 | OriginalPaper | Buchkapitel
A Novel Checkpoint Mechanism Based on Job Progress Description for Computational Grid
verfasst von : Chunjiang Lia, Xuejun Yang, Nong Xiao
Erschienen in: Parallel and Distributed Processing and Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In this paper, we argue that application-level uncoordinated checkpointing with user-defined checkpoint data is the favorable in grid environment where heterogeneity is essentially popular. We present a novel application-level uncoordinated checkpoint protocol based on Job Progress Description (
JPD
) which is composed by a Job Progress Record Object and a group of Job Progress State Objects, these two kinds of objects act as checkpoint data for the job and the methods of them can be used as checkpoint APIs. By extending this protocol with sender-based message logging, it can be used by the message passing applications in computational grid. Emulation with a kind of master-worker message-passing applications shows that using this checkpointing protocol can dramatically reduce the wall-time of the application when failure occurs.