Job replication on multiserver systems

Yusik Kim; Rhonda Righter; Ronald Wolff

doi:10.1239/aap/1246886623

Job replication on multiserver systems

Part of: Operations research and management science Computer system organization

Published online by Cambridge University Press: 01 July 2016

Yusik Kim ,

Rhonda Righter and

Ronald Wolff

Show author details

Yusik Kim*: Affiliation:
University of California, Berkeley
Rhonda Righter*: Affiliation:
University of California, Berkeley
Ronald Wolff*: Affiliation:
University of California, Berkeley
*: ∗ Postal address: Department of Industrial Engineering and Operations Research, 4141 Etcheverry Hall, Berkeley, CA 94720, USA.
∗ Postal address: Department of Industrial Engineering and Operations Research, 4141 Etcheverry Hall, Berkeley, CA 94720, USA.
∗ Postal address: Department of Industrial Engineering and Operations Research, 4141 Etcheverry Hall, Berkeley, CA 94720, USA.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Parallel processing is a way to use resources efficiently by processing several jobs simultaneously on different servers. In a well-controlled environment where the status of the servers and the jobs are well known, everything is nearly deterministic and replicating jobs on different servers is obviously a waste of resources. However, in a poorly controlled environment where the servers are unreliable and/or their capacity is highly variable, it is desirable to design a system that is robust in the sense that it is not affected by the poorly performing servers. By replicating jobs and assigning them to several different servers simultaneously, we not only achieve robustness but we can also make the system more efficient under certain conditions so that the jobs are processed at a faster rate overall. In this paper we consider the option of replicating jobs and study how the performance of different ‘degrees’ of replication, ranging from no replication to full replication, affects the performance of a system of parallel servers.

Keywords

Stochastic scheduling grid computing job replication

MSC classification

Primary: 68M20: Performance evaluation; queueing; scheduling

Secondary: 90B36: Scheduling theory, stochastic 90B22: Queues and service

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 41 , Issue 2 , June 2009 , pp. 546 - 575

DOI: https://doi.org/10.1239/aap/1246886623 [Opens in a new window]

References

Borst, S., Boxma, O., Groote, J. F. and Mauw, S. (2003). Task allocation in a multiserver system. J. Sched. 6, 423–436.Google Scholar

Dobber, M. (2006). Robust applications in time-shared distributed systems. , Vrije Universiteit Amsterdam.Google Scholar

Foster, I., Kesselman, C. and Tuecke, S. (2001). The anatomy of the grid: enabling scalable virtual organizations. Internat. J. High Performance Comput. Appl. 15, 200–222.Google Scholar

Koole, G. and Righter, R. (2008). Resource allocation in grid computing. J. Sched. 11, 163–173.Google Scholar

Korpela, E. et al. (2001). SETI@home-massively distributed computing for SETI. Comput. Sci. Eng. 3, 78–83.CrossRef Google Scholar

Larson, S. M., Snow, C. D., Shirts, M. and Pande, V. S. (2009). Folding@home and genome@home: using distributed computing to tackle previously intractable problems in computational biology. Preprint. Available at http://arxiv.org/abs/0901.0866.Google Scholar

Leistman, A. L. and Campbell, R. H. (1986). A fault-tolerant scheduling problem. IEEE Trans. Soft. Eng. 12, 1088–1089.Google Scholar

Litke, A., Skoutas, D., Tserpes, K. and Varvarigou, T. (2007). Efficient task replication and management for adaptive fault tolerance in mobile grid environments. Future Generation Computer Systems 23, 163–178.Google Scholar

Shaked, M. and Shanthikumar, J. G. (1994). Stochastic Orders and Their Applications. Academic Press, Boston, MA.Google Scholar

Shaked, M. and Shanthikumar, J. G. (2007). Stochastic Orders. Springer, New York.CrossRef Google Scholar

Article contents

Job replication on multiserver systems

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests