Skip to main content
Log in

Replica Placement Strategies in Data Grid

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Replication is a technique used in Data Grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. The research addresses the problem of replication in Data Grid environment by investigating a set of highly decentralized dynamic replica placement algorithms. Replica placement algorithms are based on heuristics that consider both network latency and user requests to select the best candidate sites to place replicas. Due to dynamic nature of Grid, the candidate site holds replicas currently may not be the best sites to fetch replicas in subsequent periods. Therefore, a replica maintenance algorithm is proposed to relocate replicas to different sites if the performance metric degrades significantly. The study of our replica placement algorithms is carried out using a model of the EU Data Grid Testbed 1 [Bell et al. Comput. Appl., 17(4), 2003] sites and their associated network geometry. We validate our replica placement algorithms with total file transfer times, the number of local file accesses, and the number of remote file accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnal, D., Tuecke, S.: Secure, efficient data transport and replica management for high performance data-intensive computing. IEEE Mass Storage Conference (2001)

  2. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Data management and transfer in high performance computational Grid environments. Parallel Comput. J. 28(5), 749–771 (2002) May

    Article  Google Scholar 

  3. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An architecture of a resource management and scheduling system in a global computational Grid, HPC Asia 2000, May 14–17, 2000, pp 283–289, Beijing, China

  4. Bell, W., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: OptorSim – a Grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17(4), (2003)

  5. Cai M., Chervenak, A., Frank, M.: A peer-to-peer replica location service based on a distributed hash table. Proceedings of the super computing conference, pp. 56–68, (2004)

  6. Cohon, J.L.: Multiobjective Programming and Planning. Academic, New York (1978)

    MATH  Google Scholar 

  7. Drezner, Z., Hamacher, H.W.: Facility Location Application and Theory. Springer, Berlin (2002)

    Google Scholar 

  8. Daskin, M.S.: Network and Discrete Location Models: Algorithms and Applications. Wiley, New York (1995)

    MATH  Google Scholar 

  9. Foster, I.: Internet Computing and the Emerging Grid, Nature Web Matters (2000)

  10. Fisher, M.L.: The Lagrangian relaxation method for solving integer programming problems. Manag. Sci. 27, 1–18 (1981)

    MATH  Google Scholar 

  11. Foster, I., Kesselman, C.: Globus: A metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. 11(2), 115–128 (1997)

    Article  Google Scholar 

  12. Foster, I., Kesselman, C.: Globus: A toolkit-based Grid architecture. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure, pp. 259–278. Morgan Kaufmann, San Mateo, CA (1999)

    Google Scholar 

  13. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Int. J. Supercomput. Appl. 15(3) (2001)

  14. Huffman, B.T., et al.: The CDF/D0 UK GridPP Project. CDF Internal Note CDF/DOC/COMP_UPG/5858, February 2002

  15. Hakami, S.: Optimum location of switching centers and the absolute centers and medians of a graph. Oper. Res. 12, 450–459

  16. High Energy Physics Experiment Website, http://www.hep.net

  17. Howes, T.A., Smith, M.: A scalable, deployable directory service framework for the internet. Technical report, Center for Information Technology Integration, University of Michigan

  18. Kavitha, R., Foster, I.: Design and Evaluation of Replication Strategies for a High Performance Data Grid, in Computing and High Energy and Nuclear Physics 2001 (CHEP’01) Conference

  19. Kavitha, R., Foster, I.: Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. Proceedings of 11th. IEEE International Symposium on High Performance Distributed Computing Edinburgh, Scotland, July 2002

  20. Kavitha, R., Iamnitchi, A., Foster, I.: Improving Data Availability through Dynamic Model Driven Replication in Large Peer-to-Peer Communities. Proceedings of Global and Peer-to-Peer Computing on Large Scale Distributed Systems Workshop, Berlin, Germany, May 2002

  21. The GriPhyN Project, http://www.griphyn.org

  22. Revees, C.R. (ed.): Modern Heuristic Techniques for Combinatorial Problems, Oxford Blackwell Scientific Publication, Oxford, UK (1993)

  23. Rahman, R.M., Barker, K., Alhajj, R.: Replica Placement Design with Static Optimality and Dynamic Maintainability, Proceedings of the IEEE/ACM International Symposium on Cluster Computing and Grid (CCGrid 06), Singapore, May, 2006

  24. Rahman, R.M., Barker, K., Alhajj, R.: Replica Placement on Data Grid: Considering Utility and Risk. IEEE International Conference on Coding and Computing (ITCC), April, 2005

  25. Toregas, C., Swain, R., Revelle, C., Bergman, L.: The location of emergency service facilities. Oper. Res. 19, 1363–1373 (1971)

    MATH  Google Scholar 

  26. Wesolowsky, G., Truscott, W.: The multiperiod location-allocation problem with relocation of facilities. Manag. Sci. 22, 57–64 (1975)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rashedur M. Rahman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahman, R.M., Barker, K. & Alhajj, R. Replica Placement Strategies in Data Grid. J Grid Computing 6, 103–123 (2008). https://doi.org/10.1007/s10723-007-9090-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-007-9090-8

Keywords

Navigation