Skip to main content
Erschienen in: The Journal of Supercomputing 1/2013

01.04.2013

Job scheduling and dynamic data replication in data grid environment

verfasst von: Najme Mansouri, Gholam Hosein Dastghaibyfard

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper, two algorithms are proposed: first, a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that considers the number of jobs waiting in queue, the location of required data for the job, and computational capability; second, a dynamic data replication strategy called Dynamic Hierarchical Replication Algorithm (DHRA) that improves file access time. DHRA stores each replica in an appropriate site, i.e., appropriate site in the requested region that has the highest number of access for that particular replica. Also, it can minimize access latency by selecting the best replica when various sites hold replicas of datasets. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Allcock B, Bester J, Bresnahan J, Chervenak AL, Foster I, Kesselman C, Meder S, Nefedova V, Quesnal D, Tuecke S (2002) Data management and transfer in high performance computational grid environments. Parallel Comput 28(3):749–771 CrossRef Allcock B, Bester J, Bresnahan J, Chervenak AL, Foster I, Kesselman C, Meder S, Nefedova V, Quesnal D, Tuecke S (2002) Data management and transfer in high performance computational grid environments. Parallel Comput 28(3):749–771 CrossRef
2.
Zurück zum Zitat Allcock B, Bester J, Bresnahan J, Chervenak AL et al (2001) Secure, efficient data transport and replica management for high-performance data-intensive computing. In: Proceedings of the first eighteenth IEEE symposium on mass storage systems and technologies. doi:10.1109/MSS.2001.10001 Allcock B, Bester J, Bresnahan J, Chervenak AL et al (2001) Secure, efficient data transport and replica management for high-performance data-intensive computing. In: Proceedings of the first eighteenth IEEE symposium on mass storage systems and technologies. doi:10.​1109/​MSS.​2001.​10001
3.
Zurück zum Zitat Foster I (2002) The grid: a new infrastructure for 21st century science. Phys Today 55:42–47 CrossRef Foster I (2002) The grid: a new infrastructure for 21st century science. Phys Today 55:42–47 CrossRef
6.
Zurück zum Zitat Nicholson C, Cameron D, Doyle AT, Millar AP, Stockinger K (2006) Dynamic data replication in LCG 2008. In: UK e-Science all hands meeting, Nottingham Nicholson C, Cameron D, Doyle AT, Millar AP, Stockinger K (2006) Dynamic data replication in LCG 2008. In: UK e-Science all hands meeting, Nottingham
7.
Zurück zum Zitat Gu Q, Chen B, Zhang Y (2008) Dynamic replica placement and location strategies for data grid. In: Proceeding of international conference on computer science and software engineering, pp 35–40 Gu Q, Chen B, Zhang Y (2008) Dynamic replica placement and location strategies for data grid. In: Proceeding of international conference on computer science and software engineering, pp 35–40
8.
Zurück zum Zitat Abawajy JH (2004) Placement of file replicas in data grid environments. In: Proceedings of international conference on computational science. Lecture notes on computer science, vol 3038, pp 66–73. doi:10.1007/978-3-540-24688-6_11 Abawajy JH (2004) Placement of file replicas in data grid environments. In: Proceedings of international conference on computational science. Lecture notes on computer science, vol 3038, pp 66–73. doi:10.​1007/​978-3-540-24688-6_​11
10.
Zurück zum Zitat Horri A, Sepahvand R, Dastghaibyfard G (2008) A hierarchical scheduling and replication strategy. Int J Comput Sci Netw Secur 8 Horri A, Sepahvand R, Dastghaibyfard G (2008) A hierarchical scheduling and replication strategy. Int J Comput Sci Netw Secur 8
12.
Zurück zum Zitat Aazami A, Ghandeharizadeh S, Helmi T (2004) Near optimal number of replicas for continuous media in ad-hoc networks of wireless devices. In: International workshop on multimedia information systems, pp 25–27 Aazami A, Ghandeharizadeh S, Helmi T (2004) Near optimal number of replicas for continuous media in ad-hoc networks of wireless devices. In: International workshop on multimedia information systems, pp 25–27
13.
Zurück zum Zitat Intanagonwiwat C, Govindan R, Estrin D (2000) Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: Proceedings of ACM MobiCom Intanagonwiwat C, Govindan R, Estrin D (2000) Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: Proceedings of ACM MobiCom
15.
Zurück zum Zitat Jin S, Wang L (2005) Content and service replication strategies in multi-hop wireless mesh networks. In: International workshop on modeling analysis and simulation of wireless and mobile systems—MSWiM, pp 79–86 Jin S, Wang L (2005) Content and service replication strategies in multi-hop wireless mesh networks. In: International workshop on modeling analysis and simulation of wireless and mobile systems—MSWiM, pp 79–86
16.
Zurück zum Zitat Foster I, Ranganathan K (2001) Design and evaluation of dynamic replication strategies for high performance data grids. In: Proceedings of international conference on computing in high energy and nuclear physics, Beijing Foster I, Ranganathan K (2001) Design and evaluation of dynamic replication strategies for high performance data grids. In: Proceedings of international conference on computing in high energy and nuclear physics, Beijing
17.
Zurück zum Zitat Foster I, Ranganathan K (2002) Identifying dynamic replication strategies for high performance data grids. In: Proceedings of 3rd IEEE/ACM international workshop on grid computing. Lecture notes on computer science, pp 75–86 Foster I, Ranganathan K (2002) Identifying dynamic replication strategies for high performance data grids. In: Proceedings of 3rd IEEE/ACM international workshop on grid computing. Lecture notes on computer science, pp 75–86
18.
Zurück zum Zitat Foster I, Ranganathan K (2002) Decoupling computation and data scheduling in distributed data-intensive applications. In: Proceedings of the 11th IEEE international symposium on high performance distributed computing, HPDC-11. IEEE Comput Soc Press, Edinburgh, pp 352–358 Foster I, Ranganathan K (2002) Decoupling computation and data scheduling in distributed data-intensive applications. In: Proceedings of the 11th IEEE international symposium on high performance distributed computing, HPDC-11. IEEE Comput Soc Press, Edinburgh, pp 352–358
19.
Zurück zum Zitat Zhang J, Lee B, Tang X, Yeo C (2008) Impact of parallel download on job scheduling in data grid environment. In: Seventh international conference on grid and cooperative computing, pp 102–109. doi:10.1109/GCC.2008.57 CrossRef Zhang J, Lee B, Tang X, Yeo C (2008) Impact of parallel download on job scheduling in data grid environment. In: Seventh international conference on grid and cooperative computing, pp 102–109. doi:10.​1109/​GCC.​2008.​57 CrossRef
20.
Zurück zum Zitat Park S-M, Kim J-H, Go Y-B, Yoon W-S (2003) Dynamic grid replication strategy based on Internet hierarchy. In: International workshop on grid and cooperative computing. Lecture notes on computer science, vol 1001, pp 1324–1331 Park S-M, Kim J-H, Go Y-B, Yoon W-S (2003) Dynamic grid replication strategy based on Internet hierarchy. In: International workshop on grid and cooperative computing. Lecture notes on computer science, vol 1001, pp 1324–1331
26.
Zurück zum Zitat Sashi K, Thanamani AS (2010) Dynamic replica management for data grid. Int J Eng Technol 2:329–333 Sashi K, Thanamani AS (2010) Dynamic replica management for data grid. Int J Eng Technol 2:329–333
28.
29.
Zurück zum Zitat Jin H, Cortes T, Buyya R (2002) High performance mass storage and parallel I/O: technologies and applications. IEEE Press, New York Jin H, Cortes T, Buyya R (2002) High performance mass storage and parallel I/O: technologies and applications. IEEE Press, New York
30.
Zurück zum Zitat Perez JM, Garcia F, Carretero J, Calderon A, Fernandez J, Daniel J (2004) A parallel I/O middleware to integrate heterogeneous storage resources on grids. In: First European across grids conference, pp 124–131 Perez JM, Garcia F, Carretero J, Calderon A, Fernandez J, Daniel J (2004) A parallel I/O middleware to integrate heterogeneous storage resources on grids. In: First European across grids conference, pp 124–131
33.
Zurück zum Zitat Rahman RM, Alhajj R, Barker K (2008) Replica selection strategies in data grid. J Parallel Distrib Comput 68:1561–1574 MATHCrossRef Rahman RM, Alhajj R, Barker K (2008) Replica selection strategies in data grid. J Parallel Distrib Comput 68:1561–1574 MATHCrossRef
34.
Zurück zum Zitat Andronikou V, Mamouras K, Tserpes K, Kyriazis D, Varvarigou T (2012) Dynamic QoS-aware data replication in grid environments based on data “importance”. Future Gener Comput Syst 28(3):544–553 CrossRef Andronikou V, Mamouras K, Tserpes K, Kyriazis D, Varvarigou T (2012) Dynamic QoS-aware data replication in grid environments based on data “importance”. Future Gener Comput Syst 28(3):544–553 CrossRef
35.
Zurück zum Zitat Saadat N, Rahmani AM (2011) PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener Comput Syst 28(7):1045–1057 Saadat N, Rahmani AM (2011) PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener Comput Syst 28(7):1045–1057
36.
Zurück zum Zitat Taheri J, Lee YC, Zomaya AY, Siegel HJ (2011) A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput Oper Res. doi:10.1016/j.cor.2011.11.012 Taheri J, Lee YC, Zomaya AY, Siegel HJ (2011) A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput Oper Res. doi:10.​1016/​j.​cor.​2011.​11.​012
37.
Zurück zum Zitat Cameron DG, Millar AP, Nicholson C, Carvajal-Schiaffino R, Zini F, Stockinger K (2004) Optorsim: a simulation tool for scheduling and replica optimization in data grids. In: International conference for computing in high energy and nuclear physics (CHEP 2004), Interlaken Cameron DG, Millar AP, Nicholson C, Carvajal-Schiaffino R, Zini F, Stockinger K (2004) Optorsim: a simulation tool for scheduling and replica optimization in data grids. In: International conference for computing in high energy and nuclear physics (CHEP 2004), Interlaken
Metadaten
Titel
Job scheduling and dynamic data replication in data grid environment
verfasst von
Najme Mansouri
Gholam Hosein Dastghaibyfard
Publikationsdatum
01.04.2013
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 1/2013
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-012-0850-2

Weitere Artikel der Ausgabe 1/2013

The Journal of Supercomputing 1/2013 Zur Ausgabe

Premium Partner