nach oben

The Journal of Supercomputing

Erschienen in:

23.02.2021

Performance and energy task migration model for heterogeneous clusters

verfasst von: Esteban Stafford, José Luis Bosque

Erschienen in: The Journal of Supercomputing | Ausgabe 9/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This article presents a set of linear regression models to predict the impact of task migration on different objectives, like performance and energy consumption. It allows to establish whether at a given moment the migration of a task is profitable in terms of performance or energy consumption. Also, it can be used to determine the best node to migrate a task depending on the objective. The model uses a small set of parameters that are easily measurable. It has been validated against a small heterogeneous cluster using the Slurm resource manager. The model captures the tendencies observed in the results of the experiments, with average relative errors below 3.5% in execution time and 2.5% in energy consumption.

Vorheriger Artikel GPU accelerated waterpixel algorithm for superpixel segmentation of hyperspectral images

Nächster Artikel Support NNEF execution model for NNAPI

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The code is available at https://github.com/dmtcp.

Petri S, Langendörfer H (1995) Load balancing and fault tolerance in workstation clusters migrating groups of communicating processes. SIGOPS Oper Syst Rev 29(4):25–36CrossRef

Harchol-Balter M, Downey AB (1997) Exploiting process lifetime distributions for dynamic load balancing. ACM Trans Comput Syst 15(3):253–285CrossRef

Milojičić DS, Douglis F, Paindaveine Y, Wheeler R, Zhou S (2000) Process migration. ACM Comput Surv 32(3):241–299CrossRef

Simon P, Stefan L, Antonello M, Carsten C, Jens B (2016) Application migration in HPC—a driver of the exascale era? Int Conf High Perform Comput Simul HPCS 2016:318–325

Jiang Y (2016) A survey of task allocation and load balancing in distributed systems. IEEE Trans Parallel Distrib Syst 27(2):585–599CrossRef

Laredo JLJ, Guinand F, Olivier D, Bouvry P (2017) Load balancing at the edge of chaos: how self-organized criticality can lead to energy-efficient computing. IEEE Trans Parallel Distrib Syst 28(2):517–529CrossRef

Gladys U, Montse F, Jordi F (2017) Task packing: getting the best from MPI unbalanced applications. In: Proceedings—2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2017, pp 547–550

De Ivanoe F, Eryk L, Richard O, Umberto S, Ernesto T, Marek T (2018) Effective processor load balancing using multi-objective parallel extremal optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’18, pp 1292–1299, New York, NY, USA, ACM

El-Sayed N, Schroeder B (2018) Understanding practical tradeoffs in hpc checkpoint-scheduling policies. IEEE Trans Dependable Secur Comput 15(2):336–350CrossRef

10.

Bosque JL, Toharia P, Robles OD, Pastor L (2013) A load index and load balancing algorithm for heterogeneous clusters. J Supercomput 65(3):1104–1113CrossRef

11.

Belgaum MR, Soomro S, Alansari Z, Alam M, Musa S, Su’ud MM (2017) Load balancing with preemptive and non-preemptive task scheduling in cloud computing, pp 1–5

12.

Pérez B, Stafford E, Bosque JL, Beivide R (2017) Energy efficiency of load balancing for data-parallel applications in heterogeneous systems. J Supercomput 73(1):330–342CrossRef

13.

Cabrera A, Acosta A, Almeida F, Blanco V (2020) A dynamic multi-objective approach for dynamic load balancing in heterogeneous systems. IEEE Trans Parallel Distrib Syst 31(10):2421–2434CrossRef

14.

Dominik B, Ulrich R (2014) Parallel multiphysics simulations of charged particles in microfluidic flows. J Comput Sci 8:1–19

15.

Robles OD, Bosque JL, Pastor L, Rodriguez A (2005) Performance analysis of a cbir system on shared-memory systems and heterogeneous clusters. In: Seventh International Workshop on Computer Architecture for Machine Perception (CAMP’05), pp 309–314

16.

Yoo AB, Jette MA, Grondona M (2003) Slurm: simple linux utility for resource management. In: Job Scheduling Strategies for Parallel Processing, pp 44–60. Berlin, Heidelberg

17.

Stafford E, Bosque JL (2020) Improving utilization of heterogeneous clusters. J Supercomput

18.

Ansel J, Arya K, Cooperman G (2009) Dmtcp: transparent checkpointing for cluster computations and the desktop. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp 1–12

19.

Manuel R-P, Jiajun C, Moríñigo José A, Gene C, Rafael M-G (2019) Job migration in hpc clusters by means of checkpoint/restart. J Supercomput 75(10):6517–6541CrossRef

20.

Jiannong C, Yinghao L, Minyi G (2005) Process migration for MPI applications based on coordinated checkpoint. Proc Int Conf Parallel Distrib Syst ICPADS 1:306–312

21.

Nils K, Johannes H, Florian S, Martin B, Christian G, Harald K, Britta N, Ulrich R (2019) A scalable and extensible checkpointing scheme for massively parallel simulations. Int J High Perform Comput Appl 33(4):571–589CrossRef

22.

Pourghassemi B, Chandramowlishwaran A (2017) Cudacr: an in-kernel application-level checkpoint/restart scheme for cuda-enabled gpus. In: International Conference on Cluster Computing, CLUSTER, pp 725–732. IEEE Computer Society

23.

Ming-Tsung C, Yi-Ping Y (2019) Clpkm: a checkpoint-based preemptive multitasking framework for opencl kernels. J Syst Architect 98:53–62CrossRef

24.

Chen G, Zhang J, Zhu Z, Q Jiang, Jiang H, Pang C (2020) Crstate: checkpoint/restart of opencl program for in-kernel applications. J Supercomput

25.

Ivanoe DF, Eryk L, Richard O, Umberto S, Ernesto T, Marek T (2015) Extremal optimization applied to load balancing in execution of distributed programs. Appl Soft Comput J 30:501–513CrossRef

26.

Jens B, Simon P, Stefan L, Antonello M (2017) Dynamic co-scheduling driven by main memory bandwidth utilization

27.

Padoin E, Diener M, Navaux P, Mehaut JF (2019) Managing power demand and load imbalance to save energy on systems with heterogeneous CPU speeds. In: Symposium on Computer Architecture and High Performance Computing, pp 72–79

28.

Chao W, Frank M, Christian E, Scott Stephen L (2012) Proactive process-level live migration and back migration in HPC environments. J Parallel Distrib Comput 72(2):254–267CrossRef

Titel: Performance and energy task migration model for heterogeneous clusters
verfasst von: Esteban Stafford
José Luis Bosque
Publikationsdatum: 23.02.2021
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 9/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-021-03663-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 9/2021

An improved privacy-preserving data mining technique using singular value decomposition with three-dimensional rotation data perturbation

Parallel modeling of wildfires using efficient solvers for ill-conditioned linear systems

Constructing effective lightweight privacy protection in RFID-based systems

A solution to dynamic green vehicle routing problems with time windows using spiking neural P systems with modified rules and learning

Optimal multilevel media stream caching in cloud-edge environment

Analysis of the material distribution system of wise information technology of 120 under deep learning

Premium Partner