nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Fine-Grained MPI+OpenMP Plasma Simulations: Communication Overlap with Dependent Tasks

verfasst von : Jérôme Richard, Guillaume Latu, Julien Bigot, Thierry Gautier

Erschienen in: Euro-Par 2019: Parallel Processing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper demonstrates how OpenMP 4.5 tasks can be used to efficiently overlap computations and MPI communications based on a case-study conducted on multi-core and many-core architectures. It focuses on task granularity, dependencies and priorities, and also identifies some limitations of OpenMP. Results on 64 Skylake nodes show that while 64% of the wall-clock time is spent in MPI communications, 60% of the cores are busy in computations, which is a good result. Indeed, the chosen dataset is small enough to be a challenging case in terms of overlap and thus useful to assess worst-case scenarios in future simulations.

Two key features were identified: by using task priority we improved the performance by 5.7% (mainly due to an improved overlap), and with recursive tasks we shortened the execution time by 9.7%. We also illustrate the need to have access to tools for task tracing and task visualization. These tools allowed a fine understanding and a performance increase for this task-based OpenMP+MPI code.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel One Table to Count Them All: Parallel Frequency Estimation on Single-Board Computers

Nächstes Kapitel Parallel Adaptive Sampling with Almost No Synchronization

See [10] for an advanced tutorial about these points.

In practice, Poisson-Ampere [7] are solved instead of Poisson. But for sake of clarity, Poisson-Ampere is not detailed here as the algorithm and performance are very close.

This construct specifies to execute iterations of one or multiple loops in parallel using (independent) tasks. Unless specified by the user, it lets the runtime choose the best granularity and perform a final synchronization.

http://www.hpc.cineca.it/hardware/marconi.

The mode cannot be configured by the user on the selected computing machines.

The latest available versions on the computing machines during the experiments.

The idle time includes periods where threads are busy waiting for ready tasks to be executed and thread synchronization periods, and the runtime overhead includes scheduling and task submission costs.

This time could be shortened, if only one could store and resubmit the task graph from one timestep to another such as in [2].

The management cost of dependencies could also be lowered by the runtime if dedicated studies are done along this line.

Augonnet, C., Aumage, O., Furmento, N., Namyst, R., Thibault, S.: StarPU-MPI: task programming over clusters of machines enhanced with accelerators. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 298–299. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33518-1_40CrossRef

Besseron, X., Gautier, T.: Impact of over-decomposition on coordinated checkpoint/rollback protocol. In: Alexander, M., et al. (eds.) Euro-Par 2011. LNCS, vol. 7156, pp. 322–332. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29740-3_36CrossRef

Bouzat, N., Rozar, F., Latu, G., Roman, J.: A new parallelization scheme for the Hermite interpolation based gyroaverage operator. In: 2017 16th ISPDC (2017)

Bouzat, N., et al.: Targeting realistic geometry in Tokamak code Gysela. ESAIM Proc. Surv. 63, 179–207 (2018)MathSciNetCrossRef

Broquedis, F., Gautier, T., Danjean, V.: libKOMP, an efficient OpenMP runtime system for both fork-join and data flow paradigms. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 102–115. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30961-8_8CrossRef

Bueno, J., et al.: Productive cluster programming with OmpSs. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011. LNCS, vol. 6852, pp. 555–566. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23400-2_52CrossRef

Crouseilles, N., Latu, G., Sonnendrücker, E.: Hermite spline interpolationon patches for parallelly solving the Vlasov-Poisson equation. IJAMCS 17(3), 335–349 (2007)MATH

Diaz, J., Muñoz-Caro, C., Niño, A.: A survey of parallel programming modelsand tools in the multi and many-core era. IEEE TPDS 23(8), 1369–1386 (2012)

Gautier, T., Pérez, C., Richard, J.: On the impact of OpenMP task granularity. In: de Supinski, B.R., Valero-Lara, P., Martorell, X., Mateo Bellido, S., Labarta, J. (eds.) IWOMP 2018. LNCS, vol. 11128, pp. 205–221. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98521-3_14CrossRef

10.

Martorell, X., Teruel, X., Klemm, M.: Advanced OpenMP Tutorial (2018). https://openmpcon.org/wp-content/uploads/2018_Tutorial3_Martorell_Teruel_Klemm.pdf

11.

OpenMP Architecture Review Board: OpenMP Application Programming Interface Version 4.5, November 2015. http://www.openmp.org

12.

OpenMP Architecture Review Board: OpenMP Application Programming Interface Version 5.0, November 2018. http://www.openmp.org

13.

Pérache, M., Jourdren, H., Namyst, R.: MPC: a unified parallel runtime for clusters of NUMA machines. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 78–88. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85451-7_9CrossRef

14.

Perez, J.M., Beltran, V., Labarta, J., Ayguadé, E.: Improving the integration of task nesting and dependencies in OpenMP. In: IPDPS 2017. IEEE (2017)

15.

Sala, K., et al.: Improving the interoperability between MPI and task-based programming models. In: Proceedings of EuroMPI 2018, pp. 6:1–6:11. ACM (2018)

16.

Song, F., YarKhan, A., Dongarra, J.: Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems. In: Proceedings of the Conference on HPC Networking, Storage and Analysis, SC 2009. ACM (2009)

17.

Sonnendrücker, E., et al.: The semi-Lagrangian method for the numerical resolution of the Vlasov equation. J. Comput. Phys. 149(2), 201–220 (1999)MathSciNetCrossRef

Titel: Fine-Grained MPI+OpenMP Plasma Simulations: Communication Overlap with Dependent Tasks
verfasst von: Jérôme Richard
Guillaume Latu
Julien Bigot
Thierry Gautier
Verlag: Springer International Publishing
Buch: Euro-Par 2019: Parallel Processing
Print ISBN: 978-3-030-29399-4

Electronic ISBN: 978-3-030-29400-7

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-29400-7_30

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"