nach oben

The Journal of Supercomputing

Erschienen in:

08.01.2021

High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn

verfasst von: Basilio B. Fraguela, Diego Andrade

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library that supports this model in hybrid shared/distributed memory systems on top of a Partitioned Global Address Space environment. While the initial version of the library provided good results, it suffered from a key restriction that heavily limited its performance and scalability. Namely, each process had to consider all the tasks in the application rather than only those of interest to it, an overhead that naturally grows with both the number of processes and tasks in the system. In this paper, this restriction is lifted, enabling our library to provide higher levels of performance. This way, in experiments using 768 cores the performance improved up to 40.1%, the average improvement being 16.1%.

Vorheriger Artikel Energy-efficient cluster head selection through relay approach for WSN

Nächster Artikel Fair and near-optimal coflow scheduling without prior knowledge of coflow size

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Agullo E, Aumage O, Faverge M, Furmento N, Pruvost F, Sergent M, Thibault S (2014) Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014)

Augonnet C, Thibault S, Namyst R, Wacrenier P (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr Comput Pract Exp 23(2):187–198CrossRef

Bachan J, Baden SB, Hofmeyr S, Jacquelin M, Kamil A, Bonachea D, Hargrove PH, Ahmed H (2019) UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 963–973

Bauer M, Treichler S, Slaughter E, Aiken A (2012) Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pp 1–11

Bosilca G, Bouteiller A, Danalis A, Faverge M, Haidar A, Herault T, Kurzak J, Langou J, Lemarinier P, Ltaief H, Luszczek P, YarKhan A, Dongarra J (2011) Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp 1432–1441. https://doi.org/10.1109/IPDPS.2011.299

Bosilca G, Bouteiller A, Danalis A, Hérault T, Lemarinier P, Dongarra J (2012) DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput 38(1–2):37–51. https://doi.org/10.1016/j.parco.2011.10.003CrossRef

Bueno J, Martorell X, Badia RM, Ayguadé E, Labarta J (2013) Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS ’13, pp 359–368

Burke MG, Knobe K, Newton R, Sarkar V (2005) UPC language specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Lab

Chamberlain B, Callahan D, Zima H (2007) Parallel programmability and the Chapel language. Int J High Perform Comput Appl 21(3):291–312. https://doi.org/10.1177/1094342007078442CrossRef

10.

Charles P, Grothoff C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. In: 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA ’05, pp 519–538

11.

Cosnard M, Loi M (1995) Automatic task graph generation techniques. In: 28th Annual Hawaii International Conference on System Sciences, HICSS’28, vol 2, pp 113–122. https://doi.org/10.1109/HICSS.1995.375471

12.

Danalis A, Jagode H, Bosilca G, Dongarra J (2015) PaRSEC in practice: optimizing a legacy chemistry application through distributed task-based execution. In: 2015 IEEE International Conference on Cluster Computing, pp 304–313. https://doi.org/10.1109/CLUSTER.2015.50

13.

Fraguela BB (2017) A comparison of task parallel frameworks based on implicit dependencies in multi-core environments. In: 50th Hawaii International Conference on System Sciences, HICSS’50, pp 6202–6211. https://doi.org/10.24251/HICSS.2017.750

14.

Fraguela BB, Andrade D (2019) Easy dataflow programming in clusters with UPC++ DepSpawn. IEEE Trans Parallel Distrib Syst 30(6):1267–1282. https://doi.org/10.1109/TPDS.2018.2884716CrossRef

15.

González CH, Fraguela BB (2013) A framework for argument-based task synchronization with automatic detection of dependencies. Parallel Comput 39(9):475–489. https://doi.org/10.1016/j.parco.2013.04.012CrossRef

16.

Cray Inc (2017) Chapel language specification version 0.984

17.

Koniges A, Cook B, Deslippe J, Kurth T, Shan H (2016) MPI usage at NERSC: present and future. In: 23rd European MPI Users’ Group Meeting, EuroMPI 2016, p 217. https://doi.org/10.1145/2966884.2966894

18.

Nieplocha J, Palmer B, Tipparaju V, Krishnan M, Trease H, Aprà E (2006) Advances, applications and performance of the global arrays shared memory programming toolkit. Int J High Perform Comput Appl 20(2):203–231. https://doi.org/10.1177/1094342006064503CrossRef

19.

Numrich RW, Reid J (1998) Co-array Fortran for parallel programming. SIGPLAN Fortran Forum 17(2):1–31. https://doi.org/10.1145/289918.289920CrossRef

20.

Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: 1991 ACM/IEEE Conference on Supercomputing, Supercomputing ’91, pp 4–13. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/125826.125848

21.

Slaughter E, Lee W, Treichler S, Bauer M, Aiken A (2015) Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’15, pp 1–12 . https://doi.org/10.1145/2807591.2807629

22.

Tejedor E, Farreras M, Grove D, Badia RM, Almasi G, Labarta J (2012) A high-productivity task-based programming model for clusters. Concurr Comput Pract Exp 24(18):2421–2448CrossRef

23.

Wozniak JM, Armstrong TG, Wilde M, Katz DS, Lusk E, Foster IT (2013) Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp 95–102. https://doi.org/10.1109/CCGrid.2013.99

24.

Yelick K, Bonachea D, Chen WY, Colella P, Datta K, Duell J, Graham SL, Hargrove P, Hilfinger P, Husbands P, Iancu C, Kamil A, Nishtala R, Su J, Welcome M, Wen T (2007) Productivity and performance using partitioned global address space languages. In: Proceedings 2007 International Workshop on Parallel Symbolic Computation, PASCO ’07, pp 24–32. https://doi.org/10.1145/1278177.1278183

25.

Yelick KA, Graham SL, Hilfinger PN, Bonachea D, Su J, Kamil A, Datta K, Colella P, Wen T (2011) Titanium. In: Encyclopedia of Parallel Computing, pp 2049–2055. Springer US

26.

Zheng Y, Kamil A, Driscoll MB, Shan H, Yelick K (2014) UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp 1105–1114

Titel: High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn
verfasst von: Basilio B. Fraguela
Diego Andrade
Publikationsdatum: 08.01.2021
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 7/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-020-03607-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 7/2021

Algorithms for functionalities of virtual network: a survey

On the performance of a GPU-based SoC in a distributed spatial audio system

Dynamic swarm class rebalancing for the process mining of rare events

OG-RADL: overall performance-based resource-aware dynamic load-balancer for deadline constrained Cloud tasks

A contention aware EQS priority assignment heuristic for cohorts in DRTDBS

SS-ITS: secure scalable intelligent transportation systems