Skip to main content
Erschienen in: The Journal of Supercomputing 7/2021

08.01.2021

High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn

verfasst von: Basilio B. Fraguela, Diego Andrade

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library that supports this model in hybrid shared/distributed memory systems on top of a Partitioned Global Address Space environment. While the initial version of the library provided good results, it suffered from a key restriction that heavily limited its performance and scalability. Namely, each process had to consider all the tasks in the application rather than only those of interest to it, an overhead that naturally grows with both the number of processes and tasks in the system. In this paper, this restriction is lifted, enabling our library to provide higher levels of performance. This way, in experiments using 768 cores the performance improved up to 40.1%, the average improvement being 16.1%.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agullo E, Aumage O, Faverge M, Furmento N, Pruvost F, Sergent M, Thibault S (2014) Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014) Agullo E, Aumage O, Faverge M, Furmento N, Pruvost F, Sergent M, Thibault S (2014) Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014)
2.
Zurück zum Zitat Augonnet C, Thibault S, Namyst R, Wacrenier P (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr Comput Pract Exp 23(2):187–198CrossRef Augonnet C, Thibault S, Namyst R, Wacrenier P (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr Comput Pract Exp 23(2):187–198CrossRef
3.
Zurück zum Zitat Bachan J, Baden SB, Hofmeyr S, Jacquelin M, Kamil A, Bonachea D, Hargrove PH, Ahmed H (2019) UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 963–973 Bachan J, Baden SB, Hofmeyr S, Jacquelin M, Kamil A, Bonachea D, Hargrove PH, Ahmed H (2019) UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 963–973
4.
Zurück zum Zitat Bauer M, Treichler S, Slaughter E, Aiken A (2012) Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pp 1–11 Bauer M, Treichler S, Slaughter E, Aiken A (2012) Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pp 1–11
5.
Zurück zum Zitat Bosilca G, Bouteiller A, Danalis A, Faverge M, Haidar A, Herault T, Kurzak J, Langou J, Lemarinier P, Ltaief H, Luszczek P, YarKhan A, Dongarra J (2011) Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp 1432–1441. https://doi.org/10.1109/IPDPS.2011.299 Bosilca G, Bouteiller A, Danalis A, Faverge M, Haidar A, Herault T, Kurzak J, Langou J, Lemarinier P, Ltaief H, Luszczek P, YarKhan A, Dongarra J (2011) Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp 1432–1441. https://​doi.​org/​10.​1109/​IPDPS.​2011.​299
7.
Zurück zum Zitat Bueno J, Martorell X, Badia RM, Ayguadé E, Labarta J (2013) Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS ’13, pp 359–368 Bueno J, Martorell X, Badia RM, Ayguadé E, Labarta J (2013) Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS ’13, pp 359–368
8.
Zurück zum Zitat Burke MG, Knobe K, Newton R, Sarkar V (2005) UPC language specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Lab Burke MG, Knobe K, Newton R, Sarkar V (2005) UPC language specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Lab
10.
Zurück zum Zitat Charles P, Grothoff C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. In: 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA ’05, pp 519–538 Charles P, Grothoff C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. In: 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA ’05, pp 519–538
12.
16.
Zurück zum Zitat Cray Inc (2017) Chapel language specification version 0.984 Cray Inc (2017) Chapel language specification version 0.984
20.
Zurück zum Zitat Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: 1991 ACM/IEEE Conference on Supercomputing, Supercomputing ’91, pp 4–13. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/125826.125848 Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: 1991 ACM/IEEE Conference on Supercomputing, Supercomputing ’91, pp 4–13. Association for Computing Machinery, New York, NY, USA. https://​doi.​org/​10.​1145/​125826.​125848
21.
Zurück zum Zitat Slaughter E, Lee W, Treichler S, Bauer M, Aiken A (2015) Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’15, pp 1–12 . https://doi.org/10.1145/2807591.2807629 Slaughter E, Lee W, Treichler S, Bauer M, Aiken A (2015) Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’15, pp 1–12 . https://​doi.​org/​10.​1145/​2807591.​2807629
22.
Zurück zum Zitat Tejedor E, Farreras M, Grove D, Badia RM, Almasi G, Labarta J (2012) A high-productivity task-based programming model for clusters. Concurr Comput Pract Exp 24(18):2421–2448CrossRef Tejedor E, Farreras M, Grove D, Badia RM, Almasi G, Labarta J (2012) A high-productivity task-based programming model for clusters. Concurr Comput Pract Exp 24(18):2421–2448CrossRef
23.
Zurück zum Zitat Wozniak JM, Armstrong TG, Wilde M, Katz DS, Lusk E, Foster IT (2013) Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp 95–102. https://doi.org/10.1109/CCGrid.2013.99 Wozniak JM, Armstrong TG, Wilde M, Katz DS, Lusk E, Foster IT (2013) Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp 95–102. https://​doi.​org/​10.​1109/​CCGrid.​2013.​99
24.
Zurück zum Zitat Yelick K, Bonachea D, Chen WY, Colella P, Datta K, Duell J, Graham SL, Hargrove P, Hilfinger P, Husbands P, Iancu C, Kamil A, Nishtala R, Su J, Welcome M, Wen T (2007) Productivity and performance using partitioned global address space languages. In: Proceedings 2007 International Workshop on Parallel Symbolic Computation, PASCO ’07, pp 24–32. https://doi.org/10.1145/1278177.1278183 Yelick K, Bonachea D, Chen WY, Colella P, Datta K, Duell J, Graham SL, Hargrove P, Hilfinger P, Husbands P, Iancu C, Kamil A, Nishtala R, Su J, Welcome M, Wen T (2007) Productivity and performance using partitioned global address space languages. In: Proceedings 2007 International Workshop on Parallel Symbolic Computation, PASCO ’07, pp 24–32. https://​doi.​org/​10.​1145/​1278177.​1278183
25.
Zurück zum Zitat Yelick KA, Graham SL, Hilfinger PN, Bonachea D, Su J, Kamil A, Datta K, Colella P, Wen T (2011) Titanium. In: Encyclopedia of Parallel Computing, pp 2049–2055. Springer US Yelick KA, Graham SL, Hilfinger PN, Bonachea D, Su J, Kamil A, Datta K, Colella P, Wen T (2011) Titanium. In: Encyclopedia of Parallel Computing, pp 2049–2055. Springer US
26.
Zurück zum Zitat Zheng Y, Kamil A, Driscoll MB, Shan H, Yelick K (2014) UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp 1105–1114 Zheng Y, Kamil A, Driscoll MB, Shan H, Yelick K (2014) UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp 1105–1114
Metadaten
Titel
High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn
verfasst von
Basilio B. Fraguela
Diego Andrade
Publikationsdatum
08.01.2021
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 7/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03607-1

Weitere Artikel der Ausgabe 7/2021

The Journal of Supercomputing 7/2021 Zur Ausgabe