Skip to main content
Erschienen in: International Journal of Parallel Programming 1/2018

13.03.2017

High-Level Programming for Many-Cores Using C++14 and the STL

verfasst von: Michael Haidl, Sergei Gorlatch

Erschienen in: International Journal of Parallel Programming | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Programming many-core systems with accelerators (e.g., GPUs) remains a challenging task, even for expert programmers. In the current, low-level approaches—OpenCL and CUDA—two distinct programming models are employed: the host code for the CPU is written in C/C++ with a restricted memory model, while the device code for the accelerator is written using a device-dependent model of CUDA or OpenCL. The programmer is responsible for explicitly specifying parallelism, memory transfers, and synchronization, and also for configuring the program and optimizing its performance for a particular many-core system. This leads to long, poorly structured and error-prone codes, often with a suboptimal performance. We present PACXX—an alternative, unified programming approach for accelerators. In PACXX, both host and device programs are written in the same programming language—the newest C++14 standard with the Standard Template Library (STL), including all modern features: type inference (auto), variadic templates, generic lambda expressions, and the newly proposed parallel extensions of the STL. PACXX includes an easy-to-use and type-safe API for multi-stage programming which allows for aggressive runtime compiler optimizations. We implement PACXX by developing a custom compiler (based on the Clang and LLVM frameworks) and a runtime system, that together perform memory management and synchronization automatically and transparently for the programmer. We evaluate our approach by comparing it to OpenCL regarding program size and target performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Targeting distributed systems in fastflow. In: Euro-Par 2012: Parallel Processing Workshops, pp. 47–56, Springer (2012) Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Targeting distributed systems in fastflow. In: Euro-Par 2012: Parallel Processing Workshops, pp. 47–56, Springer (2012)
2.
Zurück zum Zitat AMD: Bolt C++ Template Library. Version 1.2 (2014) AMD: Bolt C++ Template Library. Version 1.2 (2014)
3.
Zurück zum Zitat Bell, N., Hoberock, J.: Thrust: a parallel template library. GPU Computing Gems Jade Edition. pp. 359–372 (2011) Bell, N., Hoberock, J.: Thrust: a parallel template library. GPU Computing Gems Jade Edition. pp. 359–372 (2011)
4.
Zurück zum Zitat Bischof, H., Gorlatch, S., Leshchinskiy, R., Müller, J.: Data parallelism in C++ template programs: a Barnes-Hut case study. Parallel Process. Lett. 15(03), 257–272 (2005)MathSciNetCrossRef Bischof, H., Gorlatch, S., Leshchinskiy, R., Müller, J.: Data parallelism in C++ template programs: a Barnes-Hut case study. Parallel Process. Lett. 15(03), 257–272 (2005)MathSciNetCrossRef
5.
Zurück zum Zitat Enmyren, J., Kessler, C.: SkePU: A multi-backend skeleton programming library for multi-GPU Systems. In: Proceedings of the Fourth International Workshop on High-Level Parallel Programming and Applications, ACM, pp 5–14 (2010) Enmyren, J., Kessler, C.: SkePU: A multi-backend skeleton programming library for multi-GPU Systems. In: Proceedings of the Fourth International Workshop on High-Level Parallel Programming and Applications, ACM, pp 5–14 (2010)
6.
Zurück zum Zitat Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. Int. J. High Perform. Comput. Netw. 7(2), 129–138 (2012)CrossRef Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. Int. J. High Perform. Comput. Netw. 7(2), 129–138 (2012)CrossRef
7.
Zurück zum Zitat Gorlatch, S., Cole, M.: Parallel skeletons. In: Encyclopedia of Parallel Computing. pp. 1417–1422, Springer (2011) Gorlatch, S., Cole, M.: Parallel skeletons. In: Encyclopedia of Parallel Computing. pp. 1417–1422, Springer (2011)
8.
Zurück zum Zitat isocpp (2014a) Programming languages - C++ (committee draft) isocpp (2014a) Programming languages - C++ (committee draft)
9.
Zurück zum Zitat isocpp (2014b) Working draft, C++ extensions for ranges [N4569] isocpp (2014b) Working draft, C++ extensions for ranges [N4569]
10.
Zurück zum Zitat isocpp (2015a) Programming languages—C++ extensions for library fundamentals [N4480] isocpp (2015a) Programming languages—C++ extensions for library fundamentals [N4480]
11.
Zurück zum Zitat isocpp (2015b) Technical specification for C++ extensions for parallelism [N4578] isocpp (2015b) Technical specification for C++ extensions for parallelism [N4578]
12.
Zurück zum Zitat Khronos Group: the OpenCL specification. Version 1.2 (2012) Khronos Group: the OpenCL specification. Version 1.2 (2012)
13.
Zurück zum Zitat Khronos Group: the SPIR specification. Version 1.2 (2014) Khronos Group: the SPIR specification. Version 1.2 (2014)
14.
Zurück zum Zitat Khronos Group: SYCL specifcation. Version 1.2 (2015) Khronos Group: SYCL specifcation. Version 1.2 (2015)
15.
Zurück zum Zitat Lattner, C.: LLVM and Clang: next generation compiler technology. In: Proceedings of the BSD Conference, pp 1–2 (2008) Lattner, C.: LLVM and Clang: next generation compiler technology. In: Proceedings of the BSD Conference, pp 1–2 (2008)
17.
Zurück zum Zitat Microsoft: C++ AMP: language and programming model. Version 1.0 (2012) Microsoft: C++ AMP: language and programming model. Version 1.0 (2012)
19.
Zurück zum Zitat Nvidia: CUDA programming guide. Version 7.5 (2015a) Nvidia: CUDA programming guide. Version 7.5 (2015a)
20.
21.
Zurück zum Zitat Nvidia: Parallel thread execution ISA. Version 4.3 (2015c) Nvidia: Parallel thread execution ISA. Version 4.3 (2015c)
22.
Zurück zum Zitat Nyland, L., Harris, M., Prins, J.: Fast N-body simulation with CUDA. GPU Gems 3(1), 677–696 (2007) Nyland, L., Harris, M., Prins, J.: Fast N-body simulation with CUDA. GPU Gems 3(1), 677–696 (2007)
23.
Zurück zum Zitat Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, vol. 501. Wiley, Hoboken (2009)MATH Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, vol. 501. Wiley, Hoboken (2009)MATH
24.
Zurück zum Zitat Rompf, T., Odersky, M.: Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. ACM SIGPLAN Notices, vol. 46, pp 127–136, ACM (2010) Rompf, T., Odersky, M.: Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. ACM SIGPLAN Notices, vol. 46, pp 127–136, ACM (2010)
25.
Zurück zum Zitat Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—a portable skeleton library for high-level GPU programming. In: Workshop on High-Level Parallel Programming Models and Supportive Environments at IPDPS 2011, IEEE, pp 1176–1182 (2011) Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—a portable skeleton library for high-level GPU programming. In: Workshop on High-Level Parallel Programming Models and Supportive Environments at IPDPS 2011, IEEE, pp 1176–1182 (2011)
26.
Zurück zum Zitat Sujeeth, A.K., Brown, K.J., Lee, H., et al.: Delite: a compiler architecture for performance-oriented embedded domain-specific languages. ACM Trans. Embed. Comput. Syst. 13(4s), 134:1–134:25 (2014)CrossRef Sujeeth, A.K., Brown, K.J., Lee, H., et al.: Delite: a compiler architecture for performance-oriented embedded domain-specific languages. ACM Trans. Embed. Comput. Syst. 13(4s), 134:1–134:25 (2014)CrossRef
27.
Zurück zum Zitat Taha, W.: A gentle introduction to multi-stage programming. In: Domain-Specific Program Generation, pp 30–50, Springer (2004) Taha, W.: A gentle introduction to multi-stage programming. In: Domain-Specific Program Generation, pp 30–50, Springer (2004)
Metadaten
Titel
High-Level Programming for Many-Cores Using C++14 and the STL
verfasst von
Michael Haidl
Sergei Gorlatch
Publikationsdatum
13.03.2017
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 1/2018
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-017-0497-y

Weitere Artikel der Ausgabe 1/2018

International Journal of Parallel Programming 1/2018 Zur Ausgabe