Skip to main content
Top
Published in: The Journal of Supercomputing 11/2018

22-06-2016

Assessing and discovering parallelism in C\(++\) code for heterogeneous platforms

Authors: David del Rio Astorga, Rafael Sotomayor, Luis Miguel Sanchez, Javier Garcia Blas, Alejandro Calderon, Javier Fernandez

Published in: The Journal of Supercomputing | Issue 11/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Massively parallel architectures are mainly based on a parallel heterogeneous setup. They are composed by different computing devices that speed up specific code regions, named kernels. These kernels are usually executed offline in the corresponding devices. Porting applications to a specific heterogeneous platform is a costly task in terms of time and human resources. The key points in the porting process are the manual analysis of the source code and kernel detection. Each device of these heterogeneous platforms has their own restrictions, such as the memory allocation support. Kernels must be mapped with suitable computing devices. We introduced AKI as an automatic kernel identification and annotation tool that aims to identify potential kernels on C\(++\) sequential applications. AKI identifies those kernels that can be offlined on heterogeneous computing devices. To annotate these kernels, REPARA C++ attributes have been defined. This annotation mechanism can aid future automatic source-to-source transformation tools to facilitate the work for parallel heterogeneous platforms. AKI has been evaluated over all benchmarks included in the NAS suite. The benchmark suite incorporates a big set of realistic high performance applications. The evaluation results demonstrate that AKI is a competitive solution for identifying and annotating parallel code fragments (aka kernels).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Athavale A, Ranadive P, Babu MN, Pawar P, Sah S, Vaidya V, Rajguru C (2012) Automatic sequential to parallel code conversion the S2P tool and performance analysis. GSTF J Comput (JoC) 1(4) Athavale A, Ranadive P, Babu MN, Pawar P, Sah S, Vaidya V, Rajguru C (2012) Automatic sequential to parallel code conversion the S2P tool and performance analysis. GSTF J Comput (JoC) 1(4)
2.
go back to reference Brown C, Janjic V, Hammond K, Schner H, Idrees K, Glass C (2014) Agricultural reform: more efficient farming using advanced parallel refactoring tools. In: 22nd Euromicro international conference on parallel, distributed, and network-based processing Brown C, Janjic V, Hammond K, Schner H, Idrees K, Glass C (2014) Agricultural reform: more efficient farming using advanced parallel refactoring tools. In: 22nd Euromicro international conference on parallel, distributed, and network-based processing
3.
go back to reference Binkley D (2007) Source code analysis: a road map. In: Future of software engineering, FOSE ’07. IEEE Computer Society, Washington, DC, pp 104–119 Binkley D (2007) Source code analysis: a road map. In: Future of software engineering, FOSE ’07. IEEE Computer Society, Washington, DC, pp 104–119
4.
go back to reference Bozó I, Fordós V, Horvath Z, Tóth M, Horpácsi D, Kozsik T, Köszegi J, Barwell A, Brown C, Hammond K (2014) Discovering parallel pattern candidates in Erlang. In: 13th ACM SIGPLAN workshop on Erlang, Erlang ’14. ACM, New York, pp 13–23 Bozó I, Fordós V, Horvath Z, Tóth M, Horpácsi D, Kozsik T, Köszegi J, Barwell A, Brown C, Hammond K (2014) Discovering parallel pattern candidates in Erlang. In: 13th ACM SIGPLAN workshop on Erlang, Erlang ’14. ACM, New York, pp 13–23
5.
go back to reference Brown C, Hammond K, Danelutto M, Kilpatrick P, Schöner H, Breddin T (2013) Paraphrasing: generating parallel programs using refactoring. In: Beckert B, Damiani F, de Boer F, Bonsangue MM (eds) Formal methods for components and objects, LNCS, vol 7542. Springer, Berlin, pp 237–256CrossRef Brown C, Hammond K, Danelutto M, Kilpatrick P, Schöner H, Breddin T (2013) Paraphrasing: generating parallel programs using refactoring. In: Beckert B, Damiani F, de Boer F, Bonsangue MM (eds) Formal methods for components and objects, LNCS, vol 7542. Springer, Berlin, pp 237–256CrossRef
6.
go back to reference Castro PDO, Akel C, Petit E, Popov M, Jalby W (2015) CERE: LLVM-based Codelet Extractor and REplayer for piecewise benchmarking and optimization. ACM Trans Archit Code Optim 12:6:1–6:24CrossRef Castro PDO, Akel C, Petit E, Popov M, Jalby W (2015) CERE: LLVM-based Codelet Extractor and REplayer for piecewise benchmarking and optimization. ACM Trans Archit Code Optim 12:6:1–6:24CrossRef
8.
go back to reference Göhringer D, Tepelmann J (2014) An interactive tool based on polly for detection and parallelization of loops. In: Workshop on parallel programming and run-time management techniques for many-core architectures and design tools and architectures for multicore embedded computing platforms, PARMA-DITAM ’14. ACM, New York, pp 1:1–1:6 Göhringer D, Tepelmann J (2014) An interactive tool based on polly for detection and parallelization of loops. In: Workshop on parallel programming and run-time management techniques for many-core architectures and design tools and architectures for multicore embedded computing platforms, PARMA-DITAM ’14. ACM, New York, pp 1:1–1:6
9.
go back to reference González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40(12):1135–1160CrossRef González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40(12):1135–1160CrossRef
11.
go back to reference ISO/IEC (2011) Information technology—programming languages—C++. In: International standard ISO/IEC 14882:20111. ISO/IEC, Geneva ISO/IEC (2011) Information technology—programming languages—C++. In: International standard ISO/IEC 14882:20111. ISO/IEC, Geneva
12.
go back to reference Jin H, Jost G, Yan J, Ayguade E, Gonzalez M, Martorell X (2003) Automatic multilevel parallelization using OpenMP. Sci Program 11(2):177–190 Jin H, Jost G, Yan J, Ayguade E, Gonzalez M, Martorell X (2003) Automatic multilevel parallelization using OpenMP. Sci Program 11(2):177–190
13.
go back to reference Kevin H, Allan P, Nick C, Hartmut K, Malony Allen D, Thomas S, Rob F (2015) An autonomic performance environment for exascale. In: Supercomputing frontiers and innovations, pp 49–66 Kevin H, Allan P, Nick C, Hartmut K, Malony Allen D, Thomas S, Rob F (2015) An autonomic performance environment for exascale. In: Supercomputing frontiers and innovations, pp 49–66
14.
go back to reference Lee S, Vetter JS (2014) OpenARC: Open accelerator research compiler for directive-based, efficient heterogeneous computing. In: 23rd international symposium on high-performance parallel and distributed computing, HPDC ’14. ACM, New York, pp 115–120 Lee S, Vetter JS (2014) OpenARC: Open accelerator research compiler for directive-based, efficient heterogeneous computing. In: 23rd international symposium on high-performance parallel and distributed computing, HPDC ’14. ACM, New York, pp 115–120
15.
go back to reference Li Z, Atre R, Ul-Huda Z, Jannesari A, Wolf F (2015) Discopop: a profiling tool to identify parallelization opportunities. In: Tools for high performance computing 2014, chap 3. Springer, New York, , pp 37–54CrossRef Li Z, Atre R, Ul-Huda Z, Jannesari A, Wolf F (2015) Discopop: a profiling tool to identify parallelization opportunities. In: Tools for high performance computing 2014, chap 3. Springer, New York, , pp 37–54CrossRef
16.
go back to reference Lattner C (2008) LLVM and Clang: next generation compiler technology. In: The BSD conference, pp 1–2 Lattner C (2008) LLVM and Clang: next generation compiler technology. In: The BSD conference, pp 1–2
17.
go back to reference McCool M, Reinders J, Robison A (2012) Structured parallel programming: patterns for efficient computation, 1st edn. Morgan Kaufmann, San Francisco McCool M, Reinders J, Robison A (2012) Structured parallel programming: patterns for efficient computation, 1st edn. Morgan Kaufmann, San Francisco
19.
go back to reference Sotomayor R, Sanchez LM, Garcia Blas J, Calderon A, Fernandez J (2015) AKI: automatic kernel identification and annotation tool based on C\(++\) attributes. In: IEEE TrustCom-BigDataSE-ISPA 2015, pp 148–156 Sotomayor R, Sanchez LM, Garcia Blas J, Calderon A, Fernandez J (2015) AKI: automatic kernel identification and annotation tool based on C\(++\) attributes. In: IEEE TrustCom-BigDataSE-ISPA 2015, pp 148–156
20.
go back to reference Seo S, Jo G, Lee J (2011) Performance characterization of the NAS parallel benchmarks in OpenCL. In: 2011 IEEE international symposium on workload characterization (IISWC), pp 137–148 Seo S, Jo G, Lee J (2011) Performance characterization of the NAS parallel benchmarks in OpenCL. In: 2011 IEEE international symposium on workload characterization (IISWC), pp 137–148
21.
go back to reference Torquati M, Vanneschi M, Amini M, Guelton S, Keryell R, Lanore V, Pasquier FX, Barreteau M, Barrère R, Petrisor CT et al (2012) An innovative compilation tool-chain for embedded multi-core architectures. In: Embedded world conference Torquati M, Vanneschi M, Amini M, Guelton S, Keryell R, Lanore V, Pasquier FX, Barreteau M, Barrère R, Petrisor CT et al (2012) An innovative compilation tool-chain for embedded multi-core architectures. In: Embedded world conference
22.
go back to reference Tournavitis G, Franke B (2010) Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information. PACT 2010:377–388 Tournavitis G, Franke B (2010) Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information. PACT 2010:377–388
23.
go back to reference Vandierendonck H, Rul S, De Bosschere K (2010) The paralax infrastructure: automatic parallelization with a helping hand. PACT 2010:389–400MATH Vandierendonck H, Rul S, De Bosschere K (2010) The paralax infrastructure: automatic parallelization with a helping hand. PACT 2010:389–400MATH
Metadata
Title
Assessing and discovering parallelism in C code for heterogeneous platforms
Authors
David del Rio Astorga
Rafael Sotomayor
Luis Miguel Sanchez
Javier Garcia Blas
Alejandro Calderon
Javier Fernandez
Publication date
22-06-2016
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 11/2018
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1794-8

Other articles of this Issue 11/2018

The Journal of Supercomputing 11/2018 Go to the issue

Premium Partner