Skip to main content

2018 | OriginalPaper | Buchkapitel

Impact of Compiler Phase Ordering When Targeting GPUs

verfasst von : Ricardo Nobre, Luís Reis, João M. P. Cardoso

Erschienen in: Euro-Par 2017: Parallel Processing Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Research in compiler pass phase ordering (i.e., selection of compiler analysis/transformation passes and their order of execution) has been mostly performed in the context of CPUs and, in a small number of cases, FPGAs. In this paper we present experiments regarding compiler pass phase ordering specialization of OpenCL kernels targeting NVIDIA GPUs using Clang/LLVM 3.9 and the libclc OpenCL library. More specifically, we analyze the impact of using specialized compiler phase orders on the performance of 15 PolyBench/GPU OpenCL benchmarks. In addition, we analyze the final NVIDIA PTX assembly code generated by the different compilation flows in order to identify the main reasons for the cases with significant performance improvements. Using specialized compiler phase orders, we were able to achieve performance improvements over the CUDA version and OpenCL compiled with the NVIDIA driver. Compared to CUDA, we were able to achieve geometric mean improvements of \(1.54\times \) (up to \(5.48\times \)). Compared to the OpenCL driver version, we were able to achieve geometric mean improvements of \(1.65\times \) (up to \(5.70\times \)).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Khronos OpenCL Working Group. The OpenCL C Specification, Version 2.0 (2015) Khronos OpenCL Working Group. The OpenCL C Specification, Version 2.0 (2015)
2.
Zurück zum Zitat Nickolls, J., et al.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)CrossRef Nickolls, J., et al.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)CrossRef
3.
Zurück zum Zitat Betkaoui, B., Thomas, D.B., Luk, W.: Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing. In: 2010 International Conference on Field-Programmable Technology, Beijing, pp. 94–101 (2010) Betkaoui, B., Thomas, D.B., Luk, W.: Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing. In: 2010 International Conference on Field-Programmable Technology, Beijing, pp. 94–101 (2010)
4.
Zurück zum Zitat Kulkarni, S., Cavazos, J.: Mitigating the compiler optimization phase-ordering problem using machine learning. In: Proceedings of ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2012, pp. 147–162. ACM, New York (2012) Kulkarni, S., Cavazos, J.: Mitigating the compiler optimization phase-ordering problem using machine learning. In: Proceedings of ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2012, pp. 147–162. ACM, New York (2012)
5.
Zurück zum Zitat Purini, S., Jain, L.: Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. (TACO) 9(4), 56:1–56:23 (2013) Purini, S., Jain, L.: Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. (TACO) 9(4), 56:1–56:23 (2013)
6.
Zurück zum Zitat Martins, L.G.A., et al.: Clustering-based selection for the exploration of compiler optimization sequences. ACM Trans. Archit. Code Optim. (TACO) 13(1), 8:1–8:28 (2016) Martins, L.G.A., et al.: Clustering-based selection for the exploration of compiler optimization sequences. ACM Trans. Archit. Code Optim. (TACO) 13(1), 8:1–8:28 (2016)
7.
Zurück zum Zitat Nobre, R., Martins, L.G.A., Cardoso, J.M.P.: Use of previously acquired positioning of optimizations for phase ordering exploration. In: Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES 2015), pp. 58–67. ACM, New York (2015) Nobre, R., Martins, L.G.A., Cardoso, J.M.P.: Use of previously acquired positioning of optimizations for phase ordering exploration. In: Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES 2015), pp. 58–67. ACM, New York (2015)
8.
Zurück zum Zitat Nobre, R., Martins, L.G.A., Cardoso, J.M.P.: A graph-based iterative compiler pass selection and phase ordering approach. In: Proceedings of 17th ACM Conference on Languages, Compilers, Tools, and Theory for Embedded Systems, LCTES 2016, pp. 21–30. ACM, New York (2016) Nobre, R., Martins, L.G.A., Cardoso, J.M.P.: A graph-based iterative compiler pass selection and phase ordering approach. In: Proceedings of 17th ACM Conference on Languages, Compilers, Tools, and Theory for Embedded Systems, LCTES 2016, pp. 21–30. ACM, New York (2016)
9.
Zurück zum Zitat Nobre, R., Reis, L., Cardoso, J.M.P.: Compiler phase ordering as an orthogonal approach for reducing energy consumption. In: Proceedings of the 19th Workshop on Compilers for Parallel Computing, CPC 2016 (2016) Nobre, R., Reis, L., Cardoso, J.M.P.: Compiler phase ordering as an orthogonal approach for reducing energy consumption. In: Proceedings of the 19th Workshop on Compilers for Parallel Computing, CPC 2016 (2016)
10.
Zurück zum Zitat Grauer-Gray, S., et al.: Auto-tuning a high-level language targeted to GPU codes. In: Proceedings of Innovative Parallel Computing (InPar 2012) (2012) Grauer-Gray, S., et al.: Auto-tuning a high-level language targeted to GPU codes. In: Proceedings of Innovative Parallel Computing (InPar 2012) (2012)
11.
Zurück zum Zitat Purini, S., Jain, L.: Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. 9(4), 23 (2013). Article 56CrossRef Purini, S., Jain, L.: Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. 9(4), 23 (2013). Article 56CrossRef
Metadaten
Titel
Impact of Compiler Phase Ordering When Targeting GPUs
verfasst von
Ricardo Nobre
Luís Reis
João M. P. Cardoso
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75178-8_35

Neuer Inhalt