Skip to main content

2019 | OriginalPaper | Buchkapitel

Research on Matrix Multiplication Based on the Combination of OpenACC and CUDA

verfasst von : Yuexing Wang

Erschienen in: Geo-informatics in Sustainable Ecosystem and Society

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the improvement of GPU’s general computing capacity, the use of parallel computing to solve some difficult problems with large amount of data and intensive computing tasks has become the trend of the times. In GPU general computing, CUDA and OpenCL have been widely used and studied. However, the two parallel programming models generally exist the weakness that whose API is too close to the underlying hardware, which makes programming inefficient and is not suitable for the large-scale parallel tasks that require rapid implementation. OpenACC is a relatively advanced and simple programming language, which can achieve rapid parallelization, but the computing effect of the program is relatively low (generally lower than CUDA). Therefore, this paper tries to combine CUDA and OpenACC for mixed parallelization. This way not only greatly reduces the workload of code conversion, but also has a computing performance no less than a pure CUDA program.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Harris, M., et al.: GPGPU: general purpose computation on graphics hardware. In: ACM SIGGRAPH 2004 Course Notes, p. 33. ACM (2004) Harris, M., et al.: GPGPU: general purpose computation on graphics hardware. In: ACM SIGGRAPH 2004 Course Notes, p. 33. ACM (2004)
2.
Zurück zum Zitat Yang, Y., et al.: An optimizing compiler for GPGPU programs with input-data sharing. In: ACM Sigplan Symposium on Principles & Practice of Parallel Programming, pp. 343–344. ACM (2010) Yang, Y., et al.: An optimizing compiler for GPGPU programs with input-data sharing. In: ACM Sigplan Symposium on Principles & Practice of Parallel Programming, pp. 343–344. ACM (2010)
4.
Zurück zum Zitat Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In: ACM Sigplan Symposium on Principles and Practice of Parallel Programming, pp. 101–110. ACM (2009) Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In: ACM Sigplan Symposium on Principles and Practice of Parallel Programming, pp. 101–110. ACM (2009)
5.
Zurück zum Zitat Han, T.D., Abdelrahman, T.S.: hiCUDA: high-Level GPGPU programming. IEEE Trans. Parallel Distrib. Syst. 22(1), 78–90 (2010)CrossRef Han, T.D., Abdelrahman, T.S.: hiCUDA: high-Level GPGPU programming. IEEE Trans. Parallel Distrib. Syst. 22(1), 78–90 (2010)CrossRef
6.
Zurück zum Zitat Kessler, C., et al.: Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: The Workshop on Adaptive Resource Management & Scheduling for Cloud Computing, pp. 1–6. ACM (2017) Kessler, C., et al.: Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: The Workshop on Adaptive Resource Management & Scheduling for Cloud Computing, pp. 1–6. ACM (2017)
7.
Zurück zum Zitat Komatsu, K., et al.: Translation of large-scale simulation codes for an OpenACC platform using the xevolver framework. Int. J. Networking Comput. 6(2), 167–180 (2017)CrossRef Komatsu, K., et al.: Translation of large-scale simulation codes for an OpenACC platform using the xevolver framework. Int. J. Networking Comput. 6(2), 167–180 (2017)CrossRef
8.
Zurück zum Zitat Rostami, R.M., Ghaffari-Miab, M.: Fast computation of finite difference generated time-domain Green’s functions of layered media using OpenAcc on graphics processors. In: Iranian Conference on Electrical Engineering (2017) Rostami, R.M., Ghaffari-Miab, M.: Fast computation of finite difference generated time-domain Green’s functions of layered media using OpenAcc on graphics processors. In: Iranian Conference on Electrical Engineering (2017)
9.
Zurück zum Zitat Pereira, A.D., et al.: Enabling efficient stencil code generation in OpenACC. Procedia Comput. Sci. 108, 2333–2337 (2017)CrossRef Pereira, A.D., et al.: Enabling efficient stencil code generation in OpenACC. Procedia Comput. Sci. 108, 2333–2337 (2017)CrossRef
11.
Zurück zum Zitat Feki, S., Al-Jarro, A., Bagci, H.: Multi-GPU-based acceleration of the explicit time domain volume integral equation solver using MPI-OpenACC. Radio Science Meeting, p. 90. IEEE (2013) Feki, S., Al-Jarro, A., Bagci, H.: Multi-GPU-based acceleration of the explicit time domain volume integral equation solver using MPI-OpenACC. Radio Science Meeting, p. 90. IEEE (2013)
Metadaten
Titel
Research on Matrix Multiplication Based on the Combination of OpenACC and CUDA
verfasst von
Yuexing Wang
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-7025-0_10

Premium Partner