Skip to main content
Erschienen in: The Journal of Supercomputing 7/2021

04.01.2021

A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing

verfasst von: Mingyu Kim, Nakhoon Baek

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recently, massively-parallel computing libraries and devices are much widely used, in addition to the traditional 3D graphics systems. In this paper, we present a full 3D fixed-function graphics pipeline, based on the OpenCL, which is one of the most widely used massively-parallel computing library. The full 3D graphics features including WebGL, Web3D and others can be implemented on the massively-parallel computations, without underlying 3D graphics hardware support. Many previous works focused on another massively-parallel system of CUDA, which has a drawback of limited availability. In contrast, we designed and implemented a new architecture with OpenCL, which is now available on various computing devices, including most CPUs, GPUs, and at least theoretically, special-purpose embedded FPGAs. Our work provides full 3D graphics features on OpenCL-capable systems, without dedicated 3D graphics hardware, to finally make 3D graphics features ubiquitous. Technically, we used a top-down approach in its rendering, from the whole screen to precise pixels. At each stage, we tuned our OpenCL implementations and also their global and local parameter spaces. We present the details of our design and also the final result of our implementation, and show its correctness and efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) TensorFlow: A system for large-scale machine learning. In: Proc of the 12th USENIX Conference on Operating Systems Design and Implementation. OSDI’16. USENIX Association, USA, pp 265–283 Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) TensorFlow: A system for large-scale machine learning. In: Proc of the 12th USENIX Conference on Operating Systems Design and Implementation. OSDI’16. USENIX Association, USA, pp 265–283
5.
Zurück zum Zitat Baek N, Kim K (2019) Design and implementation of OpenGL SC 2.0 rendering pipeline. Cluster Comput 22:931–936CrossRef Baek N, Kim K (2019) Design and implementation of OpenGL SC 2.0 rendering pipeline. Cluster Comput 22:931–936CrossRef
6.
Zurück zum Zitat Boyd C (2007) The Direct3D 10 pipeline. In: ACM SIGGRAPH 2007 Courses, SIGGRAPH ’07, pp. 52–109. ACM Boyd C (2007) The Direct3D 10 pipeline. In: ACM SIGGRAPH 2007 Courses, SIGGRAPH ’07, pp. 52–109. ACM
7.
Zurück zum Zitat Colonna G, Bonelli F, Pascazio G (2019) Impact of fundamental molecular kinetics on macroscopic properties of high-enthalpy flows: the case of hypersonic atmospheric entry. Phys Rev Fluids 4(3):033404CrossRef Colonna G, Bonelli F, Pascazio G (2019) Impact of fundamental molecular kinetics on macroscopic properties of high-enthalpy flows: the case of hypersonic atmospheric entry. Phys Rev Fluids 4(3):033404CrossRef
8.
Zurück zum Zitat Corso AD, Salvi M, Kolb C, Frisvad JR, Lefohn A, Luebke D (2017) Interactive stable ray tracing. In: Proc of High Performance Graphics, HPG ’17, p. Article 1. ACM Corso AD, Salvi M, Kolb C, Frisvad JR, Lefohn A, Luebke D (2017) Interactive stable ray tracing. In: Proc of High Performance Graphics, HPG ’17, p. Article 1. ACM
9.
Zurück zum Zitat Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. Addison-Wesley Professional, Boston Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. Addison-Wesley Professional, Boston
10.
Zurück zum Zitat Gambhir M, Panda S, Basha SJ (2018) Vulkan rendering framework for mobile multimedia. In: SIGGRAPH Asia 2018 Posters, SA ’18. ACM Gambhir M, Panda S, Basha SJ (2018) Vulkan rendering framework for mobile multimedia. In: SIGGRAPH Asia 2018 Posters, SA ’18. ACM
12.
Zurück zum Zitat Gervasi O, Russo D, Vella F (2010) The AES implantation based on OpenCL for multi/many core architecture. In: 2010 Int’l Conf on Computational Science and Its Applications, pp. 129–134 Gervasi O, Russo D, Vella F (2010) The AES implantation based on OpenCL for multi/many core architecture. In: 2010 Int’l Conf on Computational Science and Its Applications, pp. 129–134
13.
Zurück zum Zitat Gkeka M.R, Bellas N, Antonopoulos C.D(2019) Comparative performance analysis of Vulkan implementations of computational applications. In: Proc of the Int’l Workshop on OpenCL, IWOCL ’19, p. Article 6. ACM Gkeka M.R, Bellas N, Antonopoulos C.D(2019) Comparative performance analysis of Vulkan implementations of computational applications. In: Proc of the Int’l Workshop on OpenCL, IWOCL ’19, p. Article 6. ACM
14.
Zurück zum Zitat Heinecke A, Trinitis C, Weidendorfer J(2010) Porting existing cache-oblivious linear algebra HPC modules to Larrabee architecture. In: Proc of the 7th ACM Int’l Conf on Computing Frontiers, CF ’10, pp. 91–92. ACM Heinecke A, Trinitis C, Weidendorfer J(2010) Porting existing cache-oblivious linear algebra HPC modules to Larrabee architecture. In: Proc of the 7th ACM Int’l Conf on Computing Frontiers, CF ’10, pp. 91–92. ACM
15.
Zurück zum Zitat Hughes JF et al (2013) Computer graphics: principles and practice: principles and practices. Addison-Wesley Professional, Boston Hughes JF et al (2013) Computer graphics: principles and practice: principles and practices. Addison-Wesley Professional, Boston
16.
Zurück zum Zitat Intel: Intel FPGA SDK for OpenCL software technology (retrieved on October 07, 2020) Intel: Intel FPGA SDK for OpenCL software technology (retrieved on October 07, 2020)
19.
Zurück zum Zitat Iqbal U et al (2016) Cancer-disease associations: a visualization and animation through medical big data. Comp Methods Programs Biomed 127:44–51CrossRef Iqbal U et al (2016) Cancer-disease associations: a visualization and animation through medical big data. Comp Methods Programs Biomed 127:44–51CrossRef
20.
21.
Zurück zum Zitat Kenzel M, Kerbl B, Schmalstieg D, Steinberger D (2018) A high-performance software graphics pipeline architecture for the GPU. ACM Trans Graph 37(4):140:1–140:15CrossRef Kenzel M, Kerbl B, Schmalstieg D, Steinberger D (2018) A high-performance software graphics pipeline architecture for the GPU. ACM Trans Graph 37(4):140:1–140:15CrossRef
22.
Zurück zum Zitat Kerbl B, Kenzel M, Schmalstieg D, Steinberger M (2017) Effective static bin patterns for sort-middle rendering. In: Proc of High Performance Graphics, HPG ’17. ACM Kerbl B, Kenzel M, Schmalstieg D, Steinberger M (2017) Effective static bin patterns for sort-middle rendering. In: Proc of High Performance Graphics, HPG ’17. ACM
23.
Zurück zum Zitat Kessenich J, Sellers G, Shreiner D (2016) OpenGL Programming Guide. Addison-Wesley Professional, Boston Kessenich J, Sellers G, Shreiner D (2016) OpenGL Programming Guide. Addison-Wesley Professional, Boston
24.
Zurück zum Zitat Khronos OpenCL Working Group (2012) The OpenCL Specification version 1.2. Khronos Group, Beaverton Khronos OpenCL Working Group (2012) The OpenCL Specification version 1.2. Khronos Group, Beaverton
25.
Zurück zum Zitat Khronos OpenCL Working Group (2019) The OpenCL Specification version 2.2. Khronos Group, Beaverton Khronos OpenCL Working Group (2019) The OpenCL Specification version 2.2. Khronos Group, Beaverton
26.
Zurück zum Zitat Khronos OpenCL Working Group: OpenCL overview (retrieved on August 10, 2020) Khronos OpenCL Working Group: OpenCL overview (retrieved on August 10, 2020)
27.
Zurück zum Zitat Kirk D (2007) NVIDIA CUDA software and GPU parallel computing architecture. In: Proc of the 6th Int’l Symp on Memory Management, ISMM ’07, pp. 103–104. ACM Kirk D (2007) NVIDIA CUDA software and GPU parallel computing architecture. In: Proc of the 6th Int’l Symp on Memory Management, ISMM ’07, pp. 103–104. ACM
28.
Zurück zum Zitat Kwon Y.C, Baek N (2014) A CUDA-based implementation of OpenGL-compatible rasterization library prototype. In: Proc of the 29th Annual ACM Symp on Applied Computing, SAC ’14, pp. 1747–1748. ACM Kwon Y.C, Baek N (2014) A CUDA-based implementation of OpenGL-compatible rasterization library prototype. In: Proc of the 29th Annual ACM Symp on Applied Computing, SAC ’14, pp. 1747–1748. ACM
29.
Zurück zum Zitat Laine S, Karras T (2011) High-performance software rasterization on GPUs. In: Proc of the ACM SIGGRAPH Symp on High Performance Graphics, HPG ’11, pp. 79–88. ACM Laine S, Karras T (2011) High-performance software rasterization on GPUs. In: Proc of the ACM SIGGRAPH Symp on High Performance Graphics, HPG ’11, pp. 79–88. ACM
30.
Zurück zum Zitat Liu F, Huang M, Liu X, Wu E.H (2010) Coherent depth test scheme in FreePipe. In: Proc of the 9th ACM SIGGRAPH Conf on Virtual-Reality Continuum and Its Applications in Industry, VRCAI ’10, pp. 265–270. ACM Liu F, Huang M, Liu X, Wu E.H (2010) Coherent depth test scheme in FreePipe. In: Proc of the 9th ACM SIGGRAPH Conf on Virtual-Reality Continuum and Its Applications in Industry, VRCAI ’10, pp. 265–270. ACM
31.
Zurück zum Zitat Liu F, Huang M.C, Liu X.H, Wu E.H (2009) CUDA renderer: A programmable graphics pipeline. In: ACM SIGGRAPH ASIA 2009 Sketches, SIGGRAPH ASIA ’09. ACM Liu F, Huang M.C, Liu X.H, Wu E.H (2009) CUDA renderer: A programmable graphics pipeline. In: ACM SIGGRAPH ASIA 2009 Sketches, SIGGRAPH ASIA ’09. ACM
32.
Zurück zum Zitat Liu F, Huang M.C, Liu X.H, Wu E.H (2010) FreePipe: A programmable parallel rendering architecture for efficient multi-fragment effects. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’10, pp. 75–82. ACM Liu F, Huang M.C, Liu X.H, Wu E.H (2010) FreePipe: A programmable parallel rendering architecture for efficient multi-fragment effects. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’10, pp. 75–82. ACM
33.
Zurück zum Zitat Luna F (2016) Introduction to 3D game programming with directX 12. Mercury Learning & Information, Herndon VA, United States Luna F (2016) Introduction to 3D game programming with directX 12. Mercury Learning & Information, Herndon VA, United States
34.
Zurück zum Zitat Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming productivity, performance, and energy consumption. In: Proc of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, ARMS-CC ’17, pp. 1–6. ACM Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming productivity, performance, and energy consumption. In: Proc of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, ARMS-CC ’17, pp. 1–6. ACM
35.
Zurück zum Zitat Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pp. 1–6 Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pp. 1–6
36.
Zurück zum Zitat Mesa Team: The Mesa 3D graphics library (retrieved on August 10, 2020) Mesa Team: The Mesa 3D graphics library (retrieved on August 10, 2020)
37.
Zurück zum Zitat Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53CrossRef Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53CrossRef
38.
Zurück zum Zitat NVIDIA (2019) CUDA Toolkit Documentation version 10.2. NVIDIA, Santa Clara NVIDIA (2019) CUDA Toolkit Documentation version 10.2. NVIDIA, Santa Clara
40.
Zurück zum Zitat Olsson O, Billeter M, Assarsson, U (2012) Tiled and clustered forward shading: Supporting transparency and MSAA. In: ACM SIGGRAPH 2012 Talks, SIGGRAPH ’12. ACM Olsson O, Billeter M, Assarsson, U (2012) Tiled and clustered forward shading: Supporting transparency and MSAA. In: ACM SIGGRAPH 2012 Talks, SIGGRAPH ’12. ACM
42.
Zurück zum Zitat Patney A, Tzeng S, Seitz KA, Owens JD (2015) Piko: a framework for authoring programmable graphics pipelines. ACM Trans Graph 34(4):1–13CrossRef Patney A, Tzeng S, Seitz KA, Owens JD (2015) Piko: a framework for authoring programmable graphics pipelines. ACM Trans Graph 34(4):1–13CrossRef
43.
Zurück zum Zitat Perkins H CUDA-on-CL:(2017) A compiler and runtime for running NVIDIA CUDA C++11 applications on OpenCL 1.2 devices. In: Proc of the 5th Int’l Workshop on OpenCL, IWOCL 2017. ACM Perkins H CUDA-on-CL:(2017) A compiler and runtime for running NVIDIA CUDA C++11 applications on OpenCL 1.2 devices. In: Proc of the 5th Int’l Workshop on OpenCL, IWOCL 2017. ACM
44.
Zurück zum Zitat Pratt H, Coenen F, Broadbent DM, Harding SP, Zheng Y (2016) Convolutional neural networks for diabetic retinopathy. Proc Comp Sci 90:200–205CrossRef Pratt H, Coenen F, Broadbent DM, Harding SP, Zheng Y (2016) Convolutional neural networks for diabetic retinopathy. Proc Comp Sci 90:200–205CrossRef
46.
Zurück zum Zitat Sanchez D, Lo D, Yoo RM, Sugerman J, Kozyrakis C (2011) Dynamic fine-grain scheduling of pipeline parallelism. In: 2011 Int’l Conf on Parallel Architectures and Compilation Techniques, pp. 22–32. IEEE Sanchez D, Lo D, Yoo RM, Sugerman J, Kozyrakis C (2011) Dynamic fine-grain scheduling of pipeline parallelism. In: 2011 Int’l Conf on Parallel Architectures and Compilation Techniques, pp. 22–32. IEEE
47.
Zurück zum Zitat Sechelea A, Do Huu T, Zimos E, Deligiannis N (2016) Twitter data clustering and visualization. In: 2016 23rd Int’l Conf on Telecommunications (ICT), pp. 1–5 Sechelea A, Do Huu T, Zimos E, Deligiannis N (2016) Twitter data clustering and visualization. In: 2016 23rd Int’l Conf on Telecommunications (ICT), pp. 1–5
48.
Zurück zum Zitat Segal M, Akeley K (2019) The OpenGL graphics system: a specification. Khronos Group, Beaverton Segal M, Akeley K (2019) The OpenGL graphics system: a specification. Khronos Group, Beaverton
49.
Zurück zum Zitat Seiler L et al (2009) Larrabee: a many-core x86 architecture for visual computing. IEEE Micro 29:10–21CrossRef Seiler L et al (2009) Larrabee: a many-core x86 architecture for visual computing. IEEE Micro 29:10–21CrossRef
50.
Zurück zum Zitat Stone JE, Gohara D, Shi G (2010) OpenCL: a parallel programming standard for heterogeneous computing systems. Comput Sci Eng 12(3):66CrossRef Stone JE, Gohara D, Shi G (2010) OpenCL: a parallel programming standard for heterogeneous computing systems. Comput Sci Eng 12(3):66CrossRef
51.
Zurück zum Zitat Suda N et al (2016) Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proc of the 2016 ACM/SIGDA Int’l Symp on Field-Programmable Gate Arrays, FPGA ’16, pp. 16–25. ACM Suda N et al (2016) Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proc of the 2016 ACM/SIGDA Int’l Symp on Field-Programmable Gate Arrays, FPGA ’16, pp. 16–25. ACM
52.
Zurück zum Zitat Sugerman J et al (2009) GRAMPS: a programming model for graphics pipelines. ACM Trans Graph 28(1):1–11CrossRef Sugerman J et al (2009) GRAMPS: a programming model for graphics pipelines. ACM Trans Graph 28(1):1–11CrossRef
53.
Zurück zum Zitat The Khronos Vulkan working group (2020) Vulkan–A Specification. Khronos Group, Beaverton The Khronos Vulkan working group (2020) Vulkan–A Specification. Khronos Group, Beaverton
54.
Zurück zum Zitat Valero A, Gracia DS, Tejero RG, Ramos LM, Navarro-Torres A, Muñoz A, Ezpeleta J, Briz JL, Murillo AC, Montijano E, et al. (2019) Exposing abstraction-level interactions with a parallel ray tracer. In: Proc of the Workshop on Computer Architecture Education, WCAE ’19, p. Article 5. ACM, New York, NY, USA Valero A, Gracia DS, Tejero RG, Ramos LM, Navarro-Torres A, Muñoz A, Ezpeleta J, Briz JL, Murillo AC, Montijano E, et al. (2019) Exposing abstraction-level interactions with a parallel ray tracer. In: Proc of the Workshop on Computer Architecture Education, WCAE ’19, p. Article 5. ACM, New York, NY, USA
56.
Zurück zum Zitat Wald I (2014) High fidelity visualization Wald I (2014) High fidelity visualization
57.
Zurück zum Zitat Welstead ST (1999) Fractal and wavelet image compression techniques. SPIE Publication, BellinghamCrossRef Welstead ST (1999) Fractal and wavelet image compression techniques. SPIE Publication, BellinghamCrossRef
58.
Zurück zum Zitat Xilinux: Developing and optimizing applications using the opencl framework (retrieved on October 07, 2020) Xilinux: Developing and optimizing applications using the opencl framework (retrieved on October 07, 2020)
59.
Zurück zum Zitat Xilinx: OpenCL devices and FPGAs (retrieved on October 07, 2020) Xilinx: OpenCL devices and FPGAs (retrieved on October 07, 2020)
60.
Zurück zum Zitat Zhou Wang, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRef Zhou Wang, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRef
Metadaten
Titel
A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing
verfasst von
Mingyu Kim
Nakhoon Baek
Publikationsdatum
04.01.2021
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 7/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03581-8

Weitere Artikel der Ausgabe 7/2021

The Journal of Supercomputing 7/2021 Zur Ausgabe

Premium Partner