Skip to main content
Top
Published in: The Journal of Supercomputing 6/2018

12-12-2017

Real-time parallel image processing applications on multicore CPUs with OpenMP and GPGPU with CUDA

Authors: Semra Aydin, Refik Samet, Omer Faruk Bay

Published in: The Journal of Supercomputing | Issue 6/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper presents real-time image processing applications using multicore and multiprocessing technologies. To this end, parallel image segmentation was performed on many images covering the entire surface of the same metallic and cylindrical moving objects. Experimental results on multicore CPU with OpenMP platform showed that by increasing the chunk size, the execution time decreases approximately four times in comparison with serial computing. The same experiments were implemented on GPGPU using four techniques: (1) Single image transmission with single pixel processing; (2) Single image transmission with multiple pixel processing; (3) Multiple image transmission with single pixel processing; and (4) Multiple image transmission with multiple pixel processing. All techniques were implemented on GeForce, Tesla K20 and Tesla K40. Experimental results of GPU with CUDA platform showed that by increasing the core number speedup is increased. Tesla K40 gave the best results of 35 and 12 (for the first technique), 36 and 13 (for the second technique), 54 and 16 (for the third technique), 71 and 17 (for the fourth technique) times improvement without and with data transmission time in comparison with serial computing. As a result, users are suggested to use Tesla K40 GPU and Multiple image transmission with multiple pixel processing to get the maximum performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Hu J, Zhang T, Jiang H (2006) New multi-DSP parallel computing architecture for real-time image processing. J Syst Eng Electron 17(4):883CrossRefMATH Hu J, Zhang T, Jiang H (2006) New multi-DSP parallel computing architecture for real-time image processing. J Syst Eng Electron 17(4):883CrossRefMATH
2.
go back to reference Mondal P, Biswal PK, Banerjee S (2016) FPGA based accelerated 3D affine transform for real-time image processing applications. Comput Electr Eng 49(1):69CrossRef Mondal P, Biswal PK, Banerjee S (2016) FPGA based accelerated 3D affine transform for real-time image processing applications. Comput Electr Eng 49(1):69CrossRef
3.
go back to reference Mertes JG, Marranghello N, Pereira AS (2013) Real-time module for digital image processing developed on a FPGA. In: 12th IFAC Conference on Programmable Devices and Embedded Systems. IFAC Proceedings Volumes 46(28), p 405 Mertes JG, Marranghello N, Pereira AS (2013) Real-time module for digital image processing developed on a FPGA. In: 12th IFAC Conference on Programmable Devices and Embedded Systems. IFAC Proceedings Volumes 46(28), p 405
4.
go back to reference Daz-Pernil D, Berciano A, Pea-Cantillana F, Gutirrez-Naranjo MA (2013) Segmenting images with gradient-based edge detection using membrane computing. Pattern Recognit Lett 34(8):846CrossRef Daz-Pernil D, Berciano A, Pea-Cantillana F, Gutirrez-Naranjo MA (2013) Segmenting images with gradient-based edge detection using membrane computing. Pattern Recognit Lett 34(8):846CrossRef
5.
go back to reference Huqqani AA, Schikuta E, Ye S, Chen P (2013) Multicore and GPU parallelization of neural networks for face recognition. Procedia Comput Sci 18:349CrossRef Huqqani AA, Schikuta E, Ye S, Chen P (2013) Multicore and GPU parallelization of neural networks for face recognition. Procedia Comput Sci 18:349CrossRef
6.
go back to reference Mahafzah BA (2011) Parallel multithreaded IDA heuristic search: algorithm design and performance evaluation. Int J Parallel Emerg Distrib Syst 26(1):61MathSciNetCrossRefMATH Mahafzah BA (2011) Parallel multithreaded IDA heuristic search: algorithm design and performance evaluation. Int J Parallel Emerg Distrib Syst 26(1):61MathSciNetCrossRefMATH
7.
go back to reference Mahafzah BA (2013) Performance assessment of multithreaded quicksort algorithm on simultaneous multithreaded architecture. J Supercomput 66(1):339CrossRef Mahafzah BA (2013) Performance assessment of multithreaded quicksort algorithm on simultaneous multithreaded architecture. J Supercomput 66(1):339CrossRef
8.
go back to reference Szgyi Z, Trk M, Pataki N (2011) Multicore C++ standard template library in a generative way. In: Proceedings of the Third Workshop on Generative Technologies (WGT) 2011. Electronic Notes in Theoretical Computer Science, vol 279(3), p 63 Szgyi Z, Trk M, Pataki N (2011) Multicore C++ standard template library in a generative way. In: Proceedings of the Third Workshop on Generative Technologies (WGT) 2011. Electronic Notes in Theoretical Computer Science, vol 279(3), p 63
10.
go back to reference Brodtkorb AR, Hagen TR, SeTra ML (2013) Graphics processing unit GPU programming strategies and trends in GPU computing. J Parallel Distrib Comput 73(1):4CrossRef Brodtkorb AR, Hagen TR, SeTra ML (2013) Graphics processing unit GPU programming strategies and trends in GPU computing. J Parallel Distrib Comput 73(1):4CrossRef
11.
go back to reference Patil S, Junnarka A (2015) Color image segmentation using median cut and contourlet transform: a parallel segmentation approach. Int J Comput Sci Inf Technol (IJCSIT) 5(6):7353 Patil S, Junnarka A (2015) Color image segmentation using median cut and contourlet transform: a parallel segmentation approach. Int J Comput Sci Inf Technol (IJCSIT) 5(6):7353
12.
go back to reference Thapliyal H, Arabnia H (2006) Reversible programmable logic array (RPLA) using Fredkin and Feynman gates for industrial electronics and applications. In: Proceedings of 2006 International Conference on Computer Design and Conference on Computing in Nanotechnology, Las Vegas, pp 70–74 Thapliyal H, Arabnia H (2006) Reversible programmable logic array (RPLA) using Fredkin and Feynman gates for industrial electronics and applications. In: Proceedings of 2006 International Conference on Computer Design and Conference on Computing in Nanotechnology, Las Vegas, pp 70–74
13.
go back to reference Thapliyal H, Arabnia H, Bajpai R, Sharma K (2007) Combined integer and variable precision (CIVP) floating point multiplication architecture for FPGAs. In: Proceedings of 2007 International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, pp 449–450 Thapliyal H, Arabnia H, Bajpai R, Sharma K (2007) Combined integer and variable precision (CIVP) floating point multiplication architecture for FPGAs. In: Proceedings of 2007 International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, pp 449–450
15.
go back to reference Gopineedi PD, Thapliyal H, Srinivas MB, Arabnia HR (2006) Novel and efficient 4:2 and 5:2 compressors with minimum number of transistors designed for low-power operations, pp 160–168 Gopineedi PD, Thapliyal H, Srinivas MB, Arabnia HR (2006) Novel and efficient 4:2 and 5:2 compressors with minimum number of transistors designed for low-power operations, pp 160–168
16.
go back to reference Balasubramanian P, Arisaka R, Arabnia H (2012) RB DSOP a rule based disjoint sum of products synthesis method. In: Proceedings of 2012 International Conference on Computer Design, Las Vegas, pp 39–43 Balasubramanian P, Arisaka R, Arabnia H (2012) RB DSOP a rule based disjoint sum of products synthesis method. In: Proceedings of 2012 International Conference on Computer Design, Las Vegas, pp 39–43
17.
go back to reference Thapliyal H, Srinivas M, Arabnia H (2005) Reversible logic synthesis of half, full and parallel subtractors. In: Proceedings of 2005 International Conference on Embedded Systems and Applications, Las Vegas, pp 165–172 Thapliyal H, Srinivas M, Arabnia H (2005) Reversible logic synthesis of half, full and parallel subtractors. In: Proceedings of 2005 International Conference on Embedded Systems and Applications, Las Vegas, pp 165–172
18.
go back to reference Al-amri SS, Kalyankar NV, D KS (2010) Image segmentation by using threshold techniques. CoRR abs/1005.4020 Al-amri SS, Kalyankar NV, D KS (2010) Image segmentation by using threshold techniques. CoRR abs/1005.4020
19.
go back to reference Osuna-Enciso V, Cuevas E, Sossa H (2013) A comparison of nature inspired algorithms for multi-threshold image segmentation. Expert Syst Appl 40(4):1213CrossRef Osuna-Enciso V, Cuevas E, Sossa H (2013) A comparison of nature inspired algorithms for multi-threshold image segmentation. Expert Syst Appl 40(4):1213CrossRef
20.
go back to reference Wei S, Hong Q, Hou M (2011) Automatic image segmentation based on PCNN with adaptive threshold time constant. Neurocomputing 74(9):1485CrossRef Wei S, Hong Q, Hou M (2011) Automatic image segmentation based on PCNN with adaptive threshold time constant. Neurocomputing 74(9):1485CrossRef
21.
go back to reference Han S, Tao W, Wu X, cheng Tai X, Wang T (2010) Fast image segmentation based on multilevel banded closed-form method. Pattern Recognit Lett 31(3):216CrossRef Han S, Tao W, Wu X, cheng Tai X, Wang T (2010) Fast image segmentation based on multilevel banded closed-form method. Pattern Recognit Lett 31(3):216CrossRef
22.
go back to reference Ayala HVH, dos Santos FM, Mariani VC, dos Santos Coelho L (2015) Image thresholding segmentation based on a novel beta differential evolution approach. Expert Syst Appl 42(4):2136CrossRef Ayala HVH, dos Santos FM, Mariani VC, dos Santos Coelho L (2015) Image thresholding segmentation based on a novel beta differential evolution approach. Expert Syst Appl 42(4):2136CrossRef
23.
go back to reference Wang R, Li C, Wang J, Wei X, Li Y, Zhu Y, Zhang S (2015) Threshold segmentation algorithm for automatic extraction of cerebral vessels from brain magnetic resonance angiography images. J Neurosci Methods 241:30CrossRef Wang R, Li C, Wang J, Wei X, Li Y, Zhu Y, Zhang S (2015) Threshold segmentation algorithm for automatic extraction of cerebral vessels from brain magnetic resonance angiography images. J Neurosci Methods 241:30CrossRef
24.
go back to reference Happ P, Feitosa R, Bentes C, Farias R (2012) A parallel image segmentation algorithm on GPUs. In: Proceedings of the 4th GEOBIA, Rio de Janeiro, 2012, pp 580–586 Happ P, Feitosa R, Bentes C, Farias R (2012) A parallel image segmentation algorithm on GPUs. In: Proceedings of the 4th GEOBIA, Rio de Janeiro, 2012, pp 580–586
25.
go back to reference Smistad E, Elster AC, Lindseth F (2014) GPU accelerated segmentation and centerline extraction of tubular structures from medical images. Int J Comput Assist Radiol Surg 9(4):561CrossRef Smistad E, Elster AC, Lindseth F (2014) GPU accelerated segmentation and centerline extraction of tubular structures from medical images. Int J Comput Assist Radiol Surg 9(4):561CrossRef
26.
go back to reference Korbes A, Vitor GB, de Alencar Loyufoi R, Ferreira JV (2010) Analysis of a step-based watershed algorithm using CUDA. Int J Curr Res Rev 1(1):6 Korbes A, Vitor GB, de Alencar Loyufoi R, Ferreira JV (2010) Analysis of a step-based watershed algorithm using CUDA. Int J Curr Res Rev 1(1):6
27.
go back to reference Singh BM, Sharma R, Mittal A, Ghosh D (2011) Parallel implementation of Otsus binarization approach on GPU. Int J Comput Appl 32(2):16 Singh BM, Sharma R, Mittal A, Ghosh D (2011) Parallel implementation of Otsus binarization approach on GPU. Int J Comput Appl 32(2):16
28.
go back to reference Farias R, Farias R, Marroquim R, Clua E (2013) Parallel image segmentation using reduction-sweeps on multicore processors and GPUs. In: Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images, SIBGRAPI ’13. IEEE Computer Society, Washington, DC, pp 139–146 Farias R, Farias R, Marroquim R, Clua E (2013) Parallel image segmentation using reduction-sweeps on multicore processors and GPUs. In: Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images, SIBGRAPI ’13. IEEE Computer Society, Washington, DC, pp 139–146
29.
go back to reference Prosser N (2010) Medical image segmentation using gpu accelerated variational level set methods. Master’s thesis, Rochester Institute of Technology Prosser N (2010) Medical image segmentation using gpu accelerated variational level set methods. Master’s thesis, Rochester Institute of Technology
30.
go back to reference Abramov A, Kulvicius T, Wörgötter F, Dellen B (2010) Real-time image segmentation on a GPU. In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore-challenge. Lecture notes in computer science, vol 6310. Springer, Berlin, Heidelberg Abramov A, Kulvicius T, Wörgötter F, Dellen B (2010) Real-time image segmentation on a GPU. In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore-challenge. Lecture notes in computer science, vol 6310. Springer, Berlin, Heidelberg
31.
go back to reference Smistad E, Falch TL, Bozorgi M, Elster AC, Lindseth F (2015) Medical image segmentation on GPUs a comprehensive review. Med Image Anal 20(1):1CrossRef Smistad E, Falch TL, Bozorgi M, Elster AC, Lindseth F (2015) Medical image segmentation on GPUs a comprehensive review. Med Image Anal 20(1):1CrossRef
32.
go back to reference Li Y, Jiao L, Shang R, Stolkin R (2015) Dynamic-context cooperative quantum-behaved particle swarm optimization based on multilevel thresholding applied to medical image segmentation. Inf Sci 294:408MathSciNetCrossRef Li Y, Jiao L, Shang R, Stolkin R (2015) Dynamic-context cooperative quantum-behaved particle swarm optimization based on multilevel thresholding applied to medical image segmentation. Inf Sci 294:408MathSciNetCrossRef
33.
go back to reference Chen Z, Meng X, Guo L, Liu G (2012) GICUDA: a parallel program for 3D correlation imaging of large scale gravity and gravity gradiometry data on graphics processing units with CUDA. Comput Geosci 46:119CrossRef Chen Z, Meng X, Guo L, Liu G (2012) GICUDA: a parallel program for 3D correlation imaging of large scale gravity and gravity gradiometry data on graphics processing units with CUDA. Comput Geosci 46:119CrossRef
34.
go back to reference Bay OF, Samet R, Aydn S, Tural S, Bayram A (2015) Performance analysis of GPU-based parallel image segmentation using CUDA. In: Proceedings of the 2th International Conference on Advanced Technology and Sciences (Antalya-Turkey, 2015), ICAT’15, pp 426–429 Bay OF, Samet R, Aydn S, Tural S, Bayram A (2015) Performance analysis of GPU-based parallel image segmentation using CUDA. In: Proceedings of the 2th International Conference on Advanced Technology and Sciences (Antalya-Turkey, 2015), ICAT’15, pp 426–429
35.
go back to reference Hovland RJ Latency and bandwidth impact on gpu-systems. Tech. rep., Norwegian University of Science and Technology Hovland RJ Latency and bandwidth impact on gpu-systems. Tech. rep., Norwegian University of Science and Technology
36.
go back to reference Samet R, Aydin S, Bay OF, Tural S, Bayram A (2015) Real time image processing applications on multicore CPU and GPGPU. In: The 21st International Conference on Parallel and Distributed Processing, WORLDCOMP’15, Las Vegas-Nevada, 27–30 July 2015 Samet R, Aydin S, Bay OF, Tural S, Bayram A (2015) Real time image processing applications on multicore CPU and GPGPU. In: The 21st International Conference on Parallel and Distributed Processing, WORLDCOMP’15, Las Vegas-Nevada, 27–30 July 2015
37.
go back to reference Samet R, Aydin S, Tural S, Bayram A (2016) Primer defects detection on military cartridge cases. In: The 15th annual International Conference, NICOGRAPH’15, Hangzhou, 6–8 July 2016 Samet R, Aydin S, Tural S, Bayram A (2016) Primer defects detection on military cartridge cases. In: The 15th annual International Conference, NICOGRAPH’15, Hangzhou, 6–8 July 2016
38.
go back to reference Abdullah M, Abuelrub E, Mahafzah B (2011) The chained-cubic tree interconnection network. Int Arab J Inf Technol 8(3):334 Abdullah M, Abuelrub E, Mahafzah B (2011) The chained-cubic tree interconnection network. Int Arab J Inf Technol 8(3):334
Metadata
Title
Real-time parallel image processing applications on multicore CPUs with OpenMP and GPGPU with CUDA
Authors
Semra Aydin
Refik Samet
Omer Faruk Bay
Publication date
12-12-2017
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 6/2018
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-017-2168-6

Other articles of this Issue 6/2018

The Journal of Supercomputing 6/2018 Go to the issue

Premium Partner