Skip to main content
Erschienen in: Design Automation for Embedded Systems 3/2020

07.03.2020

Deep learning parallel computing and evaluation for embedded system clustering architecture processor

verfasst von: Yue Zu

Erschienen in: Design Automation for Embedded Systems | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the era of intelligence, the processing of a large amount of information and various intelligent applications need to rely on embedded devices. This trend has made machine learning algorithms play an increasingly important role. High-performance embedded computing is an effective means to solve the lack of computing power of embedded devices. Aiming at the problem that the calculation amount of new intelligent embedded applications based on machine learning technology is higher, the computing power of traditional embedded systems is difficult to meet their needs, this paper studies the parallel optimization and implementation techniques of convolutional neural networks in Parallella platform. The parallel optimization strategy of convolutional neural network on the clustering architecture processor of heterogeneous multi-core system is given. Then the high-performance implementation of convolutional neural network on Parallella platform is studied, and the function of convolutional neural network system is implemented. A set of performance evaluation methods for embedded parallel processors is proposed. From the application point of S698P, the eCos operating system is selected as the platform. The single-core mode and multi-core mode are compared on the simulator GRSIM, and the parallel performance evaluation is given. Experiments have shown that the efficiency of deep learning tasks is significantly improved compared to traditional parallel methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mai TNT, Kim S (2017) Parallel implementation of color-based particle filter for object tracking in embedded systems. Hum Cent Comput Inf Sci 7(1):2 Mai TNT, Kim S (2017) Parallel implementation of color-based particle filter for object tracking in embedded systems. Hum Cent Comput Inf Sci 7(1):2
2.
Zurück zum Zitat Gao F, Huang Z, Wang S et al (2017) Optimized parallel implementation of face detection based on embedded heterogeneous many-core architecture. Int J Pattern Recognit Artif Intell 31(7):1756011MathSciNet Gao F, Huang Z, Wang S et al (2017) Optimized parallel implementation of face detection based on embedded heterogeneous many-core architecture. Int J Pattern Recognit Artif Intell 31(7):1756011MathSciNet
3.
Zurück zum Zitat Chen WH, Ji-Yao AN, Ren-Fa LI et al (2017) Review on deep-learning-based cognitive computing. Acta Autom Sin 43(11):1886–1897MATH Chen WH, Ji-Yao AN, Ren-Fa LI et al (2017) Review on deep-learning-based cognitive computing. Acta Autom Sin 43(11):1886–1897MATH
4.
Zurück zum Zitat Niu J, Huang C, Li J et al (2018) Parallel computing techniques for concept-cognitive learning based on granular computing. Int J Mach Learn Cybernet 9(3):1–21 Niu J, Huang C, Li J et al (2018) Parallel computing techniques for concept-cognitive learning based on granular computing. Int J Mach Learn Cybernet 9(3):1–21
5.
Zurück zum Zitat Zeng G, Liu W (2017) An iso-time scaling method for big data tasks executing on parallel computing systems. J Supercomput 73(10):4493–4516 Zeng G, Liu W (2017) An iso-time scaling method for big data tasks executing on parallel computing systems. J Supercomput 73(10):4493–4516
6.
Zurück zum Zitat Yin S, Peng O, Tang S et al (2018) A high energy efficient reconfigurable hybrid neural network processor for deep learning applications. IEEE J Solid State Circuits 53(4):968–982 Yin S, Peng O, Tang S et al (2018) A high energy efficient reconfigurable hybrid neural network processor for deep learning applications. IEEE J Solid State Circuits 53(4):968–982
7.
Zurück zum Zitat Wen S, Wei H, Zeng Z et al (2018) Memristive fully convolutional network: an accurate hardware image-segmentor in deep learning. IEEE Trans Emerg Top Comput Intell 2(5):324–334 Wen S, Wei H, Zeng Z et al (2018) Memristive fully convolutional network: an accurate hardware image-segmentor in deep learning. IEEE Trans Emerg Top Comput Intell 2(5):324–334
8.
Zurück zum Zitat Gu X, Angelov PP, Zhang C et al (2018) A massively parallel deep rule-based ensemble classifier for remote sensing scenes. IEEE Geosci Remote Sens Lett 15(3):345–349 Gu X, Angelov PP, Zhang C et al (2018) A massively parallel deep rule-based ensemble classifier for remote sensing scenes. IEEE Geosci Remote Sens Lett 15(3):345–349
9.
Zurück zum Zitat Wang C, Shen Y, Jia J et al (2018) SingleCaffe: an efficient framework for deep learning on a single node. IEEE Access 6(99):69660–69671 Wang C, Shen Y, Jia J et al (2018) SingleCaffe: an efficient framework for deep learning on a single node. IEEE Access 6(99):69660–69671
10.
Zurück zum Zitat Chung I, Sainath TN, Ramabhadran B et al (2017) Parallel deep neural network training for Big Data on Blue Gene/Q. IEEE Trans Parallel Distrib Syst 28(6):1703–1714 Chung I, Sainath TN, Ramabhadran B et al (2017) Parallel deep neural network training for Big Data on Blue Gene/Q. IEEE Trans Parallel Distrib Syst 28(6):1703–1714
11.
Zurück zum Zitat Sugie T, Akamatsu T, Nishitsuji T et al (2018) High-performance parallel computing for next-generation holographic imaging. Nat Electron 1(4):254–259 Sugie T, Akamatsu T, Nishitsuji T et al (2018) High-performance parallel computing for next-generation holographic imaging. Nat Electron 1(4):254–259
12.
Zurück zum Zitat Xia C, Yan L, Xin Z et al (2018) A novel DVR-ESS-embedded wind-energy conversion system. IEEE Trans Sustain Energy 9(3):1 Xia C, Yan L, Xin Z et al (2018) A novel DVR-ESS-embedded wind-energy conversion system. IEEE Trans Sustain Energy 9(3):1
13.
Zurück zum Zitat Cai B, Ye W, Zhao J (2018) A dynamic texture based segmentation method for ultrasound images with Surfacelet, HMT and parallel computing. Multimed Tools Appl 78(1):5381–5401 Cai B, Ye W, Zhao J (2018) A dynamic texture based segmentation method for ultrasound images with Surfacelet, HMT and parallel computing. Multimed Tools Appl 78(1):5381–5401
14.
Zurück zum Zitat Cunha MAP, Matoussi O, Pétrot F (2018) Detecting software cache coherence violations in MPSoC using traces captured on virtual platforms. ACM Trans Embed Comput Syst 16(2):1–21 Cunha MAP, Matoussi O, Pétrot F (2018) Detecting software cache coherence violations in MPSoC using traces captured on virtual platforms. ACM Trans Embed Comput Syst 16(2):1–21
15.
Zurück zum Zitat Dou W, Li Y (2018) A fault-tolerant computing method for Xdraw parallel algorithm. J Supercomput 74(3):1–25 Dou W, Li Y (2018) A fault-tolerant computing method for Xdraw parallel algorithm. J Supercomput 74(3):1–25
16.
Zurück zum Zitat Thoman P, Dichev K, Heller T et al (2018) A taxonomy of task-based parallel programming technologies for high-performance computing. J Supercomput 74(4):1422–1434 Thoman P, Dichev K, Heller T et al (2018) A taxonomy of task-based parallel programming technologies for high-performance computing. J Supercomput 74(4):1422–1434
17.
Zurück zum Zitat Yu L, Nina-Paravecino F, Kaeli D et al (2018) Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms. J Biomed Opt 23(1):1–4 Yu L, Nina-Paravecino F, Kaeli D et al (2018) Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms. J Biomed Opt 23(1):1–4
18.
Zurück zum Zitat Zhu G, Chen W, Wang D et al (2019) Study on high-density integration resistive random access memory array from multiphysics perspective by parallel computing. IEEE Trans Electron Devices 66(4):1747–1753 Zhu G, Chen W, Wang D et al (2019) Study on high-density integration resistive random access memory array from multiphysics perspective by parallel computing. IEEE Trans Electron Devices 66(4):1747–1753
19.
Zurück zum Zitat Mo ZY (2018) Extreme-scale parallel computing: bottlenecks and strategies. Front Inf Technol Electron Eng 19(10):1251–1260 Mo ZY (2018) Extreme-scale parallel computing: bottlenecks and strategies. Front Inf Technol Electron Eng 19(10):1251–1260
20.
Zurück zum Zitat Grubov VV, Nedaivozov VO (2018) Stream processing of multichannel EEG data using parallel computing technology with NVIDIA CUDA graphics processors. Tech Phys Lett 44(5):453–455 Grubov VV, Nedaivozov VO (2018) Stream processing of multichannel EEG data using parallel computing technology with NVIDIA CUDA graphics processors. Tech Phys Lett 44(5):453–455
21.
Zurück zum Zitat Chen Y, Zhao Q, Hu X et al (2019) Multi-resolution parallel magnetic resonance image reconstruction in mobile computing-based IoT. IEEE Access 7(99):15623–15633 Chen Y, Zhao Q, Hu X et al (2019) Multi-resolution parallel magnetic resonance image reconstruction in mobile computing-based IoT. IEEE Access 7(99):15623–15633
Metadaten
Titel
Deep learning parallel computing and evaluation for embedded system clustering architecture processor
verfasst von
Yue Zu
Publikationsdatum
07.03.2020
Verlag
Springer US
Erschienen in
Design Automation for Embedded Systems / Ausgabe 3/2020
Print ISSN: 0929-5585
Elektronische ISSN: 1572-8080
DOI
https://doi.org/10.1007/s10617-020-09235-5