Skip to main content
Top

2018 | OriginalPaper | Chapter

Research on Parallel Acceleration for Deep Learning Inference Based on Many-Core ARM Platform

Authors : Keqian Zhu, Jingfei Jiang

Published in: Advanced Computer Architecture

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Deep learning is one of the hottest research directions in the field of artificial intelligence. It has achieved results which subvert these of traditional methods. However, the demand for computing ability of hardware platform is also increasing. The academia and industry mainly use heterogeneous GPUs to accelerating computation. ARM is relatively more open than GPUs. The purpose of this paper is to study the performance and related acceleration techniques of ThunderX high-performance many-core ARM chips under large-scale inference tasks. In order to study the computational performance of the target platform objectively, several deep models are adapted for acceleration. Through the selection of computational libraries, adjustment of parallel strategies, application of various performance optimization techniques, we have excavated the computing ability of many-core ARM platforms deeply. The final experimental results show that the performance of single-chip ThunderX is equivalent to that of the i7 7700 K chip, and the overall performance of dual-chip can reach 1.77 times that of the latter. In terms of energy efficiency, the former is inferior to the latter. Stronger cooling system or bad power management may lead to more power consumption. Overall, high-performance ARM chips can be deployed in the cloud to complete large-scale deep learning inference tasks which requiring high throughput.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
2.
go back to reference Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: CVPR (2015) Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: CVPR (2015)
3.
go back to reference Lee, V.W., Kim, C., Chhugani, J., et al.: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. ACM SIGARCH Comput. Arch. News 38(3), 451–460 (2010)CrossRef Lee, V.W., Kim, C., Chhugani, J., et al.: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. ACM SIGARCH Comput. Arch. News 38(3), 451–460 (2010)CrossRef
5.
go back to reference Rungsuptaweekoon, K., Visoottiviseth, V., Takano, R.: Evaluating the power efficiency of deep learning inference on embedded GPU systems. In: 2nd International Conference on Information Technology (INCIT) 2017, pp. 1–5. IEEE (2017) Rungsuptaweekoon, K., Visoottiviseth, V., Takano, R.: Evaluating the power efficiency of deep learning inference on embedded GPU systems. In: 2nd International Conference on Information Technology (INCIT) 2017, pp. 1–5. IEEE (2017)
6.
go back to reference Zhu, K., Jiang, J.: DSP based acceleration for long short-term memory model based word prediction application. In: 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA), pp. 93–99. IEEE (2017) Zhu, K., Jiang, J.: DSP based acceleration for long short-term memory model based word prediction application. In: 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA), pp. 93–99. IEEE (2017)
8.
go back to reference Jin, R., Dou, Y., Wang, Y., et al.: Confusion graph: detecting confusion communities in large scale image classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 1980–1986. AAAI Press (2017) Jin, R., Dou, Y., Wang, Y., et al.: Confusion graph: detecting confusion communities in large scale image classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 1980–1986. AAAI Press (2017)
Metadata
Title
Research on Parallel Acceleration for Deep Learning Inference Based on Many-Core ARM Platform
Authors
Keqian Zhu
Jingfei Jiang
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-2423-9_3