Top

Published in:

2018 | OriginalPaper | Chapter

Balancing Convolutional Neural Networks Pipeline in FPGAs

Authors : Mark Cappello Ferreira de Sousa, Miguel Angelo de Abreu de Sousa, Emilio Del-Moral-Hernandez

Published in: Artificial Neural Networks and Machine Learning – ICANN 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Convolutional Neural Networks (CNNs) have achieved excellent performance in image classification, being successfully applied in a wide range of domains. However, their processing power demand offers a challenge to their implementation in embedded real-time applications. To tackle this problem, we focused in this work on the FPGA acceleration of the convolutional layers, since they account for about 90% of the overall computational load. We implemented buffers to reduce the storage of feature maps and consequently, facilitating the allocation of the whole kernel weights in Block-RAMs (BRAMs). Moreover, we used 8-bits kernel weights, rounded from an already trained CNN, to further reduce the need for memory, storing them in multiple BRAMs to aid kernel loading throughput. To balance the pipeline of convolutions through the convolutional layers we manipulated the amount of parallel computation in the convolutional step in each convolutional layer. We adopted the AlexNet CNN architecture to run our experiments and compare the results. We were able to run the inference of the convolutional layers in 3.9 ms with maximum operation frequency of 76.9 MHz.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Attention Enhanced Chinese Word Embeddings

next chapter Generating Diverse and Meaningful Captions

Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., Culurciello, E.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of the 2010 IEEE International Symposium Circuits System, pp. 257–260 (2010)

Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Preprint (2014)

Vanhoucke, V., Senior, A., Mao, M.: Improving the speed of neural networks on CPUs. In: Proceedings of the Deep Learning Unsupervised Feature Learning Work, NIPS 2011, pp. 1–8 (2011)

Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1–9 (2012)

Peemen, M., Setio, A.A.A., Mesman, B., Corporaal, H.: Memory-centric accelerator design for convolutional neural networks. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 13–19 (2013)

Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., Zhang, C.: FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. Concurr. Comput. Pract. Exp. 22, 685–701 (2016)

Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170. ACM (2015)

Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29, 2352–2449 (2017)CrossRef

Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks, vol. 48 (2015)

10.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. In: 2014 International Conference on Learning Representation, pp. 1–14 (2015)

11.

Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRef

12.

Dettmers, T.: 8-Bit approximations for parallelism in deep learning, pp. 1–14 (2015)

13.

Courbariaux, M., Bengio, Y., David, J.-P.: Training deep neural networks with low precision multiplications, pp. 1–10 (2014)

14.

Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision, vol. 37 (2015)

15.

Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks, pp. 1–9 (2015)

16.

Savich, A.W., Moussa, M., Areibi, S.: The impact of arithmetic representation on implementing MLP-BP on FPGAs: A study. IEEE Trans. Neural Netw. 18, 240–252 (2007)CrossRef

17.

Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. Presented at the June 2014

18.

Suda, N., et al.: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium Field-Programmable Gate Arrays - FPGA 2016, pp. 16–25 (2016)

19.

Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL 2016 - 26th International Conference on Field-Programmable Logic Applications (2016)

Title: Balancing Convolutional Neural Networks Pipeline in FPGAs
Authors: Mark Cappello Ferreira de Sousa
Miguel Angelo de Abreu de Sousa
Emilio Del-Moral-Hernandez
Publisher: Springer International Publishing
Book: Artificial Neural Networks and Machine Learning – ICANN 2018
Print ISBN: 978-3-030-01417-9

Electronic ISBN: 978-3-030-01418-6

Copyright Year: 2018
DOI: https://doi.org/10.1007/978-3-030-01418-6_17

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner