Skip to main content
Top

2018 | OriginalPaper | Chapter

Balancing Convolutional Neural Networks Pipeline in FPGAs

Authors : Mark Cappello Ferreira de Sousa, Miguel Angelo de Abreu de Sousa, Emilio Del-Moral-Hernandez

Published in: Artificial Neural Networks and Machine Learning – ICANN 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Convolutional Neural Networks (CNNs) have achieved excellent performance in image classification, being successfully applied in a wide range of domains. However, their processing power demand offers a challenge to their implementation in embedded real-time applications. To tackle this problem, we focused in this work on the FPGA acceleration of the convolutional layers, since they account for about 90% of the overall computational load. We implemented buffers to reduce the storage of feature maps and consequently, facilitating the allocation of the whole kernel weights in Block-RAMs (BRAMs). Moreover, we used 8-bits kernel weights, rounded from an already trained CNN, to further reduce the need for memory, storing them in multiple BRAMs to aid kernel loading throughput. To balance the pipeline of convolutions through the convolutional layers we manipulated the amount of parallel computation in the convolutional step in each convolutional layer. We adopted the AlexNet CNN architecture to run our experiments and compare the results. We were able to run the inference of the convolutional layers in 3.9 ms with maximum operation frequency of 76.9 MHz.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., Culurciello, E.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of the 2010 IEEE International Symposium Circuits System, pp. 257–260 (2010) Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., Culurciello, E.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of the 2010 IEEE International Symposium Circuits System, pp. 257–260 (2010)
2.
go back to reference Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Preprint (2014) Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Preprint (2014)
3.
go back to reference Vanhoucke, V., Senior, A., Mao, M.: Improving the speed of neural networks on CPUs. In: Proceedings of the Deep Learning Unsupervised Feature Learning Work, NIPS 2011, pp. 1–8 (2011) Vanhoucke, V., Senior, A., Mao, M.: Improving the speed of neural networks on CPUs. In: Proceedings of the Deep Learning Unsupervised Feature Learning Work, NIPS 2011, pp. 1–8 (2011)
4.
go back to reference Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1–9 (2012) Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1–9 (2012)
5.
go back to reference Peemen, M., Setio, A.A.A., Mesman, B., Corporaal, H.: Memory-centric accelerator design for convolutional neural networks. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 13–19 (2013) Peemen, M., Setio, A.A.A., Mesman, B., Corporaal, H.: Memory-centric accelerator design for convolutional neural networks. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 13–19 (2013)
6.
go back to reference Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., Zhang, C.: FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. Concurr. Comput. Pract. Exp. 22, 685–701 (2016) Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., Zhang, C.: FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. Concurr. Comput. Pract. Exp. 22, 685–701 (2016)
7.
go back to reference Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170. ACM (2015) Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170. ACM (2015)
8.
go back to reference Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29, 2352–2449 (2017)CrossRef Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29, 2352–2449 (2017)CrossRef
9.
go back to reference Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks, vol. 48 (2015) Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks, vol. 48 (2015)
10.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. In: 2014 International Conference on Learning Representation, pp. 1–14 (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. In: 2014 International Conference on Learning Representation, pp. 1–14 (2015)
11.
go back to reference Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRef Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRef
12.
go back to reference Dettmers, T.: 8-Bit approximations for parallelism in deep learning, pp. 1–14 (2015) Dettmers, T.: 8-Bit approximations for parallelism in deep learning, pp. 1–14 (2015)
13.
go back to reference Courbariaux, M., Bengio, Y., David, J.-P.: Training deep neural networks with low precision multiplications, pp. 1–10 (2014) Courbariaux, M., Bengio, Y., David, J.-P.: Training deep neural networks with low precision multiplications, pp. 1–10 (2014)
14.
go back to reference Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision, vol. 37 (2015) Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision, vol. 37 (2015)
15.
go back to reference Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks, pp. 1–9 (2015) Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks, pp. 1–9 (2015)
16.
go back to reference Savich, A.W., Moussa, M., Areibi, S.: The impact of arithmetic representation on implementing MLP-BP on FPGAs: A study. IEEE Trans. Neural Netw. 18, 240–252 (2007)CrossRef Savich, A.W., Moussa, M., Areibi, S.: The impact of arithmetic representation on implementing MLP-BP on FPGAs: A study. IEEE Trans. Neural Netw. 18, 240–252 (2007)CrossRef
17.
go back to reference Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. Presented at the June 2014 Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. Presented at the June 2014
18.
go back to reference Suda, N., et al.: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium Field-Programmable Gate Arrays - FPGA 2016, pp. 16–25 (2016) Suda, N., et al.: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium Field-Programmable Gate Arrays - FPGA 2016, pp. 16–25 (2016)
19.
go back to reference Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL 2016 - 26th International Conference on Field-Programmable Logic Applications (2016) Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL 2016 - 26th International Conference on Field-Programmable Logic Applications (2016)
Metadata
Title
Balancing Convolutional Neural Networks Pipeline in FPGAs
Authors
Mark Cappello Ferreira de Sousa
Miguel Angelo de Abreu de Sousa
Emilio Del-Moral-Hernandez
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01418-6_17

Premium Partner