Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

Improving Performance Estimation for FPGA-Based Accelerators for Convolutional Neural Networks

verfasst von : Martin Ferianc, Hongxiang Fan, Ringo S. W. Chu, Jakub Stano, Wayne Luk

Erschienen in: Applied Reconfigurable Computing. Architectures, Tools, and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Field-programmable gate array (FPGA) based accelerators are being widely used for acceleration of convolutional neural networks (CNNs) due to their potential in improving the performance and reconfigurability for specific application instances. To determine the optimal configuration of an FPGA-based accelerator, it is necessary to explore the design space and an accurate performance prediction plays an important role during the exploration. This work introduces a novel method for fast and accurate estimation of latency based on a Gaussian process parametrised by an analytic approximation and coupled with runtime data. The experiments conducted on three different CNNs on an FPGA-based accelerator on Intel Arria 10 GX 1150 demonstrated a 30.7% improvement in accuracy with respect to the mean absolute error in comparison to a standard analytic method in leave-one-out cross-validation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
A tutorial code is available at https://​git.​io/​Jv31c.
 
2
For a detailed derivation please refer to [15].
 
Literatur
2.
Zurück zum Zitat Dai, S., Zhou, Y., Zhang, H., Ustun, E., Young, E.F., Zhang, Z.: Fast and accurate estimation of quality of results in high-level synthesis with machine learning. In: Proceedings of the 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 129–132. IEEE, Boulder (2018) Dai, S., Zhou, Y., Zhang, H., Ustun, E., Young, E.F., Zhang, Z.: Fast and accurate estimation of quality of results in high-level synthesis with machine learning. In: Proceedings of the 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 129–132. IEEE, Boulder (2018)
4.
Zurück zum Zitat Fan, H., et al.: A real-time object detection accelerator with compressed SSDLite on FPGA. In: Proceedings of the 2018 International Conference on Field-Programmable Technology (FPT), pp. 14–21. IEEE, Sakura (2018) Fan, H., et al.: A real-time object detection accelerator with compressed SSDLite on FPGA. In: Proceedings of the 2018 International Conference on Field-Programmable Technology (FPT), pp. 14–21. IEEE, Sakura (2018)
5.
Zurück zum Zitat Fan, H., et al.: F-E3D: FPGA-based acceleration of an efficient 3D convolutional neural network for human action recognition. In: Proceedings of the 2019 IEEE 30th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), vol. 2160, pp. 1–8. IEEE, New York (2019) Fan, H., et al.: F-E3D: FPGA-based acceleration of an efficient 3D convolutional neural network for human action recognition. In: Proceedings of the 2019 IEEE 30th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), vol. 2160, pp. 1–8. IEEE, New York (2019)
6.
8.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 770–778. IEEE, Las Vegas (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2016, pp. 770–778. IEEE, Las Vegas (2016)
9.
Zurück zum Zitat Holland, B., George, A.D., Lam, H., Smith, M.C.: An analytical model for multilevel performance prediction of multi-FPGA systems. ACM Trans. Reconfig. Technol. Syst. (TRETS) 4(3), 27 (2011) Holland, B., George, A.D., Lam, H., Smith, M.C.: An analytical model for multilevel performance prediction of multi-FPGA systems. ACM Trans. Reconfig. Technol. Syst. (TRETS) 4(3), 27 (2011)
11.
Zurück zum Zitat Lian, X., Liu, Z., Song, Z., Dai, J., Zhou, W., Ji, X.: High-performance FPGA-based CNN accelerator with block-floating-point arithmetic. IEEE Trans. Very Large Scale Integr. VLSI Syst. 27, 1874–1885 (2019)CrossRef Lian, X., Liu, Z., Song, Z., Dai, J., Zhou, W., Ji, X.: High-performance FPGA-based CNN accelerator with block-floating-point arithmetic. IEEE Trans. Very Large Scale Integr. VLSI Syst. 27, 1874–1885 (2019)CrossRef
13.
Zurück zum Zitat Matthews, D.G., et al.: GPflow: a Gaussian process library using TensorFlow. J. Mach. Learn. Res. 18, 1299–1304 (2017)MathSciNetMATH Matthews, D.G., et al.: GPflow: a Gaussian process library using TensorFlow. J. Mach. Learn. Res. 18, 1299–1304 (2017)MathSciNetMATH
14.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
15.
Zurück zum Zitat Rasmussen, C.E.: Gaussian Processes in Machine Learning. The MIT Press, Cambridge (2005)CrossRef Rasmussen, C.E.: Gaussian Processes in Machine Learning. The MIT Press, Cambridge (2005)CrossRef
16.
Zurück zum Zitat Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 3, pp. 779–788. IEEE, Las Vegas (2016) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 3, pp. 779–788. IEEE, Las Vegas (2016)
17.
Zurück zum Zitat Venieris, S., Kouris, A., Bouganis, C.S.: Toolflows for mapping convolutional neural networks on FPGAs: a survey and future directions. ACM Comput. Surv. (CSUR) 51, 1–39 (2018)CrossRef Venieris, S., Kouris, A., Bouganis, C.S.: Toolflows for mapping convolutional neural networks on FPGAs: a survey and future directions. ACM Comput. Surv. (CSUR) 51, 1–39 (2018)CrossRef
18.
Zurück zum Zitat Williams, C.K., Rasmussen, C.E.: Gaussian processes for regression. In: Advances in Neural Information Processing Systems, pp. 514–520 (1996) Williams, C.K., Rasmussen, C.E.: Gaussian processes for regression. In: Advances in Neural Information Processing Systems, pp. 514–520 (1996)
19.
Zurück zum Zitat Yasudo, R., Coutinho, J., Varbanescu, A., Luk, W., Amano, H., Becker, T.: Performance estimation for exascale reconfigurable dataflow platforms. In: Proceedings of the 2018 International Conference on Field-Programmable Technology (FPT), pp. 314–317. IEEE, Sakura (2018) Yasudo, R., Coutinho, J., Varbanescu, A., Luk, W., Amano, H., Becker, T.: Performance estimation for exascale reconfigurable dataflow platforms. In: Proceedings of the 2018 International Conference on Field-Programmable Technology (FPT), pp. 314–317. IEEE, Sakura (2018)
Metadaten
Titel
Improving Performance Estimation for FPGA-Based Accelerators for Convolutional Neural Networks
verfasst von
Martin Ferianc
Hongxiang Fan
Ringo S. W. Chu
Jakub Stano
Wayne Luk
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-44534-8_1

Neuer Inhalt