Skip to main content

2019 | OriginalPaper | Buchkapitel

5. ENVISION: Energy-Scalable Sparse Convolutional Neural Network Processing

verfasst von : Bert Moons, Daniel Bankman, Marian Verhelst

Erschienen in: Embedded Deep Learning

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter focuses on Envision: two generations of energy-scalable sparse convolutional neural network processors. They achieve SotA energy-efficiency through leveraging the three key CNN-characteristics discussed in Chap. 3. (a) Inherent CNN parallelism is exploited through a highly parallelized processor architecture that minimizes internal memory bandwidth. (b) Inherent network sparsity in pruned networks and RELU activated feature maps is leveraged by compressing sparse IO-datastreams and skipping unnecessary computations. (c) The inherent fault-tolerance of CNNs is exploited by making this architecture DVAS or DVAFS compatible in Envision V1 and V2, respectively, according to the theory discussed in Chap. 4. This capability allows minimizing energy consumption for any CNN, with any computational precision requirement up to 16b fixed-point. Through its energy-scalability and high energy-efficiency, Envision lends itself perfectly for hierarchical applications, discussed in Chap. 2. It hereby enables CNN processing in always-on embedded applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Albericio J, Judd P, Jerger N, Aamodt T, Hetherington T, Moshovos A (2016) Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International symposium on computer architecture (ISCA) Albericio J, Judd P, Jerger N, Aamodt T, Hetherington T, Moshovos A (2016) Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International symposium on computer architecture (ISCA)
Zurück zum Zitat Cavigelli L, Gschwend D, Mayer C, Willi S, Muheim B, Benini L (2015a) Origami: a convolutional network accelerator. In: Proceedings of the 25th edition on Great Lakes symposium on VLSI, pp 199–204 Cavigelli L, Gschwend D, Mayer C, Willi S, Muheim B, Benini L (2015a) Origami: a convolutional network accelerator. In: Proceedings of the 25th edition on Great Lakes symposium on VLSI, pp 199–204
Zurück zum Zitat Cavigelli L, Magno M, Benini L (2015b) Accelerating real-time embedded scene labeling with convolutional networks. In: Proceedings of the 52nd annual design automation conference Cavigelli L, Magno M, Benini L (2015b) Accelerating real-time embedded scene labeling with convolutional networks. In: Proceedings of the 52nd annual design automation conference
Zurück zum Zitat Chen Y, Luo T, Liu S, Zhang S, He L, Wang J, Li L, Chen T, Xu Z, Sun N, et al (2014) DaDianNao: a machine-learning supercomputer. In: Proceedings of the 47th Annual IEEE/ACM international symposium on microarchitecture, pp 609–622 Chen Y, Luo T, Liu S, Zhang S, He L, Wang J, Li L, Chen T, Xu Z, Sun N, et al (2014) DaDianNao: a machine-learning supercomputer. In: Proceedings of the 47th Annual IEEE/ACM international symposium on microarchitecture, pp 609–622
Zurück zum Zitat Chen YH, Krishna T, Emer J, Sze V (2016) Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. ISSCC Dig of Tech papers, pp 262–263 Chen YH, Krishna T, Emer J, Sze V (2016) Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. ISSCC Dig of Tech papers, pp 262–263
Zurück zum Zitat Chippa VK, Venkataramani S, Chakradhar ST, Roy K, Raghunathan A (2013) Approximate computing: an integrated hardware approach. In: Proceedings of the Asilomar conference on signals, systems and computers, pp 111–117 Chippa VK, Venkataramani S, Chakradhar ST, Roy K, Raghunathan A (2013) Approximate computing: an integrated hardware approach. In: Proceedings of the Asilomar conference on signals, systems and computers, pp 111–117
Zurück zum Zitat Du Z, Fasthuber R, Chen T, Ienne P, Li L, Luo T, Feng X, Chen Y, Temam O (2015) ShiDianNao: shifting vision processing closer to the sensor. In: International symposium on computer architecture (ISCA), pp 92–104 Du Z, Fasthuber R, Chen T, Ienne P, Li L, Luo T, Feng X, Chen Y, Temam O (2015) ShiDianNao: shifting vision processing closer to the sensor. In: International symposium on computer architecture (ISCA), pp 92–104
Zurück zum Zitat Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) EIE: efficient inference engine on compressed deep neural network. In: International symposium on computer architecture (ISCA) Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) EIE: efficient inference engine on compressed deep neural network. In: International symposium on computer architecture (ISCA)
Zurück zum Zitat Huffman DA, et al (1952) A method for the construction of minimum-redundancy codes. Proc IRE 40(9):1098–1101CrossRef Huffman DA, et al (1952) A method for the construction of minimum-redundancy codes. Proc IRE 40(9):1098–1101CrossRef
Zurück zum Zitat Knag P, Chester L, Zhang Z (2016) A 1.40 mm2 141mW 898 GOPS sparse neuromorphic processor in 40 nm CMOS. In: Proceedings of the IEEE symposium on VLSI circuits, pp 180–181 Knag P, Chester L, Zhang Z (2016) A 1.40 mm2 141mW 898 GOPS sparse neuromorphic processor in 40 nm CMOS. In: Proceedings of the IEEE symposium on VLSI circuits, pp 180–181
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2234CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2234CrossRef
Zurück zum Zitat Lee J, Kim C, Kang S, Shin D, Kim S, Yoo HY (2018) UNPU: a 50.6 tops/w unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision. In: International solid-state circuits conference (ISSCC) Lee J, Kim C, Kang S, Shin D, Kim S, Yoo HY (2018) UNPU: a 50.6 tops/w unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision. In: International solid-state circuits conference (ISSCC)
Zurück zum Zitat Moons B, Verhelst M (2016) A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale convNets. In: Proceedings of the IEEE symposium on VLSI Circuits, pp 178–179 Moons B, Verhelst M (2016) A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale convNets. In: Proceedings of the IEEE symposium on VLSI Circuits, pp 178–179
Zurück zum Zitat Moons B, Uytterhoeven R, Dehaene W, Verhelst M (2017) Envision: a 0.26-to-10 TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSOI. In: International solid-state circuits conference (ISSCC) Moons B, Uytterhoeven R, Dehaene W, Verhelst M (2017) Envision: a 0.26-to-10 TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSOI. In: International solid-state circuits conference (ISSCC)
Zurück zum Zitat Motamedi M, Gysel P, Akella V, Ghiasi S (2016) Design space exploration of FPGA-based deep convolutional neural networks. In: Proceedings of the 21st Asia and South Pacific design automation conference (ASP-DAC), pp 575–580 Motamedi M, Gysel P, Akella V, Ghiasi S (2016) Design space exploration of FPGA-based deep convolutional neural networks. In: Proceedings of the 21st Asia and South Pacific design automation conference (ASP-DAC), pp 575–580
Zurück zum Zitat Rahman A, Lee J, Choi K (2016) Efficient FPGA acceleration of convolutional neural networks using logical-3D compute array. In: Proceedings of the design, automation & test in Europe conference & exhibition (DATE), pp 1393–1398 Rahman A, Lee J, Choi K (2016) Efficient FPGA acceleration of convolutional neural networks using logical-3D compute array. In: Proceedings of the design, automation & test in Europe conference & exhibition (DATE), pp 1393–1398
Zurück zum Zitat Reagen B, Whatmough P, Adolf R, Rama S, Lee H, Lee SK, Hernandez-Lobato JM, Wei GY, Brooks D (2016) Minerva: enabling low-power, highly-accurate deep neural network accelerators. In: Proceedings of the ACM/IEEE 43rd annual international symposium on computer architecture (ISCA) Reagen B, Whatmough P, Adolf R, Rama S, Lee H, Lee SK, Hernandez-Lobato JM, Wei GY, Brooks D (2016) Minerva: enabling low-power, highly-accurate deep neural network accelerators. In: Proceedings of the ACM/IEEE 43rd annual international symposium on computer architecture (ISCA)
Zurück zum Zitat Shin D, Lee J, Lee J, Yoo HJ (2017) 14.2 DNPU: an 8.1 TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, New York, pp 240–241CrossRef Shin D, Lee J, Lee J, Yoo HJ (2017) 14.2 DNPU: an 8.1 TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, New York, pp 240–241CrossRef
Zurück zum Zitat Suda N, Chandra V, Dasika G, Mohanty A, Ma Y, Vrudhula S, Seo Js, Cao Y (2016) Throughput-optimized openCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays, pp 16–25 Suda N, Chandra V, Dasika G, Mohanty A, Ma Y, Vrudhula S, Seo Js, Cao Y (2016) Throughput-optimized openCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays, pp 16–25
Zurück zum Zitat Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on CPUs. In: Deep learning and unsupervised feature learning workshop at advances in neural information processing systems Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on CPUs. In: Deep learning and unsupervised feature learning workshop at advances in neural information processing systems
Zurück zum Zitat Wu B, Willems M (2015) Rapid architectural exploration in designing application-specific processors. In: ASIP designer whitepaper Wu B, Willems M (2015) Rapid architectural exploration in designing application-specific processors. In: ASIP designer whitepaper
Metadaten
Titel
ENVISION: Energy-Scalable Sparse Convolutional Neural Network Processing
verfasst von
Bert Moons
Daniel Bankman
Marian Verhelst
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-319-99223-5_5

Neuer Inhalt