Skip to main content
Top

2024 | OriginalPaper | Chapter

4. Optimized Neural Network Processor Based on Frequency-Domain Compression Algorithm

Author : Jinshan Yue

Published in: High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter introduces a designed neural network processor that improves energy efficiency with a frequency-domain compression algorithm. This chapter first analyzes the significant power and area overhead of NN processors based on irregular sparse compression technology due to the support for sparsity, and then introduces the frequency-domain structural compression algorithm adopted in this work.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ding C, Liao S, Wang Y, Li Z, Liu N, Zhuo Y, Wang C, Qian X, Bai Y, Yuan G et al (2017) CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices. In: Proceedings of the 50th annual IEEE/ACM international symposium on microarchitecture. ACM, pp 395–408 Ding C, Liao S, Wang Y, Li Z, Liu N, Zhuo Y, Wang C, Qian X, Bai Y, Yuan G et al (2017) CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices. In: Proceedings of the 50th annual IEEE/ACM international symposium on microarchitecture. ACM, pp 395–408
2.
go back to reference Yue J, Liu Y, Liu R, Sun W, Yuan Z, Yung-Ning T, Chen Y-J, Ren A, Wang Y, Chang M-F et al (2020) Sticker-t: an energy-efficient neural network processor using block-circulant algorithm and unified frequency-domain acceleration. IEEE J Sol-State Circ 56(6):1936–1948CrossRef Yue J, Liu Y, Liu R, Sun W, Yuan Z, Yung-Ning T, Chen Y-J, Ren A, Wang Y, Chang M-F et al (2020) Sticker-t: an energy-efficient neural network processor using block-circulant algorithm and unified frequency-domain acceleration. IEEE J Sol-State Circ 56(6):1936–1948CrossRef
3.
go back to reference Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
4.
go back to reference Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
5.
go back to reference Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255 Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
6.
7.
8.
go back to reference LeCun Y, Cortes C, Burges CJC (1994) The MNIST database of handwritten digits LeCun Y, Cortes C, Burges CJC (1994) The MNIST database of handwritten digits
9.
go back to reference Moons B, Uytterhoeven R, Dehaene W, Verhelst M (2017) 14.5 ENVISION: a 0.26-to-10 TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28 nm FDSOI. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 246–247 Moons B, Uytterhoeven R, Dehaene W, Verhelst M (2017) 14.5 ENVISION: a 0.26-to-10 TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28 nm FDSOI. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 246–247
10.
go back to reference He S, Torkelson M (1998) Design and implementation of a 1024-point pipeline FFT processor. In: Proceedings of the IEEE 1998 custom integrated circuits conference (cat. no. 98CH36143). IEEE, pp 131–134 He S, Torkelson M (1998) Design and implementation of a 1024-point pipeline FFT processor. In: Proceedings of the IEEE 1998 custom integrated circuits conference (cat. no. 98CH36143). IEEE, pp 131–134
11.
go back to reference Bong K, Choi S, Kim C, Kang S, Kim Y, Yoo H-J (2017) 14.6 A 0.62 mW ultra-low-power convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 248–249 Bong K, Choi S, Kim C, Kang S, Kim Y, Yoo H-J (2017) 14.6 A 0.62 mW ultra-low-power convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 248–249
12.
go back to reference Seo J, Brezzo B, Liu Y, Parker BD, Esser SK, Montoye RK, Rajendran B, Tierno JA, Chang L, Modha DS et al (2011) A 45 nm CMOS neuromorphic chip with a scalable architecture for learning in networks of spiking neurons. In: 2011 IEEE custom integrated circuits conference (CICC). IEEE, pp 1–4 Seo J, Brezzo B, Liu Y, Parker BD, Esser SK, Montoye RK, Rajendran B, Tierno JA, Chang L, Modha DS et al (2011) A 45 nm CMOS neuromorphic chip with a scalable architecture for learning in networks of spiking neurons. In: 2011 IEEE custom integrated circuits conference (CICC). IEEE, pp 1–4
13.
go back to reference Yue J, Liu R, Sun W, Yuan Z, Wang Z, Tu Y-N, Chen Y-J, Ren A, Wang Y, Chang M-F et al (2019) 7.5 a 65 nm 0.39-to-140.3 tops/w 1-to-12 b unified neural network processor using block-circulant-enabled transpose-domain acceleration with 8.1\(\times \) higher tops/mm\(^2\) and 6T HBST-tram-based 2d data-reuse architecture. In: 2019 IEEE international solid-state circuits conference-(ISSCC). IEEE, pp 138–140 Yue J, Liu R, Sun W, Yuan Z, Wang Z, Tu Y-N, Chen Y-J, Ren A, Wang Y, Chang M-F et al (2019) 7.5 a 65 nm 0.39-to-140.3 tops/w 1-to-12 b unified neural network processor using block-circulant-enabled transpose-domain acceleration with 8.1\(\times \) higher tops/mm\(^2\) and 6T HBST-tram-based 2d data-reuse architecture. In: 2019 IEEE international solid-state circuits conference-(ISSCC). IEEE, pp 138–140
14.
go back to reference Sak H, Senior AW, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. Int J Speech Technol Sak H, Senior AW, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. Int J Speech Technol
15.
go back to reference Zhang T, Ye S, Zhang K, Tang J, Wen W, Fardad M, Wang Y (2018) A systematic DNN weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European conference on computer vision (ECCV), pp 184–199 Zhang T, Ye S, Zhang K, Tang J, Wen W, Fardad M, Wang Y (2018) A systematic DNN weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European conference on computer vision (ECCV), pp 184–199
16.
go back to reference Shin D, Lee J, Lee J, Yoo H-J (2017) 14.2 DNPU: an 8.1 TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 240–241 Shin D, Lee J, Lee J, Yoo H-J (2017) 14.2 DNPU: an 8.1 TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In: 2017 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 240–241
17.
go back to reference Yuan Z, Yue J, Yang H, Wang Z, Li J, Yang Y, Guo Q, Li X, Chang M-F, Yang H et al (2018) STICKER: a 0.41–62.1 TOPS/W 8 bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In: 2018 IEEE symposium on VLSI circuits. IEEE, pp 33–34 Yuan Z, Yue J, Yang H, Wang Z, Li J, Yang Y, Guo Q, Li X, Chang M-F, Yang H et al (2018) STICKER: a 0.41–62.1 TOPS/W 8 bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In: 2018 IEEE symposium on VLSI circuits. IEEE, pp 33–34
18.
go back to reference Lin CH, Cheng CC, Tsai YM, Hung SJ, Chen CC (2020) 7.1 A 3.4-to-13.3 TOPS/W 3.6 TOPS dual-core deep-learning accelerator for versatile AI applications in 7 nm 5G smartphone SoC. In: 2020 IEEE international solid- state circuits conference (ISSCC). IEEE Lin CH, Cheng CC, Tsai YM, Hung SJ, Chen CC (2020) 7.1 A 3.4-to-13.3 TOPS/W 3.6 TOPS dual-core deep-learning accelerator for versatile AI applications in 7 nm 5G smartphone SoC. In: 2020 IEEE international solid- state circuits conference (ISSCC). IEEE
19.
go back to reference Zimmer B, Venkatesan R, Shao YS, Clemons J, Fojtik M, Jiang N et al (2019) A 0.11 pJ/OP, 0.32–128 TOPS, scalable multi-chip-module-based deep neural network accelerator with ground-reference signaling in 16 nm. In: 2019 IEEE hot chips 31 symposium (HCS) Zimmer B, Venkatesan R, Shao YS, Clemons J, Fojtik M, Jiang N et al (2019) A 0.11 pJ/OP, 0.32–128 TOPS, scalable multi-chip-module-based deep neural network accelerator with ground-reference signaling in 16 nm. In: 2019 IEEE hot chips 31 symposium (HCS)
20.
go back to reference Lee J, Kim C, Kang S, Shin D, Kim S, Yoo H-J (2018) UNPU: a 50.6 TOPS/W unified deep neural network accelerator with 1 b-to-16 b fully-variable weight bit-precision. In: 2018 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 218–220 Lee J, Kim C, Kang S, Shin D, Kim S, Yoo H-J (2018) UNPU: a 50.6 TOPS/W unified deep neural network accelerator with 1 b-to-16 b fully-variable weight bit-precision. In: 2018 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 218–220
21.
go back to reference Yuan Z, Yang Y, Yue J, Liu R, Feng X, Lin Z, Wu X, Li X, Yang H, Liu Y (2020) 14.2 A 65 nm 24.7 \(\upmu \)J/frame 12.3 mW activation-similarity-aware convolutional neural network video processor using hybrid precision, inter-frame data reuse and mixed-bit-width difference-frame data codec. In: 2020 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 232–234 Yuan Z, Yang Y, Yue J, Liu R, Feng X, Lin Z, Wu X, Li X, Yang H, Liu Y (2020) 14.2 A 65 nm 24.7 \(\upmu \)J/frame 12.3 mW activation-similarity-aware convolutional neural network video processor using hybrid precision, inter-frame data reuse and mixed-bit-width difference-frame data codec. In: 2020 IEEE international solid-state circuits conference (ISSCC). IEEE, pp 232–234
Metadata
Title
Optimized Neural Network Processor Based on Frequency-Domain Compression Algorithm
Author
Jinshan Yue
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-3477-1_4