Top

Neural Computing and Applications

Published in:

17-10-2019 | Review Article

Implementation of DNNs on IoT devices

Authors: Zhichao Zhang, Abbas Z. Kouzani

Published in: Neural Computing and Applications | Issue 5/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Driven by the recent growth in the fields of internet of things (IoT) and deep neural networks (DNNs), DNN-powered IoT devices are expected to transform a variety of industrial applications. DNNs, however, involve many parameters and operations to process the data generated by IoT devices. This results in high data-processing latency and energy consumption. New approaches are thus being souhgt to tackle these issues and deploy real-time DNNs into resource-limited IoT devices. This paper presents a comprehensive review on hardware-and-software-co-design approaches developed to implement DNNs on low-resource hardware platforms. These approaches explore the trade-off between energy consumption, speed, classification accuracy, and model size. First, an overview of DNNs is given. Next, available tools for implementing DNNs on low-resource hardware platforms are described. Then, the memory hierarchy designs together with dataflow mapping strategies are presented. Furthermore, various model optimization approaches, including pruning and quantization, are discussed. In addition, case studies are given to demonstrate the feasibility of implementing DNNs for IoT applications. Finally, detailed discussions, research gaps, and future directions are provided. The presented review can guide the design and implementation of the next generation of hardware and software solutions for real-world IoT applications.

previous article Improved GWO for large-scale function optimization and MLP optimization in cancer identification

next article Estimation of hydrogen flow rate in atmospheric Ar:H2 plasma by using artificial neural network

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M (2018) Deep learning for IoT big data and streaming analytics: a survey. IEEE Commun Surv Tutor 20(4):2923–2960CrossRef

Sodhro AH, Luo Z, Sodhro GH, Muzamal M, Rodrigues JJ, de Albuquerque VHC (2019) Artificial Intelligence based QoS optimization for multimedia communication in IoV systems. Future Gener Comput Syst 95:667–680CrossRef

Evans D (2011) The internet of things: how the next evolution of the internet is changing everything. CISCO White Pap 1(2011):1–11

Sodhro AH, Shaikh FK, Pirbhulal S, Lodro MM, Shah MA (2017) Medical-QoS based telemedicine service selection using analytic hierarchy process. In: Handbook of large-scale distributed computing in smart healthcare. Springer, pp 589–609

Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1CrossRef

Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, CambridgeMATH

Li J, Zhang Y, Chen X, Xiang Y (2018) Secure attribute-based data sharing for resource-limited users in cloud computing. Comput Secur 72:1–12CrossRef

Iandola F, Keutzer K (2017) Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures. In: Proceedings of the twelfth IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis companion. ACM, p 1

Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329CrossRef

10.

Ndikumana A, Tran NH, Hong CS (2018) Deep learning based caching for self-driving car in multi-access edge computing. arXiv preprint arXiv:181001548

11.

Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

12.

Luo Z, Small A, Dugan L, Lane S (2018) Cloud chaser: real time deep learning computer vision on low computing power devices. arXiv preprint arXiv:181001069

13.

Mozer TF (2017) Triggering video surveillance using embedded voice, speech, or sound recognition. Google Patents

14.

Stergiou C, Psannis KE, Kim B-G, Gupta B (2018) Secure integration of IoT and cloud computing. Future Gener Comput Syst 78:964–975CrossRef

15.

Al-Garadi MA, Mohamed A, Al-Ali A, Du X, Guizani M (2018) A survey of machine and deep learning methods for Internet of Things (IoT) security. arXiv preprint arXiv:180711023

16.

Wu B, Iandola F, Jin PH, Keutzer K (2017) Squeezedet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 129–137

17.

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

18.

Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobiLeNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:170404861

19.

Sodhro AH, Pirbhulal S, de Albuquerque VHC (2019) Artificial intelligence driven mechanism for edge computing based industrial applications. IEEE Trans Ind Inf 15(7):4235–4243CrossRef

20.

Sodhro AH, Li Y, Shah MA (2016) Energy-efficient adaptive transmission power control for wireless body area networks. IET Commun 10(1):81–90CrossRef

21.

Chen Y-H, Krishna T, Emer JS, Sze V (2017) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits 52(1):127–138CrossRef

22.

Horowitz M (2014) Energy table for 45 nm process. Stanford VLSI wiki

23.

Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ ()2016 EIE: efficient inference engine on compressed deep neural network. In: ACM/IEEE 43rd annual international symposium on computer architecture (ISCA). IEEE, pp 243–254

24.

Guan Y, Liang H, Xu N, Wang W, Shi S, Chen X, Sun G, Zhang W, Cong J (2017) FP-DNN: an automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates. In: IEEE 25th annual international symposium on field-programmable custom computing machines (FCCM). IEEE, pp 152–159

25.

Guo K, Zeng S, Yu J, Wang Y, Yang H (2017) A survey of FPGA-based neural network accelerator. arXiv preprint arXiv:171208934

26.

Abdelouahab K, Pelcat M, Serot J, Berry F (2018) Accelerating CNN inference on FPGAs: a survey. arXiv preprint arXiv:180601683

27.

Wang E, Davis JJ, Zhao R, Ng H-C, Niu X, Luk W, Cheung PY, Constantinides GA (2019) Deep neural network approximation for custom hardware: where we’ve been, where we’re going. arXiv preprint arXiv:190106955

28.

Cheng Y, Wang D, Zhou P, Zhang T (2018) Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Process Mag 35(1):126–136CrossRef

29.

Shawahna A, Sait SM, El-Maleh A (2019) FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7:7823–7859CrossRef

30.

Venieris SI, Kouris A, Bouganis C-S (2018) Toolflows for mapping convolutional neural networks on fpgas: a survey and future directions. ACM Computing Surveys (CSUR) 51(3):56CrossRef

31.

McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133. https://doi.org/10.1007/bf02478259 MathSciNetCrossRefMATH

32.

Guresen E, Kayakutlu G (2011) Definition of artificial neural networks with comparison to other networks. Procedia Comput Sci 3:426–433CrossRef

33.

Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef

34.

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436. https://doi.org/10.1038/nature14539 CrossRef

35.

Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRef

36.

Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination Press, San Francisco

37.

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003 CrossRef

38.

Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRef

39.

Ivakhnenko AG, Lapa VGE (1965) Cybernetic predicting devices. CCM Information Corporation, New York

40.

Bertsekas DP, Tsitsiklis JN (1995) Neuro-dynamic programming: an overview. In: Proceedings of the 34th IEEE conference on decision and control. IEEE, Piscataway, pp 560–564

41.

Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A field guide to dynamical recurrent neural networks. IEEE Press

42.

Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRef

43.

Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 689–696

44.

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

45.

Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:13126114

46.

Raina R, Madhavan A, Ng AY (2009) Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 873–880

47.

Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484CrossRef

48.

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef

49.

Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken AP, Tejani A, Totz J, Wang Z (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, vol 3, p 4

50.

Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 427–436

51.

Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) XNOR-Net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, pp 525–542

52.

Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112

53.

Graves A, Mohamed A-R, Hinton G Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6645–6649

54.

Socher R, Lin CC, Manning C, Ng AY (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136

55.

Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88CrossRef

56.

Sünderhauf N, Brock O, Scheirer W, Hadsell R, Fox D, Leitner J, Upcroft B, Abbeel P, Burgard W, Milford M (2018) The limits and potentials of deep learning for robotics. Int J Robot Res 37(4–5):405–420CrossRef

57.

Heaton J, Polson N, Witte JH (2017) Deep learning for finance: deep portfolios. Appl Stoch Models Bus Ind 33(1):3–12MathSciNetCrossRef

58.

Liu Y, Racah E, Correa J, Khosrowshahi A, Lavers D, Kunkel K, Wehner M, Collins W (2016) Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv preprint arXiv:160501156

59.

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

60.

Zhang Q, Nian Wu Y, Zhu S-C (2018) Interpretable convolutional neural networks. In: Proc IEEE conference on computer vision and pattern recognition, pp 8827–8836

61.

van Gerven M, Bohte S (2018) Artificial neural networks as models of neural information processing. Frontiers Media, LausanneCrossRef

62.

Bajaj R (2016) Exploiting DSP block capabilities in FPGA high level design flows. Nanyang Technological University, Singapore

63.

Vipin K, Fahmy SA (2018) FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications. ACM Comput Surv (CSUR) 51(4):72CrossRef

64.

Guo K, Sui L, Qiu J, Yu J, Wang J, Yao S, Han S, Wang Y, Yang H (2018) Angel-Eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst 37(1):35–47CrossRef

65.

Di Febbo P, Dal Mutto C, Tieu K, Mattoccia S (2018) KCNN: extremely-efficient hardware keypoint detection with a compact convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 682–690

66.

Xilinx Artix-7 FPGA AC701 Evaluation Kit (2019) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-a7-ac701-g.html. Accessed 16 Oct 2019

67.

Wei L, Luo B, Li Y, Liu Y, Xu Q (2018) I know what you see: power side-channel attack on convolutional neural network accelerators. In: Proceedings of the 34th annual computer security applications conference. ACM, pp 393-406

68.

Spartan-6 FPGA SP605 Evaluation Kit (2019) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-s6-sp605-g.html#documentation. Accessed 16 Oct 2019

69.

Kästner F, Janßen B, Kautz F, Hübner M, Corradi G (2018) Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ. In: IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 154–161

70.

PYNQ-Z1: Python Productivity for Zynq-7000 ARM/FPGA SoC (2019) Digilent Inc. https://store.digilentinc.com/pynq-z1-python-productivity-for-zynq-7000-arm-fpga-soc/. Accessed 16 Oct 2019

71.

Morcel R, Hajj H, Saghir MA, Akkary H, Artail H, Khanna R, Keshavamurthy A (2019) FeatherNet: an accelerated convolutional neural network design for resource-constrained FPGAs. ACM Trans Reconfig Technol Syst (TRETS) 12(2):6

72.

Cyclone V GT FPGA Development Board (2017) Altera. https://www.intel.com.au/content/dam/www/programmable/us/en/pdfs/literature/manual/rm_cvgt_fpga_dev_board.pdf. Accessed 16 Oct 2019

73.

Venieris SI, Bouganis C-S (2017) fpgaConvNet: a toolflow for mapping diverse convolutional neural networks on embedded FPGAs. arXiv preprint arXiv:171108740

74.

Xilinx Zynq-7000 SoC ZC706 Evaluation Kit (2018) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html#hardware. Accessed 16 Oct 2019

75.

Brilli G, Burgio P, Bertogna M (2018) Convolutional neural networks on embedded automotive platforms: a qualitative comparison. In: International conference on high performance computing and simulation (HPCS). IEEE, pp 496–499

76.

ZCU102 evaluation board user guide (2019) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf. Accessed 16 Oct 2019

77.

ZCU104 Evaluation Board (2018) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu104/ug1267-zcu104-eval-bd.pdf. Accessed 16 Oct 2019

78.

ZCU106 Evaluation Board (2018) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu106/ug1244-zcu106-eval-bd.pdf. Accessed 16 Oct 2019

79.

Nazemi M, Pasandi G, Pedram M (2019) Energy-efficient, low-latency realization of neural networks through boolean logic minimization. In: Proceedings of the 24th Asia and South Pacific design automation conference. ACM, pp 274–279

80.

Intel^® Arria^® 10 SoC Development Kit (2019) Intel Corporation. https://www.intel.com/content/www/us/en/programmable/products/boards_and_kits/dev-kits/altera/arria-10-soc-development-kit.html. Accessed 16 Oct 2019

81.

Colangelo P, Luebbers E, Huang R, Margala M, Nealis K (2017) Application of convolutional neural networks on Intel^® Xeon^® processor with integrated FPGA. In: IEEE high performance extreme computing conference (HPEC). IEEE, pp 1–7

82.

Intel Arria 10 GX FPGA Development Kit (2019) Intel Corporation. https://www.intel.com/content/www/us/en/programmable/products/boards_and_kits/dev-kits/altera/kit-a10-gx-fpga.html. Accessed 16 Oct 2019

83.

Omnitek (2019) Xilinx Inc. https://www.xilinx.com/products/acceleration-solutions/1-zz0jo0.html. Accessed 16 Oct 2019

84.

Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow org

85.

Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678

86.

Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:151201274

87.

Redmon J (2013) Darknet: open source neural networks in C

88.

Samajdar A, Zhu Y, Whatmough P, Mattina M, Krishna T (2018) SCALE-Sim: systolic CNN accelerator. arXiv preprint arXiv:181102883

89.

Chen C, Liu X, Peng H, Ding H, Shi C-JR (2018) iFPNA: a flexible and efficient deep neural network accelerator with a programmable data flow engine in 28 nm CMOS. In: IEEE 44th European solid state circuits conference (ESSCIRC). IEEE, pp 170–173

90.

Corporation I (2019) Stratix V product table. https://www.intel.com.au/content/dam/www/programmable/us/en/pdfs/literature/pt/stratix-v-product-table.pdf. Accessed 16 Oct 2019

91.

Parashar A, Rhu M, Mukkara A, Puglielli A, Venkatesan R, Khailany B, Emer J, Keckler SW, Dally WJ (2017) SCNN: an accelerator for compressed-sparse convolutional neural networks. In: ACM SIGARCH computer architecture news, vol 2. ACM, pp 27–40

92.

Zhang C, Sun G, Fang Z, Zhou P, Pan P, Cong J (2018) Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. IEEE Tran Comput Aided Des Integr Circuits Syst. https://doi.org/10.1109/TCAD.2017.2785257 CrossRef

93.

Yang X, Gao M, Pu J, Nayak A, Liu Q, Bell SE, Setter JO, Cao K, Ha H, Kozyrakis C (2018) DNN dataflow choice is overrated. arXiv preprint arXiv:180904070

94.

Li J, Yan G, Lu W, Jiang S, Gong S, Wu J, Li X (2018) SmartShuttle: optimizing off-chip memory accesses for deep learning accelerators. In: Design, automation and test in Europe conference and exhibition (DATE). IEEE, pp 343–348

95.

Chen Y-H, Emer J, Sze V (2018) Eyeriss v2: a flexible and high-performance accelerator for emerging deep neural networks. arXiv preprint arXiv:180707928

96.

Du Z, Fasthuber R, Chen T, Ienne P, Li L, Luo T, Feng X, Chen Y, Temam O (2015) ShiDianNao: shifting vision processing closer to the sensor. In: ACM SIGARCH computer architecture news, vol 3. ACM, pp 92–104

97.

Kwon H, Samajdar A, Krishna T (2018) MAERI: enabling flexible dataflow mapping over DNN accelerators via reconfigurable interconnects. In: Proceedings of the twenty-third international conference on architectural support for programming languages and operating systems. ACM, pp 461–475

98.

Ullrich K, Meeds E, Welling M (2017) Soft weight-sharing for neural network compression. arXiv preprint arXiv:170204008

99.

He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: International conference on computer vision (ICCV), vol 6

100.

Huang Z, Wang N (2018) Data-driven sparse structure selection for deep neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 304–320CrossRef

101.

Yang T-J, Chen Y-H, Sze V (2017) Designing energy-efficient convolutional neural networks using energy-aware pruning. In: IEEE conference on computer vision and pattern recognition (CVPR)

102.

Hegde K, Yu J, Agrawal R, Yan M, Pellauer M, Fletcher CW (2018) UCNN: exploiting computational reuse in deep neural networks via weight repetition. arXiv preprint arXiv:180406508

103.

Lane ND, Bhattacharya S, Georgiev P, Forlivesi C, Jiao L, Qendro L, Kawsar F (2016) Deepx: a software accelerator for low-power deep learning inference on mobile devices. In: Proceedings of the 15th International conference on information processing in sensor networks. IEEE Press, p 23

104.

Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:150302531

105.

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

106.

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556

107.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

108.

Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and < 0.5 mb model size. arXiv preprint arXiv:160207360

109.

Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobiLeNetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

110.

Zhang X, Zhou X, Lin M, Sun J (2018) ShuffLeNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856

111.

Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149

112.

Horowitz M (2014) 1.1 Computing’s energy problem (and what we can do about it). In: IEEE international conference on solid-state circuits conference digest of technical papers (ISSCC). IEEE, pp 10–14

113.

Wen W, Wu C, Wang Y, Chen Y, Li H (2016) Learning structured sparsity in deep neural networks. In: Advances in neural information processing systems, pp 2074–2082

114.

Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4820–4828

115.

Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18(1):6869–6898MathSciNetMATH

116.

Tang W, Hua G, Wang L (2017) How to train a compact binary neural network with high accuracy? In: AAAI, pp 2625–2631

117.

Zhu C, Han S, Mao H, Dally WJ (2016) Trained ternary quantization. arXiv preprint arXiv:161201064

118.

Manessi F, Rozza A, Bianco S, Napoletano P, Schettini R (2018) Automated pruning for deep neural network compression. In: 2018 24th International conference on pattern recognition (ICPR). IEEE, pp 657–664

119.

Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst (JETC) 13(3):32

120.

Li D, Wang X, Kong D (2018) Deeprebirth: accelerating deep neural network execution on mobile devices. In: Thirty-second AAAI conference on artificial intelligence

121.

Louizos C, Ullrich K, Welling M (2017) Bayesian compression for deep learning. In: Advances in neural information processing systems, pp 3288–3298

122.

Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. In: Advances in neural information processing systems, pp 2575–2583

123.

Kharitonov V, Molchanov D, Vetrov D (2018) Variational dropout via empirical bayes. arXiv preprint arXiv:181100596

124.

Jacob B, Kligys S, Chen B, Zhu M, Tang M, Howard A, Adam H, Kalenichenko D (2018) Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2704–2713

125.

Banner R, Hubara I, Hoffer E, Soudry D (2018) Scalable methods for 8-bit training of neural networks. In: Advances in neural information processing systems, pp 5151–5159

126.

Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: Training deep neural networks with binary weights during propagations. In: Advances in neural information processing systems, pp 3123–3131

127.

Umuroglu Y, Fraser NJ, Gambardella G, Blott M, Leong P, Jahre M, Vissers K (2017) Finn: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, pp 65–74

128.

Zhao R, Song W, Zhang W, Xing T, Lin J-H, Srivastava M, Gupta R, Zhang Z (2017) Accelerating binarized convolutional neural networks with software-programmable fpgas. In: Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, pp 15–24

129.

Darabi S, Belbahri M, Courbariaux M, Nia VP (2018) BNN+: improved binary network training. arXiv preprint arXiv:181211800

130.

Gartner (2018) Gartner identifies the top 10 strategic technology trends for 2019. https://www.gartner.com/en/newsroom/press-releases/2018-10-15-gartner-identifies-the-top-10-strategic-technology-trends-for-2019. Accessed 16 Oct 2019

131.

Sodhro AH, Pirbhulal S, Luo Z, de Albuquerque VHC (2019) Towards an optimal resource management for IoT based green and sustainable smart cities. J Clean Prod 220:1167–1179CrossRef

132.

Chandio AA, Zhu D, Sodhro AH (2014) Integration of inter-connectivity of information system (i3) using web services. arXiv preprint arXiv:14053724

133.

Sodhro AH, Malokani AS, Sodhro GH, Muzammal M, Zongwei L (2019) An adaptive QoS computation for medical data processing in intelligent healthcare applications. Neural computing and applications, pp 1–12

134.

Sodhro AH, Pirbhulal S, Sodhro GH, Gurtov A, Muzammal M, Luo Z (2018) A joint transmission power control and duty-cycle approach for smart healthcare system. IEEE Sens J 19(19):8479–8486CrossRef

135.

Wei X, Liu W, Chen L, Ma L, Chen H, Zhuang Y (2019) FPGA-based hybrid-type implementation of quantized neural networks for remote sensing applications. Sensors 19(4):924CrossRef

136.

Kang S, Lee J, Kim C, Yoo H-J (2018) B-Face: 0.2 mW CNN-based face recognition processor with face alignment for mobile user identification. In: IEEE symposium on VLSI circuits. IEEE, pp 137–138

137.

Kueh SM, Kazmierski TJ (2018) Low-power and low-cost dedicated bit-serial hardware neural network for epileptic seizure prediction system. IEEE J Transl Eng Health Med 6:1–9CrossRef

138.

Gao C, Braun S, Kiselev I, Anumula J, Delbruck T, Liu S-C (2019) Real-time speech recognition for IoT purpose using a delta recurrent neural network accelerator. In: IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–5

139.

Li C-L, Huang Y-J, Cai Y-J, Han J, Zeng X-Y (2018) FPGA implementation of LSTM based on automatic speech recognition. In: 14th IEEE international conference on solid-state and integrated circuit technology (ICSICT). IEEE, pp 1–3

140.

You X, Zhang C, Tan X, Jin S, Wu H (2019) AI for 5G: research directions and paradigms. Sci China Inf Sci 62(2):21301CrossRef

141.

Magsi H, Sodhro AH, Chachar FA, Abro SAK, Sodhro GH, Pirbhulal S (2018) Evolution of 5G in internet of medical things. In: International conference on computing, mathematics and engineering technologies (iCoMET). IEEE, pp 1–7

142.

Lodro MM, Majeed N, Khuwaja AA, Sodhro AH, Greedy S (2018) Statistical channel modelling of 5G mmWave MIMO wireless communication. In: International conference on computing, mathematics and engineering technologies (iCoMET). IEEE, pp 1–5

143.

Teerapittayanon S, McDanel B, Kung H (2017) Distributed deep neural networks over the cloud, the edge and end devices. In: IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 328–339

144.

Michael Chui JM, Miremadi M, Henke N, Chung R, Nel P, Malhotra S (2018) Notes from the AI frontier: applications and value of deep learning. McKinsey & Company. https://www.mckinsey.com/featured-insights/artificial-intelligence/notes-from-the-ai-frontier-applications-and-value-of-deep-learning. Accessed 16 Oct 2019

145.

Press G (2019) Artificial intelligence (AI) stats news: AI augmentation to create $2.9 trillion of business value. Forbes. https://www.forbes.com/sites/gilpress/2019/08/12/artificial-intelligence-ai-stats-news-ai-augmentation-to-create-2-9-trillion-of-business-value/#21cb849b63c2. Accessed 16 Oct 2019

146.

MSV J (2019) Microsoft and Intel collaborate to simplify AI deployments at the edge. Forbes. https://www.forbes.com/sites/janakirammsv/2019/08/23/microsoft-and-intel-collaborate-to-simplify-ai-deployments-at-the-edge/#60ebb26f2a4b. Accessed 16 Oct 2019

Title: Implementation of DNNs on IoT devices
Authors: Zhichao Zhang
Abbas Z. Kouzani
Publication date: 17-10-2019
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 5/2020
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-019-04550-w

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 5/2020

An artificial neural network model for the prediction of bruxism by means of occlusal variables

A classifier of matrix modular neural network to simplify complex classification tasks

Hybrid model for the ANI index prediction using Remifentanil drug and EMG signal

An improved evolutionary approach-based hybrid algorithm for Bayesian network structure learning in dynamic constrained search space

Sparse representation and overcomplete dictionary learning for anomaly detection in electrocardiograms

A methodology for detecting relevant single nucleotide polymorphism in prostate cancer with multivariate adaptive regression splines and backpropagation artificial neural networks

Premium Partner