Abstract
Many real-world edge applications including object detection, robotics, and smart health are enabled by deploying deep neural networks (DNNs) on energy-constrained mobile platforms. In this article, we propose a novel approach to trade off energy and accuracy of inference at runtime using a design space called Learning Energy Accuracy Tradeoff Networks (LEANets). The key idea behind LEANets is to design classifiers of increasing complexity using pretrained DNNs to perform input-specific adaptive inference. The accuracy and energy consumption of the adaptive inference scheme depends on a set of thresholds, one for each classifier. To determine the set of threshold vectors to achieve different energy and accuracy tradeoffs, we propose a novel multiobjective optimization approach. We can select the appropriate threshold vector at runtime based on the desired tradeoff. We perform experiments on multiple pretrained DNNs including ConvNet, VGG-16, and MobileNet using diverse image classification datasets. Our results show that we get up to a 50% gain in energy for negligible loss in accuracy, and optimized LEANets achieve significantly better energy and accuracy tradeoff when compared to a state-of-the-art method referred to as Slimmable neural networks.
- Syrine Belakaria, Aryan Deshwal, and Janardhan Rao Doppa. 2019. Max-value entropy search for multi-objective Bayesian optimization. In Advances in Neural Information Processing Systems (NeurIPS’19).Google Scholar
- Caffe-HRT. [n.d.]. Retrieved from https://github.com/OAID/Caffe-HRT.Google Scholar
- Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, and Ling Li. 2014. DaDianNao: A machine-learning supercomputer. In Proceedings of MICRO. 609--622.Google ScholarDigital Library
- Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2017. Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37, 3 (2017), 12--21. DOI:10.1109/MM.2017.54Google ScholarDigital Library
- François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1800--1807. DOI:10.1109/CVPR.2017.195Google ScholarCross Ref
- Matthieu Courbariaux and Yoshua Bengio. 2016. BinaryNet: Training deep neural networks with weights and activations constrained to +1 or −1. CoRR abs/1602.02830 (2016). arxiv:1602.02830 http://arxiv.org/abs/1602.02830.Google Scholar
- Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, T. Meyarivan, and A. Fast. 2002. NSGA-II. IEEE Transactions on Evolutionary Computation 6, 2 (2002), 182--197.Google ScholarDigital Library
- Aryan Deshwal, Nitthilan Kannappan Jayakodi, Biresh Kumar Joardar, Janardhan Rao Doppa, and Partha Pratim Pande. 2019. MOOS: A multi-objective design space exploration and optimization framework for NoC enabled manycore systems. ACM Transactions on Embedded Computing Systems (TECS) 18, 5s (2019), 77:1--77:23. DOI:10.1145/3358206Google ScholarDigital Library
- Ruizhou Ding, Zeye Liu, R. D. Shawn Blanton, and Diana Marculescu. 2018. Quantized deep neural networks for energy efficient hardware-based inference. In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC’18). DOI:10.1109/ASPDAC.2018.8297274Google ScholarDigital Library
- Janardhan Rao Doppa, Alan Fern, and Prasad Tadepalli. 2014. HC-Search: A learning framework for search-based structured prediction. Journal of Artificial Intelligence Research 50 (2014), 369--407.Google ScholarDigital Library
- Janardhan Rao Doppa, Alan Fern, and Prasad Tadepalli. 2014. Structured prediction via output space search. Journal of Machine Learning Research 15, 1 (2014), 1317--1350.Google ScholarDigital Library
- Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. 2017. TETRIS: Scalable and efficient neural network acceleration with 3D memory. SIGARCH Computer Architecture News 45, 1 (2017), 751--764. DOI:10.1145/3093337.3037702Google ScholarDigital Library
- Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. 2016. EIE: Efficient inference engine on compressed deep neural network. In Proceedings of ISCA. 243--254.Google Scholar
- Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both weights and connections for efficient neural networks. CoRR abs/1506.02626 (2015). arxiv:1506.02626Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. CoRR abs/1512.03385 (2015). arxiv:1512.03385 http://arxiv.org/abs/1512.03385.Google Scholar
- Daniel Hernandez-Lobato, Jose Miguel Hernandez-Lobato, Amar Shah, and Ryan P. Adams. 2016. Predictive entropy search for multi-objective Bayesian optimization. In ICML. 1492--1501.Google Scholar
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2014. Distilling the knowledge in a neural network. In NIPS Deep Learning Workshop.Google Scholar
- Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017). arxiv:1704.04861 http://arxiv.org/abs/1704.04861.Google Scholar
- Gao Huang, Zhuang Liu, and Kilian Q. Weinberger. 2016. Densely connected convolutional networks. CoRR abs/1608.06993 (2016). arxiv:1608.06993 http://arxiv.org/abs/1608.06993.Google Scholar
- Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and <1MB model size. CoRR abs/1602.07360 (2016).Google Scholar
- Nitthilan Kannappan Jayakodi, Anwesha Chatterjee, Wonje Choi, Janardhan Rao Doppa, and Partha Pratim Pande. 2018. Trading-off accuracy and energy of deep inference on embedded systems: A co-design approach. IEEE TCAD 37, 11 (2018), 2881--2893.Google Scholar
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. CoRR abs/1408.5093 (2014). arxiv:1408.5093 http://arxiv.org/abs/1408.5093.Google ScholarDigital Library
- Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2018. Learning-based application-agnostic 3D NoC design for heterogeneous manycore systems. IEEE Transactions on Computing 68, 6 (2018), 852--866.Google ScholarDigital Library
- Patrick Judd, Alberto Delmas, Sayeh Sharify, and Andreas Moshovos. 2016. Cnvlutin: Ineffectual-neuron-free deep neural network computing. In Proceedings of ISCA. 1--13.Google Scholar
- Ryan Gary Kim, Wonje Choi, Zhuo Chen, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2017. Imitation learning for dynamic VFI control in large-scale manycore systems. IEEE Transactions on VLSI Systems (TVLSI) 25, 9 (2017), 2458--2471.Google ScholarDigital Library
- Michael Lam, Janardhan Rao Doppa, Sinisa Todorovic, and Thomas G. Dietterich. 2015. HC-search for structured prediction in computer vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 4923--4932.Google Scholar
- Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, and Gang Hua. 2015. A convolutional neural network cascade for face detection. In Proceeding of CVPR. 5325--5334.Google Scholar
- Yufei Ma, Naveen Suda, Yu Cao, Jae sun Seo, and Sarma Vrudhula. 2016. Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In Proceedings of FPL. 1--8.Google Scholar
- Sumit Mandal, Ganapati Bhatt, Chetan Arvid Patel, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Ogras. 2019. Dynamic resource management of heterogeneous mobile platforms via imitation learning. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, 12 (2019), 2842--2854.Google ScholarCross Ref
- Katayoun Neshatpour, Farnaz Behnia, Houman Homayoun, and Avesta Sasan. 2018. ICNN: An iterative implementation of convolutional neural networks to enable energy and computational complexity aware dynamic approximation. In Proceedings of DATE.Google ScholarCross Ref
- ODROIOD-XU4. 2017. Retrieved March 29, 2018, from https://wiki.odroid.com/odroid-xu4/hardware/hardware.Google Scholar
- Priyadarshini Panda, Abhronil Sengupta, and Kaushik Roy. 2016. Conditional deep learning for energy-efficient and enhanced pattern recognition. In Proceedings of DATE.Google ScholarCross Ref
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Fei-Fei Li. 2014. ImageNet large scale visual recognition challenge. CoRR abs/1409.0575 (2014). http://arxiv.org/abs/1409.0575Google Scholar
- SmartPower2. [n.d.]. Retrieved from https://wiki.odroid.com/accessory/power_supply%_battery/smartpower2.Google Scholar
- Das Sourav, Janardhan Rao Doppa, Partha Pratim Pande, and Krishnendu Chakrabarty. 2017. Design-space exploration and optimization of an energy-efficient and reliable 3-D small-world network-on-chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 36, 5 (2017), 719--732.Google ScholarDigital Library
- Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias W. Seeger. 2010. Gaussian process optimization in the bandit setting: No regret and experimental design. In ICML. 1015--1022.Google Scholar
- Dimitrios Stamoulis, Ting-Wu Chin, Anand Krishnan Prakash, Haocheng Fang, Sribhuvan Sajja, Mitchell Bognar, and Diana Marculescu. 2018. Designing adaptive neural networks for energy-constrained image classification. CoRR abs/1808.01550 (2018). arxiv:1808.01550 http://arxiv.org/abs/1808.01550.Google Scholar
- Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer. 2017. Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105, 12 (2017), 2295--2329.Google ScholarCross Ref
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1--9. DOI:10.1109/CVPR.2015.7298594Google ScholarCross Ref
- Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, and Liang Lin. 2017. Cost-effective active learning for deep image classification. CoRR abs/1701.03551 (2017).Google Scholar
- Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of CVPR.Google ScholarCross Ref
- Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas S. Huang. 2019. Slimmable neural networks. In ICLR.Google Scholar
- Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless CNNs with low-precision weights. Arxiv:1702.03044 (2017). arxiv:1702.03044Google Scholar
- Eckart Zitzler. 1999. Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. Vol. 63. Ithaca, NY: Shaker.Google Scholar
Index Terms
- Design and Optimization of Energy-Accuracy Tradeoff Networks for Mobile Platforms via Pretrained Deep Models
Recommendations
Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks
Information SecurityAbstractDeep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires expensive computing resources and a lot of training data, which are difficult to obtain for most individual ...
Symmetric Power Activation Functions for Deep Neural Networks
LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and ApplicationsCommon nonlinear activation functions with large saturation regions, like Sigmoid and Tanh, used for Deep Neural Networks (DNNs) can not guarantee useful and efficient training since they suffer from vanishing gradients problem. Rectified Linear Units ...
Deep Elman recurrent neural networks for statistical parametric speech synthesis
Owing to the success of deep learning techniques in automatic speech recognition, deep neural networks (DNNs) have been used as acoustic models for statistical parametric speech synthesis (SPSS). DNNs do not inherently model the temporal structure in ...
Comments