ABSTRACT
Deep learning using deep neural networks is taking machine intelligence to the next level in computer vision, speech recognition, natural language processing, etc. Brain-like hardware platforms for the brain-inspired computational models are being studied, but none of such platforms deals with the huge size of practical deep neural networks. This paper presents two techniques, factorization and pruning, that not only compress the models but also maintain the form of the models for the execution on neuromorphic architectures. We also propose a novel method to combine the two techniques. The proposed method shows significant improvements in reducing the number of model parameters over standalone use of each method while maintaining the performance. Our experimental results show that the proposed method can achieve 31× reduction rate without loss of accuracy for the largest layer of AlexNet.
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarCross Ref
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012.Google ScholarDigital Library
- Ouais Alsharif and Joelle Pineau. End-to-end text recognition with hybrid hmm maxout models. arXiv preprint arXiv:1310.1811, 2013.Google Scholar
- Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. Cnn features off-the-shelf: an astounding baseline for recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 512--519, 2014. Google ScholarDigital Library
- Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, and Lars Wolf. Deepface: Closing the gap to human-level performance in face verification. In Proc. Conf. on Computer Vision and Pattern Recognition, pages 1701--1708, 2014. Google ScholarDigital Library
- Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.Google Scholar
- Clément Farabet, Berin Martini, Polina Akselrod, Selçuk Talay, Yann LeCun, and Eugenio Culurciello. Hardware accelerated convolutional neural networks for synthetic vision systems. In Proc. IEEE Int. Symp. on Circuits and Systems, pages 257--260, 2010.Google ScholarCross Ref
- Srimat Chakradhar, Murugan Sankaradas, Venkata Jakkula, and Srihari Cadambi. A dynamically configurable coprocessor for convolutional neural networks. In ACM SIGARCH Computer Architecture News, volume 38, pages 247--257, 2010. Google ScholarDigital Library
- Andrew S Cassidy et al. Real-time scalable cortical computing at 46 giga-synaptic ops/watt with ~100x speed up in time-to-solution and ~100,000x reduction in energy-to-solution. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 27--38, 2014. Google ScholarDigital Library
- Ben Varkey Benjamin et al. Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations. Proceedings of the IEEE, 102(5):699--716, 2014.Google ScholarCross Ref
- Johannes Schemmel, D Bruderle, A Grubl, Matthias Hock, Karlheinz Meier, and Sebastian Millner. A wafer-scale neuromorphic hardware system for large-scale neural modeling. In Proc. IEEE Int. Symp. on Circuits and Systems, pages 1947--1950, 2010.Google ScholarCross Ref
- John V Arthur et al. Building block of a programmable neuromorphic substrate: A digital neurosynaptic core. In Proc. Int. Joint Conf. on Neural Networks, pages 1--8, 2012.Google Scholar
- Steven K Esser et al. Cognitive computing systems: Algorithms and applications for networks of neurosynaptic cores. In Proc. Int. Joint Conf. on Neural Networks, pages 1--10, 2013.Google Scholar
- Arnon Amir et al. Cognitive computing programming paradigm: a corelet language for composing networks of neurosynaptic cores. In Proc. Int. Joint Conf. on Neural Networks, pages 1--10, 2013.Google Scholar
- Andrew S. Cassidy et al. Cognitive computing building block: A versatile and efficient digital neuron model for neurosynaptic cores. In Proc. Int. Joint Conf. on Neural Networks, pages 1--10, 2013.Google Scholar
- Greg Snider. Molecular-junction-nanowire crossbar-based neural network. U.S. Patent 20040150010.Google Scholar
- Wenlin Chen, James T Wilson, Stephen Tyree, Kilian Q Weinberger, and Yixin Chen. Compressing neural networks with the hashing trick. arXiv preprint arXiv:1504.04788, 2015.Google Scholar
- John A Hertz, Anders S Krogh, and Richard G Palmer. Introduction to the theory of neural computation. Google ScholarDigital Library
- Yann LeCun, John S Denker, Sara A Solla, Richard E Howard, and Lawrence D Jackel. Optimal brain damage. In Advances in Neural Information Processing Systems, pages 598--605, 1990. Google ScholarDigital Library
- Babak Hassibi, David G Stork, and Gregory J Wolff. Optimal brain surgeon and general network pruning. In Proc. Int. Conf. on Neural Networks, pages 293--299, 1993.Google ScholarCross Ref
- Tianxing He, Yuchen Fan, Yanmin Qian, Tian Tan, and Kai Yu. Reshaping deep neural network for fast decoding by node-pruning. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pages 245--249, 2014.Google ScholarCross Ref
- Tara N Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pages 6655--6659, 2013.Google ScholarCross Ref
- Jian Xue, Jinyu Li, and Yifan Gong. Restructuring of deep neural network acoustic models with singular value decomposition. In INTERSPEECH, pages 2365--2369, 2013.Google Scholar
- Emily L Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems, pages 1269--1277, 2014.Google ScholarDigital Library
- Ian J. Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Dumoulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Frédéric Bastien, and Yoshua Bengio. Pylearn2: a machine learning research library. arXiv preprint arXiv:1308.4214, 2013.Google Scholar
- Weiguang Ding, Ruoyan Wang, Fei Mao, and Graham Taylor. Theano-based large-scale visual recognition with multiple gpus. arXiv preprint arXiv:1412.2302, 2014.Google Scholar
- Song Han, Jeff Pool, John Tran, and William J Dally. Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626, 2015.Google Scholar
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.Google Scholar
- Wenlin Chen, James T Wilson, Stephen Tyree, Kilian Q Weinberger, and Yixin Chen. Compressing neural networks with the hashing trick. arXiv preprint arXiv:1504.04788, 2015.Google Scholar
- Jaeyong Chung, Taehwan Shin, and Yongshin Kang. Insight: A neuromorphic computing system for evaluation of large neural networks. arXiv preprint arXiv:1508.01008, 2015.Google Scholar
- Taehwan Shin, Yongshin Kang, Seungho Yang, Seban Kim, and Jaeyong Chung. Live demonstration: Real-time image classification on a neuromorphic computing system with zero off-chip memory access. In Proc. IEEE Int. Symp. on Circuits and Systems, 2016.Google ScholarDigital Library
Index Terms
- Simplifying deep neural networks for neuromorphic architectures
Recommendations
Fast learning in Deep Neural Networks
The paper aims at speeding up Deep Neural Networks (DNN) since this is one of the major bottlenecks in deep learning. This has been achieved by parameterizing the weight matrix using low rank factorization and periodic functions. By parameterization, ...
Hebbian Learning Meets Deep Convolutional Neural Networks
Image Analysis and Processing – ICIAP 2019AbstractNeural networks are said to be biologically inspired since they mimic the behavior of real neurons. However, several processes in state-of-the-art neural networks, including Deep Convolutional Neural Networks (DCNN), are far from the ones found in ...
Shunting Inhibition as a Neural-Inspired Mechanism for Multiplication in Neuromorphic Architectures
NICE '23: Proceedings of the 2023 Annual Neuro-Inspired Computational Elements ConferenceShunting inhibition is a potential mechanism by which biological systems multiply two time-varying signals, most recently proposed in single neurons of the fly visual system. Our work demonstrates this effect in a biological neuron model and the ...
Comments