Abstract
Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, and intra-kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, in parallel computing environments, and in hardware-based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by assessing the misclassification rate with a corresponding connectivity pattern. The pruned network is retrained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra-kernel strided sparsity with a simple constraint can significantly reduce the size of the kernel and feature map tensors. The proposed work shows that when pruning granularities are applied in combination, we can prune the CIFAR-10 network by more than 70% with less than a 1% loss in accuracy.
- M. Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. 2002. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing 50, 2 (2002), 174--188. Google ScholarDigital Library
- James Carpenter, Peter Clifford, and Paul Fearnhead. 1999. Improved particle filter for nonlinear problems. In IEEE Proceedings on Radar, Sonar and Navigation 146. IET, 2--7. Google ScholarCross Ref
- Giovanna Castellano, Anna Maria Fanelli, and Marcello Pelillo. 1997. An iterative pruning algorithm for feedforward neural networks. IEEE Transactions on Neural Networks 8, 3 (1997), 519--531. Google ScholarDigital Library
- Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.Google Scholar
- Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. Cudnn: Efficient primitives for deep learning. Arxiv Preprint Arxiv:1410.0759 (2014).Google Scholar
- Maxwell D. Collins and Pushmeet Kohli. 2014. Memory bounded deep convolutional networks. Arxiv Preprint Arxiv:1412.1442 (2014).Google Scholar
- Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. BinaryConnect: Training deep neural networks with binary weights during propagations. Advances in Neural Information Processing Systems. 3105--3113. Google ScholarDigital Library
- Neil J. Gordon, David J. Salmond, and Adrian F. M. Smith. 1993. Novel approach to nonlinear/non-gaussian bayesian state estimation. In IEEE Proceedings on F-Radar and Signal Processing 140. IET, 107--113.Google Scholar
- Song Han, Huizi Mao, and William J. Dally. 2015a. A deep neural network compression pipeline: Pruning, quantization, huffman encoding. Arxiv Preprint Arxiv:1510.00149 (2015).Google Scholar
- Song Han, Jeff Pool, John Tran, and William Dally. 2015b. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems. 1135--1143. Google ScholarDigital Library
- Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and others. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 6 (2012), 82--97. Google ScholarCross Ref
- Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Arxiv Preprint Arxiv:1502.03167 (2015).Google Scholar
- A. Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical report, University of Toronto.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 1097--1105. Google ScholarDigital Library
- Ngai Ming Kwok, Gu Fang, and Weizhen Zhou. 2005. Evolutionary particle filter: Re-sampling from the genetic algorithm perspective. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2005). IEEE, 2935--2940. Google ScholarCross Ref
- Vadim Lebedev and Victor Lempitsky. 2015. Fast convnets using group-wise brain damage. Arxiv Preprint Arxiv:1506.02515 (2015).Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. Google ScholarCross Ref
- Tiancheng Li, Shudong Sun, Tariq Pervez Sattar, and Juan Manuel Corchado. 2014. Fight sample degeneracy and impoverishment in particle filters: A review of intelligent approaches. Expert Systems with Applications 41, 8 (2014), 3944--3954. Google ScholarDigital Library
- Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through FFTs. Arxiv Preprint Arxiv:1312.5851 (2013).Google Scholar
- Kazuyuki Nakamura, Ryo Yoshida, Masao Nagasaki, Satoru Miyano, and Tomoyuki Higuchi. 2009. Parameter estimation of in silico biological pathways with particle filtering towards a petascale computing. In Proceedings of the Pacific Symposium on Biocomputing 14. 227--238.Google Scholar
- Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.Google Scholar
- Katja Nummiaro, Esther Koller-Meier, and Luc Van Gool. 2003. An adaptive color-based particle filter. Image and Vision Computing 21, 1 (2003), 99--110. Google ScholarCross Ref
- Adam Polyak and Lior Wolf. 2015. Channel-level acceleration of deep face representations. IEEE Access 3 (2015), 2163--2175. Google Scholar
- Russell Reed. 1993. Pruning algorithms-a survey. IEEE Transactions on Neural Networks 4, 5 (1993), 740--747. Google ScholarDigital Library
- Pierre Sermanet, Soumith Chintala, and Yann LeCun. 2012. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR). IEEE, 3288--3291.Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014).Google Scholar
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Slawomir W. Stepniewski and Andy J. Keane. 1997. Pruning backpropagation neural networks using modern stochastic optimisation techniques. Neural Computing 8 Applications 5, 2 (1997), 76--98.Google Scholar
- Daniel Strigl, Klaus Kofler, and Stefan Podlipnig. 2010. Performance and scalability of GPU-based convolutional neural networks. In Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. IEEE, 317--324. Google ScholarDigital Library
- Wonyong Sung, Sungho Shin, and Kyuyeon Hwang. 2015. Resiliency of deep neural networks under quantization. Arxiv Preprint Arxiv:1511.06488 (2015).Google Scholar
- Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning 4 (2012), 2.Google Scholar
- Jaco Vermaak, Arnaud Doucet, and Patrick Pérez. 2003. Maintaining multimodality through mixture tracking. In Proceedings of the 9th IEEE International Conference on Computer Vision, 2003. IEEE, 1110--1116. Google ScholarDigital Library
- Li Wan, Matthew Zeiler, Sixin Zhang, Yann L. Cun, and Rob Fergus. 2013. Regularization of neural networks using dropconnect. In Proceedings of the 30th International Conference on Machine Learning (ICML-13). 1058--1066. Google ScholarDigital Library
Index Terms
- Structured Pruning of Deep Convolutional Neural Networks
Recommendations
Structured Pruning with Automatic Pruning Rate Derivation for Image Processing Neural Networks
ISMSI '22: Proceedings of the 2022 6th International Conference on Intelligent Systems, Metaheuristics & Swarm IntelligenceStructured pruning has been proposed for network model compression. Because most of existing structured pruning methods assign pruning rate manually, finding appropriate pruning rate to suppress the degradation of pruned model accuracy is difficult. ...
Recursive least squares method for training and pruning convolutional neural networks
AbstractConvolutional neural networks (CNNs) have shown good performance in many practical applications. However, their high computational and storage requirements make them difficult to deploy on resource-constrained devices. To address this issue, in ...
Loss-aware automatic selection of structured pruning criteria for deep neural network acceleration
AbstractStructured pruning is a well-established technique for compressing neural networks, making them suitable for deployment in resource-limited edge devices. This study presents an efficient loss-aware automatic selection of structured pruning (LAASP)...
Graphical abstractDisplay Omitted
Highlights- An efficient loss-aware structured pruning technique for slimming CNNs.
- Pruning-while-training approach replacing sequential train-prune-finetune process.
- Automatic selection of pruning criteria with layer-wise variable rate ...
Comments