research-article

Structured Pruning of Deep Convolutional Neural Networks

Authors:
Sajid Anwar

Seoul National University, Republic of Korea, Seoul, Korea

Seoul National University, Republic of Korea, Seoul, Korea
View Profile

,
Kyuyeon Hwang

Seoul National University, Republic of Korea, Seoul, Korea

Seoul National University, Republic of Korea, Seoul, Korea
View Profile

,
Wonyong Sung

Seoul National University, Republic of Korea, Seoul, Korea

Seoul National University, Republic of Korea, Seoul, Korea
View Profile

ACM Journal on Emerging Technologies in Computing Systems Volume 13 Issue 3Article No.: 32pp 1–18https://doi.org/10.1145/3005348

Published:09 February 2017Publication History

ACM Journal on Emerging Technologies in Computing Systems

Abstract

Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, and intra-kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, in parallel computing environments, and in hardware-based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by assessing the misclassification rate with a corresponding connectivity pattern. The pruned network is retrained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra-kernel strided sparsity with a simple constraint can significantly reduce the size of the kernel and feature map tensors. The proposed work shows that when pruning granularities are applied in combination, we can prune the CIFAR-10 network by more than 70% with less than a 1% loss in accuracy.

References

M. Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. 2002. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing 50, 2 (2002), 174--188. Google ScholarDigital Library
James Carpenter, Peter Clifford, and Paul Fearnhead. 1999. Improved particle filter for nonlinear problems. In IEEE Proceedings on Radar, Sonar and Navigation 146. IET, 2--7. Google ScholarCross Ref
Giovanna Castellano, Anna Maria Fanelli, and Marcello Pelillo. 1997. An iterative pruning algorithm for feedforward neural networks. IEEE Transactions on Neural Networks 8, 3 (1997), 519--531. Google ScholarDigital Library
Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.Google Scholar
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. Cudnn: Efficient primitives for deep learning. Arxiv Preprint Arxiv:1410.0759 (2014).Google Scholar
Maxwell D. Collins and Pushmeet Kohli. 2014. Memory bounded deep convolutional networks. Arxiv Preprint Arxiv:1412.1442 (2014).Google Scholar
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. BinaryConnect: Training deep neural networks with binary weights during propagations. Advances in Neural Information Processing Systems. 3105--3113. Google ScholarDigital Library
Neil J. Gordon, David J. Salmond, and Adrian F. M. Smith. 1993. Novel approach to nonlinear/non-gaussian bayesian state estimation. In IEEE Proceedings on F-Radar and Signal Processing 140. IET, 107--113.Google Scholar
Song Han, Huizi Mao, and William J. Dally. 2015a. A deep neural network compression pipeline: Pruning, quantization, huffman encoding. Arxiv Preprint Arxiv:1510.00149 (2015).Google Scholar
Song Han, Jeff Pool, John Tran, and William Dally. 2015b. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems. 1135--1143. Google ScholarDigital Library
Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and others. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 6 (2012), 82--97. Google ScholarCross Ref
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Arxiv Preprint Arxiv:1502.03167 (2015).Google Scholar
A. Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical report, University of Toronto.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 1097--1105. Google ScholarDigital Library
Ngai Ming Kwok, Gu Fang, and Weizhen Zhou. 2005. Evolutionary particle filter: Re-sampling from the genetic algorithm perspective. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2005). IEEE, 2935--2940. Google ScholarCross Ref
Vadim Lebedev and Victor Lempitsky. 2015. Fast convnets using group-wise brain damage. Arxiv Preprint Arxiv:1506.02515 (2015).Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. Google ScholarCross Ref
Tiancheng Li, Shudong Sun, Tariq Pervez Sattar, and Juan Manuel Corchado. 2014. Fight sample degeneracy and impoverishment in particle filters: A review of intelligent approaches. Expert Systems with Applications 41, 8 (2014), 3944--3954. Google ScholarDigital Library
Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through FFTs. Arxiv Preprint Arxiv:1312.5851 (2013).Google Scholar
Kazuyuki Nakamura, Ryo Yoshida, Masao Nagasaki, Satoru Miyano, and Tomoyuki Higuchi. 2009. Parameter estimation of in silico biological pathways with particle filtering towards a petascale computing. In Proceedings of the Pacific Symposium on Biocomputing 14. 227--238.Google Scholar
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.Google Scholar
Katja Nummiaro, Esther Koller-Meier, and Luc Van Gool. 2003. An adaptive color-based particle filter. Image and Vision Computing 21, 1 (2003), 99--110. Google ScholarCross Ref
Adam Polyak and Lior Wolf. 2015. Channel-level acceleration of deep face representations. IEEE Access 3 (2015), 2163--2175. Google Scholar
Russell Reed. 1993. Pruning algorithms-a survey. IEEE Transactions on Neural Networks 4, 5 (1993), 740--747. Google ScholarDigital Library
Pierre Sermanet, Soumith Chintala, and Yann LeCun. 2012. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR). IEEE, 3288--3291.Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014).Google Scholar
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
Slawomir W. Stepniewski and Andy J. Keane. 1997. Pruning backpropagation neural networks using modern stochastic optimisation techniques. Neural Computing 8 Applications 5, 2 (1997), 76--98.Google Scholar
Daniel Strigl, Klaus Kofler, and Stefan Podlipnig. 2010. Performance and scalability of GPU-based convolutional neural networks. In Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. IEEE, 317--324. Google ScholarDigital Library
Wonyong Sung, Sungho Shin, and Kyuyeon Hwang. 2015. Resiliency of deep neural networks under quantization. Arxiv Preprint Arxiv:1511.06488 (2015).Google Scholar
Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning 4 (2012), 2.Google Scholar
Jaco Vermaak, Arnaud Doucet, and Patrick Pérez. 2003. Maintaining multimodality through mixture tracking. In Proceedings of the 9th IEEE International Conference on Computer Vision, 2003. IEEE, 1110--1116. Google ScholarDigital Library
Li Wan, Matthew Zeiler, Sixin Zhang, Yann L. Cun, and Rob Fergus. 2013. Regularization of neural networks using dropconnect. In Proceedings of the 30th International Conference on Machine Learning (ICML-13). 1058--1066. Google ScholarDigital Library

Index Terms

Structured Pruning of Deep Convolutional Neural Networks
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Structured Pruning with Automatic Pruning Rate Derivation for Image Processing Neural Networks
ISMSI '22: Proceedings of the 2022 6th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence

Structured pruning has been proposed for network model compression. Because most of existing structured pruning methods assign pruning rate manually, finding appropriate pruning rate to suppress the degradation of pruned model accuracy is difficult. ...
Read More
Recursive least squares method for training and pruning convolutional neural networks
Abstract
Convolutional neural networks (CNNs) have shown good performance in many practical applications. However, their high computational and storage requirements make them difficult to deploy on resource-constrained devices. To address this issue, in ...
Read More
Loss-aware automatic selection of structured pruning criteria for deep neural network acceleration
Abstract
Structured pruning is a well-established technique for compressing neural networks, making them suitable for deployment in resource-limited edge devices. This study presents an efficient loss-aware automatic selection of structured pruning (LAASP)...
Graphical abstract

Display Omitted
Highlights
- An efficient loss-aware structured pruning technique for slimming CNNs.
- Pruning-while-training approach replacing sequential train-prune-finetune process.
- Automatic selection of pruning criteria with layer-wise variable rate ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Journal on Emerging Technologies in Computing Systems Volume 13, Issue 3
Special Issue on Hardware and Algorithms for Learning On-a-chip and Special Issue on Alternative Computing Systems
July 2017
418 pages
ISSN:1550-4832
EISSN:1550-4840
DOI:10.1145/3051701
Editor:
Yuan Xie
University of California, Santa Barbara, USA
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 9 February 2017
- Accepted: 1 October 2016
- Revised: 1 August 2016
- Received: 1 March 2016
Published in jetc Volume 13, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep convolutional neural networks
feature map pruning
intra-kernel strided sparsity
structured pruning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 393
  Total Citations
  View Citations
- 3,446
  Total Downloads
- Downloads (Last 12 months)476
- Downloads (Last 6 weeks)76
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Structured Pruning of Deep Convolutional Neural Networks

ACM Journal on Emerging Technologies in Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Structured Pruning with Automatic Pruning Rate Derivation for Image Processing Neural Networks

Recursive least squares method for training and pruning convolutional neural networks

Loss-aware automatic selection of structured pruning criteria for deep neural network acceleration

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Structured Pruning of Deep Convolutional Neural Networks

ACM Journal on Emerging Technologies in Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Structured Pruning with Automatic Pruning Rate Derivation for Image Processing Neural Networks

Recursive least squares method for training and pruning convolutional neural networks

Loss-aware automatic selection of structured pruning criteria for deep neural network acceleration

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media