skip to main content
research-article

Structured Pruning of Deep Convolutional Neural Networks

Authors Info & Claims
Published:09 February 2017Publication History
Skip Abstract Section

Abstract

Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, and intra-kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, in parallel computing environments, and in hardware-based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by assessing the misclassification rate with a corresponding connectivity pattern. The pruned network is retrained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra-kernel strided sparsity with a simple constraint can significantly reduce the size of the kernel and feature map tensors. The proposed work shows that when pruning granularities are applied in combination, we can prune the CIFAR-10 network by more than 70% with less than a 1% loss in accuracy.

References

  1. M. Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. 2002. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing 50, 2 (2002), 174--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. James Carpenter, Peter Clifford, and Paul Fearnhead. 1999. Improved particle filter for nonlinear problems. In IEEE Proceedings on Radar, Sonar and Navigation 146. IET, 2--7. Google ScholarGoogle ScholarCross RefCross Ref
  3. Giovanna Castellano, Anna Maria Fanelli, and Marcello Pelillo. 1997. An iterative pruning algorithm for feedforward neural networks. IEEE Transactions on Neural Networks 8, 3 (1997), 519--531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.Google ScholarGoogle Scholar
  5. Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. Cudnn: Efficient primitives for deep learning. Arxiv Preprint Arxiv:1410.0759 (2014).Google ScholarGoogle Scholar
  6. Maxwell D. Collins and Pushmeet Kohli. 2014. Memory bounded deep convolutional networks. Arxiv Preprint Arxiv:1412.1442 (2014).Google ScholarGoogle Scholar
  7. Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. BinaryConnect: Training deep neural networks with binary weights during propagations. Advances in Neural Information Processing Systems. 3105--3113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Neil J. Gordon, David J. Salmond, and Adrian F. M. Smith. 1993. Novel approach to nonlinear/non-gaussian bayesian state estimation. In IEEE Proceedings on F-Radar and Signal Processing 140. IET, 107--113.Google ScholarGoogle Scholar
  9. Song Han, Huizi Mao, and William J. Dally. 2015a. A deep neural network compression pipeline: Pruning, quantization, huffman encoding. Arxiv Preprint Arxiv:1510.00149 (2015).Google ScholarGoogle Scholar
  10. Song Han, Jeff Pool, John Tran, and William Dally. 2015b. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems. 1135--1143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and others. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 6 (2012), 82--97. Google ScholarGoogle ScholarCross RefCross Ref
  12. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Arxiv Preprint Arxiv:1502.03167 (2015).Google ScholarGoogle Scholar
  13. A. Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical report, University of Toronto.Google ScholarGoogle Scholar
  14. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ngai Ming Kwok, Gu Fang, and Weizhen Zhou. 2005. Evolutionary particle filter: Re-sampling from the genetic algorithm perspective. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2005). IEEE, 2935--2940. Google ScholarGoogle ScholarCross RefCross Ref
  16. Vadim Lebedev and Victor Lempitsky. 2015. Fast convnets using group-wise brain damage. Arxiv Preprint Arxiv:1506.02515 (2015).Google ScholarGoogle Scholar
  17. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. Google ScholarGoogle ScholarCross RefCross Ref
  18. Tiancheng Li, Shudong Sun, Tariq Pervez Sattar, and Juan Manuel Corchado. 2014. Fight sample degeneracy and impoverishment in particle filters: A review of intelligent approaches. Expert Systems with Applications 41, 8 (2014), 3944--3954. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through FFTs. Arxiv Preprint Arxiv:1312.5851 (2013).Google ScholarGoogle Scholar
  20. Kazuyuki Nakamura, Ryo Yoshida, Masao Nagasaki, Satoru Miyano, and Tomoyuki Higuchi. 2009. Parameter estimation of in silico biological pathways with particle filtering towards a petascale computing. In Proceedings of the Pacific Symposium on Biocomputing 14. 227--238.Google ScholarGoogle Scholar
  21. Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.Google ScholarGoogle Scholar
  22. Katja Nummiaro, Esther Koller-Meier, and Luc Van Gool. 2003. An adaptive color-based particle filter. Image and Vision Computing 21, 1 (2003), 99--110. Google ScholarGoogle ScholarCross RefCross Ref
  23. Adam Polyak and Lior Wolf. 2015. Channel-level acceleration of deep face representations. IEEE Access 3 (2015), 2163--2175. Google ScholarGoogle Scholar
  24. Russell Reed. 1993. Pruning algorithms-a survey. IEEE Transactions on Neural Networks 4, 5 (1993), 740--747. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Pierre Sermanet, Soumith Chintala, and Yann LeCun. 2012. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR). IEEE, 3288--3291.Google ScholarGoogle Scholar
  26. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014).Google ScholarGoogle Scholar
  27. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Slawomir W. Stepniewski and Andy J. Keane. 1997. Pruning backpropagation neural networks using modern stochastic optimisation techniques. Neural Computing 8 Applications 5, 2 (1997), 76--98.Google ScholarGoogle Scholar
  29. Daniel Strigl, Klaus Kofler, and Stefan Podlipnig. 2010. Performance and scalability of GPU-based convolutional neural networks. In Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. IEEE, 317--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wonyong Sung, Sungho Shin, and Kyuyeon Hwang. 2015. Resiliency of deep neural networks under quantization. Arxiv Preprint Arxiv:1511.06488 (2015).Google ScholarGoogle Scholar
  31. Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning 4 (2012), 2.Google ScholarGoogle Scholar
  32. Jaco Vermaak, Arnaud Doucet, and Patrick Pérez. 2003. Maintaining multimodality through mixture tracking. In Proceedings of the 9th IEEE International Conference on Computer Vision, 2003. IEEE, 1110--1116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Li Wan, Matthew Zeiler, Sixin Zhang, Yann L. Cun, and Rob Fergus. 2013. Regularization of neural networks using dropconnect. In Proceedings of the 30th International Conference on Machine Learning (ICML-13). 1058--1066. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Structured Pruning of Deep Convolutional Neural Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Journal on Emerging Technologies in Computing Systems
      ACM Journal on Emerging Technologies in Computing Systems  Volume 13, Issue 3
      Special Issue on Hardware and Algorithms for Learning On-a-chip and Special Issue on Alternative Computing Systems
      July 2017
      418 pages
      ISSN:1550-4832
      EISSN:1550-4840
      DOI:10.1145/3051701
      • Editor:
      • Yuan Xie
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 February 2017
      • Accepted: 1 October 2016
      • Revised: 1 August 2016
      • Received: 1 March 2016
      Published in jetc Volume 13, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader