Weitere Artikel dieser Ausgabe durch Wischen aufrufen
State-of-the-art deep neural networks (DNNs) have greatly improved the accuracy of facial landmark localization. However, DNN models usually have a huge number of parameters which cause high memory cost and computational complexity. To address this issue, a novel method is proposed to compress and accelerate large DNN models while maintaining the performance. It includes three steps: (1) importance-based pruning: compared with traditional connection pruning, weight correlations are introduced to find and prune unimportant neurons or connections. (2) Product quantization: product quantization helps to enforce weights shared. With the same size codebook, product quantization can achieve higher compression rate than scalar quantization. (3) Network retraining: to reduce compression difficulty and performance degradation, the network is retrained iteratively after compressing one layer at a time. Besides, all pooling layers are removed and the strides of their neighbor convolutional layers are increased to accelerate the network simultaneously. The experimental results of compressing a VGG-like model demonstrate the effectiveness of our proposed method, which achieves 26 × compression and 4 × acceleration while the root mean squared error (RMSE) increases by just 3.6%.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Carreira J, Agrawal P, Fragkiadaki K, et al. Human pose estimation with iterative error feedback. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 4733–4742.
Dhall A, Goecke R, Joshi J, Sikka K, Gedeon T. Emotion recognition in the wild challenge 2014: baseline, data and protocol. Proceedings of the 16th international conference on multimodal interaction; 2014. p. 461–466.
Taigman Y, Yang M, Ranzato MA, Wolf L. Web-scale training for face identification. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 2746–2754.
Chen JC, Patel VM, Chellappa R. Unconstrained face verification using deep cnn features. 2016 IEEE Winter conference on applications of computer vision; 2016. p. 1–9.
Sun Y, Wang X, Tang X. Deep convolutional network cascade for facial point detection. Proceedings of the IEEE conference on computer vision and pattern recognition; 2013. p. 3476–3483.
Zhang Z, Luo P, Loy CC, Tang X. Facial landmark detection by deep multi-task learning. European conference on computer vision; 2014. p. 94–108.
Chen Y, Yang J, Qian J. Recurrent neural network for facial landmark detection. Neurocomputing. 2017:26–38.
Jegou H, Douze M, Schmid C. Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell. 2011:117–128.
Chellapilla K, Puri S, Simard P. High performance convolutional neural networks for document processing. Tenth international workshop on frontiers in handwriting recognition; 2006.
Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556.
Xiong X, De la Torre F. Supervised descent method and its applications to face alignment. Proceedings of the IEEE conference on computer vision and pattern recognition; 2013. p. 532– 539.
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580.
Han S, Pool J, Tran J, Dally W. Learning both weights and connections for efficient neural network. Advances in neural information processing systems; 2015. p. 1135–1143.
Sun Y, Wang X, Tang X. 2015. Sparsifying neural network connections for face recognition. arXiv: 1512.01891.
Han S, Mao H, Dally WJ. 2015. Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. arXiv: 1510.00149.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. 2015. Rethinking the inception architecture for computer vision. arXiv: 1512.00567.
Courbariaux M, Bengio Y. 2016. Binarynet: training deep neural networks with weights and activations constrained to + 1 or − 1. arXiv: 1602.02830.
Denil M, Shakibi B, Dinh L, de Freitas N. Predicting parameters in deep learning. Advances in neural information processing systems; 2013. p. 2148–2156.
Scardapane S, Comminiello D, Hussain A, Uncini A. 2016. Group sparse regularization for deep neural networks. arXiv: 1607.00485.
Sainath TN, Kingsbury B, Sindhwani V, Arisoy E, Ramabhadran B. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. 2013 IEEE international conference on acoustics, speech and signal processing; 2013. p. 6655–6659.
Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R. Exploiting linear structure within convolutional networks for efficient evaluation. Advances in neural information processing systems; 2014. p. 1269–1277.
Gong Y, Liu L, Yang M, Bourdev L. 2014. Compressing deep convolutional networks using vector quantization. arXiv: 1412.6115.
Han S, Liu X, Mao H, et al. 2016. EIE: efficient inference engine on compressed deep neural network. arXiv: 1602.01528.
Appuswamy R, Nayak T, Arthur J, et al. 2016. Structured convolution matrices for energy-efficient deep learning[j]. arXiv: 1606.02407.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, ..., Darrell T. Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on multimedia; 2014. p. 675–678.
Belhumeur PN, Jacobs DW, Kriegman DJ, Kumar N. Localizing parts of faces using a consensus of exemplars. IEEE Trans Pattern Anal Mach Intell. 2013:2930–2940.
Zhu X, Ramanan D. Face detection, pose estimation, and landmark localization in the wild. Computer vision and pattern recognition (CVPR); 2012. p. 2879–2886.
Liang L, Xiao R, Wen F, Sun J. Face alignment via component-based discriminative search. European conference on computer vision; 2008. p. 72–85.
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M. A semi-automatic methodology for facial landmark annotation. Proceedings of the IEEE conference on computer vision and pattern recognition workshops; 2013. p. 896–903.
Zhang Z, Luo P, Loy CC, et al. Learning deep representation for face alignment with auxiliary attributes. IEEE Trans Pattern Anal Mach Intell. 2016:918–930.
- Compressing and Accelerating Neural Network for Facial Point Localization
- Springer US
Neuer Inhalt/© ITandMEDIA