Top

International Journal of Computer Vision

Published in:

13-04-2023

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

Authors: Ziwei Wang, Jiwen Lu, Han Xiao, Shengyu Liu, Jie Zhou

Published in: International Journal of Computer Vision | Issue 7/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we propose an ultrafast automated model compression framework called SeerNet for flexible network deployment. Conventional non-differen-tiable methods discretely search the desirable compression policy based on the accuracy from exhaustively trained lightweight models, and existing differentiable methods optimize an extremely large supernet to obtain the required compressed model for deployment. They both cause heavy computational cost due to the complex compression policy search and evaluation process. On the contrary, we obtain the optimal efficient networks by directly optimizing the compression policy with an accurate performance predictor, where the ultrafast automated model compression for various computational cost constraint is achieved without complex compression policy search and evaluation. Specifically, we first train the performance predictor based on the accuracy from uncertain compression policies actively selected by efficient evolutionary search, so that informative supervision is provided to learn the accurate performance predictor with acceptable cost. Then we leverage the gradient that maximizes the predicted performance under the barrier complexity constraint for ultrafast acquisition of the desirable compression policy, where adaptive update stepsizes with momentum are employed to enhance optimality of the acquired pruning and quantization strategy. Compared with the state-of-the-art automated model compression methods, experimental results on image classification and object detection show that our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost. Code is available at https://github.com/ZiweiWangTHU/SeerNet.

previous article Disassembling Convolutional Segmentation Network

next article Neural Architecture Search for Dense Prediction Tasks in Computer Vision

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Abbasnejad, E.,, Teney, D., Parvaneh, A., Shi, J., & van den Hengel, A. (2020). Counterfactual vision and language learning. In CVPR, pp. 10044–10054.

Balcan, M.-F., Broder, A., & Zhang, T. (2007). Margin based active learning. In: COLT, pp. 35–50.

Bell, S., Lawrence, Z. C., Bala, K., & Girshick, R. (2016). Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: CVPR, pp. 2874–2883.

Beluch, W. H., Genewein, T., Nürnberger, A., & Köhler, J. M. (2018). The power of ensembles for active learning in image classification. In: CVPR, pp. 9368–9377.

Bethge, J., Bartz, C., Yang, H., Chen, Y., & Meinel, C. (2020). Meliusnet: Can binary neural networks achieve mobilenet-level accuracy? arXiv preprint arXiv:2001.05936.

Bulat, A., & Tzimiropoulos, G. (2021). Bit-mixer: Mixed-precision networks with runtime bit-width selection. In: ICCV, pp. 5188–5197.

Cai, H., Gan, C., Wang, T., Zhang, Z., & Han, S. (2019). Once-for-all: Train one network and specialize it for efficient deployment. arXiv preprint arXiv:1908.09791.

Cai, Z., & Vasconcelos, N. (2020). Rethinking differentiable search for mixed-precision neural networks. In: CVPR, pp. 2349–2358.

Chen, G., Choi, W., Yu, X., Han, T., & Chandraker, M. (2017). Learning efficient object detection models with knowledge distillation. In: NIPS, pp. 742–751.

Choi, J., Wang, Z., Venkataramani, S., Chuang, P. I.-J., Srinivasan, V., & Gopalakrishnan, K. (2018). Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085.

Dai, X., Zhang, P., Wu, B., Yin, H., Sun, F., Wang, Y., Dukhan, M., Hu, Y., Wu, Y., Jia, Y., et al. (2019). Chamnet: Towards efficient network design through platform-aware model adaptation. In: CVPR, pp. 11398–11407.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: CVPR, pp. 248–255.

Denil, M., Shakibi, B., Dinh, L., De Freitas, N., et al. (2013). Predicting parameters in deep learning. In: NIPS, pp. 2148–2156.

Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2018). Boosting adversarial attacks with momentum. In: CVPR, pp. 9185–9193.

Dong, Z., Yao, Z., Gholami, A., Mahoney, M. W., & Keutzer, K. (2019). Hawq: Hessian aware quantization of neural networks with mixed-precision. In: ICCV, pp. 293–302.

Duch, W., & Korczak, J. (1998). Optimization and global minimization methods suitable for neural networks. Neural Computing Surveys, 2, 163–212.

Erin Liong, V., Lu, J., Wang, G., Moulin, P., & Zhou, J. (2015). Deep hashing for compact binary codes learning. In: CVPR, pp. 2475–2483.

Esser, S. K., McKinstry, J. L., Bablani, D., Appuswamy, R., & Modha, D. S. (2019). Learned step size quantization. arXiv preprint arXiv:1902.08153.

Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. IJCV, 88(2), 303–338.CrossRef

Feichtenhofer, C., Fan, H., Malik, J., & He, K. (2019). Slowfast networks for video recognition. In: ICCV, pp. 6202–6211.

Finlay, C., Pooladian, A.-A., & Oberman, A. (2019). The logbarrier adversarial attack: making effective use of decision boundary information. In: ICCV, pp. 4862–4870.

Gal, Y. Islam, R., & Ghahramani, Z. (2017). Deep bayesian active learning with image data. arXiv preprint arXiv:1703.02910.

Gong, R., Liu, X., Jiang, S., Li, T., Hu, P., Lin, J., Yu, F., & Yan, J. (2019). Differentiable soft quantization: Bridging full-precision and low-bit neural networks. arXiv preprint arXiv:1908.05033.

Goyal, Y., Wu, Z., Ernst, J., Batra, D., Parikh, D., & Lee, S. (2019). Counterfactual visual explanations. arXiv preprint arXiv:1904.07451.

Habi, H. V., Jennings, R. H., & Netzer, A. (2020). Hmq: Hardware friendly mixed precision quantization block for cnns. arXiv preprint arXiv:2007.09952.

Han, S., Mao, H., & Dally, W. J. (2015a). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.

Han, S., Pool, J., Tran, J., & Dally, W. (2015b). Learning both weights and connections for efficient neural network. In: NIPS, pp. 1135–1143.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: CVPR, pp. 770–778.

He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018a). Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866.

He, Y., Zhang, X., & Sun, J. (2017). Channel pruning for accelerating very deep neural networks. In: ICCV, pp. 1389–1397.

He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., & Han, S. (2018b). Amc: Automl for model compression and acceleration on mobile devices. In: ECCV, pp. 784–800.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks. In: NIPS, pp. 4107–4115.

Jin, Q., Yang, L., & Liao, Z. (2020). Adabits: Neural network quantization with adaptive bit-widths. In: CVPR, pp. 2146–2156.

Joshi, A. J., Porikli, F., & Papanikolopoulos, N. (2009). Multi-class active learning for image classification. In: CVPR, pp. 2372–2379.

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.

Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710.

Li, J., Qi, Q., Wang, J., Ge, C., Li, Y., Yue, Z., & Sun, H. (2019a). Oicsr: Out-in-channel sparsity regularization for compact deep neural networks. In: CVPR, pp. 7046–7055.

Li, R., Wang, Y., Liang, F., Qin, H., Yan, J., & Fan, R. (2019b). Fully quantized network for object detection. In: CVPR, pp. 2810–2819.

Li, X., & Guo, Y. (2014). Multi-level adaptive active learning for scene classification. In: ECCV, pp. 234–249.

Li, Y., Gu, S., Mayer, C., Van Gool, L., & Timofte, R. (2020a). Group sparsity: The hinge between filter pruning and decomposition for network compression. In: CVPR, pp. 8018–8027.

Li, Y., Dong, X., & Wei, W. (2020). Additive powers-of-two quantization: A non-uniform discretization for neural networks. ICLR.

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Lawrence, Z. C. (2014). Microsoft coco: Common objects in context. In: ECCV, pp. 740–755.

Liu, B., Wang, M., Foroosh, H., Tappen, M., & Pensky, M. (2015). Sparse convolutional neural networks. In: CVPR, pp. 806–814.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In: ECCV, pp. 21–37.

Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., & Cheng, K.-T. (2018a). Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In: ECCV, pp. 722–737.

Liu, Z., Mu, H., Zhang, X., Guo, Z., Yang, X., Cheng, K.-T., & Sun, J. (2019). Metapruning: Meta learning for automatic neural network channel pruning. In: ICCV, pp. 3296–3305.

Liu, Z., Sun, M., Zhou, T., Huang, G., & Darrell, T. (2018b). Rethinking the value of network pruning. In: ICLR.

Lou, Q., Guo, F., Kim, M., & Liu, L., & Lei, J. (2019). Autoq: Automated kernel-wise neural network quantization. In: ICLR.

Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through \( l_0 \) regularization. arXiv preprint arXiv:1712.01312.

Louizos, C., Reisser, M., Blankevoort, T., Gavves, E., & Welling, M. (2018). Relaxed quantization for discretized neural networks. arXiv preprint arXiv:1810.01875.

Luo, W., Schwing, A., & Urtasun, R. (2013). Latent structured active learning. NIPS, 26, 728–736.

Melville, P., & Mooney, R. J. (2004). Diverse ensembles for active learning. In: ICML.

Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2016). Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440.

Molchanov, P., Mallya, A., Tyree, S., Frosio, I., & Kautz, J. (2019). Importance estimation for neural network pruning. In: CVPR, pp. 11264–11272.

Peng, H., Wu, J., Chen, S., & Huang, J. (2019). Collaborative channel pruning for deep networks. In: ICML, pp. 5113–5122.

Phan, H., Huynh, D., He, Y., Savvides, M., & Shen, Z. (2019). Mobinet: A mobile binary network for image classification. arXiv preprint arXiv:1907.12629.

Qu, Z., Zhou, Z., Cheng, Y., & Thiele, L. (2020). Adaptive loss-aware quantization for multi-bit networks. In: CVPR, pp. 7988–7997.

Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In: ECCV, pp. 525–542.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520.

Settles, B., & Craven, M. (2008). An analysis of active learning strategies for sequence labeling tasks. In: EMNLP, pp. 1070–1079.

Siddiqui, Y., Valentin, J., & Nießner, M. (2020). Viewal: Active learning with viewpoint entropy for semantic segmentation. In: CVPR, pp. 9433–9443.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). On the importance of initialization and momentum in deep learning. In: ICML, pp. 1139–1147.

Uhlich, S., Mauch, L., Yoshiyama, K., Cardinaux, F., Garcia, J. A., Tiedemann, S., Kemp, T., & Nakamura, A. (2019). Differentiable quantization of deep neural networks. arXiv preprint arXiv:1905.11452.

Vasisht, D., Damianou, A., Varma, M., & Kapoor, A. (2014). Active learning for sparse bayesian multilabel classification. In: KDD, pp. 472–481.

Vijayanarasimhan, S., & Grauman, K. (2014). Large-scale live active learning: Training object detectors with crawled data and crowds. IJCV, 108(1–2), 97–114.

Wang, K., Liu, Z., Lin, Y., Lin, J., & Han, S. (2019a). Haq: Hardware-aware automated quantization with mixed precision. In: CVPR, pp. 8612–8620.

Wang, T., Wang, K., Cai, H., Lin, J., Liu, Z., Wang, H., Lin, Y., & Han, S. (2020a). Apq: Joint search for network architecture, pruning and quantization policy. In: CVPR, pp. 2078–2087.

Wang, W., Song, H., Zhao, S., Shen, J., Zhao, S., Hoi, S. C. H., & Ling, H. (2019b). Learning unsupervised video object segmentation through visual attention. In: CVPR, pp. 3064–3074.

Wang, Y., Lu, Y., & Blankevoort, T. (2020b). Differentiable joint pruning and quantization for hardware efficiency. In: ECCV, pp. 259–277.

Wang, Z., Zheng, Q., Lu, J., & Zhou, J. (2020c). Deep hashing with active pairwise supervision. In: ECCV, pp. 522–538.

Wang, Z., Jiwen, L., & Zhou, J. (2021). Learning channel-wise interactions for binary convolutional neural networks. TPAMI, 43(10), 3432–3445.CrossRef

Wang, Z., Xiao, H., Lu, J., & Zhou, J. (2021b). Generalizable mixed-precision quantization via attribution rank preservation. In: ICCV, pp. 5291–5300.

Wang, Z., Jiwen, L., Ziyi, W., & Zhou, J. (2022). Learning efficient binarized object detectors with information compression. TPAMI, 44(6), 3082–3095.CrossRef

Wang, Z., Wang, C., Xu, X., Zhou, J., & Lu, J. (2020b). Quantformer: Learning extremely low-precision vision transformers. TPAMI, pp. 1–14. https://doi.org/10.1109/TPAMI.2022.3229313.

Wen, W., Liu, H., Chen, Y., Li, H., Bender, G., & Kindermans, P.-J. (2020). Neural predictor for neural architecture search. In: ECCV, pp. 660–676.

Wu, Z., Wang, Z., Wei, Z., Wei, Y., & Yan, H. (2020). Smart explorer: Recognizing objects in dense clutter via interactive exploration. In: IROS, pp. 6600–6607.

Yang, T.-J., Howard, A., Chen, B., Zhang, X., Go, A., Sandler, M., Sze, V., & Adam, H. (2018). Netadapt: Platform-aware neural network adaptation for mobile applications. In: ECCV, pp. 285–300.

Yu, H., Han, Q., Li, J., Shi, J., Cheng, G., & Fan, B. (2020). Search what you want: Barrier panelty nas for mixed precision quantization. arXiv preprint arXiv:2007.10026.

Zhang, D., Yang, J., Ye, D., & Hua, G. (2018). Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In: ECCV, pp. 365–382.

Title: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
Authors: Ziwei Wang
Jiwen Lu
Han Xiao
Shengyu Liu
Jie Zhou
Publication date: 13-04-2023
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 7/2023
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-023-01783-0

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 7/2023

Adversarial Learning Domain-Invariant Conditional Features for Robust Face Anti-spoofing

Disentangling Geometric Deformation Spaces in Generative Latent Shape Models

Efficient Person Search: An Anchor-Free Approach

Memory Based Temporal Fusion Network for Video Deblurring

Advanced Binary Neural Network for Single Image Super Resolution

Correction to: Learning to Adapt to Light

Premium Partner