Top

Published in:

2024 | OriginalPaper | Chapter

AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency

Authors : Yan-Ting Ye, Ting-An Chen, Ming-Syan Chen

Published in: Advances in Knowledge Discovery and Data Mining

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Product Quantization (PQ) has received an increasing research attention due to the effectiveness on bit-width compression for memory efficiency. PQ is developed to divide weight values into blocks and adopt clustering algorithms dividing them into groups assigned with quantized values accordingly. Existing research mainly focused on the clustering strategy design with a minimal error between the original weights and the quantized values for the performance maintenance. However, the block division, i.e., the selection of block size, determines the choices of number of clusters and compression rate which has not been fully studied. Therefore, this paper proposes a novel scheme AdaPQ with the process, Adaptive Exploration Product Quantization, to first flexibly construct varying block sizes by padding the filter weights, which enlarges the search space of quantization result of PQ and avoids being suffered from a sub-optimal solution. Afterward, we further design a strategy, Adversary-aware Block Size Selection, to select an appropriate block size for each layer by evaluating the sensitivity on performance under perturbation for obtaining a minor performance loss under a high compression rate. Experimental results show that AdaPQ achieves a higher accuracy under a similar compression rate compared with the state-of-the-art.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

next chapter Ranking Enhanced Supervised Contrastive Learning for Regression

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2022)

Anwar, S., Barnes, N.: Real image denoising with feature attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3155–3164 (2019)

Li, E., Zeng, L., Zhou, Z., Chen, X.: Edge AI: on-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Commun. 19(1), 447–457 (2019)

Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704–2713 (2018)

Anwar, S., Hwang, K., Sung, W.: Structured pruning of deep convolutional neural networks (2015). https://doi.org/10.48550/arXiv.1512.08571

Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)

Blalock, D., Gonzalez, O., Jose, J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)

Uhlich, S., et al.: Mixed Precision DNNs: all you need is a good parametrization (2020). https://doi.org/10.48550/arXiv.1905.11452

10.

Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients (2018). https://doi.org/10.48550/arXiv.1606.06160

11.

Dong, Z., Yao, Z., Gholami, A., Mahoney, M., Keutzer, K.: HAWQ: Hessian Aware Quantization of neural networks with mixed-precision. In: Proceedings of The IEEE/CVF International Conference on Computer Vision (ICCV) 2019, pp. 293–302 (2019)

12.

Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: Hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019, pp. 8604–8612 (2019)

13.

Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. In: Advances in Neural Information Processing Systems 2017, vol. 30, pp. 345–353 (2017)

14.

Lin, M., et al.: Rotated binary neural network. In: Advances in Neural Information Processing Systems 2020, vol. 33, pp. 7474–7485 (2020)

15.

Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 117–128 (2011)

16.

Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing Deep Convolutional Networks using Vector Quantization (2014). https://doi.org/10.48550/arXiv.1412.6115

17.

Stock, P., Joulin, A., Gribonval, R., Graham, B., Jégou, H.: And the bit goes down: revisiting the quantization of neural networks. In: International Conference on Learning Representations. (2020)

18.

Moon, T.K.: The expectation-maximization algorithm. IEEE Sig. Process. Mag. 13, 47–60 (1996)

19.

Stock, P., et al.: Training with quantization noise for extreme model compression. In: International Conference on Learning Representations 2021 (2021)

20.

Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 4820–4828 (2016)

21.

Yoshua, B., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation (2013). https://doi.org/10.48550/arXiv.1308.3432

22.

Wu, Y., Lee, H., Lin, Y., Chien, S.: Accelerator design for vector quantized convolutional neural network. In: 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), pp. 46–50 (2019)

23.

Hinton, G., Vinyals, O., Dean, J.,: Distilling the knowledge in a neural network. In: NIPS 2014 Deep Learning Workshop (2014)

24.

Lopes, R., Fenu, S., Starner, T.: Data-free knowledge distillation for deep neural networks (2017). https://doi.org/10.48550/arXiv.1710.07535

25.

Gou, J., et al.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), pp. 1789–1819 (2021). https://doi.org/10.1007/s11263-021-01453-z

26.

Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M., Keutzer, K.: A survey of quantization methods for efficient neural network inference (2021). https://doi.org/10.48550/arXiv.2103.13630

27.

Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization (2016). https://doi.org/10.48550/arXiv.1511.06488

28.

Yao, Z., et al.: HAWQ-V3: dyadic neural network quantization. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 11875–11886 (2021)

29.

Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs (2017) https://doi.org/10.48550/arXiv.1702.08734

30.

Han, S., Mao, H., Dally, W.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: 4th International Conference on Learning Representations (ICLR) (2016)

31.

MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)

32.

Ren, P., et al.: A comprehensive survey of neural architecture search: challenges and solutions. ACM Comput. Surv. 54(4) (2021). https://doi.org/10.1145/3447582

33.

Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-10 (Canadian Institute for Advanced Research). http://www.cs.toronto.edu/~kriz/cifar.html

34.

Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-100 (Canadian Institute for Advanced Research). http://www.cs.toronto.edu/~kriz/cifar.html

35.

Le, Y., Yang, X.: Tiny ImageNet Visual Recognition Challenge (2015)

36.

Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

37.

Howard, A., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). https://doi.org/10.48550/arXiv.1704.04861

38.

Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the fifth ACM Sigkdd International Conference on Knowledge Discovery And Data Mining, pp. 277–281 (1999)

Title: AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency
Authors: Yan-Ting Ye
Ting-An Chen
Ming-Syan Chen
Publisher: Springer Nature Singapore
Book: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright Year: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner