Skip to main content
Top
Published in:
Cover of the book

2024 | OriginalPaper | Chapter

AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency

Authors : Yan-Ting Ye, Ting-An Chen, Ming-Syan Chen

Published in: Advances in Knowledge Discovery and Data Mining

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Product Quantization (PQ) has received an increasing research attention due to the effectiveness on bit-width compression for memory efficiency. PQ is developed to divide weight values into blocks and adopt clustering algorithms dividing them into groups assigned with quantized values accordingly. Existing research mainly focused on the clustering strategy design with a minimal error between the original weights and the quantized values for the performance maintenance. However, the block division, i.e., the selection of block size, determines the choices of number of clusters and compression rate which has not been fully studied. Therefore, this paper proposes a novel scheme AdaPQ with the process, Adaptive Exploration Product Quantization, to first flexibly construct varying block sizes by padding the filter weights, which enlarges the search space of quantization result of PQ and avoids being suffered from a sub-optimal solution. Afterward, we further design a strategy, Adversary-aware Block Size Selection, to select an appropriate block size for each layer by evaluating the sensitivity on performance under perturbation for obtaining a minor performance loss under a high compression rate. Experimental results show that AdaPQ achieves a higher accuracy under a similar compression rate compared with the state-of-the-art.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2022) Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2022)
3.
go back to reference Anwar, S., Barnes, N.: Real image denoising with feature attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3155–3164 (2019) Anwar, S., Barnes, N.: Real image denoising with feature attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3155–3164 (2019)
4.
go back to reference Li, E., Zeng, L., Zhou, Z., Chen, X.: Edge AI: on-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Commun. 19(1), 447–457 (2019) Li, E., Zeng, L., Zhou, Z., Chen, X.: Edge AI: on-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Commun. 19(1), 447–457 (2019)
5.
go back to reference Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704–2713 (2018) Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704–2713 (2018)
7.
go back to reference Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
8.
go back to reference Blalock, D., Gonzalez, O., Jose, J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020) Blalock, D., Gonzalez, O., Jose, J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)
11.
go back to reference Dong, Z., Yao, Z., Gholami, A., Mahoney, M., Keutzer, K.: HAWQ: Hessian Aware Quantization of neural networks with mixed-precision. In: Proceedings of The IEEE/CVF International Conference on Computer Vision (ICCV) 2019, pp. 293–302 (2019) Dong, Z., Yao, Z., Gholami, A., Mahoney, M., Keutzer, K.: HAWQ: Hessian Aware Quantization of neural networks with mixed-precision. In: Proceedings of The IEEE/CVF International Conference on Computer Vision (ICCV) 2019, pp. 293–302 (2019)
12.
go back to reference Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: Hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019, pp. 8604–8612 (2019) Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: Hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019, pp. 8604–8612 (2019)
13.
go back to reference Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. In: Advances in Neural Information Processing Systems 2017, vol. 30, pp. 345–353 (2017) Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. In: Advances in Neural Information Processing Systems 2017, vol. 30, pp. 345–353 (2017)
14.
go back to reference Lin, M., et al.: Rotated binary neural network. In: Advances in Neural Information Processing Systems 2020, vol. 33, pp. 7474–7485 (2020) Lin, M., et al.: Rotated binary neural network. In: Advances in Neural Information Processing Systems 2020, vol. 33, pp. 7474–7485 (2020)
15.
go back to reference Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 117–128 (2011) Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 117–128 (2011)
17.
go back to reference Stock, P., Joulin, A., Gribonval, R., Graham, B., Jégou, H.: And the bit goes down: revisiting the quantization of neural networks. In: International Conference on Learning Representations. (2020) Stock, P., Joulin, A., Gribonval, R., Graham, B., Jégou, H.: And the bit goes down: revisiting the quantization of neural networks. In: International Conference on Learning Representations. (2020)
18.
go back to reference Moon, T.K.: The expectation-maximization algorithm. IEEE Sig. Process. Mag. 13, 47–60 (1996) Moon, T.K.: The expectation-maximization algorithm. IEEE Sig. Process. Mag. 13, 47–60 (1996)
19.
go back to reference Stock, P., et al.: Training with quantization noise for extreme model compression. In: International Conference on Learning Representations 2021 (2021) Stock, P., et al.: Training with quantization noise for extreme model compression. In: International Conference on Learning Representations 2021 (2021)
20.
go back to reference Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 4820–4828 (2016) Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 4820–4828 (2016)
22.
go back to reference Wu, Y., Lee, H., Lin, Y., Chien, S.: Accelerator design for vector quantized convolutional neural network. In: 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), pp. 46–50 (2019) Wu, Y., Lee, H., Lin, Y., Chien, S.: Accelerator design for vector quantized convolutional neural network. In: 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), pp. 46–50 (2019)
23.
go back to reference Hinton, G., Vinyals, O., Dean, J.,: Distilling the knowledge in a neural network. In: NIPS 2014 Deep Learning Workshop (2014) Hinton, G., Vinyals, O., Dean, J.,: Distilling the knowledge in a neural network. In: NIPS 2014 Deep Learning Workshop (2014)
28.
go back to reference Yao, Z., et al.: HAWQ-V3: dyadic neural network quantization. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 11875–11886 (2021) Yao, Z., et al.: HAWQ-V3: dyadic neural network quantization. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 11875–11886 (2021)
30.
go back to reference Han, S., Mao, H., Dally, W.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: 4th International Conference on Learning Representations (ICLR) (2016) Han, S., Mao, H., Dally, W.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: 4th International Conference on Learning Representations (ICLR) (2016)
31.
go back to reference MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967) MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
35.
go back to reference Le, Y., Yang, X.: Tiny ImageNet Visual Recognition Challenge (2015) Le, Y., Yang, X.: Tiny ImageNet Visual Recognition Challenge (2015)
36.
go back to reference Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009) Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
38.
go back to reference Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the fifth ACM Sigkdd International Conference on Knowledge Discovery And Data Mining, pp. 277–281 (1999) Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the fifth ACM Sigkdd International Conference on Knowledge Discovery And Data Mining, pp. 277–281 (1999)
Metadata
Title
AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency
Authors
Yan-Ting Ye
Ting-An Chen
Ming-Syan Chen
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2253-2_1

Premium Partner