Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 2/2024

13.08.2023 | Original Article

AFMPM: adaptive feature map pruning method based on feature distillation

verfasst von: Yufeng Guo, Weiwei Zhang, Junhuang Wang, Ming Ji, Chenghui Zhen, Zhengzheng Guo

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 2/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Feature distillation is a technology that uses the middle layer feature map of the teacher network as knowledge to transfer to the students. The feature information not only reflects the image information but also covers the feature extraction ability of the teacher network. However, the existing feature distillation methods lack theoretical guidance for feature map evaluation and suffer from the mismatch of sizes between high-dimensional feature maps and low-dimensional feature maps, and poor information utilization. In this paper, we propose an Adaptive Feature Map Pruning Method (AFMPM) for feature distillation, which transforms the problem of feature map pruning into the problem of optimization so that the valid information of the feature map is retained to the maximum extent. AFMPM has achieved significant improvements in feature distillation, and the advanced and generalized nature of the method has been verified by conducting experiments on the teacher-student distillation framework and the self-distillation framework.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Wang C, Zhang S, Song S et al (2022) Learn from the past: experience ensemble knowledge distillation. arXiv preprint https://arxiv.org/abs/2202.12488 Wang C, Zhang S, Song S et al (2022) Learn from the past: experience ensemble knowledge distillation. arXiv preprint https://​arxiv.​org/​abs/​2202.​12488
2.
Zurück zum Zitat Yao J, Zhang S, Yao Y et al (2022) Edge-cloud polarization and collaboration: a comprehensive survey for AI. IEEE Trans Knowl Data Eng 35:6866 Yao J, Zhang S, Yao Y et al (2022) Edge-cloud polarization and collaboration: a comprehensive survey for AI. IEEE Trans Knowl Data Eng 35:6866
3.
Zurück zum Zitat Liu Z, Sun M, Zhou T et al (2018) Rethinking the value of network pruning. arXiv preprint https://arxiv.org/abs/1810.0527 Liu Z, Sun M, Zhou T et al (2018) Rethinking the value of network pruning. arXiv preprint https://​arxiv.​org/​abs/​1810.​0527
4.
Zurück zum Zitat Wang D, Zhang S, Di Z et al (2022) A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation. arXiv preprint https://arxiv.org/abs/2202.10461 Wang D, Zhang S, Di Z et al (2022) A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation. arXiv preprint https://​arxiv.​org/​abs/​2202.​10461
5.
Zurück zum Zitat Yang C, An Z, Cai L et al (2022) Knowledge distillation using hierarchical self-supervision augmented distribution. IEEE Trans Neural Netw Learn Syst (TNNLS) 12:1–15 Yang C, An Z, Cai L et al (2022) Knowledge distillation using hierarchical self-supervision augmented distribution. IEEE Trans Neural Netw Learn Syst (TNNLS) 12:1–15
6.
Zurück zum Zitat Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint https://arxiv.org/abs/1503.02531 Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint https://​arxiv.​org/​abs/​1503.​02531
7.
Zurück zum Zitat Romero A, Ballas N, Kahou SE et al (2015) FitNets: hints for thin deep nets. arXiv preprint https://arxiv.org/abs/1412.6550 Romero A, Ballas N, Kahou SE et al (2015) FitNets: hints for thin deep nets. arXiv preprint https://​arxiv.​org/​abs/​1412.​6550
8.
Zurück zum Zitat Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint https://arxiv.org/abs/1612.03928 Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint https://​arxiv.​org/​abs/​1612.​03928
9.
Zurück zum Zitat Yim J, Joo D, Bae J et al (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7130–7138 Yim J, Joo D, Bae J et al (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7130–7138
10.
Zurück zum Zitat Kim J, Park SU, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. Neural Inf Process Syst (NIPS) 31:2765–2774 Kim J, Park SU, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. Neural Inf Process Syst (NIPS) 31:2765–2774
11.
Zurück zum Zitat Heo B, Lee M, Yun S et al (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI Conference on Artificial Intelligence, pp 3779–3787 Heo B, Lee M, Yun S et al (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI Conference on Artificial Intelligence, pp 3779–3787
12.
Zurück zum Zitat Song J, Chen Y, Ye J et al (2022) Spot-adaptive knowledge distillation. In: IEEE Trans Image Process, pp 3359–3370 Song J, Chen Y, Ye J et al (2022) Spot-adaptive knowledge distillation. In: IEEE Trans Image Process, pp 3359–3370
13.
Zurück zum Zitat Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning (PMLR), pp 4723–4731 Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning (PMLR), pp 4723–4731
14.
Zurück zum Zitat Zhang L, Song J, Gao A et al (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 3713–3722 Zhang L, Song J, Gao A et al (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 3713–3722
15.
Zurück zum Zitat Lin M, Ji R, Wang Y et al (2020) Hrank: Filter pruning using high-rank feature map. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1529–1538 Lin M, Ji R, Wang Y et al (2020) Hrank: Filter pruning using high-rank feature map. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1529–1538
16.
Zurück zum Zitat Amik FR, Tasin AI, Ahmed S et al (2022) Dynamic Rectification Knowledge Distillation. arXiv preprint https://arxiv.org/abs/2201.11319 Amik FR, Tasin AI, Ahmed S et al (2022) Dynamic Rectification Knowledge Distillation. arXiv preprint https://​arxiv.​org/​abs/​2201.​11319
17.
Zurück zum Zitat Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook of systemic autoimmune diseases 1(4) Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook of systemic autoimmune diseases 1(4)
18.
Zurück zum Zitat Park W, Kim D, Lu Y et al (2019) Relational knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3967–3976 Park W, Kim D, Lu Y et al (2019) Relational knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3967–3976
19.
Zurück zum Zitat Tian Y, Krishnan D, Isola P (2019) Contrastive representation distillation. arXiv preprint https://arxiv.org/abs/1910.10699 Tian Y, Krishnan D, Isola P (2019) Contrastive representation distillation. arXiv preprint https://​arxiv.​org/​abs/​1910.​10699
20.
Zurück zum Zitat Heo B, Kim J, Yun S et al (2019) A comprehensive overhaul of feature distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1921–1930 Heo B, Kim J, Yun S et al (2019) A comprehensive overhaul of feature distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1921–1930
21.
Zurück zum Zitat Huang Z, Wang N (2017) Like what you like: knowledge distill via neuron selectivity transfer. arXiv preprint https://arxiv.org/abs/1707.01219 Huang Z, Wang N (2017) Like what you like: knowledge distill via neuron selectivity transfer. arXiv preprint https://​arxiv.​org/​abs/​1707.​01219
22.
Zurück zum Zitat Passalis N, Tefas A (2018) Probabilistic knowledge transfer for deep representation learning. IEEE Trans Neural Netw Learn Syst (TNNLS) 32:2030–2039MathSciNetCrossRef Passalis N, Tefas A (2018) Probabilistic knowledge transfer for deep representation learning. IEEE Trans Neural Netw Learn Syst (TNNLS) 32:2030–2039MathSciNetCrossRef
23.
Zurück zum Zitat Ahn S, Hu SX, Damianou A et al (2019) Variational information distillation for knowledge transfer. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9163 – 917 Ahn S, Hu SX, Damianou A et al (2019) Variational information distillation for knowledge transfer. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9163 – 917
24.
Zurück zum Zitat Sun D, Yao A, Zhou A et al (2019) Deeply-supervised knowledge synergy. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6997–7006 Sun D, Yao A, Zhou A et al (2019) Deeply-supervised knowledge synergy. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6997–7006
25.
Zurück zum Zitat Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 1365–1374 Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 1365–1374
26.
Zurück zum Zitat Lee CY, Xie S, Gallagher P et al (2015) Deeply-supervised nets. Artif Intell Stat PMLR 21:562–570 Lee CY, Xie S, Gallagher P et al (2015) Deeply-supervised nets. Artif Intell Stat PMLR 21:562–570
27.
Zurück zum Zitat Peng B, Jin X, Liu J et al (2019) Correlation congruence for knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 5007–5016 Peng B, Jin X, Liu J et al (2019) Correlation congruence for knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 5007–5016
28.
Zurück zum Zitat Chen D, Mei JP, Zhang Y et al (2021) Cross-layer distillation with semantic calibration. In: AAAI Conference on Artificial Intelligence, pp 7028–7036 Chen D, Mei JP, Zhang Y et al (2021) Cross-layer distillation with semantic calibration. In: AAAI Conference on Artificial Intelligence, pp 7028–7036
29.
Zurück zum Zitat Zhang L, Shi Y, Shi Z et al (2020) Task-oriented feature distillation. Neural Inf Process Syst (NIPS) 33:14759–14771 Zhang L, Shi Y, Shi Z et al (2020) Task-oriented feature distillation. Neural Inf Process Syst (NIPS) 33:14759–14771
30.
Zurück zum Zitat LeCun Y, Denker J, Solla S (1989) Optimal brain damage. Neural Inf Process Syst (NIPS) 598–605 LeCun Y, Denker J, Solla S (1989) Optimal brain damage. Neural Inf Process Syst (NIPS) 598–605
31.
Zurück zum Zitat Han S, Pool J, Tran J et al (2015) Learning both weights and connections for efficient neural network. Neural Inf Process Syst (NIPS) 1:1135–1143 Han S, Pool J, Tran J et al (2015) Learning both weights and connections for efficient neural network. Neural Inf Process Syst (NIPS) 1:1135–1143
32.
Zurück zum Zitat Frankle J, Carbin M (2018) The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint https://arxiv.org/abs/1803.03635 Frankle J, Carbin M (2018) The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint https://​arxiv.​org/​abs/​1803.​03635
33.
Zurück zum Zitat Frankle J, Dziugaite GK, Roy D et al (2020) Linear mode connectivity and the lottery ticket hypothesis. In: International Conference on Machine Learning (PMLR), pp 3259–3269 Frankle J, Dziugaite GK, Roy D et al (2020) Linear mode connectivity and the lottery ticket hypothesis. In: International Conference on Machine Learning (PMLR), pp 3259–3269
34.
Zurück zum Zitat Ye J, Lu X, Lin Z et al (2018) Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. arXiv preprint https://arxiv.org/abs/1802.00124 Ye J, Lu X, Lin Z et al (2018) Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. arXiv preprint https://​arxiv.​org/​abs/​1802.​00124
35.
Zurück zum Zitat Zhuang T, Zhang Z, Huang Y et al (2020) Neuron-level structured pruning using polarization regularizer. Neural Inf Process Syst (NIPS) 33:9865–9877 Zhuang T, Zhang Z, Huang Y et al (2020) Neuron-level structured pruning using polarization regularizer. Neural Inf Process Syst (NIPS) 33:9865–9877
36.
Zurück zum Zitat Hu H, Peng R, Tai YW et al (2016) Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint https://arxiv.org/abs/1607.03250 Hu H, Peng R, Tai YW et al (2016) Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint https://​arxiv.​org/​abs/​1607.​03250
37.
Zurück zum Zitat Luo JH, Wu J (2017) An entropy-based pruning method for cnn compression. arXiv preprint arXiv:1706.05791 Luo JH, Wu J (2017) An entropy-based pruning method for cnn compression. arXiv preprint arXiv:1706.05791
38.
Zurück zum Zitat He Y, Liu P, Wang Z et al (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4340–4349 He Y, Liu P, Wang Z et al (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4340–4349
39.
Zurück zum Zitat Wang X, Zheng Z, He Y et al (2023) Progressive local filter pruning for image retrieval acceleration. IEEE Trans Multimed 4:1–11 Wang X, Zheng Z, He Y et al (2023) Progressive local filter pruning for image retrieval acceleration. IEEE Trans Multimed 4:1–11
40.
Zurück zum Zitat Li G, Wang J, Shen HW et al (2020) Cnnpruner: pruning convolutional neural networks with visual analytics. IEEE Trans Vis Comput Gr 27:1364–1373CrossRef Li G, Wang J, Shen HW et al (2020) Cnnpruner: pruning convolutional neural networks with visual analytics. IEEE Trans Vis Comput Gr 27:1364–1373CrossRef
41.
Zurück zum Zitat Wang X, Zheng Z, He Y et al (2022) Soft person reidentification network pruning via blockwise adjacent filter decaying. IEEE Trans Cybern 52:13293–13307CrossRef Wang X, Zheng Z, He Y et al (2022) Soft person reidentification network pruning via blockwise adjacent filter decaying. IEEE Trans Cybern 52:13293–13307CrossRef
Metadaten
Titel
AFMPM: adaptive feature map pruning method based on feature distillation
verfasst von
Yufeng Guo
Weiwei Zhang
Junhuang Wang
Ming Ji
Chenghui Zhen
Zhengzheng Guo
Publikationsdatum
13.08.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 2/2024
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-023-01926-2

Weitere Artikel der Ausgabe 2/2024

International Journal of Machine Learning and Cybernetics 2/2024 Zur Ausgabe

Neuer Inhalt