Skip to main content

31.08.2024 | Research

PDD: Pruning Neural Networks During Knowledge Distillation

verfasst von: Xi Dan, Wenjie Yang, Fuyan Zhang, Yihang Zhou, Zhuojun Yu, Zhen Qiu, Boyuan Zhao, Zeyu Dong, Libo Huang, Chuanguang Yang

Erschienen in: Cognitive Computation

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although deep neural networks have developed at a high level, the large computational requirement limits the deployment in end devices. To this end, a variety of model compression and acceleration techniques have been developed. Among these, knowledge distillation has emerged as a popular approach that involves training a small student model to mimic the performance of a larger teacher model. However, the student architectures used in existing knowledge distillation are not optimal and always have redundancy, which raises questions about the validity of this assumption in practice. This study aims to investigate this assumption and empirically demonstrate that student models could contain redundancy, which can be removed through pruning without significant performance degradation. Therefore, we propose a novel pruning method to eliminate redundancy in student models. Instead of using traditional post-training pruning methods, we perform pruning during knowledge distillation (PDD) to prevent any loss of important information from the teacher models to the student models. This is achieved by designing a differentiable mask for each convolutional layer, which can dynamically adjust the channels to be pruned based on the loss. Experimental results show that with ResNet20 as the student model and ResNet56 as the teacher model, a 39.53%-FLOPs reduction was achieved by removing 32.77% of parameters, while the top-1 accuracy on CIFAR10 increased by 0.17%. With VGG11 as the student model and VGG16 as the teacher model, a 74.96%-FLOPs reduction was achieved by removing 76.43% of parameters, with only a loss of 1.34% in the top-1 accuracy on CIFAR10. Our code is available at https://​github.​com/​YihangZhou0424/​PDD-Pruning-during-distillation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adriana R, Nicolas B, Ebrahimi KS, Antoine C, Carlo G, Yoshua B. Fitnets: hints for thin deep nets. Proc ICLR. 2015;2(3):1. Adriana R, Nicolas B, Ebrahimi KS, Antoine C, Carlo G, Yoshua B. Fitnets: hints for thin deep nets. Proc ICLR. 2015;2(3):1.
2.
Zurück zum Zitat Belagiannis V, Farshad A, Galasso F. Adversarial network compression. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 2018;0–0 Belagiannis V, Farshad A, Galasso F. Adversarial network compression. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 2018;0–0
3.
Zurück zum Zitat Cai L, An Z, Yang C, Xu Y. Softer pruning, incremental regularization. In: 2020 25th international conference on pattern recognition (ICPR). 2021;224–230. IEEE Cai L, An Z, Yang C, Xu Y. Softer pruning, incremental regularization. In: 2020 25th international conference on pattern recognition (ICPR). 2021;224–230. IEEE
4.
Zurück zum Zitat Cai L, An Z, Yang C, Yan Y, Xu Y. Prior gradient mask guided pruning-aware fine-tuning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2022;36:140–148 Cai L, An Z, Yang C, Yan Y, Xu Y. Prior gradient mask guided pruning-aware fine-tuning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2022;36:140–148
5.
Zurück zum Zitat Covington P, Adams J, Sargin E. Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM conference on recommender systems. 2016;191–198 Covington P, Adams J, Sargin E. Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM conference on recommender systems. 2016;191–198
6.
Zurück zum Zitat Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. 2009;248–255. Ieee Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. 2009;248–255. Ieee
7.
Zurück zum Zitat Devlin J, Chang MW, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. 2018 arXiv:1810.04805 Devlin J, Chang MW, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. 2018 arXiv:​1810.​04805
8.
Zurück zum Zitat Dong X, Huang J, Yang Y, Yan S. More is less: a more complicated network with less inference complexity. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017;5840–5848 Dong X, Huang J, Yang Y, Yan S. More is less: a more complicated network with less inference complexity. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017;5840–5848
10.
Zurück zum Zitat Han S, Mao H, Dally WJ. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015 arXiv:1510.00149 Han S, Mao H, Dally WJ. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015 arXiv:​1510.​00149
11.
Zurück zum Zitat Hassibi B, Stork DG, Wolff GJ. Optimal brain surgeon and general network pruning. In: IEEE international conference on neural networks. 1993;293–299. IEEE Hassibi B, Stork DG, Wolff GJ. Optimal brain surgeon and general network pruning. In: IEEE international conference on neural networks. 1993;293–299. IEEE
12.
Zurück zum Zitat He Y, Kang G, Dong X, Fu Y, Yang Y. Soft filter pruning for accelerating deep convolutional neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018;2234–2240 He Y, Kang G, Dong X, Fu Y, Yang Y. Soft filter pruning for accelerating deep convolutional neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018;2234–2240
13.
Zurück zum Zitat He Y, Liu P, Wang Z, Hu Z, Yang Y. Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;4340–4349 He Y, Liu P, Wang Z, Hu Z, Yang Y. Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;4340–4349
14.
Zurück zum Zitat He Y, Zhang X, Sun J. Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision. 2017;1389–1397 He Y, Zhang X, Sun J. Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision. 2017;1389–1397
15.
Zurück zum Zitat Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015arXiv:1503.02531 Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015arXiv:1503.02531
16.
Zurück zum Zitat Hu H, Bai S, Li A, Cui J, Wang L. Dense relation distillation with context-aware aggregation for few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 2021;10185–10194 Hu H, Bai S, Li A, Cui J, Wang L. Dense relation distillation with context-aware aggregation for few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 2021;10185–10194
17.
Zurück zum Zitat Huang Z, Wang N. Data-driven sparse structure selection for deep neural networks. In: ECCV. 2018;304–320 Huang Z, Wang N. Data-driven sparse structure selection for deep neural networks. In: ECCV. 2018;304–320
18.
Zurück zum Zitat Jung S, Lee D, Park T, Moon T. Fair feature distillation for visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;12115–12124 Jung S, Lee D, Park T, Moon T. Fair feature distillation for visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;12115–12124
19.
Zurück zum Zitat Kim J, Park S. Kwak N. Paraphrasing complex network: Network compression via factor transfer. NeurIPS; 2018. p. 31. Kim J, Park S. Kwak N. Paraphrasing complex network: Network compression via factor transfer. NeurIPS; 2018. p. 31.
20.
Zurück zum Zitat Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;42(8):30–7.CrossRef Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;42(8):30–7.CrossRef
22.
Zurück zum Zitat Krizhevsky A. Learning multiple layers of features from tiny images. Citeseer: Tech. rep; 2009. Krizhevsky A. Learning multiple layers of features from tiny images. Citeseer: Tech. rep; 2009.
23.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84–90.CrossRef Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84–90.CrossRef
25.
Zurück zum Zitat LeCun Y, Denker J, Solla S. Optimal brain damage. NeurIPS 1989:2 LeCun Y, Denker J, Solla S. Optimal brain damage. NeurIPS 1989:2
26.
Zurück zum Zitat Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401(6755):788–91.CrossRef Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401(6755):788–91.CrossRef
27.
Zurück zum Zitat Li H, Kadav A, Durdanovic I, Samet H, Graf HP. Pruning filters for efficient convnets. 2016arXiv:1608.08710 Li H, Kadav A, Durdanovic I, Samet H, Graf HP. Pruning filters for efficient convnets. 2016arXiv:1608.08710
28.
Zurück zum Zitat Li L, Gan Z, Cheng Y, Liu J. Relation-aware graph attention network for visual question answering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) October 2019 Li L, Gan Z, Cheng Y, Liu J. Relation-aware graph attention network for visual question answering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) October 2019
29.
Zurück zum Zitat Lin M, Ji R, Wang Y, Zhang Y, Zhang B, Tian Y, Shao L. Hrank: filter pruning using high-rank feature map. In: CVPR. 2020;1529–1538 Lin M, Ji R, Wang Y, Zhang Y, Zhang B, Tian Y, Shao L. Hrank: filter pruning using high-rank feature map. In: CVPR. 2020;1529–1538
30.
Zurück zum Zitat Lin S, Ji R, Yan C, Zhang B, Cao L, Ye Q, Huang F, Doermann D. Towards optimal structured CNN pruning via generative adversarial learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;2790–2799 Lin S, Ji R, Yan C, Zhang B, Cao L, Ye Q, Huang F, Doermann D. Towards optimal structured CNN pruning via generative adversarial learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019;2790–2799
31.
Zurück zum Zitat Liu J, Tang J, Wu G. Residual feature distillation network for lightweight image super-resolution. In: Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. 2020;41–55. Springer Liu J, Tang J, Wu G. Residual feature distillation network for lightweight image super-resolution. In: Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. 2020;41–55. Springer
32.
Zurück zum Zitat Liu J, Zhuang B, Zhuang Z, Guo Y, Huang J, Zhu J, Tan M. Discrimination-aware network pruning for deep model compression. IEEE Trans Pattern Anal Mach Intell. 2021;44(8):4035–51. Liu J, Zhuang B, Zhuang Z, Guo Y, Huang J, Zhu J, Tan M. Discrimination-aware network pruning for deep model compression. IEEE Trans Pattern Anal Mach Intell. 2021;44(8):4035–51.
33.
Zurück zum Zitat Lu Y, Yang W, Zhang Y, Chen Z, Chen J, Xuan Q, Wang Z, Yang, X. Understanding the dynamics of DNNs using graph modularity. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XII. 2022;225–242. Springer Lu Y, Yang W, Zhang Y, Chen Z, Chen J, Xuan Q, Wang Z, Yang, X. Understanding the dynamics of DNNs using graph modularity. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XII. 2022;225–242. Springer
34.
Zurück zum Zitat Niu W, Ma X, Lin S, Wang S, Qian X, Lin X, Wang Y, Ren B. PatDNN: achieving real-time DNN execution on mobile devices with pattern-based weight pruning. In: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 2020;907–922 Niu W, Ma X, Lin S, Wang S, Qian X, Lin X, Wang Y, Ren B. PatDNN: achieving real-time DNN execution on mobile devices with pattern-based weight pruning. In: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 2020;907–922
35.
Zurück zum Zitat Papernot N, McDaniel P, Wu X, Jha S, Swami A. Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP). 2016;582–597. IEEE Papernot N, McDaniel P, Wu X, Jha S, Swami A. Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP). 2016;582–597. IEEE
36.
Zurück zum Zitat Park E, Ahn J, Yoo S. Weighted-entropy-based quantization for deep neural networks. In: CVPR. 2017;5456–5464 Park E, Ahn J, Yoo S. Weighted-entropy-based quantization for deep neural networks. In: CVPR. 2017;5456–5464
37.
Zurück zum Zitat Sanh V, Debut L, Chaumond J, Wolf T. Distilbert, a distilled version of Bert: smaller, faster, cheaper and lighter. 2019 arXiv:1910.01108 Sanh V, Debut L, Chaumond J, Wolf T. Distilbert, a distilled version of Bert: smaller, faster, cheaper and lighter. 2019 arXiv:​1910.​01108
38.
Zurück zum Zitat Shan Y, Hoens TR, Jiao J, Wang H, Yu D, Mao J. Deep crossing: web-scale modeling without manually crafted combinatorial features. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016;255–262 Shan Y, Hoens TR, Jiao J, Wang H, Yu D, Mao J. Deep crossing: web-scale modeling without manually crafted combinatorial features. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016;255–262
39.
Zurück zum Zitat Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al. Mastering the game of go with deep neural networks and tree search. nature 2016;529(7587):484–489 Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al. Mastering the game of go with deep neural networks and tree search. nature 2016;529(7587):484–489
40.
Zurück zum Zitat Tang H, Lu Y, Xuan Q. Sr-init: an interpretable layer pruning method. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023;1–5. IEEE Tang H, Lu Y, Xuan Q. Sr-init: an interpretable layer pruning method. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023;1–5. IEEE
41.
Zurück zum Zitat Wu S, Rupprecht C, Vedaldi A. Unsupervised learning of probably symmetric deformable 3d objects from images in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) June 2020 Wu S, Rupprecht C, Vedaldi A. Unsupervised learning of probably symmetric deformable 3d objects from images in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) June 2020
42.
Zurück zum Zitat Yang C, An Z, Cai L, Xu Y. Hierarchical self-supervised augmented knowledge distillation. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI). 2021;1217–1223 Yang C, An Z, Cai L, Xu Y. Hierarchical self-supervised augmented knowledge distillation. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI). 2021;1217–1223
43.
Zurück zum Zitat Yang C, An Z, Cai L, Xu Y. Knowledge distillation using hierarchical self-supervision augmented distribution. IEEE Transactions on Neural Networks and Learning Systems 2022;1–15 Yang C, An Z, Cai L, Xu Y. Knowledge distillation using hierarchical self-supervision augmented distribution. IEEE Transactions on Neural Networks and Learning Systems 2022;1–15
44.
Zurück zum Zitat Yang C, An Z, Cai L, Xu Y. Mutual contrastive learning for visual representation learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2022;36:3045–3053 Yang C, An Z, Cai L, Xu Y. Mutual contrastive learning for visual representation learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2022;36:3045–3053
45.
Zurück zum Zitat Yang C, An Z, Li C, Diao B, Xu Y. Multi-objective pruning for CNNs using genetic algorithm. In: International Conference on Artificial Neural Networks. 2019;299–305. Springer Yang C, An Z, Li C, Diao B, Xu Y. Multi-objective pruning for CNNs using genetic algorithm. In: International Conference on Artificial Neural Networks. 2019;299–305. Springer
46.
Zurück zum Zitat Yang C, An Z, Xu Y. Multi-view contrastive learning for online knowledge distillation. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021;3750–3754. IEEE Yang C, An Z, Xu Y. Multi-view contrastive learning for online knowledge distillation. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021;3750–3754. IEEE
47.
Zurück zum Zitat Yang C, An Z, Zhou H, Cai L, Zhi X, Wu J, Xu Y, Zhang Q. Mixskd: self-knowledge distillation from mixup for image recognition. In: European Conference on Computer Vision. 2022;534–551. Springer Yang C, An Z, Zhou H, Cai L, Zhi X, Wu J, Xu Y, Zhang Q. Mixskd: self-knowledge distillation from mixup for image recognition. In: European Conference on Computer Vision. 2022;534–551. Springer
48.
Zurück zum Zitat Yang C, An Z, Zhou H, Zhuang F, Xu Y, Zhang Q. Online knowledge distillation via mutual contrastive learning for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2023;45(8):10212–27.CrossRef Yang C, An Z, Zhou H, Zhuang F, Xu Y, Zhang Q. Online knowledge distillation via mutual contrastive learning for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2023;45(8):10212–27.CrossRef
49.
Zurück zum Zitat Yang C, An Z, Zhu H, Hu X, Zhang K, Xu K, Li C, Xu Y. Gated convolutional networks with hybrid connectivity for image classification. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34 :12581–12588 Yang C, An Z, Zhu H, Hu X, Zhang K, Xu K, Li C, Xu Y. Gated convolutional networks with hybrid connectivity for image classification. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34 :12581–12588
50.
Zurück zum Zitat Yang C, Yu X, An Z, Xu Y. Categories of response-based, feature-based, and relation-based knowledge distillation. In: Advancements in Knowledge Distillation: Towards New Horizons of Intelligent Systems, 2023;1–32. Springer Yang C, Yu X, An Z, Xu Y. Categories of response-based, feature-based, and relation-based knowledge distillation. In: Advancements in Knowledge Distillation: Towards New Horizons of Intelligent Systems, 2023;1–32. Springer
51.
Zurück zum Zitat Yang C, Zhou H, An Z, Jiang X, Xu Y, Zhang Q. Cross-image relational knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022;12319–12328 Yang C, Zhou H, An Z, Jiang X, Xu Y, Zhang Q. Cross-image relational knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022;12319–12328
52.
Zurück zum Zitat Yu R, Li A, Chen CF, Lai JH, Morariu VI, Han X, Gao M, Lin CY, Davis LS. NISP: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018;9194–9203 Yu R, Li A, Chen CF, Lai JH, Morariu VI, Han X, Gao M, Lin CY, Davis LS. NISP: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018;9194–9203
53.
Zurück zum Zitat Zhang X, Xie W, Li Y, Lei J, Du Q. Filter pruning via learned representation median in the frequency domain. IEEE Transactions on Cybernetics. 2021;53(5):3165–75.CrossRef Zhang X, Xie W, Li Y, Lei J, Du Q. Filter pruning via learned representation median in the frequency domain. IEEE Transactions on Cybernetics. 2021;53(5):3165–75.CrossRef
Metadaten
Titel
PDD: Pruning Neural Networks During Knowledge Distillation
verfasst von
Xi Dan
Wenjie Yang
Fuyan Zhang
Yihang Zhou
Zhuojun Yu
Zhen Qiu
Boyuan Zhao
Zeyu Dong
Libo Huang
Chuanguang Yang
Publikationsdatum
31.08.2024
Verlag
Springer US
Erschienen in
Cognitive Computation
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-024-10350-9

Premium Partner