nach oben

Optical Memory and Neural Networks

Erschienen in:

01.12.2023

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

verfasst von: Jiangxiao Zhang, Feng Gao, Lina Huo, Hongliang Wang, Ying Dang

Erschienen in: Optical Memory and Neural Networks | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.

Vorheriger Artikel Review on Pest Detection and Classification in Agricultural Environments Using Image-Based Deep Learning Models and Its Challenges

Nächster Artikel Information Added U-Net with Sharp Block for Nucleus Segmentation of Histopathology Images

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Kyu J. Han, Akshay Chandrashekaran, Jungsuk Kim, and Ian Lane, The capio 2017 conversational speech recognition system, arXiv preprint arXiv:1801.00059, 2017.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Rrecognition, 2016.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, Bert: Pretraining of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018.

Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho, Relational knowledge distillation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3967–3976.

Zhong Meng, Jinyu Li, Yong Zhao, and Yifan Gong, Conditional teacher-student learning, in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 6445–6449.

Geoffrey Hinton, Oriol Vinyals, Jeff Dean, et al., Distilling the knowledge in a neural network, vol. 2, no. 7, arXiv preprint arXiv:1503.02531, 2015.

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio, Fitnets: Hints for thin deep nets, arXiv preprint arXiv:1412.6550, 2014.

Zagoruyko, S. and Komodakis, N., Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer, arXiv preprint arXiv:1612.03928, 2016.

Jangho Kim, SeongUk Park, and Nojun Kwak, Paraphrasing complex network: Network compression via factor transfer, Adv. Neural Inf. Process. Syst., 2018, vol. 31.

10.

Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi, Knowledge transfer via distillation of activation boundaries formed by hidden neurons, in Proceedings of the AAAI Conference on Artificial Intelligence, 2019, vol. 33, pp. 3779–3787.CrossRef

11.

Peyman Passban, Yimeng Wu, Mehdi Rezagholizadeh, and Qun Liu, Alp-kd: Attentionbased layer projection for knowledge distillation, in Proceedings of the AAAI Conference on artificial intelligence, 2021, vol. 35, pp. 13657–13665.CrossRef

12.

Pengguang Chen, Shu Liu, Hengshuang Zhao, and Jiaya Jia, Distilling knowledge via knowledge review, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5008–5017.

13.

Yonglong Tian, Dilip Krishnan, and Phillip Isola, Contrastive representation distillation, arXiv preprint arXiv:1910.10699, 2019.

14.

Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, and Li Fei-Fei, Graph distillation for action detection with privileged modalities, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 166–183.

15.

Zhimao Peng, Zechao Li, Junge Zhang, Yan Li, Guo-Jun Qi, and Jinhui Tang, Few-shot image recognition with knowledge transfer, in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 441–449.

16.

Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, and Youliang Yan, Knowledge adaptation for efficient semantic segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 578–587.

17.

Yifan Liu, Ke Chen, Chris Liu, Zengchang Qin, Zhenbo Luo, and Jingdong Wang, Structured knowledge distillation for semantic segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2604–2613.

18.

Quanquan Li, Shengying Jin, and Junjie Yan, Mimicking very efficient network for object detection, in Proceedings of the ieee conference on computer vision and pattern recognition, 2017, pp. 6356–6364.

19.

Konstantin Shmelkov, Cordelia Schmid, and Karteek Alahari, Incremental learning of object detectors without catastrophic forgetting, in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 3400–3409.

20.

Ziqi Zhang, Yaya Shi, Chunfeng Yuan, Bing Li, Peijin Wang, Weiming Hu, and Zheng-Jun Zha, Object relational graph with teacherrecommended learning for video captioning, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13278–13288.

21.

Hongchen Tan, Xiuping Liu, Meng Liu, Baocai Yin, and Xin Li, Kt-gan: Knowledgetransfer generative adversarial network for textto-image synthesis, IEEE Trans. Image Process., 2020, vol. 30, pp. 1275–1290.CrossRef

22.

Borui Zhao, Quan Cui, Renjie Song, Yiyu Qiu, and Jiajun Liang, Decoupled knowledge distillation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11953–11962.

23.

Yuang Liu, Wei Zhang, and Jun Wang, Adaptive multi-teacher multi-level knowledge distillation, Neurocomputing, 2020, vol. 415, pp. 106– 113.CrossRef

24.

Hailin Zhang, Defang Chen, and Can Wang, Confidence-aware multi-teacher knowledge distillation, in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2022, pp. 4498–4502.

25.

Frederick Tung and Greg Mori, Similaritypreserving knowledge distillation, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1365–1374.

26.

Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, and Zhenwen Dai, Variational information distillation for knowledge transfer, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9163–9171.

27.

Passalis, N. and Tefas, A., Learning deep representations with probabilistic knowledge transfer, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 268–284.

28.

Krizhevsky, A. Hinton, G., et al., Learning multiple layers of features from tiny images, 2009.

29.

Ya Le and Xuan Yang, Tiny imagenet visual recognition challenge, 2015, CS 231N, vol. 7, no. 7, p. 3.

30.

Zagoruyko, S. and Komodakis, N., Wide residual networks, arXiv preprint arXiv:1605.07146, 2016.

31.

Sandler, M., Howard, A., Menglong Zhu, Zhmoginov, A., and Liang-Chieh Chen, Mobilenetv2: Inverted residuals and linear bottlenecks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510–4520.

32.

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun, Shufflenet: An extremely efficient convolutional neural network for mobile devices, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6848–6856.

33.

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun, Shufflenet v2: Practical guidelines for efficient cnn architecture design, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 116–131.

34.

Simonyan, K. and Zisserman, A., Very deep convolutional networks for largescale image recognition, arXiv preprint arXiv:1409.1556, 2014.

35.

Baoyun Peng, Xiao Jin, Jiaheng Liu, Dongsheng Li, Yichao Wu, Yu Liu, Shunfeng Zhou, and Zhaoning Zhang, Correlation congruence for knowledge distillation, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 5007–5016.

Titel: Enhancement of Knowledge Distillation via Non-Linear Feature Alignment
verfasst von: Jiangxiao Zhang
Feng Gao
Lina Huo
Hongliang Wang
Ying Dang
Publikationsdatum: 01.12.2023
Verlag: Pleiades Publishing
Erschienen in: Optical Memory and Neural Networks / Ausgabe 4/2023
Print ISSN: 1060-992X
Elektronische ISSN: 1934-7898
DOI: https://doi.org/10.3103/S1060992X23040136

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2023

Plant Foliage Disease Diagnosis Using Light-Weight Efficient Sequential CNN Model

Review on Pest Detection and Classification in Agricultural Environments Using Image-Based Deep Learning Models and Its Challenges

Data Augmentation and Fine Tuning of Convolutional Neural Network during Training for Person Re-Identification in Video Surveillance Systems

ASE-UNet: An Orange Fruit Segmentation Model in an Agricultural Environment Based on Deep Learning

Lessen Pressure Drop and Forecasting Thermal Performance in U-Tube Heat Exchanger Using Chimp Optimization and Deep Belief Neural Network

Development of Prediction Models for Vulnerable Road User Accident Severity

Premium Partner