Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 9/2022

03-05-2022 | Original Article

RD-NMSVM: neural mapping support vector machine based on parameter regularization and knowledge distillation

Authors: Jidong Han, Ting Zhang, Yujian Li, Zhaoying Liu

Published in: International Journal of Machine Learning and Cybernetics | Issue 9/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Artificial neural networks (ANNs) model has made remarkable achievements in many fields. Therefore, we have greater expectation for it, expecting it to have the same intelligence as human beings. However, ANNs still can’t perform continual learning like humans at present. The serious defect of ANNs model is called the catastrophic forgetting problem. For this problem, we put forward a novel method called neural mapping support vector machine based on Parameter Regularization and Knowledge Distillation or RD-NMSVM for short. Our model consists of three parts: firstly, the shared neural network module, which is used to extract common features of different tasks; Secondly, the specific task module, which employs multi-classification support vector machine as classifier, and it is equivalent to using neural networks as neural kernel mapping of support vector machine; Thirdly, the parameter regularization and knowledge distillation module, which inhibits the parameters of the shared network module from updating greatly and learns previous knowledge. Note that RD-NMSVM doesn’t utilize samples of previous tasks. From our experiments, we can see that RD-NMSVM has obvious advantages in eliminating catastrophic forgetting of ANNs model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
13.
go back to reference Castrejón L, Kundu K, Urtasun R, Fidler S (2017) Annotating object instances with a polygon-RNN. In: Proceedings: 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. pp 4485–4493 Castrejón L, Kundu K, Urtasun R, Fidler S (2017) Annotating object instances with a polygon-RNN. In: Proceedings: 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. pp 4485–4493
17.
go back to reference Chen J, Yang L, Zhang Y et al (2016) Combining fully convolutional and recurrent neural networks for 3d biomedical image segmentation. In: Advances in neural information processing systems. pp 3036–3044 Chen J, Yang L, Zhang Y et al (2016) Combining fully convolutional and recurrent neural networks for 3d biomedical image segmentation. In: Advances in neural information processing systems. pp 3036–3044
18.
go back to reference Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems. pp 5998–6008 Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems. pp 5998–6008
19.
go back to reference Zheng S, Lu J, Zhao H et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 6881–6890 Zheng S, Lu J, Zhao H et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 6881–6890
20.
go back to reference Dai Z, Cai B, Lin Y, Chen J (2021) Up-detr: Unsupervised pre-training for object detection with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1601–1610 Dai Z, Cai B, Lin Y, Chen J (2021) Up-detr: Unsupervised pre-training for object detection with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1601–1610
22.
go back to reference Chen H, Wang Y, Guo T et al (2021) Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 12299–12310 Chen H, Wang Y, Guo T et al (2021) Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 12299–12310
24.
go back to reference Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention—MICCAI 2015. Springer International Publishing, Cham, pp 234–241CrossRef Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention—MICCAI 2015. Springer International Publishing, Cham, pp 234–241CrossRef
27.
go back to reference Gehring J, Auli M, Grangier D et al (2017) Convolutional sequence to sequence learning. In: International conference on machine learning. PMLR, pp 1243–1252 Gehring J, Auli M, Grangier D et al (2017) Convolutional sequence to sequence learning. In: International conference on machine learning. PMLR, pp 1243–1252
28.
go back to reference Wang S, Wang B, Gong J et al (2020) Combining ResNet and transformer for Chinese grammatical error diagnosis. In: Proc 6th Work Nat Lang Process Tech Educ Appl, pp 36–43 Wang S, Wang B, Gong J et al (2020) Combining ResNet and transformer for Chinese grammatical error diagnosis. In: Proc 6th Work Nat Lang Process Tech Educ Appl, pp 36–43
29.
go back to reference Grossberg S (1982) Studies of mind and brain: neural principles of learning, perception, development, cognition, and motor control. Bost Stud Philos Sci xvii:662MATH Grossberg S (1982) Studies of mind and brain: neural principles of learning, perception, development, cognition, and motor control. Bost Stud Philos Sci xvii:662MATH
31.
go back to reference Rannen A, Aljundi R, Blaschko MB, Tuytelaars T (2017) Encoder based lifelong learning. In: 2017 IEEE international conference on computer vision (ICCV). pp 1329–1337 Rannen A, Aljundi R, Blaschko MB, Tuytelaars T (2017) Encoder based lifelong learning. In: 2017 IEEE international conference on computer vision (ICCV). pp 1329–1337
32.
go back to reference Farajtabar M, Azizan N, Mott A, Li A (2020) Orthogonal Gradient Descent for Continual Learning. In: Chiappa S, Calandra R (eds) Proceedings of the twenty third international conference on artificial intelligence and statistics. PMLR, pp 3762–3773 Farajtabar M, Azizan N, Mott A, Li A (2020) Orthogonal Gradient Descent for Continual Learning. In: Chiappa S, Calandra R (eds) Proceedings of the twenty third international conference on artificial intelligence and statistics. PMLR, pp 3762–3773
33.
go back to reference Zenke F, Poole B, Ganguli S (2017) Continual Learning Through Synaptic Intelligence. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning. PMLR, pp 3987–3995 Zenke F, Poole B, Ganguli S (2017) Continual Learning Through Synaptic Intelligence. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning. PMLR, pp 3987–3995
34.
go back to reference Isele D, Cosgun A (2018) Selective experience replay for lifelong learning. Proc AAAI Conf Artif Intell 32 Isele D, Cosgun A (2018) Selective experience replay for lifelong learning. Proc AAAI Conf Artif Intell 32
35.
go back to reference Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) iCaRL: incremental classifier and representation learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). pp 5533–5542 Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) iCaRL: incremental classifier and representation learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). pp 5533–5542
37.
go back to reference Chaudhry A, Dokania PK, Ajanthan T, Torr PHS (2018) Riemannian walk for incremental learning: understanding forgetting and intransigence. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11215 LNCS. pp 556–572. https://doi.org/10.1007/978-3-030-01252-6_33 Chaudhry A, Dokania PK, Ajanthan T, Torr PHS (2018) Riemannian walk for incremental learning: understanding forgetting and intransigence. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11215 LNCS. pp 556–572. https://​doi.​org/​10.​1007/​978-3-030-01252-6_​33
38.
go back to reference Schwarz J, Czarnecki W, Luketina J, et al (2018) Progress and compress: a scalable framework for continual learning. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning. PMLR, pp 4528–4537 Schwarz J, Czarnecki W, Luketina J, et al (2018) Progress and compress: a scalable framework for continual learning. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning. PMLR, pp 4528–4537
43.
go back to reference Sun J, Wang S, Zhang J, Zong C (2020) Distill and replay for continual language learning. In: Proceedings of the 28th international conference on computational linguistics. pp 3569–3579 Sun J, Wang S, Zhang J, Zong C (2020) Distill and replay for continual language learning. In: Proceedings of the 28th international conference on computational linguistics. pp 3569–3579
44.
go back to reference Monaikul N, Castellucci G, Filice S, Rokhlenko O (2021) Continual learning for named entity recognition. Proc AAAI Conf Artif Intell 35:13570–13577 Monaikul N, Castellucci G, Filice S, Rokhlenko O (2021) Continual learning for named entity recognition. Proc AAAI Conf Artif Intell 35:13570–13577
46.
go back to reference Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Ghahramani Z, Welling M, Cortes C et al (eds) Advances in neural information processing systems. Curran Associates Inc, New York Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Ghahramani Z, Welling M, Cortes C et al (eds) Advances in neural information processing systems. Curran Associates Inc, New York
47.
go back to reference Wu C, Herranz L, Liu X et al (2018) Memory replay GANs: learning to generate images from new categories without forgetting. In: Proceedings of the 32nd international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, USA, pp 5966–5976 Wu C, Herranz L, Liu X et al (2018) Memory replay GANs: learning to generate images from new categories without forgetting. In: Proceedings of the 32nd international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, USA, pp 5966–5976
Metadata
Title
RD-NMSVM: neural mapping support vector machine based on parameter regularization and knowledge distillation
Authors
Jidong Han
Ting Zhang
Yujian Li
Zhaoying Liu
Publication date
03-05-2022
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 9/2022
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-022-01563-1

Other articles of this Issue 9/2022

International Journal of Machine Learning and Cybernetics 9/2022 Go to the issue