Skip to main content
Erschienen in: Neural Processing Letters 1/2018

20.09.2017

Knowledge Fusion in Feedforward Artificial Neural Networks

verfasst von: Milad I. Akhlaghi, Sergey V. Sukhov

Erschienen in: Neural Processing Letters | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Artificial neural networks are well known computational models that have been successful in demonstrating various human cognitive capabilities. Nevertheless, as opposed to the human brain, neural networks usually require starting from the scratch to learn a new task. Furthermore, in contrast to human abilities, re-training a network on a new task will not conserve already learned information necessarily and may lead to a catastrophic forgetting. Having a well-established method for knowledge transfer between neural networks can alleviate these issues. Here in this paper, we propose a method to fuse knowledge contained in separate trained networks. The method is non-iterative and does not require initial or additional training data or training sessions. The theoretical basis of the model based on a probabilistic approach is presented and its performance for feedforward neural networks is tested on classification tasks for several publicly available data sets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
2.
Zurück zum Zitat Balakrishnan S, Madigan D (2008) Algorithms for sparse linear classifiers in the massive data setting. J Mach Learn Res 9:313–337MATH Balakrishnan S, Madigan D (2008) Algorithms for sparse linear classifiers in the massive data setting. J Mach Learn Res 9:313–337MATH
3.
Zurück zum Zitat Muller K-R, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201CrossRef Muller K-R, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201CrossRef
4.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
5.
Zurück zum Zitat Mokhtari F, Hossein-Zadeh G-A (2013) Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks. J Neurosci Methods 212(2):259–268CrossRef Mokhtari F, Hossein-Zadeh G-A (2013) Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks. J Neurosci Methods 212(2):259–268CrossRef
6.
Zurück zum Zitat Gao Z-K, Cai Q, Yang Y-X, Dong N, Zhang S-S (2017) Visibility graph from adaptive optimal kernel time-frequency representation for classification of epileptiform EEG. Int J Neural Syst 27(04):1750005CrossRef Gao Z-K, Cai Q, Yang Y-X, Dong N, Zhang S-S (2017) Visibility graph from adaptive optimal kernel time-frequency representation for classification of epileptiform EEG. Int J Neural Syst 27(04):1750005CrossRef
7.
Zurück zum Zitat McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. Psychol Learn Motiv 24:109–165CrossRef McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. Psychol Learn Motiv 24:109–165CrossRef
8.
Zurück zum Zitat Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans knowl Data Eng 22(10):1345–1359CrossRef
9.
Zurück zum Zitat Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2016) A review of adaptive online learning for artificial neural networks. Artif Intell Rev 1–19 Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2016) A review of adaptive online learning for artificial neural networks. Artif Intell Rev 1–19
10.
Zurück zum Zitat Jain LC, Seera M, Lim CP, Balasubramaniam P (2014) A review of online learning in supervised neural networks. Neural Comput Appl 25(3–4):491–509CrossRef Jain LC, Seera M, Lim CP, Balasubramaniam P (2014) A review of online learning in supervised neural networks. Neural Comput Appl 25(3–4):491–509CrossRef
11.
Zurück zum Zitat Thrun S, Pratt L (2012) Learning to learn. Springer, New YorkMATH Thrun S, Pratt L (2012) Learning to learn. Springer, New YorkMATH
12.
Zurück zum Zitat Yang L, Jing L, Yu J, Ng MK (2016) Learning transferred weights from co-occurrence data for heterogeneous transfer learning. IEEE Trans Neural Netw Learn Syst 27(11):2187–2200MathSciNetCrossRef Yang L, Jing L, Yu J, Ng MK (2016) Learning transferred weights from co-occurrence data for heterogeneous transfer learning. IEEE Trans Neural Netw Learn Syst 27(11):2187–2200MathSciNetCrossRef
13.
Zurück zum Zitat Li J, Zhao R, Huang J-T, Gong Y (2014) Learning small-size DNN with output-distribution-based criteria. In: Fifteenth annual conference of the international speech communication association, Li J, Zhao R, Huang J-T, Gong Y (2014) Learning small-size DNN with output-distribution-based criteria. In: Fifteenth annual conference of the international speech communication association,
14.
Zurück zum Zitat Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. Paper presented at the NIPS workshop Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. Paper presented at the NIPS workshop
15.
Zurück zum Zitat Ba J, Caruana R (2014) Do deep nets really need to be deep? Advances in neural information processing systems 2654–2662 Ba J, Caruana R (2014) Do deep nets really need to be deep? Advances in neural information processing systems 2654–2662
16.
Zurück zum Zitat Tang Z, Wang D, Zhang Z (2016) Recurrent neural network training with dark knowledge transfer. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5900–5904 Tang Z, Wang D, Zhang Z (2016) Recurrent neural network training with dark knowledge transfer. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5900–5904
18.
Zurück zum Zitat Ratcliff R (1990) Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychol Rev 97(2):285–308CrossRef Ratcliff R (1990) Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychol Rev 97(2):285–308CrossRef
19.
Zurück zum Zitat French RM (1991) Using semi-distributed representations to overcome catastrophic interference in connectionist networks. In: Proceedings of the thirteenth annual conference of the cognitive science society, Erlbaum, Hillsdale, pp 173–178 French RM (1991) Using semi-distributed representations to overcome catastrophic interference in connectionist networks. In: Proceedings of the thirteenth annual conference of the cognitive science society, Erlbaum, Hillsdale, pp 173–178
20.
Zurück zum Zitat French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128–135CrossRef French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128–135CrossRef
21.
Zurück zum Zitat Caruana R (1998) Multitask learning. In: Learning to learn. Springer, New York, pp 95–133CrossRef Caruana R (1998) Multitask learning. In: Learning to learn. Springer, New York, pp 95–133CrossRef
22.
Zurück zum Zitat Li Z, Hoiem D (2016) Learning without forgetting. In: European conference on computer vision. Springer, New York, pp 614–629 Li Z, Hoiem D (2016) Learning without forgetting. In: European conference on computer vision. Springer, New York, pp 614–629
23.
Zurück zum Zitat Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A deep convolutional activation feature for generic visual recognition. In: International conference in machine learning (ICML), pp 647–655 Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A deep convolutional activation feature for generic visual recognition. In: International conference in machine learning (ICML), pp 647–655
24.
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
25.
Zurück zum Zitat French RM, Ans B, Rousset S (2001) Pseudopatterns and dual-network memory models: advantages and shortcomings. In: Connectionist models of learning, development and evolution. Springer, pp 13–22 French RM, Ans B, Rousset S (2001) Pseudopatterns and dual-network memory models: advantages and shortcomings. In: Connectionist models of learning, development and evolution. Springer, pp 13–22
26.
Zurück zum Zitat Li H, Wang X, Ding S (2017) Research and development of neural network ensembles: a survey. Artif Intell Rev 1–25 Li H, Wang X, Ding S (2017) Research and development of neural network ensembles: a survey. Artif Intell Rev 1–25
27.
Zurück zum Zitat Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17CrossRef Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17CrossRef
29.
Zurück zum Zitat Domingos P (2000) Bayesian averaging of classifiers and the overfitting problem. In: 17th international conference on machine learning, San Francisco, pp 223–230 Domingos P (2000) Bayesian averaging of classifiers and the overfitting problem. In: 17th international conference on machine learning, San Francisco, pp 223–230
30.
31.
Zurück zum Zitat Cho S-B, Kim JH (1995) Multiple network fusion using fuzzy logic. IEEE Trans Neural Netw 6(2):497–501CrossRef Cho S-B, Kim JH (1995) Multiple network fusion using fuzzy logic. IEEE Trans Neural Netw 6(2):497–501CrossRef
32.
Zurück zum Zitat Buciluǎ C, Caruana R, Niculescu-Mizil (2006) A Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 535–541 Buciluǎ C, Caruana R, Niculescu-Mizil (2006) A Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 535–541
33.
Zurück zum Zitat Mantas CJ (2008) A generic fuzzy aggregation operator: rules extraction from and insertion into artificial neural networks. Soft Comput 12(5):493–514CrossRefMATH Mantas CJ (2008) A generic fuzzy aggregation operator: rules extraction from and insertion into artificial neural networks. Soft Comput 12(5):493–514CrossRefMATH
34.
Zurück zum Zitat Kolman E, Margaliot M (2005) Are artificial neural networks white boxes? IEEE Trans Neural Netw 16(4):844–852CrossRef Kolman E, Margaliot M (2005) Are artificial neural networks white boxes? IEEE Trans Neural Netw 16(4):844–852CrossRef
35.
Zurück zum Zitat Hruschka ER, Ebecken NF (2006) Extracting rules from multilayer perceptrons in classification problems: a clustering-based approach. Neurocomputing 70(1):384–397CrossRef Hruschka ER, Ebecken NF (2006) Extracting rules from multilayer perceptrons in classification problems: a clustering-based approach. Neurocomputing 70(1):384–397CrossRef
36.
Zurück zum Zitat McGarry KJ, MacIntyre J (1999) Knowledge extraction and insertion from radial basis function networks. In: IEE colloquium on applied statistical pattern recognition, p 15 McGarry KJ, MacIntyre J (1999) Knowledge extraction and insertion from radial basis function networks. In: IEE colloquium on applied statistical pattern recognition, p 15
37.
Zurück zum Zitat Kasabov N, Woodford B (1999) Rule insertion and rule extraction from evolving fuzzy neural networks: algorithms and applications for building adaptive, intelligent expert systems. In: Fuzzy systems conference proceedings, 1999. FUZZ-IEEE’99. 1999 IEEE international. IEEE, pp 1406–1411 Kasabov N, Woodford B (1999) Rule insertion and rule extraction from evolving fuzzy neural networks: algorithms and applications for building adaptive, intelligent expert systems. In: Fuzzy systems conference proceedings, 1999. FUZZ-IEEE’99. 1999 IEEE international. IEEE, pp 1406–1411
38.
Zurück zum Zitat Tran SN, Garcez ASdA (2016) Deep logic networks: inserting and extracting knowledge from deep belief networks. IEEE transactions on neural networks and learning systems, vol 99, pp 1–13. doi:10.1109/TNNLS.2016.2603784 Tran SN, Garcez ASdA (2016) Deep logic networks: inserting and extracting knowledge from deep belief networks. IEEE transactions on neural networks and learning systems, vol 99, pp 1–13. doi:10.​1109/​TNNLS.​2016.​2603784
39.
Zurück zum Zitat Tran SN, Garcez AdA (2013) Knowledge extraction from deep belief networks for images. In: IJCAI-2013 workshop on neural-symbolic learning and reasoning Tran SN, Garcez AdA (2013) Knowledge extraction from deep belief networks for images. In: IJCAI-2013 workshop on neural-symbolic learning and reasoning
40.
Zurück zum Zitat Fourati H (2015) Multisensor data fusion: from algorithms and architectural design to applications. CRC Press, Boca Raton Fourati H (2015) Multisensor data fusion: from algorithms and architectural design to applications. CRC Press, Boca Raton
41.
Zurück zum Zitat Remagnino P, Monekosso DN, Jain LC (eds) Innovations in defence support systems -3, vol 336. Springer, Berlin Heidelberg, pp 1–21 Remagnino P, Monekosso DN, Jain LC (eds) Innovations in defence support systems -3, vol 336. Springer, Berlin Heidelberg, pp 1–21
42.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
43.
Zurück zum Zitat Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR. Citeseer, pp 958–962 Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR. Citeseer, pp 958–962
44.
Zurück zum Zitat Quinlan JR (1987) Simplifying decision trees. Int J Man-Mach Stud 27(3):221–234CrossRef Quinlan JR (1987) Simplifying decision trees. Int J Man-Mach Stud 27(3):221–234CrossRef
45.
Zurück zum Zitat Noordewier MO, Towell GG, Shavlik JW (1991) Training knowledge-based neural networks to recognize genes in DNA sequences. In: Advances in neural information processing systems (vol 3, pp 530–536). Denver, CO: Morgan Kaufmann Noordewier MO, Towell GG, Shavlik JW (1991) Training knowledge-based neural networks to recognize genes in DNA sequences. In: Advances in neural information processing systems (vol 3, pp 530–536). Denver, CO: Morgan Kaufmann
46.
Zurück zum Zitat Schlimmer JC (1987) Concept acquisition through representational adjustment. Doctoral disseration, Department of Information and Computer Science, University of California, Irvine Schlimmer JC (1987) Concept acquisition through representational adjustment. Doctoral disseration, Department of Information and Computer Science, University of California, Irvine
47.
Zurück zum Zitat Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554CrossRef Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554CrossRef
48.
Zurück zum Zitat McRae K, Hetherington PA (1993) Catastrophic interference is eliminated in pretrained networks. In: Proceedings of the 15h annual conference of the cognitive science society, pp 723–728 McRae K, Hetherington PA (1993) Catastrophic interference is eliminated in pretrained networks. In: Proceedings of the 15h annual conference of the cognitive science society, pp 723–728
50.
Zurück zum Zitat Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y Fitnets (2015) Hints for thin deep nets. In: Proceedings of international conference on learning representations Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y Fitnets (2015) Hints for thin deep nets. In: Proceedings of international conference on learning representations
Metadaten
Titel
Knowledge Fusion in Feedforward Artificial Neural Networks
verfasst von
Milad I. Akhlaghi
Sergey V. Sukhov
Publikationsdatum
20.09.2017
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2018
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-017-9712-5

Weitere Artikel der Ausgabe 1/2018

Neural Processing Letters 1/2018 Zur Ausgabe

Neuer Inhalt