Skip to main content
Erschienen in: Neural Computing and Applications 11/2019

24.07.2018 | Original Article

Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

verfasst von: Shin Kamada, Takumi Ichimura, Akira Hara, Kenneth J. Mackin

Erschienen in: Neural Computing and Applications | Ausgabe 11/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recently, deep learning is receiving renewed attention in the field of artificial intelligence. Deep belief network (DBN) has a deep network architecture that can represent multiple features of input patterns hierarchically, using pre-trained restricted Boltzmann machines (RBMs). Such deep network architectures enable extremely high classification accuracy in many tasks compared to previous methods. However, determining various parameters to design effective deep network architectures is a difficult task even for experienced designers, since traditional RBM and DBN cannot change their network structure during the training. The adaptive structure learning method has been previously proposed for finding the optimum number of hidden neurons in multilayered neural networks. The method employs the neuron generation–annihilation algorithm by observing the variance of weight decays. We develop the adaptive structure learning method of RBM and DBN using the neuron generation–annihilation and layer generation algorithm by observing the variance of some parameters. The effectiveness of our proposed model was verified by tenfold cross-validation on benchmark data sets CIFAR-10 and CIFAR-100. The adaptive DBN achieved the highest classification accuracy (97.4% for CIFAR-10, 81.2% for CIFAR-100) among several latest DBN- and CNN-based methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160 Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160
2.
Zurück zum Zitat Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192 Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192
3.
Zurück zum Zitat Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616 Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616
4.
Zurück zum Zitat Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61CrossRef Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61CrossRef
5.
Zurück zum Zitat Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150CrossRef Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150CrossRef
6.
Zurück zum Zitat Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531CrossRef Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531CrossRef
7.
Zurück zum Zitat Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951CrossRef Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951CrossRef
9.
Zurück zum Zitat Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598 Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598
10.
Zurück zum Zitat Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRef Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRef
11.
Zurück zum Zitat Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619 Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619
12.
Zurück zum Zitat Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50 Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50
13.
Zurück zum Zitat Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210CrossRef Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210CrossRef
14.
Zurück zum Zitat Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148CrossRef Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148CrossRef
15.
Zurück zum Zitat Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278 Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278
16.
Zurück zum Zitat Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380 Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380
17.
Zurück zum Zitat Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974 Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974
18.
Zurück zum Zitat Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto
19.
Zurück zum Zitat KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17 KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17
20.
Zurück zum Zitat Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887CrossRef Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887CrossRef
21.
Zurück zum Zitat Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835 Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835
22.
Zurück zum Zitat Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602 Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602
23.
Zurück zum Zitat Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10 Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10
24.
Zurück zum Zitat Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346 Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346
25.
Zurück zum Zitat Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223 Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223
26.
Zurück zum Zitat Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:1707.04780 Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:​1707.​04780
27.
Zurück zum Zitat Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685CrossRef Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685CrossRef
28.
Zurück zum Zitat Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904CrossRef Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904CrossRef
29.
Zurück zum Zitat Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274 Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274
30.
Zurück zum Zitat Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837CrossRef Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837CrossRef
31.
Zurück zum Zitat Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722 Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722
32.
Zurück zum Zitat Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184CrossRef Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184CrossRef
33.
Zurück zum Zitat Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146CrossRef Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146CrossRef
34.
Zurück zum Zitat Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:1711.03130 [cs.LG] Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:​1711.​03130 [cs.LG]
35.
Zurück zum Zitat Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532 Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532
36.
Zurück zum Zitat Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169CrossRef Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169CrossRef
37.
Zurück zum Zitat Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800CrossRef Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800CrossRef
38.
Zurück zum Zitat Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071 Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071
39.
Zurück zum Zitat Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594 Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594
40.
Zurück zum Zitat Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119 Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119
43.
Zurück zum Zitat Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124 Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124
44.
Zurück zum Zitat Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327 Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327
45.
Zurück zum Zitat Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016) Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016)
47.
Zurück zum Zitat Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12 Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12
Metadaten
Titel
Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation
verfasst von
Shin Kamada
Takumi Ichimura
Akira Hara
Kenneth J. Mackin
Publikationsdatum
24.07.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 11/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3622-y

Weitere Artikel der Ausgabe 11/2019

Neural Computing and Applications 11/2019 Zur Ausgabe