nach oben

Neural Computing and Applications

Erschienen in:

24.07.2018 | Original Article

Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

verfasst von: Shin Kamada, Takumi Ichimura, Akira Hara, Kenneth J. Mackin

Erschienen in: Neural Computing and Applications | Ausgabe 11/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Recently, deep learning is receiving renewed attention in the field of artificial intelligence. Deep belief network (DBN) has a deep network architecture that can represent multiple features of input patterns hierarchically, using pre-trained restricted Boltzmann machines (RBMs). Such deep network architectures enable extremely high classification accuracy in many tasks compared to previous methods. However, determining various parameters to design effective deep network architectures is a difficult task even for experienced designers, since traditional RBM and DBN cannot change their network structure during the training. The adaptive structure learning method has been previously proposed for finding the optimum number of hidden neurons in multilayered neural networks. The method employs the neuron generation–annihilation algorithm by observing the variance of weight decays. We develop the adaptive structure learning method of RBM and DBN using the neuron generation–annihilation and layer generation algorithm by observing the variance of some parameters. The effectiveness of our proposed model was verified by tenfold cross-validation on benchmark data sets CIFAR-10 and CIFAR-100. The adaptive DBN achieved the highest classification accuracy (97.4% for CIFAR-10, 81.2% for CIFAR-100) among several latest DBN- and CNN-based methods.

Vorheriger Artikel Deep learning to detect botnet via network flow summaries

Nächster Artikel A local cores-based hierarchical clustering algorithm for data sets with complex structures

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160

Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192

Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616

Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61CrossRef

Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150CrossRef

Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531CrossRef

Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951CrossRef

Bengio Y (2009) Learning Deep Architectures for AI. Found Trends Mach Learn Arch 2(1):1–127MathSciNetCrossRef

Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598

10.

Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRef

11.

Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619

12.

Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50

13.

Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210CrossRef

14.

Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148CrossRef

15.

Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278

16.

Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380

17.

Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974

18.

Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto

19.

KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17

20.

Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887CrossRef

21.

Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835

22.

Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602

23.

Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10

24.

Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346

25.

Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223

26.

Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:1707.04780

27.

Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685CrossRef

28.

Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904CrossRef

29.

Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274

30.

Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837CrossRef

31.

Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722

32.

Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184CrossRef

33.

Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146CrossRef

34.

Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:1711.03130 [cs.LG]

35.

Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532

36.

Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169CrossRef

37.

Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800CrossRef

38.

Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071

39.

Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594

40.

Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119

41.

LeCun Y et al (2015) THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 26 June 2017

42.

Cortes C et al (2016) AdaNet: adaptive structural learning of artificial neural networks. arXiv:1607.01097

43.

Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124

44.

Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327

45.

Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016)

46.

Benjamin G (2015) Fractional max-pooling. arXiv:1412.6071

47.

Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12

Titel: Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation
verfasst von: Shin Kamada
Takumi Ichimura
Akira Hara
Kenneth J. Mackin
Publikationsdatum: 24.07.2018
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 11/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-018-3622-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 11/2019

An adaptive efficient memristive ink drop spread (IDS) computing system

Calculations of the heat source parameters on the basis of temperature fields with the use of ANN

Application of nature inspired optimization algorithms in optimum positioning of pump-as-turbines in water distribution networks

Robust subspace learning-based low-rank representation for manifold clustering

Delay-distribution-dependent non-fragile state estimation for discrete-time neural networks under event-triggered mechanism

Image splicing forgery detection based on low-dimensional singular value decomposition of discrete cosine transform coefficients