Top

Published in:

2017 | OriginalPaper | Chapter

The Study of Architecture MLP with Linear Neurons in Order to Eliminate the “vanishing Gradient” Problem

Authors : Janusz Kolbusz, Pawel Rozycki, Bogdan M. Wilamowski

Published in: Artificial Intelligence and Soft Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Research in deep neural networks are becoming popular in artificial intelligence. Main reason for training difficulties is the problem of vanishing gradients while number of layers increases. While such networks are very powerful they are difficult in training. The paper discusses capabilities of different neural network architectures and presents the proposition of new multilayer architecture with additional linear neurons, that is much easier to train that traditional MLP network and reduces effect of vanishing gradients. Efficiency of suggested approach has been confirmed by several exeriments.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Echo State Networks Simulation of SIR Distributed Control

next chapter Convergence and Rates of Convergence of Recursive Radial Basis Functions Networks in Function Learning and Classification

Larochelle, H., et al.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10, 1–40 (2009)MATH

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef

Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef

Wilamowski, B.M., Bo, W., Korniak, J.: Big data and deep learning. In: 2016 IEEE 20th Jubilee International Conference on Intelligent Engineering Systems (INES). IEEE (2016)

Wilamowski, B.M., Korniak, J.: Learning architectures with enhanced capabilities and easier training. In: 2015 IEEE 19th International Conference on Intelligent Engineering Systems (INES). IEEE (2015)

Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Estimation of deep neural networks capabilities based on a trigonometric approach. In: IEEE 20th International Conference on Intelligent Engineering Systems (INES 2016), Budapest, pp. 30–2, June 2016

Wilamowski, B.M., Yu, H.: Neural network learning without backpropagation. IEEE Trans. Neural Networks 21(11), 1793–1803 (2010)CrossRef

10.

Hunter, D., Hao, Y., Pukish, M.S., Kolbusz, J., Wilamowski, B.M.: Selection of proper neural network sizes and architectures A comparative study. IEEE Trans. Industr. Inf. 8, 228–240 (2012)CrossRef

11.

Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Unc. Fuzz. Knowl. Based Syst. 06, 107 (1998)CrossRefMATH

12.

Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS (2010)

13.

He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV (2015)

14.

LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998). doi:10.1007/3-540-49430-8_2 CrossRef

15.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)

16.

He, K., J. Sun, J.: Convolutional neural networks at constrained time cost. In: CVPR (2015)

17.

Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arxiv:1505.00387 (2015)

18.

Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is dificult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRef

19.

Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. arXiv preprint arxiv:1409.5185 (2014)

20.

Rozycki, P., Kolbusz, J., Korostenskyi, R., Wilamowski, B.M.: Estimation of deep neural networks capabilities using polynomial approach. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 136–147. Springer, Cham (2016). doi:10.1007/978-3-319-39378-0_13

21.

Wilamowski, B.M., Yu, H.: Improved computation for levenberg marquardt training. IEEE Trans. Neural Networks 21(6), 930–937 (2010)CrossRef

22.

Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Dedicated deep neural network architectures and methods for their training. In: IEEE 19th International Conference on Intelligent Engineering Systems (INES 2015), Bratislava, pp. 73–78, 3–5 September 2015

23.

Hunter, D.: Utilizing Dual Neural Networks as a Tool for Training, Optimization, and Architecture Conversion. Ph.D. thesis, Auburn University (2013)

24.

Wilamowski, B.M., Yu, H.: NNT - Neural Networks Trainer. http://nng.wsiz.rzeszow.pl/

Title: The Study of Architecture MLP with Linear Neurons in Order to Eliminate the “vanishing Gradient” Problem
Authors: Janusz Kolbusz
Pawel Rozycki
Bogdan M. Wilamowski
Publisher: Springer International Publishing
Book: Artificial Intelligence and Soft Computing
Print ISBN: 978-3-319-59062-2

Electronic ISBN: 978-3-319-59063-9

Copyright Year: 2017
DOI: https://doi.org/10.1007/978-3-319-59063-9_9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner