Skip to main content
Top

2017 | OriginalPaper | Chapter

The Study of Architecture MLP with Linear Neurons in Order to Eliminate the “vanishing Gradient” Problem

Authors : Janusz Kolbusz, Pawel Rozycki, Bogdan M. Wilamowski

Published in: Artificial Intelligence and Soft Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Research in deep neural networks are becoming popular in artificial intelligence. Main reason for training difficulties is the problem of vanishing gradients while number of layers increases. While such networks are very powerful they are difficult in training. The paper discusses capabilities of different neural network architectures and presents the proposition of new multilayer architecture with additional linear neurons, that is much easier to train that traditional MLP network and reduces effect of vanishing gradients. Efficiency of suggested approach has been confirmed by several exeriments.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Larochelle, H., et al.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10, 1–40 (2009)MATH Larochelle, H., et al.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10, 1–40 (2009)MATH
2.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
3.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
4.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef
5.
go back to reference Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
6.
go back to reference Wilamowski, B.M., Bo, W., Korniak, J.: Big data and deep learning. In: 2016 IEEE 20th Jubilee International Conference on Intelligent Engineering Systems (INES). IEEE (2016) Wilamowski, B.M., Bo, W., Korniak, J.: Big data and deep learning. In: 2016 IEEE 20th Jubilee International Conference on Intelligent Engineering Systems (INES). IEEE (2016)
7.
go back to reference Wilamowski, B.M., Korniak, J.: Learning architectures with enhanced capabilities and easier training. In: 2015 IEEE 19th International Conference on Intelligent Engineering Systems (INES). IEEE (2015) Wilamowski, B.M., Korniak, J.: Learning architectures with enhanced capabilities and easier training. In: 2015 IEEE 19th International Conference on Intelligent Engineering Systems (INES). IEEE (2015)
8.
go back to reference Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Estimation of deep neural networks capabilities based on a trigonometric approach. In: IEEE 20th International Conference on Intelligent Engineering Systems (INES 2016), Budapest, pp. 30–2, June 2016 Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Estimation of deep neural networks capabilities based on a trigonometric approach. In: IEEE 20th International Conference on Intelligent Engineering Systems (INES 2016), Budapest, pp. 30–2, June 2016
9.
go back to reference Wilamowski, B.M., Yu, H.: Neural network learning without backpropagation. IEEE Trans. Neural Networks 21(11), 1793–1803 (2010)CrossRef Wilamowski, B.M., Yu, H.: Neural network learning without backpropagation. IEEE Trans. Neural Networks 21(11), 1793–1803 (2010)CrossRef
10.
go back to reference Hunter, D., Hao, Y., Pukish, M.S., Kolbusz, J., Wilamowski, B.M.: Selection of proper neural network sizes and architectures A comparative study. IEEE Trans. Industr. Inf. 8, 228–240 (2012)CrossRef Hunter, D., Hao, Y., Pukish, M.S., Kolbusz, J., Wilamowski, B.M.: Selection of proper neural network sizes and architectures A comparative study. IEEE Trans. Industr. Inf. 8, 228–240 (2012)CrossRef
11.
go back to reference Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Unc. Fuzz. Knowl. Based Syst. 06, 107 (1998)CrossRefMATH Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Unc. Fuzz. Knowl. Based Syst. 06, 107 (1998)CrossRefMATH
12.
go back to reference Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS (2010) Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS (2010)
13.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV (2015) He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV (2015)
14.
go back to reference LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998). doi:10.1007/3-540-49430-8_2 CrossRef LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998). doi:10.​1007/​3-540-49430-8_​2 CrossRef
15.
go back to reference Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015) Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)
16.
go back to reference He, K., J. Sun, J.: Convolutional neural networks at constrained time cost. In: CVPR (2015) He, K., J. Sun, J.: Convolutional neural networks at constrained time cost. In: CVPR (2015)
18.
go back to reference Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is dificult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRef Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is dificult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRef
20.
go back to reference Rozycki, P., Kolbusz, J., Korostenskyi, R., Wilamowski, B.M.: Estimation of deep neural networks capabilities using polynomial approach. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 136–147. Springer, Cham (2016). doi:10.1007/978-3-319-39378-0_13 Rozycki, P., Kolbusz, J., Korostenskyi, R., Wilamowski, B.M.: Estimation of deep neural networks capabilities using polynomial approach. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 136–147. Springer, Cham (2016). doi:10.​1007/​978-3-319-39378-0_​13
21.
go back to reference Wilamowski, B.M., Yu, H.: Improved computation for levenberg marquardt training. IEEE Trans. Neural Networks 21(6), 930–937 (2010)CrossRef Wilamowski, B.M., Yu, H.: Improved computation for levenberg marquardt training. IEEE Trans. Neural Networks 21(6), 930–937 (2010)CrossRef
22.
go back to reference Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Dedicated deep neural network architectures and methods for their training. In: IEEE 19th International Conference on Intelligent Engineering Systems (INES 2015), Bratislava, pp. 73–78, 3–5 September 2015 Rozycki, P., Kolbusz, J., Wilamowski, B.M.: Dedicated deep neural network architectures and methods for their training. In: IEEE 19th International Conference on Intelligent Engineering Systems (INES 2015), Bratislava, pp. 73–78, 3–5 September 2015
23.
go back to reference Hunter, D.: Utilizing Dual Neural Networks as a Tool for Training, Optimization, and Architecture Conversion. Ph.D. thesis, Auburn University (2013) Hunter, D.: Utilizing Dual Neural Networks as a Tool for Training, Optimization, and Architecture Conversion. Ph.D. thesis, Auburn University (2013)
Metadata
Title
The Study of Architecture MLP with Linear Neurons in Order to Eliminate the “vanishing Gradient” Problem
Authors
Janusz Kolbusz
Pawel Rozycki
Bogdan M. Wilamowski
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-59063-9_9

Premium Partner