Skip to main content
Top
Published in: Soft Computing 3/2017

02-11-2016 | Focus

Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network

Author: Angel Kuri-Morales

Published in: Soft Computing | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Multi-layered perceptron networks (MLP) have been proven to be universal approximators. However, to take advantage of this theoretical result, we must determine the smallest number of units in the hidden layer. Two basic theoretically established requirements are that an adequate activation function be selected and a proper training algorithm be applied. We must also guarantee that (a) The training data compile with the demands of the universal approximation theorem (UAT) and (b) The amount of information present in the training data be determined. We discuss how to preprocess the data in order to meet such demands. Once this is done, a closed formula to determine H may be applied. Knowing H implies that any unknown function associated to the training data may, in practice, be arbitrarily approximated by a MLP. We take advantage of previous work where a complexity regularization approach tried to minimize the RMS training error. In that work, an algebraic expression of H is attempted by sequential trial-and-error. In contrast, here we find a closed formula \(H=f(m_{O}, N)\) where \(m_{O}\) is the number of units in the input layer and N is the effective size of the training data. The algebraic expression we derive stems from statistically determined lower bounds of H in a range of interest of the \((m_{O}, N)\) pairs. The resulting sequence of 4250 triples \((H, m_{O}, N)\) is replaced by a single 12-term bivariate polynomial. To determine its 12 coefficients and the degrees of the 12 associated terms, a genetic algorithm was applied. The validity of the resulting formula is tested by determining the architecture of twelve MLPs for as many problems and verifying that the RMS error is minimal when using it to determine H.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Alistair M (1990) Implementing the PPM data compression scheme. IEEE Trans Commun 38(11):1917–1921CrossRef Alistair M (1990) Implementing the PPM data compression scheme. IEEE Trans Commun 38(11):1917–1921CrossRef
go back to reference Ash T (1989) Dynamic node creation in backpropagation networks. Connect Sci 1(4):365–375CrossRef Ash T (1989) Dynamic node creation in backpropagation networks. Connect Sci 1(4):365–375CrossRef
go back to reference Barron AR (1994) Approximation and estimation bounds for artificial neural networks. Mach Learn 14:115–133MATH Barron AR (1994) Approximation and estimation bounds for artificial neural networks. Mach Learn 14:115–133MATH
go back to reference Cheney EW (1966) Introduction to approximation theory. McGraw-Hill, New York, pp 45–51 Cheney EW (1966) Introduction to approximation theory. McGraw-Hill, New York, pp 45–51
go back to reference Fahlman SE (1988) An empirical study of learning speed in back propagation networks. In: Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufman Fahlman SE (1988) An empirical study of learning speed in back propagation networks. In: Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufman
go back to reference Fletcher L, Katkovnik V, Steffens FE, Engelbrecht AP (1998) Optimizing the number of hidden nodes of a feedforward artificial neural network. In: Proceedings of the IEEE International Joint Conference on Neural Networks, vol 2, pp 1608–1612 Fletcher L, Katkovnik V, Steffens FE, Engelbrecht AP (1998) Optimizing the number of hidden nodes of a feedforward artificial neural network. In: Proceedings of the IEEE International Joint Conference on Neural Networks, vol 2, pp 1608–1612
go back to reference Funahashi KI, Nakamura Y (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw 6(6):801–806CrossRef Funahashi KI, Nakamura Y (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw 6(6):801–806CrossRef
go back to reference Haykin SS et al (2009) Neural networks and learning machines, vol 3. Pearson Education, Upper Saddle River Haykin SS et al (2009) Neural networks and learning machines, vol 3. Pearson Education, Upper Saddle River
go back to reference Hearst MA, Dumais ST, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28 Hearst MA, Dumais ST, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
go back to reference Hecht-Nielsen R(1989) Theory of the backpropagation neural network. In: IEEE International Joint Conference on Neural Networks, 1989. IJCNN. pp 593–605 Hecht-Nielsen R(1989) Theory of the backpropagation neural network. In: IEEE International Joint Conference on Neural Networks, 1989. IJCNN. pp 593–605
go back to reference Hirose Y, Yamashita IC, Hijiya S (1991) Back-propagation algorithm which varies the number of hidden units. Neural Netw 4:61–66CrossRef Hirose Y, Yamashita IC, Hijiya S (1991) Back-propagation algorithm which varies the number of hidden units. Neural Netw 4:61–66CrossRef
go back to reference Kuri-Morales A, Aldana-Bobadilla E (2013) The best genetic algorithm I. In: Advances in soft computing and its applications. Springer, Berlin, pp 1–15 Kuri-Morales A, Aldana-Bobadilla E (2013) The best genetic algorithm I. In: Advances in soft computing and its applications. Springer, Berlin, pp 1–15
go back to reference Kuri-Morales A, Cartas-Ayala A (2014) Polynomial multivariate approximation with genetic algorithms. In: Canadian Conference on Artificial Intelligence. Springer International Publishing, pp 307–312 Kuri-Morales A, Cartas-Ayala A (2014) Polynomial multivariate approximation with genetic algorithms. In: Canadian Conference on Artificial Intelligence. Springer International Publishing, pp 307–312
go back to reference Kuri-Morales A, Aldana-Bobadilla E, López-Peña I (2013) The best genetic algorithm II. In: Advances in soft computing and its applications. Springer, Berlin, pp 16–29 Kuri-Morales A, Aldana-Bobadilla E, López-Peña I (2013) The best genetic algorithm II. In: Advances in soft computing and its applications. Springer, Berlin, pp 16–29
go back to reference Kurt H, Maxwell S, Halbert W (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef Kurt H, Maxwell S, Halbert W (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef
go back to reference Li M, Vitányi P (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New YorkCrossRefMATH Li M, Vitányi P (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New YorkCrossRefMATH
go back to reference Medeiros CMS, Guilherme AB (2013) A novel weight pruning method for MLP classifiers based on the MAXCORE principle. Neural Comput Appl 22(1):71–84CrossRef Medeiros CMS, Guilherme AB (2013) A novel weight pruning method for MLP classifiers based on the MAXCORE principle. Neural Comput Appl 22(1):71–84CrossRef
go back to reference Nash Warwick J, Sellers Tracy L, Talbot Simon R, Cawthorn Andrew J, Ford Wes B (1994) The Population Biology of Abalone (_Haliotis_ species) in Tasmania. I. Blacklip Abalone (_H. rubra_) from the North Coast and Islands of Bass Strait. Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288). https://archive.ics.uci.edu/ml/datasets/Abalone Nash Warwick J, Sellers Tracy L, Talbot Simon R, Cawthorn Andrew J, Ford Wes B (1994) The Population Biology of Abalone (_Haliotis_ species) in Tasmania. I. Blacklip Abalone (_H. rubra_) from the North Coast and Islands of Bass Strait. Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288). https://​archive.​ics.​uci.​edu/​ml/​datasets/​Abalone
go back to reference Networks N (1999) A comprehensive foundation, 2nd edn. Ch. 4, p 294, Notes and References 8, Prentice Hall International Networks N (1999) A comprehensive foundation, 2nd edn. Ch. 4, p 294, Notes and References 8, Prentice Hall International
go back to reference Noboru M, Shuji Y, Shun-ichi A (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5(6):865–872CrossRef Noboru M, Shuji Y, Shun-ichi A (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5(6):865–872CrossRef
go back to reference Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3(2):246–257CrossRef Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3(2):246–257CrossRef
go back to reference Reed R (1993) Pruning algorithms a survey. IEEE Trans Neural Netw 4(5):707–740CrossRef Reed R (1993) Pruning algorithms a survey. IEEE Trans Neural Netw 4(5):707–740CrossRef
go back to reference Rivals I, Personnaz L (2000) A statistical procedure for determining the optimal number of hidden neurons of a neural model. In: Second International Symposium on Neural Computation (NC’2000), Berlin, May 23–26 Rivals I, Personnaz L (2000) A statistical procedure for determining the optimal number of hidden neurons of a neural model. In: Second International Symposium on Neural Computation (NC’2000), Berlin, May 23–26
go back to reference Saw JG, Yang MC, Mo TC (1984) Chebyshev inequality with estimated mean and variance. Am Stat 38(2):130–132MathSciNet Saw JG, Yang MC, Mo TC (1984) Chebyshev inequality with estimated mean and variance. Am Stat 38(2):130–132MathSciNet
go back to reference Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef
go back to reference Shampine LF, Allen RC (1973) Numerical computing: an introduction. Harcourt Brace College Publishers, San DiegoMATH Shampine LF, Allen RC (1973) Numerical computing: an introduction. Harcourt Brace College Publishers, San DiegoMATH
go back to reference Teoh EJ, Tan KC, Xiang C (2006) Estimating the number of hidden neurons in a feedforward network using the singular value decomposition. IEEE Trans Neural Netw 17(6):1623–1629CrossRef Teoh EJ, Tan KC, Xiang C (2006) Estimating the number of hidden neurons in a feedforward network using the singular value decomposition. IEEE Trans Neural Netw 17(6):1623–1629CrossRef
go back to reference Vladimir V (2000) The nature of statistical learning theory. Springer, Berlin Vladimir V (2000) The nature of statistical learning theory. Springer, Berlin
go back to reference Xin Y (1999) Evolving artificial neural networks. IEEE Proc 87(9):1423–1447CrossRef Xin Y (1999) Evolving artificial neural networks. IEEE Proc 87(9):1423–1447CrossRef
go back to reference Xu L (1995) Ying-Yang machine: a Bayesian- Kullback scheme for unified learnings and new results on vector quantization. In: Keynote talk, Proceedings of International Conference on Neural Information Processing (ICONIP95), Oct. 30–NOV. 3, pp 977–988 Xu L (1995) Ying-Yang machine: a Bayesian- Kullback scheme for unified learnings and new results on vector quantization. In: Keynote talk, Proceedings of International Conference on Neural Information Processing (ICONIP95), Oct. 30–NOV. 3, pp 977–988
go back to reference Xu L (1997) Bayesian Ying-Yang System and Theory as A Unified Statistical Learning Approach: (III) Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. Lecture Notes in Computer Science: Proc. Of International Workshop on Theoretical Aspects of Neural Computation, May 26–28, 1997, Hong Kong, Springer, pp 43–60 Xu L (1997) Bayesian Ying-Yang System and Theory as A Unified Statistical Learning Approach: (III) Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. Lecture Notes in Computer Science: Proc. Of International Workshop on Theoretical Aspects of Neural Computation, May 26–28, 1997, Hong Kong, Springer, pp 43–60
go back to reference Xu S, Chen L (2008) Novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining. In: International Conference on Information Technology and Applications: iCITA. 2008. pp 683–686 Xu S, Chen L (2008) Novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining. In: International Conference on Information Technology and Applications: iCITA. 2008. pp 683–686
Metadata
Title
Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network
Author
Angel Kuri-Morales
Publication date
02-11-2016
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 3/2017
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-016-2416-3

Other articles of this Issue 3/2017

Soft Computing 3/2017 Go to the issue

Premium Partner