Skip to main content
Erschienen in: Neural Processing Letters 2/2020

25.07.2020

An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

verfasst von: Cody Dennis, Andries P. Engelbrecht, Beatrice M. Ombuki-Berman

Erschienen in: Neural Processing Letters | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The activation functions used in an artificial neural network define how nodes of the network respond to input, directly influence the shape of the error surface and play a role in the difficulty of the neural network training problem. Choice of activation functions is a significant question which must be addressed when applying a neural network to a problem. One issue which must be considered when selecting an activation function is known as activation function saturation. Saturation occurs when a bounded activation function primarily outputs values close to its boundary. Excessive saturation damages the network’s ability to encode information and may prevent successful training. Common functions such as the logistic and hyperbolic tangent functions have been shown to exhibit saturation when the neural network is trained using particle swarm optimization. This study proposes a new measure of activation function saturation, evaluates the saturation behavior of eight common activation functions, and evaluates six measures of controlling activation function saturation in particle swarm optimization based neural network training. Activation functions that result in low levels of saturation are identified. For each activation function recommendations are made regarding which saturation control mechanism is most effective at reducing saturation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
2.
Zurück zum Zitat Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, New YorkMATH Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, New YorkMATH
3.
Zurück zum Zitat Carvalho M, Ludermir R (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Proceedings of the international conference on 3D digital imaging and modeling, pp 1–5 Carvalho M, Ludermir R (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Proceedings of the international conference on 3D digital imaging and modeling, pp 1–5
5.
Zurück zum Zitat Dahl G, Sainath T, Hinton G (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: Proceedings of the conference on acoustics, speech and signal processing, pp 8609–8613 Dahl G, Sainath T, Hinton G (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: Proceedings of the conference on acoustics, speech and signal processing, pp 8609–8613
6.
Zurück zum Zitat Das M, Dulger L (2009) Signature vecification (SV) toolbox: applications of PSO-NN. Eng Appl Artific Intel 22(4):688–694CrossRef Das M, Dulger L (2009) Signature vecification (SV) toolbox: applications of PSO-NN. Eng Appl Artific Intel 22(4):688–694CrossRef
7.
Zurück zum Zitat Dreyfus G (2005) Neural networks: methodology and applications. Springer, BerlinMATH Dreyfus G (2005) Neural networks: methodology and applications. Springer, BerlinMATH
9.
Zurück zum Zitat Dugas C, Bengio Y, Belisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In: Proceedings of the conference on advances in neural information processing systems, pp 472–478 Dugas C, Bengio Y, Belisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In: Proceedings of the conference on advances in neural information processing systems, pp 472–478
10.
Zurück zum Zitat Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, pp 39–43 Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, pp 39–43
11.
Zurück zum Zitat Eggensperger K, Lindauer M, Hoos H, Hutter F, Leyton-Brown K (2018) Efficient benchmarking of algorithm configurators via model-based surrogates. Mach Learn 107(1):15–41MathSciNetCrossRef Eggensperger K, Lindauer M, Hoos H, Hutter F, Leyton-Brown K (2018) Efficient benchmarking of algorithm configurators via model-based surrogates. Mach Learn 107(1):15–41MathSciNetCrossRef
12.
Zurück zum Zitat Elliott D (1993) A better activation function for artificial neural networks. Technical report T.R. 93-8, University of Maryland Elliott D (1993) A better activation function for artificial neural networks. Technical report T.R. 93-8, University of Maryland
13.
Zurück zum Zitat Engelbrecht A (2012) Particle swarm optimization: velocity initialization. In: Proceedings of the congress on evolutionary computation, pp 1–8 Engelbrecht A (2012) Particle swarm optimization: velocity initialization. In: Proceedings of the congress on evolutionary computation, pp 1–8
14.
Zurück zum Zitat Engelbrecht A (2013) Particle swarm optimization: global best or local best? In: BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence. IEEE, pp 124–135 Engelbrecht A (2013) Particle swarm optimization: global best or local best? In: BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence. IEEE, pp 124–135
15.
Zurück zum Zitat Engelbrecht A, Cloete I, Geldenhuys J, Zurada J (1995) Automatic scaling using gamma learning for feedforward neural networks. In: Proceedings of the international workshop on artificial neural networks. Springer, pp 374–381 Engelbrecht A, Cloete I, Geldenhuys J, Zurada J (1995) Automatic scaling using gamma learning for feedforward neural networks. In: Proceedings of the international workshop on artificial neural networks. Springer, pp 374–381
18.
Zurück zum Zitat Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256 Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256
19.
Zurück zum Zitat Golik P, Doetsch P, Ney H (2013) Cross-entropy vs. squared error training: a theoretical and experimental comparison. Interspeech 13:1756–1760 Golik P, Doetsch P, Ney H (2013) Cross-entropy vs. squared error training: a theoretical and experimental comparison. Interspeech 13:1756–1760
20.
Zurück zum Zitat Gudise V, Venayagamoorthy G (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the swarm intelligence symposium, pp 110–117 Gudise V, Venayagamoorthy G (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the swarm intelligence symposium, pp 110–117
21.
Zurück zum Zitat Harrison K (2018) An analysis of parameter control mechanisms for the particle swarm optimization algorithm. Ph.D. thesis, University of Pretoria Harrison K (2018) An analysis of parameter control mechanisms for the particle swarm optimization algorithm. Ph.D. thesis, University of Pretoria
22.
Zurück zum Zitat Helwig S, Wanka R (2007) Particle swarm optimization in high dimensional bounded search spaces. In: Proceedings of the swarm intelligence symposium, pp 198–205 Helwig S, Wanka R (2007) Particle swarm optimization in high dimensional bounded search spaces. In: Proceedings of the swarm intelligence symposium, pp 198–205
23.
Zurück zum Zitat Helwig S, Wanka R (2008) Theoretical analysis of initial particle swarm behavior. In: Rudolph G, Jansen T, Beume N, Lucas S, Poloni C (eds) Proceedings of the international conference on parallel problem solving from nature, pp 889–898 Helwig S, Wanka R (2008) Theoretical analysis of initial particle swarm behavior. In: Rudolph G, Jansen T, Beume N, Lucas S, Poloni C (eds) Proceedings of the international conference on parallel problem solving from nature, pp 889–898
24.
Zurück zum Zitat Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Coello CA (ed) Learning and intelligent optimization. Springer, Heidelberg, pp 507–523CrossRef Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Coello CA (ed) Learning and intelligent optimization. Springer, Heidelberg, pp 507–523CrossRef
25.
Zurück zum Zitat Hutter F, Lücke J, Schmidt-Thieme L (2015) Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz 29(4):329–337CrossRef Hutter F, Lücke J, Schmidt-Thieme L (2015) Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz 29(4):329–337CrossRef
26.
Zurück zum Zitat Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the international conference on neural networks, vol 4, pp 1942–1948 Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the international conference on neural networks, vol 4, pp 1942–1948
27.
Zurück zum Zitat Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the international congress on evolutionary computation, vol 2, pp 1671–1676 Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the international congress on evolutionary computation, vol 2, pp 1671–1676
28.
Zurück zum Zitat Lawrence S, Tsoi A, Back A (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proceedings of the Australian conference on neural networks, australian national university, vol 1621 Lawrence S, Tsoi A, Back A (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proceedings of the Australian conference on neural networks, australian national university, vol 1621
29.
Zurück zum Zitat LeCun Y, Bottou L, Orr G, Müller K (2012) Efficient BackProp. Springer, Berlin, pp 9–48 LeCun Y, Bottou L, Orr G, Müller K (2012) Efficient BackProp. Springer, Berlin, pp 9–48
30.
Zurück zum Zitat Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. in: Proceedings of the workshop on deep learning for audio, speech, and language processing, vol 30, pp 3–8 Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. in: Proceedings of the workshop on deep learning for audio, speech, and language processing, vol 30, pp 3–8
31.
Zurück zum Zitat Mendes R, Cortez P, Rocha M, Neves J (2002) Particle swarms for feedforward neural network training. In: Proceedings of the international joint conference on neural networks, pp 1895–1899 Mendes R, Cortez P, Rocha M, Neves J (2002) Particle swarms for feedforward neural network training. In: Proceedings of the international joint conference on neural networks, pp 1895–1899
32.
Zurück zum Zitat Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957 Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957
33.
Zurück zum Zitat Oldewage E (2018) The perils of particle swarm optimization in high dimensional problem spaces. Master’s thesis, Department of Computer Science, University of Pretoria Oldewage E (2018) The perils of particle swarm optimization in high dimensional problem spaces. Master’s thesis, Department of Computer Science, University of Pretoria
34.
Zurück zum Zitat Olorunda O, Engelbrecht A (2008) Measuring exploration/exploitation in particle swarms using swarm diversity. In: Proceedings of the international congress on evolutionary computation, pp 1128–1134 Olorunda O, Engelbrecht A (2008) Measuring exploration/exploitation in particle swarms using swarm diversity. In: Proceedings of the international congress on evolutionary computation, pp 1128–1134
35.
37.
Zurück zum Zitat Rakitianskaia A, Engelbrecht A (2012) Training feedforward neural networks with dynamic particle swarm optimisation. Int J Uncertain Fuzziness Knowl Based Syst 6(3):233–270 Rakitianskaia A, Engelbrecht A (2012) Training feedforward neural networks with dynamic particle swarm optimisation. Int J Uncertain Fuzziness Knowl Based Syst 6(3):233–270
38.
Zurück zum Zitat Rakitianskaia A, Engelbrecht A (2014) Training high-dimensional neural networks with cooperative particle swarm optimiser. In: Proceedings of the international joint conference on neural networks, pp 4011–4018 Rakitianskaia A, Engelbrecht A (2014) Training high-dimensional neural networks with cooperative particle swarm optimiser. In: Proceedings of the international joint conference on neural networks, pp 4011–4018
39.
Zurück zum Zitat Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the symposium on swarm intelligence, pp 1–8 Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the symposium on swarm intelligence, pp 1–8
40.
Zurück zum Zitat Rakitianskaia A, Engelbrecht A (2015) Measuring saturation in neural networks. In: Proceedings of the symposium series on computational intelligence, pp 1423–1430 Rakitianskaia A, Engelbrecht A (2015) Measuring saturation in neural networks. In: Proceedings of the symposium series on computational intelligence, pp 1423–1430
41.
Zurück zum Zitat Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the international congress on evolutionary computation, pp 125–132 Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the international congress on evolutionary computation, pp 125–132
42.
Zurück zum Zitat Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl ‘ 14(1):19–27 Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl ‘ 14(1):19–27
43.
Zurück zum Zitat Röbel A (1994) The dynamic pattern selection algorithm: effective training and controlled generalization of backpropagation neural networks. Technical report, Technische Universität Berlin Röbel A (1994) The dynamic pattern selection algorithm: effective training and controlled generalization of backpropagation neural networks. Technical report, Technische Universität Berlin
45.
Zurück zum Zitat Stützle T, López-Ibáñez M (2019) Automated design of metaheuristic algorithms. Springer, Cham, pp 541–579 Stützle T, López-Ibáñez M (2019) Automated design of metaheuristic algorithms. Springer, Cham, pp 541–579
46.
Zurück zum Zitat van den Bergh F, Engelbrecht A (2000) Cooperative learning in neural networks using particle swarm optimizers. S Afr Comput J 26:84–90 van den Bergh F, Engelbrecht A (2000) Cooperative learning in neural networks using particle swarm optimizers. S Afr Comput J 26:84–90
47.
Zurück zum Zitat van Wyk A, Engelbrecht A (2010) Overfitting by PSO trained feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 1–8 van Wyk A, Engelbrecht A (2010) Overfitting by PSO trained feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 1–8
48.
Zurück zum Zitat van Wyk A, Engelbrecht A (2016) Analysis of activation functions for particle swarm optimized feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 423–430 van Wyk A, Engelbrecht A (2016) Analysis of activation functions for particle swarm optimized feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 423–430
49.
Zurück zum Zitat Volschenk A, Engelbrecht A (2016) An analysis of competitive coevolutionary particle swarm optimizers to train neural network game tree evaluation functions. In: Tan Y, Shi Y, Niu B (eds) Advances in swarm intelligence. Springer, Cham, pp 369–380CrossRef Volschenk A, Engelbrecht A (2016) An analysis of competitive coevolutionary particle swarm optimizers to train neural network game tree evaluation functions. In: Tan Y, Shi Y, Niu B (eds) Advances in swarm intelligence. Springer, Cham, pp 369–380CrossRef
50.
Zurück zum Zitat Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. thesis, Harvard University Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. thesis, Harvard University
51.
Zurück zum Zitat Wessels L, Barnard E (1992) Avoiding false local minima by proper initialization of connections. Trans Neural Netw 3(6):899–905CrossRef Wessels L, Barnard E (1992) Avoiding false local minima by proper initialization of connections. Trans Neural Netw 3(6):899–905CrossRef
53.
Zurück zum Zitat Wyk A, Engelbrecht A (2011) Lambda-gamma learning with feedforward neural networks using particle swarm optimization. In: Proceedings of the symposium on swarm intelligence, pp 1–8 Wyk A, Engelbrecht A (2011) Lambda-gamma learning with feedforward neural networks using particle swarm optimization. In: Proceedings of the symposium on swarm intelligence, pp 1–8
54.
Zurück zum Zitat Xiao X, Wang Z, Li Q, Xia S, Jiang Y (2017) Back-propagation neural network on markov chains from system call sequences: a new approach for detecting android malware with system call sequences. IET Inf Secur 11(1):8–15CrossRef Xiao X, Wang Z, Li Q, Xia S, Jiang Y (2017) Back-propagation neural network on markov chains from system call sequences: a new approach for detecting android malware with system call sequences. IET Inf Secur 11(1):8–15CrossRef
Metadaten
Titel
An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks
verfasst von
Cody Dennis
Andries P. Engelbrecht
Beatrice M. Ombuki-Berman
Publikationsdatum
25.07.2020
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 2/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-020-10290-z

Weitere Artikel der Ausgabe 2/2020

Neural Processing Letters 2/2020 Zur Ausgabe

OriginalPaper

Hyper Autoencoders

Neuer Inhalt