nach oben

Neural Processing Letters

Erschienen in:

25.07.2020

An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

verfasst von: Cody Dennis, Andries P. Engelbrecht, Beatrice M. Ombuki-Berman

Erschienen in: Neural Processing Letters | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The activation functions used in an artificial neural network define how nodes of the network respond to input, directly influence the shape of the error surface and play a role in the difficulty of the neural network training problem. Choice of activation functions is a significant question which must be addressed when applying a neural network to a problem. One issue which must be considered when selecting an activation function is known as activation function saturation. Saturation occurs when a bounded activation function primarily outputs values close to its boundary. Excessive saturation damages the network’s ability to encode information and may prevent successful training. Common functions such as the logistic and hyperbolic tangent functions have been shown to exhibit saturation when the neural network is trained using particle swarm optimization. This study proposes a new measure of activation function saturation, evaluates the saturation behavior of eight common activation functions, and evaluates six measures of controlling activation function saturation in particle swarm optimization based neural network training. Activation functions that result in low levels of saturation are identified. For each activation function recommendations are made regarding which saturation control mechanism is most effective at reducing saturation.

Vorheriger Artikel Radical and Stroke-Enhanced Chinese Word Embeddings Based on Neural Networks

Nächster Artikel Super-Resolution Based Automatic Diagnosis of Retinal Disease Detection for Clinical Applications

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Al Hazza MH, Adesta EY (2013) Investigation of the effect of cutting speed on the surface roughness parameters in CNC end milling using artificial neural network. In: IOP conference series: materials science and engineering, vol 53, IOP Publishing. https://doi.org/10.1088/1757-899X/53/1/012089

Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, New YorkMATH

Carvalho M, Ludermir R (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Proceedings of the international conference on 3D digital imaging and modeling, pp 1–5

Center NGD (2019) Boulder sunspot number data. https://www.sws.bom.gov.au/Educational/2/3/6. Accessed 16 Mar 2019

Dahl G, Sainath T, Hinton G (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: Proceedings of the conference on acoustics, speech and signal processing, pp 8609–8613

Das M, Dulger L (2009) Signature vecification (SV) toolbox: applications of PSO-NN. Eng Appl Artific Intel 22(4):688–694CrossRef

Dreyfus G (2005) Neural networks: methodology and applications. Springer, BerlinMATH

Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 16 Mar 2019

Dugas C, Bengio Y, Belisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In: Proceedings of the conference on advances in neural information processing systems, pp 472–478

10.

Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, pp 39–43

11.

Eggensperger K, Lindauer M, Hoos H, Hutter F, Leyton-Brown K (2018) Efficient benchmarking of algorithm configurators via model-based surrogates. Mach Learn 107(1):15–41MathSciNetCrossRef

12.

Elliott D (1993) A better activation function for artificial neural networks. Technical report T.R. 93-8, University of Maryland

13.

Engelbrecht A (2012) Particle swarm optimization: velocity initialization. In: Proceedings of the congress on evolutionary computation, pp 1–8

14.

Engelbrecht A (2013) Particle swarm optimization: global best or local best? In: BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence. IEEE, pp 124–135

15.

Engelbrecht A, Cloete I, Geldenhuys J, Zurada J (1995) Automatic scaling using gamma learning for feedforward neural networks. In: Proceedings of the international workshop on artificial neural networks. Springer, pp 374–381

16.

Fisher R (1936) Iris data set. https://archive.ics.uci.edu/ml/datasets/Iris. Accessed 2 Aug 2018

17.

Forina M et al (1991) Wine data set. https://archive.ics.uci.edu/ml/datasets/Wine. Accessed 2 Aug 2018

18.

Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256

19.

Golik P, Doetsch P, Ney H (2013) Cross-entropy vs. squared error training: a theoretical and experimental comparison. Interspeech 13:1756–1760

20.

Gudise V, Venayagamoorthy G (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the swarm intelligence symposium, pp 110–117

21.

Harrison K (2018) An analysis of parameter control mechanisms for the particle swarm optimization algorithm. Ph.D. thesis, University of Pretoria

22.

Helwig S, Wanka R (2007) Particle swarm optimization in high dimensional bounded search spaces. In: Proceedings of the swarm intelligence symposium, pp 198–205

23.

Helwig S, Wanka R (2008) Theoretical analysis of initial particle swarm behavior. In: Rudolph G, Jansen T, Beume N, Lucas S, Poloni C (eds) Proceedings of the international conference on parallel problem solving from nature, pp 889–898

24.

Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Coello CA (ed) Learning and intelligent optimization. Springer, Heidelberg, pp 507–523CrossRef

25.

Hutter F, Lücke J, Schmidt-Thieme L (2015) Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz 29(4):329–337CrossRef

26.

Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the international conference on neural networks, vol 4, pp 1942–1948

27.

Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the international congress on evolutionary computation, vol 2, pp 1671–1676

28.

Lawrence S, Tsoi A, Back A (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proceedings of the Australian conference on neural networks, australian national university, vol 1621

29.

LeCun Y, Bottou L, Orr G, Müller K (2012) Efficient BackProp. Springer, Berlin, pp 9–48

30.

Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. in: Proceedings of the workshop on deep learning for audio, speech, and language processing, vol 30, pp 3–8

31.

Mendes R, Cortez P, Rocha M, Neves J (2002) Particle swarms for feedforward neural network training. In: Proceedings of the international joint conference on neural networks, pp 1895–1899

32.

Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957

33.

Oldewage E (2018) The perils of particle swarm optimization in high dimensional problem spaces. Master’s thesis, Department of Computer Science, University of Pretoria

34.

Olorunda O, Engelbrecht A (2008) Measuring exploration/exploitation in particle swarms using swarm diversity. In: Proceedings of the international congress on evolutionary computation, pp 1128–1134

35.

Platt J (1991) A resource-allocating network for function interpolation. Neural Comput 3(2):213–225MathSciNetCrossRef

36.

Prechelt L (1994) PROBEN1—a set of benchmarks and benchmarking rules for neural network training algorithms. https://github.com/jeffheaton/proben1. Accessed 16 Mar 2019

37.

Rakitianskaia A, Engelbrecht A (2012) Training feedforward neural networks with dynamic particle swarm optimisation. Int J Uncertain Fuzziness Knowl Based Syst 6(3):233–270

38.

Rakitianskaia A, Engelbrecht A (2014) Training high-dimensional neural networks with cooperative particle swarm optimiser. In: Proceedings of the international joint conference on neural networks, pp 4011–4018

39.

Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the symposium on swarm intelligence, pp 1–8

40.

Rakitianskaia A, Engelbrecht A (2015) Measuring saturation in neural networks. In: Proceedings of the symposium series on computational intelligence, pp 1423–1430

41.

Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the international congress on evolutionary computation, pp 125–132

42.

Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl ‘ 14(1):19–27

43.

Röbel A (1994) The dynamic pattern selection algorithm: effective training and controlled generalization of backpropagation neural networks. Technical report, Technische Universität Berlin

44.

Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of the international congress on evolutionary computation, pp 69–73. https://doi.org/10.1109/ICEC.1998.699146

45.

Stützle T, López-Ibáñez M (2019) Automated design of metaheuristic algorithms. Springer, Cham, pp 541–579

46.

van den Bergh F, Engelbrecht A (2000) Cooperative learning in neural networks using particle swarm optimizers. S Afr Comput J 26:84–90

47.

van Wyk A, Engelbrecht A (2010) Overfitting by PSO trained feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 1–8

48.

van Wyk A, Engelbrecht A (2016) Analysis of activation functions for particle swarm optimized feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 423–430

49.

Volschenk A, Engelbrecht A (2016) An analysis of competitive coevolutionary particle swarm optimizers to train neural network game tree evaluation functions. In: Tan Y, Shi Y, Niu B (eds) Advances in swarm intelligence. Springer, Cham, pp 369–380CrossRef

50.

Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. thesis, Harvard University

51.

Wessels L, Barnard E (1992) Avoiding false local minima by proper initialization of connections. Trans Neural Netw 3(6):899–905CrossRef

52.

Wolberg W (1990) Breast cancer wisconsin (original) data set. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 2 Aug 2018

53.

Wyk A, Engelbrecht A (2011) Lambda-gamma learning with feedforward neural networks using particle swarm optimization. In: Proceedings of the symposium on swarm intelligence, pp 1–8

54.

Xiao X, Wang Z, Li Q, Xia S, Jiang Y (2017) Back-propagation neural network on markov chains from system call sequences: a new approach for detecting android malware with system call sequences. IET Inf Secur 11(1):8–15CrossRef

Titel: An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks
verfasst von: Cody Dennis
Andries P. Engelbrecht
Beatrice M. Ombuki-Berman
Publikationsdatum: 25.07.2020
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 2/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-020-10290-z

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Interview Entropie Bild 1/© Bernhard Weßling, Joerg Schweinsberg/© Datacore Software, Smart Factory Symbolbild/© TensorSpark | Generated with AI | Getty Images, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2020

Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering

Feature Selection Method Based on Differential Correlation Information Entropy

Logic Negation with Spiking Neural P Systems

3D Model Retrieval Using Bipartite Graph Matching Based on Attention

Hyper Autoencoders

Pairwise Generalization Network for Cross-Domain Image Recognition

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.