Skip to main content
Erschienen in: Arabian Journal for Science and Engineering 4/2020

26.10.2019 | Research Article - Computer Engineering and Computer Science

Architectural Parameter-Independent Network Initialization Scheme for Sigmoidal Feedforward ANNs

verfasst von: Sarfaraz Masood, M. N. Doja, Pravin Chandra

Erschienen in: Arabian Journal for Science and Engineering | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The selection of the initial network weights has been a known key aspect affecting the convergence of sigmoidal activation function-based artificial neural networks. In this paper, a new network initialization scheme has been proposed that initializes the network weights such that activation functions in the network are not saturated initially. The proposed method ensures that the initial outputs of the hidden neurons are in the active region which positively impacts the network’s rate of convergence. Unlike most of the earlier proposed initialization schemes, this method does not depend on architectural parameters like the size of the input layer or the hidden layer. The performance of the proposed scheme has been compared against eight well-known weight initialization routines over six benchmark real-world problems. Results show that the proposed weight initialization routine enables the network to achieve better performance within the same count of network training epochs. A right-tailed t-test also shows that our proposed scheme is significantly better in most of the cases against the other techniques or statistically similar in a few cases but never underperforms. Hence, it may be considered as a strong alternative to the conventional neural network initialization techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hornik, K.; Stinchcombe, M.; White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)MATHCrossRef Hornik, K.; Stinchcombe, M.; White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)MATHCrossRef
2.
Zurück zum Zitat Hornik, K.; Stinchcombe, M.; White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3(5), 551–560 (1990)CrossRef Hornik, K.; Stinchcombe, M.; White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3(5), 551–560 (1990)CrossRef
3.
Zurück zum Zitat Masood, S.; Gupta, S.; Wajid, A.; Gupta, S.; Ahmed, M.: Prediction of human ethnicity from facial images using neural networks (2018) Masood, S.; Gupta, S.; Wajid, A.; Gupta, S.; Ahmed, M.: Prediction of human ethnicity from facial images using neural networks (2018)
4.
Zurück zum Zitat Masood, S.; Gupta, S.; Khan, S.: Novel approach for musical instrument identification using neural network. In: 2015 Annual IEEE India Conference (INDICON), pp. 1–5 (2015) Masood, S.; Gupta, S.; Khan, S.: Novel approach for musical instrument identification using neural network. In: 2015 Annual IEEE India Conference (INDICON), pp. 1–5 (2015)
5.
Zurück zum Zitat Harun, N.H.; Yusof, Y.; Hassan, F.; Embong, Z.: Classification of fundus images for diabetic retinopathy using artificial neural network. In: 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), pp. 498–501 (2019) Harun, N.H.; Yusof, Y.; Hassan, F.; Embong, Z.: Classification of fundus images for diabetic retinopathy using artificial neural network. In: 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), pp. 498–501 (2019)
6.
Zurück zum Zitat Aljurayfani, M.; Alghernas, S.; Shargabi, A.: Medical self-diagnostic system using artificial neural networks. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–5 (2019) Aljurayfani, M.; Alghernas, S.; Shargabi, A.: Medical self-diagnostic system using artificial neural networks. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–5 (2019)
7.
Zurück zum Zitat Khashei, M.; Bijari, M.: An artificial neural network (p, d, q) model for timeseries forecasting. Expert Syst. Appl. 37(1), 479–489 (2010)MATHCrossRef Khashei, M.; Bijari, M.: An artificial neural network (p, d, q) model for timeseries forecasting. Expert Syst. Appl. 37(1), 479–489 (2010)MATHCrossRef
8.
Zurück zum Zitat Doucoure, B.; Agbossou, K.; Cardenas, A.: Time series prediction using artificial wavelet neural network and multi-resolution analysis: application to wind speed data. Renew. Energy 92, 202–211 (2016)CrossRef Doucoure, B.; Agbossou, K.; Cardenas, A.: Time series prediction using artificial wavelet neural network and multi-resolution analysis: application to wind speed data. Renew. Energy 92, 202–211 (2016)CrossRef
9.
Zurück zum Zitat Feng, X.; Li, Q.; Zhu, Y.; Hou, J.; Jin, L.; Wang, J.: Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107, 118–128 (2015)CrossRef Feng, X.; Li, Q.; Zhu, Y.; Hou, J.; Jin, L.; Wang, J.: Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107, 118–128 (2015)CrossRef
10.
Zurück zum Zitat Qiu, M.; Song, Y.; Akagi, F.: Application of artificial neural network for the prediction of stock market returns: the case of the Japanese stock market. Chaos Solitons Fractals 85, 1–7 (2016)MathSciNetCrossRef Qiu, M.; Song, Y.; Akagi, F.: Application of artificial neural network for the prediction of stock market returns: the case of the Japanese stock market. Chaos Solitons Fractals 85, 1–7 (2016)MathSciNetCrossRef
11.
Zurück zum Zitat Moghaddam, A.H.; Moghaddam, M.H.; Esfandyari, M.: Stock market index prediction using artificial neural network. J. Econ. Finance Adm. Sci. 21(41), 89–93 (2016) Moghaddam, A.H.; Moghaddam, M.H.; Esfandyari, M.: Stock market index prediction using artificial neural network. J. Econ. Finance Adm. Sci. 21(41), 89–93 (2016)
12.
Zurück zum Zitat Rummelhart, D.E.; McClelland, J.L.; Group, P.R.; et al.: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Cambridge (1986)CrossRef Rummelhart, D.E.; McClelland, J.L.; Group, P.R.; et al.: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Cambridge (1986)CrossRef
13.
Zurück zum Zitat Plagianakos, V.P.; Sotiropoulos, D.G.; Vrahatis, M.N.: An improved backpropagation method with adaptive learning rate. In: Proceedings of the 2nd International Conference on Circuits, Systems and Computers (1998) Plagianakos, V.P.; Sotiropoulos, D.G.; Vrahatis, M.N.: An improved backpropagation method with adaptive learning rate. In: Proceedings of the 2nd International Conference on Circuits, Systems and Computers (1998)
14.
Zurück zum Zitat Hagan, M.T.; Menhaj, M.B.: Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5(6), 989–993 (1994)CrossRef Hagan, M.T.; Menhaj, M.B.: Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5(6), 989–993 (1994)CrossRef
15.
Zurück zum Zitat Riedmiller, M.; Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the IEEE International Conference on Neural Networks, vol. 1993, pp. 586–591 (1993) Riedmiller, M.; Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the IEEE International Conference on Neural Networks, vol. 1993, pp. 586–591 (1993)
16.
Zurück zum Zitat Narayanan, A.; Menneer, T.: Quantum artificial neural network architectures and components. Inf. Sci. 128(3–4), 231–255 (2000)MathSciNetMATHCrossRef Narayanan, A.; Menneer, T.: Quantum artificial neural network architectures and components. Inf. Sci. 128(3–4), 231–255 (2000)MathSciNetMATHCrossRef
17.
Zurück zum Zitat Fahlman, S.E.; Lebiere, C.: The cascade-correlation learning architecture. In: Advances in Neural Information Processing Systems, pp. 524–532 (1990) Fahlman, S.E.; Lebiere, C.: The cascade-correlation learning architecture. In: Advances in Neural Information Processing Systems, pp. 524–532 (1990)
18.
Zurück zum Zitat Fahlman, S.E.: The recurrent cascade-correlation architecture. In: Advances in Neural Information Processing Systems, pp. 190–196 (1991) Fahlman, S.E.: The recurrent cascade-correlation architecture. In: Advances in Neural Information Processing Systems, pp. 190–196 (1991)
19.
Zurück zum Zitat Benardos, P.G.; Vosniakos, G.C.: Optimizing feedforward artificial neural network architecture. Eng. Appl. Artif. Intell. 20(3), 365–382 (2007)CrossRef Benardos, P.G.; Vosniakos, G.C.: Optimizing feedforward artificial neural network architecture. Eng. Appl. Artif. Intell. 20(3), 365–382 (2007)CrossRef
20.
Zurück zum Zitat Ertuğrul, Ö.F.: A novel type of activation function in artificial neural networks: trained activation function. Neural Netw. 99, 148–157 (2018)CrossRef Ertuğrul, Ö.F.: A novel type of activation function in artificial neural networks: trained activation function. Neural Netw. 99, 148–157 (2018)CrossRef
21.
Zurück zum Zitat Karlik, B.; Olgac, A.V.: Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int. J. Artif. Intell. Expert Syst. 1(4), 111–122 (2011) Karlik, B.; Olgac, A.V.: Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int. J. Artif. Intell. Expert Syst. 1(4), 111–122 (2011)
22.
Zurück zum Zitat Nguyen, D.; Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: 1990 IJCNN International Joint Conference on Neural Networks, pp. 21–26 (1990) Nguyen, D.; Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: 1990 IJCNN International Joint Conference on Neural Networks, pp. 21–26 (1990)
23.
Zurück zum Zitat Drago, G.P.; Ridella, S.: Statistically controlled activation weight initialization (SCAWI). IEEE Trans. Neural Netw. 3(4), 627–631 (1992)CrossRef Drago, G.P.; Ridella, S.: Statistically controlled activation weight initialization (SCAWI). IEEE Trans. Neural Netw. 3(4), 627–631 (1992)CrossRef
24.
Zurück zum Zitat Kim, Y.K.; Ra, J.B.: Weight value initialization for improving training speed in the backpropagation network. In: [Proceedings] 1991 IEEE International Joint Conference on Neural Networks, pp. 2396–2401 (1991) Kim, Y.K.; Ra, J.B.: Weight value initialization for improving training speed in the backpropagation network. In: [Proceedings] 1991 IEEE International Joint Conference on Neural Networks, pp. 2396–2401 (1991)
25.
Zurück zum Zitat Bottou, L.Y.: Reconnaissance de la parole par reseaux multi-couches. In: Proceedings of the International Workshop Neural Networks Application, Neuro-Nimes, vol. 88, pp. 197–217 (1988) Bottou, L.Y.: Reconnaissance de la parole par reseaux multi-couches. In: Proceedings of the International Workshop Neural Networks Application, Neuro-Nimes, vol. 88, pp. 197–217 (1988)
26.
Zurück zum Zitat Thimm, G.; Fiesler, E.: High-order and multilayer perceptron initialization. IEEE Trans. Neural Netw. 8(2), 349–359 (1997)CrossRef Thimm, G.; Fiesler, E.: High-order and multilayer perceptron initialization. IEEE Trans. Neural Netw. 8(2), 349–359 (1997)CrossRef
27.
Zurück zum Zitat Pavelka, A.; Procházka, A.: Algorithms for initialization of neural network weights. In: Proceedings of the 12th Annual Conference, MATLAB, pp. 453–459 (2004) Pavelka, A.; Procházka, A.: Algorithms for initialization of neural network weights. In: Proceedings of the 12th Annual Conference, MATLAB, pp. 453–459 (2004)
28.
Zurück zum Zitat Kolen, J.F.; Pollack, J.B.: Back propagation is sensitive to initial conditions. In: Advances in Neural Information Processing Systems, pp. 860–867 (1991) Kolen, J.F.; Pollack, J.B.: Back propagation is sensitive to initial conditions. In: Advances in Neural Information Processing Systems, pp. 860–867 (1991)
29.
Zurück zum Zitat Yam, Y.F.; Leung, C.T.; Tam, P.K.; Siu, W.C.: An independent component analysis based weight initialization method for multilayer perceptrons. Neurocomputing 48(1–4), 807–818 (2002)MATHCrossRef Yam, Y.F.; Leung, C.T.; Tam, P.K.; Siu, W.C.: An independent component analysis based weight initialization method for multilayer perceptrons. Neurocomputing 48(1–4), 807–818 (2002)MATHCrossRef
30.
Zurück zum Zitat Sodhi, S.S.; Chandra, P.: A partially deterministic weight initialization method for SFFANNs. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 1275–1280 (2014) Sodhi, S.S.; Chandra, P.: A partially deterministic weight initialization method for SFFANNs. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 1275–1280 (2014)
31.
Zurück zum Zitat Bhatia, M.P.S.; Chandra, P.: A new weight initialization method for sigmoidal FFANN. J. Intell. Fuzzy Syst. (Preprint), 1–9 (2018) Bhatia, M.P.S.; Chandra, P.: A new weight initialization method for sigmoidal FFANN. J. Intell. Fuzzy Syst. (Preprint), 1–9 (2018)
32.
Zurück zum Zitat Gorman, R.P.; Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75–89 (1988)CrossRef Gorman, R.P.; Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75–89 (1988)CrossRef
34.
Zurück zum Zitat Haykin, S.: Neural Networks: A Comprehensive Foundation, 1st edn. Prentice Hall PTR, Upper Saddle River (1994)MATH Haykin, S.: Neural Networks: A Comprehensive Foundation, 1st edn. Prentice Hall PTR, Upper Saddle River (1994)MATH
35.
Zurück zum Zitat Mittal, A.; Singh, A.P.; Chandra, P.: A Modification to the Nguyen–Widrow Weight Initialization Method (2020) Mittal, A.; Singh, A.P.; Chandra, P.: A Modification to the Nguyen–Widrow Weight Initialization Method (2020)
36.
Zurück zum Zitat Roffman, D.; Hart, G.; Girardi, M.; Ko, C.J.; Deng, J.: Predicting non-melanoma skin cancer via a multi-parameterized artificial neural network. Sci. Rep. 8(1), 1701 (2018)CrossRef Roffman, D.; Hart, G.; Girardi, M.; Ko, C.J.; Deng, J.: Predicting non-melanoma skin cancer via a multi-parameterized artificial neural network. Sci. Rep. 8(1), 1701 (2018)CrossRef
37.
Zurück zum Zitat Shebani, A.; Iwnicki, S.: Prediction of wheel and rail wear under different contact conditions using artificial neural networks. Wear 406, 173–184 (2018)CrossRef Shebani, A.; Iwnicki, S.: Prediction of wheel and rail wear under different contact conditions using artificial neural networks. Wear 406, 173–184 (2018)CrossRef
38.
Zurück zum Zitat Fernando, C.; Banarse, D.; Blundell, C.; Zwols, Y.; Ha, D.; Rusu, A.A.; Pritzel, A.; Wierstra, D.: Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017) Fernando, C.; Banarse, D.; Blundell, C.; Zwols, Y.; Ha, D.; Rusu, A.A.; Pritzel, A.; Wierstra, D.: Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:​1701.​08734 (2017)
39.
Zurück zum Zitat Wang, L.; Zeng, Y.; Chen, T.: Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst. Appl. 42(2), 855–863 (2015)CrossRef Wang, L.; Zeng, Y.; Chen, T.: Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst. Appl. 42(2), 855–863 (2015)CrossRef
40.
Zurück zum Zitat Leema, N.; Nehemiah, H.K.; Kannan, A.: Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets. Appl. Soft Comput. 49, 834–844 (2016)CrossRef Leema, N.; Nehemiah, H.K.; Kannan, A.: Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets. Appl. Soft Comput. 49, 834–844 (2016)CrossRef
41.
Zurück zum Zitat Alshahrani, H.; Alzahrani, A.; Alshehri, A.; Alharthi, R.; Fu, H.: Evaluation of gradient descent optimization: using android applications in neural networks. In: 2017 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1471–1476 (2017) Alshahrani, H.; Alzahrani, A.; Alshehri, A.; Alharthi, R.; Fu, H.: Evaluation of gradient descent optimization: using android applications in neural networks. In: 2017 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1471–1476 (2017)
42.
Zurück zum Zitat Wanto, A.; Andani, S.R.; Poningsih, P.; Dewi, R.; Lubis, M.R.; Saputra, W.; Kirana, I.O.: Analysis Of Standard Gradient Descent with GD Momentum and Adaptive LR For SPR Prediction (2018) Wanto, A.; Andani, S.R.; Poningsih, P.; Dewi, R.; Lubis, M.R.; Saputra, W.; Kirana, I.O.: Analysis Of Standard Gradient Descent with GD Momentum and Adaptive LR For SPR Prediction (2018)
43.
Zurück zum Zitat Taheri-Garavand, A.; Meda, V.; Naderloo, L.: Artificial neural network-genetic algorithm modeling for moisture content prediction of savory leaves drying process in different drying conditions. Eng. Agric. Environ. Food 11(4), 232–238 (2018)CrossRef Taheri-Garavand, A.; Meda, V.; Naderloo, L.: Artificial neural network-genetic algorithm modeling for moisture content prediction of savory leaves drying process in different drying conditions. Eng. Agric. Environ. Food 11(4), 232–238 (2018)CrossRef
44.
Zurück zum Zitat Lenka, S.K.; Mohapatra, A.G.: Gradient descent with momentum based neural network pattern classification for the prediction of soil moisture content in precision agriculture. In: 2015 IEEE International Symposium on Nanoelectronic and Information Systems, pp. 63–66 (2015) Lenka, S.K.; Mohapatra, A.G.: Gradient descent with momentum based neural network pattern classification for the prediction of soil moisture content in precision agriculture. In: 2015 IEEE International Symposium on Nanoelectronic and Information Systems, pp. 63–66 (2015)
45.
Zurück zum Zitat Nayak, S.; Choudhury, B.B.; Lenka, S.K.: Gradient descent with momentum based backpropagation neural network for selection of industrial robot. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, vol. 1, pp. 487–496 (2016) Nayak, S.; Choudhury, B.B.; Lenka, S.K.: Gradient descent with momentum based backpropagation neural network for selection of industrial robot. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, vol. 1, pp. 487–496 (2016)
46.
Zurück zum Zitat Mohanty, S.; Jha, M.K.; Raul, S.K.; Panda, R.K.; Sudheer, K.P.: Using artificial neural network approach for simultaneous forecasting of weekly groundwater levels at multiple sites. Water Resour. Manag. 29(15), 5521–5532 (2015)CrossRef Mohanty, S.; Jha, M.K.; Raul, S.K.; Panda, R.K.; Sudheer, K.P.: Using artificial neural network approach for simultaneous forecasting of weekly groundwater levels at multiple sites. Water Resour. Manag. 29(15), 5521–5532 (2015)CrossRef
47.
Zurück zum Zitat Premalatha, N.; Arasu, A.V.: Prediction of solar radiation for solar systems by using ANN models with different back propagation algorithms. J. Appl. Res. Technol. 14(3), 206–214 (2016)CrossRef Premalatha, N.; Arasu, A.V.: Prediction of solar radiation for solar systems by using ANN models with different back propagation algorithms. J. Appl. Res. Technol. 14(3), 206–214 (2016)CrossRef
49.
Zurück zum Zitat MATLAB: version 9.0.0 (R2016a). The MathWorks Inc., Natick (2016) MATLAB: version 9.0.0 (R2016a). The MathWorks Inc., Natick (2016)
50.
Zurück zum Zitat Kim, T.K.: T test as a parametric statistic. Korean J. Anesthesiol. 68(6), 540 (2015)CrossRef Kim, T.K.: T test as a parametric statistic. Korean J. Anesthesiol. 68(6), 540 (2015)CrossRef
Metadaten
Titel
Architectural Parameter-Independent Network Initialization Scheme for Sigmoidal Feedforward ANNs
verfasst von
Sarfaraz Masood
M. N. Doja
Pravin Chandra
Publikationsdatum
26.10.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Arabian Journal for Science and Engineering / Ausgabe 4/2020
Print ISSN: 2193-567X
Elektronische ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-019-04200-2

Weitere Artikel der Ausgabe 4/2020

Arabian Journal for Science and Engineering 4/2020 Zur Ausgabe

Research Article-Computer Engineering and Computer Science

Shortest Path Computation in a Network with Multiple Destinations

Research Article-Computer Engineering and Computer Science

Content-Based Image Retrieval Using Color, Shape and Texture Descriptors and Features

Research Article-Computer Engineering and Computer Science

Bengali Stop Word and Phrase Detection Mechanism

Research Article - Special Issue - Intelligent Computing And Interdisciplinary Applications

An Integrated Word Embedding-Based Dual-Task Learning Method for Sentiment Analysis

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.