Skip to main content
Erschienen in: Neural Processing Letters 3/2021

05.03.2021

Balanced Gradient Training of Feed Forward Networks

verfasst von: Son Nguyen, Michael T. Manry

Erschienen in: Neural Processing Letters | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We show that there are infinitely many valid scaled gradients which can be used to train a neural network. A novel training method is proposed that finds the best scaled gradients in each training iteration. The method’s implementation uses first order derivatives which makes it scalable and suitable for deep learning and big data. In simulations, the proposed method has similar or less testing error than conjugate gradient and Levenberg Marquardt. The method reaches the final network utilizing fewer multiplies than the other two algorithms. It also works better than conjugate gradient in convolutional neural networks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Akram M Usman, Usman Anam (2011) Computer aided system for brain tumor detection and segmentation. In: 2011 International Conference on Computer Networks and Information Technology (ICCNIT). pp 299–302 IEEE Akram M Usman, Usman Anam (2011) Computer aided system for brain tumor detection and segmentation. In: 2011 International Conference on Computer Networks and Information Technology (ICCNIT). pp 299–302 IEEE
2.
Zurück zum Zitat Atkinson PM, Tatnall ARL (1997) Introduction neural networks in remote sensing. Int J Remote Sens 18(4):699–709CrossRef Atkinson PM, Tatnall ARL (1997) Introduction neural networks in remote sensing. Int J Remote Sens 18(4):699–709CrossRef
3.
Zurück zum Zitat Auddy S S, Tyagi K, Nguyen S, Manry M (2016) Discriminant vector tranformations in neural network classifiers. In: 2016 International Joint Conference on Neural Networks (IJCNN) Auddy S S, Tyagi K, Nguyen S, Manry M (2016) Discriminant vector tranformations in neural network classifiers. In: 2016 International Joint Conference on Neural Networks (IJCNN)
4.
Zurück zum Zitat Baxt WG (1991) Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med 115(11):843–848CrossRef Baxt WG (1991) Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med 115(11):843–848CrossRef
5.
Zurück zum Zitat Beck C, Weinan E, Jentzen A (2019) Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. J Nonlinear Sci 29(4):1563–1619MathSciNetCrossRef Beck C, Weinan E, Jentzen A (2019) Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. J Nonlinear Sci 29(4):1563–1619MathSciNetCrossRef
6.
Zurück zum Zitat Bhandarkar SM, Koh J, Suk M (1997) Multiscale image segmentation using a hierarchical self-organizing map. Neurocomputing 14(3):241–272CrossRef Bhandarkar SM, Koh J, Suk M (1997) Multiscale image segmentation using a hierarchical self-organizing map. Neurocomputing 14(3):241–272CrossRef
7.
Zurück zum Zitat Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinMATH Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinMATH
8.
Zurück zum Zitat Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151CrossRef Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151CrossRef
9.
Zurück zum Zitat Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, EnglandCrossRef Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, EnglandCrossRef
10.
Zurück zum Zitat Brause Rüdiger W (2001) Medical analysis and diagnosis by neural networks. In: International Symposium on Medical Data Analysis. pp 1–13 Springer Brause Rüdiger W (2001) Medical analysis and diagnosis by neural networks. In: International Symposium on Medical Data Analysis. pp 1–13 Springer
11.
Zurück zum Zitat Dai T, Cai J, Zhang Y, Xia ST, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11065–11074 Dai T, Cai J, Zhang Y, Xia ST, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11065–11074
12.
Zurück zum Zitat Eapi GR (2015) Comprehensive neural network forecasting system for ground level ozone in multiple regions. Ph.D. dissertation, The University of Texas at Arlington Eapi GR (2015) Comprehensive neural network forecasting system for ground level ozone in multiple regions. Ph.D. dissertation, The University of Texas at Arlington
13.
Zurück zum Zitat Economou G-PK, Spiropoulos C, Economopoulos NM, Charokopos N, Lymberopoulos D, Spiliopoulou M, Haralambopulu E, Goutis CE (1994) Medical diagnosis and artificial neural networks: a medical expert system applied to pulmonary diseases. In: Neural Networks for Signal Processing [1994] IV. Proceedings of the 1994 IEEE Workshop. pp 482–489 IEEE Economou G-PK, Spiropoulos C, Economopoulos NM, Charokopos N, Lymberopoulos D, Spiliopoulou M, Haralambopulu E, Goutis CE (1994) Medical diagnosis and artificial neural networks: a medical expert system applied to pulmonary diseases. In: Neural Networks for Signal Processing [1994] IV. Proceedings of the 1994 IEEE Workshop. pp 482–489 IEEE
14.
Zurück zum Zitat Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networks a review. Pattern Recogn 35(10):2279–2301CrossRef Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networks a review. Pattern Recogn 35(10):2279–2301CrossRef
15.
Zurück zum Zitat Gill PE, Murray W (1979) Conjugate-Gradient methods for large-scale nonlinear optimization. Technical report, Standford Univ Calif Systems Optimization LAB Gill PE, Murray W (1979) Conjugate-Gradient methods for large-scale nonlinear optimization. Technical report, Standford Univ Calif Systems Optimization LAB
16.
Zurück zum Zitat Goodfellow I, Bengio Y, Courville A (2016) Deep Learn. MIT press, USAMATH Goodfellow I, Bengio Y, Courville A (2016) Deep Learn. MIT press, USAMATH
17.
Zurück zum Zitat Gore RG, Li J, Manry M, Liu L-M, Changhua Yu, Wei J (2005) Iterative design of neural network classifiers through regression. Int J Artif Intell Tools 14(01n02):281–301CrossRef Gore RG, Li J, Manry M, Liu L-M, Changhua Yu, Wei J (2005) Iterative design of neural network classifiers through regression. Int J Artif Intell Tools 14(01n02):281–301CrossRef
18.
Zurück zum Zitat Hamidieh K (2018) A data-driven statistical model for predicting the critical temperature of a superconductor. Comput Mater Sci 154:346–354CrossRef Hamidieh K (2018) A data-driven statistical model for predicting the critical temperature of a superconductor. Comput Mater Sci 154:346–354CrossRef
19.
Zurück zum Zitat Ho Y-C, Kashyap RL (1965) An algorithm for linear inequalities and its applications. IEEE Transactions on Electronic Computers 5:683–688CrossRef Ho Y-C, Kashyap RL (1965) An algorithm for linear inequalities and its applications. IEEE Transactions on Electronic Computers 5:683–688CrossRef
20.
Zurück zum Zitat Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef
21.
Zurück zum Zitat Kavzoglu T, Mather PM (1999) Pruning artificial neural networks: an example using land cover classification of multi-sensor images. Int J Remote Sens 20(14):2787–2803CrossRef Kavzoglu T, Mather PM (1999) Pruning artificial neural networks: an example using land cover classification of multi-sensor images. Int J Remote Sens 20(14):2787–2803CrossRef
22.
Zurück zum Zitat Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, Citeseer Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, Citeseer
23.
Zurück zum Zitat Kulluk S, Ozbakir L, Baykasoglu A (2012) Training neural networks with harmony search algorithms for classification problems. Eng Appl Artif Intell 25(1):11–19CrossRef Kulluk S, Ozbakir L, Baykasoglu A (2012) Training neural networks with harmony search algorithms for classification problems. Eng Appl Artif Intell 25(1):11–19CrossRef
24.
Zurück zum Zitat Le QV, Ngiam J, Coates A, Lahiri A, Prochnow Bobby, Ng Andrew Y (2011) On optimization methods for deep learning. In: Proceedings of the 28th International Conference on International Conference on Machine Learning. pp 265–272 Omnipress Le QV, Ngiam J, Coates A, Lahiri A, Prochnow Bobby, Ng Andrew Y (2011) On optimization methods for deep learning. In: Proceedings of the 28th International Conference on International Conference on Machine Learning. pp 265–272 Omnipress
25.
Zurück zum Zitat LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. Handb Brain Theor Neural Netw 3361(10):1995 LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. Handb Brain Theor Neural Netw 3361(10):1995
26.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
27.
Zurück zum Zitat LeCun Y A, Bottou Léon, Orr Genevieve B, Müller Klaus-Robert (2012) Efficient backprop. In: Neural networks: Tricks of the trade. pp 9–48 Springer LeCun Y A, Bottou Léon, Orr Genevieve B, Müller Klaus-Robert (2012) Efficient backprop. In: Neural networks: Tricks of the trade. pp 9–48 Springer
28.
Zurück zum Zitat Lee KY, Cha YT, Park JH (1992) Short-term load forecasting using an artificial neural network. IEEE Trans Power Syst 7(1):124–132CrossRef Lee KY, Cha YT, Park JH (1992) Short-term load forecasting using an artificial neural network. IEEE Trans Power Syst 7(1):124–132CrossRef
29.
Zurück zum Zitat Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2(2):164–168MathSciNetCrossRef Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2(2):164–168MathSciNetCrossRef
30.
Zurück zum Zitat Lin JT, Inigo R (1991) Hand written zip code recognition by back propagation neural network. In: IEEE Proceedings of Southeastcon’91. pp 731–735 IEEE Lin JT, Inigo R (1991) Hand written zip code recognition by back propagation neural network. In: IEEE Proceedings of Southeastcon’91. pp 731–735 IEEE
31.
Zurück zum Zitat Liu K, Subbarayan S, Shoults RR, Manry M, Kwan C, Lewis FI, Naccarino J (1996) Comparison of very short-term load forecasting techniques. IEEE Trans Power Syst 11(2):877–882CrossRef Liu K, Subbarayan S, Shoults RR, Manry M, Kwan C, Lewis FI, Naccarino J (1996) Comparison of very short-term load forecasting techniques. IEEE Trans Power Syst 11(2):877–882CrossRef
32.
Zurück zum Zitat Liu LM, Manry M, Amar F, Dawson MS, Fung AK (1994) Image classification in remote sensing using functional link neural networks. In: Proceedings of the IEEE southwest symposium on image analysis and interpretation. pp 54–58 IEEE Liu LM, Manry M, Amar F, Dawson MS, Fung AK (1994) Image classification in remote sensing using functional link neural networks. In: Proceedings of the IEEE southwest symposium on image analysis and interpretation. pp 54–58 IEEE
33.
Zurück zum Zitat Luxhøj JT (1998) An artificial neural network for nonlinear estimation of the turbine flow-meter coefficient. Eng Appl Artif Intell 11(6):723–734CrossRef Luxhøj JT (1998) An artificial neural network for nonlinear estimation of the turbine flow-meter coefficient. Eng Appl Artif Intell 11(6):723–734CrossRef
34.
Zurück zum Zitat Manry M, Dawson MS, Fung AK, Apollo SJ, Allen LS, Lyle WD, Gong W (1994) Fast training of neural networks for remote sensing. Remote Sens Rev 9(1–2):77–96CrossRef Manry M, Dawson MS, Fung AK, Apollo SJ, Allen LS, Lyle WD, Gong W (1994) Fast training of neural networks for remote sensing. Remote Sens Rev 9(1–2):77–96CrossRef
35.
Zurück zum Zitat Morgan N, Bourlard HA (1995) Neural networks for statistical recognition of continuous speech. Proc IEEE 83(5):742–772CrossRef Morgan N, Bourlard HA (1995) Neural networks for statistical recognition of continuous speech. Proc IEEE 83(5):742–772CrossRef
36.
Zurück zum Zitat Nazeer Shahrin Azuan, Omar Nazaruddin, Marzuki Khalid (2007) Face recognition system using artificial neural networks approach. In: 2007 International Conference on Signal Processing, Communications and Networking. pp 420–425 IEEE Nazeer Shahrin Azuan, Omar Nazaruddin, Marzuki Khalid (2007) Face recognition system using artificial neural networks approach. In: 2007 International Conference on Signal Processing, Communications and Networking. pp 420–425 IEEE
37.
Zurück zum Zitat Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning
38.
Zurück zum Zitat Nguyen S (2019) Affine invariance in multilayer perceptron training. Ph.D. dissertation, The University of Texas at Arlington Nguyen S (2019) Affine invariance in multilayer perceptron training. Ph.D. dissertation, The University of Texas at Arlington
39.
Zurück zum Zitat Nguyen Son, Tyagi Kanishka, Kheirkhah Parastoo, Manry Michael (2016) Partially affine invariant back propagation. In: 2016 International Joint Conference on Neural Networks (IJCNN). pp 811–818 IEEE Nguyen Son, Tyagi Kanishka, Kheirkhah Parastoo, Manry Michael (2016) Partially affine invariant back propagation. In: 2016 International Joint Conference on Neural Networks (IJCNN). pp 811–818 IEEE
40.
Zurück zum Zitat Yisok O, Sarabandi K, Ulaby FT (1992) An empirical model and an inversion technique for radar scattering from bare soil surfaces. IEEE Trans Geosci Remote Sens 30(2):370–381CrossRef Yisok O, Sarabandi K, Ulaby FT (1992) An empirical model and an inversion technique for radar scattering from bare soil surfaces. IEEE Trans Geosci Remote Sens 30(2):370–381CrossRef
41.
Zurück zum Zitat Osawa K, Tsuji Y, Ueno Y, Naruse A, Yokota R, Matsuoka S (2019) Large-scale distributed second-order optimization using kronecker-factored approximate curvature for deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 12359–12367 Osawa K, Tsuji Y, Ueno Y, Naruse A, Yokota R, Matsuoka S (2019) Large-scale distributed second-order optimization using kronecker-factored approximate curvature for deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 12359–12367
42.
Zurück zum Zitat Oz C, Leu MC (2011) American sign language word recognition with a sensory glove using artificial neural networks. Eng Appl Artif Intell 24(7):1204–1213CrossRef Oz C, Leu MC (2011) American sign language word recognition with a sensory glove using artificial neural networks. Eng Appl Artif Intell 24(7):1204–1213CrossRef
43.
Zurück zum Zitat Parisini T, Zoppoli R (1994) Neural networks for nonlinear state estimation. Int J Robust Nonlinear Control 4(2):231–248MathSciNetCrossRef Parisini T, Zoppoli R (1994) Neural networks for nonlinear state estimation. Int J Robust Nonlinear Control 4(2):231–248MathSciNetCrossRef
44.
Zurück zum Zitat Patra JC, Panda G, Baliarsingh R (1994) Artificial neural network-based nonlinearity estimation of pressure sensors. IEEE Trans Instrum Meas 43(6):874–881CrossRef Patra JC, Panda G, Baliarsingh R (1994) Artificial neural network-based nonlinearity estimation of pressure sensors. IEEE Trans Instrum Meas 43(6):874–881CrossRef
45.
Zurück zum Zitat Polak S, Skowron A, Brandys J, Mendyk A (2008) Artificial neural networks based modeling for pharmacoeconomics application. Appl Math Comput 203(2):482–492MathSciNetMATH Polak S, Skowron A, Brandys J, Mendyk A (2008) Artificial neural networks based modeling for pharmacoeconomics application. Appl Math Comput 203(2):482–492MathSciNetMATH
46.
Zurück zum Zitat Raudys S (2012) Statistical and neural classifiers: an integrated approach to design. Springer, BerlinMATH Raudys S (2012) Statistical and neural classifiers: an integrated approach to design. Springer, BerlinMATH
47.
Zurück zum Zitat Robinson MD, Manry M, Malalur SS, Changhua Yu (2017) Properties of a batch training algorithm for feedforward networks. Neural Process Lett 45(3):841–854CrossRef Robinson MD, Manry M, Malalur SS, Changhua Yu (2017) Properties of a batch training algorithm for feedforward networks. Neural Process Lett 45(3):841–854CrossRef
48.
Zurück zum Zitat Rosenbrock HH (1960) An automatic method for finding the greatest or least value of a function. Comput J 3(3):175–184MathSciNetCrossRef Rosenbrock HH (1960) An automatic method for finding the greatest or least value of a function. Comput J 3(3):175–184MathSciNetCrossRef
49.
Zurück zum Zitat Rui Yong, El-Keib AA (1995) A review of ann-based short-term load forecasting models. In: Proceedings of the Twenty-Seventh Southeastern Symposium on System Theory, 1995. pp 78–82 IEEE Rui Yong, El-Keib AA (1995) A review of ann-based short-term load forecasting models. In: Proceedings of the Twenty-Seventh Southeastern Symposium on System Theory, 1995. pp 78–82 IEEE
50.
Zurück zum Zitat Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRef Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRef
51.
Zurück zum Zitat Saifullah Y, Manry M (1993) Classification-based segmentation of zip codes. IEEE Trans Syst, Man, Cybern 23(5):1437–1443CrossRef Saifullah Y, Manry M (1993) Classification-based segmentation of zip codes. IEEE Trans Syst, Man, Cybern 23(5):1437–1443CrossRef
52.
Zurück zum Zitat Shepherd AJ (1996) Second-order methods for neural networks fast and reliable training methods for multi-layer perceptrons, chapter 1. Multi-layer perceptron training, 1st edn. Springer, Berlin, pp 1–22 Shepherd AJ (1996) Second-order methods for neural networks fast and reliable training methods for multi-layer perceptrons, chapter 1. Multi-layer perceptron training, 1st edn. Springer, Berlin, pp 1–22
53.
Zurück zum Zitat Tyagi K, Manry M (2018) Multi-step training of a generalized linear classifier. Neural Process Lett 50(2):1341–1360CrossRef Tyagi K, Manry M (2018) Multi-step training of a generalized linear classifier. Neural Process Lett 50(2):1341–1360CrossRef
54.
Zurück zum Zitat Tyagi K, Nguyen S, Rawat R, Manry M (2019) Second order training and sizing for the multilayer perceptron. Neural Process Lett 51(1):963–991CrossRef Tyagi K, Nguyen S, Rawat R, Manry M (2019) Second order training and sizing for the multilayer perceptron. Neural Process Lett 51(1):963–991CrossRef
55.
Zurück zum Zitat Voultsidou M, Dodel S, Herrmann JM (2005) Neural networks approach to clustering of activity in fmri data. IEEE Trans Med Imaging 24(8):987–996CrossRef Voultsidou M, Dodel S, Herrmann JM (2005) Neural networks approach to clustering of activity in fmri data. IEEE Trans Med Imaging 24(8):987–996CrossRef
56.
Zurück zum Zitat Wang J, Huang J (2001) Neural network enhanced output regulation in nonlinear systems. Automatica 37(8):1189–1200MathSciNetCrossRef Wang J, Huang J (2001) Neural network enhanced output regulation in nonlinear systems. Automatica 37(8):1189–1200MathSciNetCrossRef
57.
Zurück zum Zitat Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. dissertation, Harvard University Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. dissertation, Harvard University
Metadaten
Titel
Balanced Gradient Training of Feed Forward Networks
verfasst von
Son Nguyen
Michael T. Manry
Publikationsdatum
05.03.2021
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 3/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-021-10474-1

Weitere Artikel der Ausgabe 3/2021

Neural Processing Letters 3/2021 Zur Ausgabe

Neuer Inhalt