nach oben

Neural Processing Letters

Erschienen in:

05.03.2021

Balanced Gradient Training of Feed Forward Networks

verfasst von: Son Nguyen, Michael T. Manry

Erschienen in: Neural Processing Letters | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We show that there are infinitely many valid scaled gradients which can be used to train a neural network. A novel training method is proposed that finds the best scaled gradients in each training iteration. The method’s implementation uses first order derivatives which makes it scalable and suitable for deep learning and big data. In simulations, the proposed method has similar or less testing error than conjugate gradient and Levenberg Marquardt. The method reaches the final network utilizing fewer multiplies than the other two algorithms. It also works better than conjugate gradient in convolutional neural networks.

Vorheriger Artikel Multi-object Spatial–Temporal Anomaly Detection Using an LSTM-Based Framework

Nächster Artikel Exponential Synchronization of Delayed Switching Genetic Oscillator Networks via Mode-Dependent Partial Impulsive Control

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Akram M Usman, Usman Anam (2011) Computer aided system for brain tumor detection and segmentation. In: 2011 International Conference on Computer Networks and Information Technology (ICCNIT). pp 299–302 IEEE

Atkinson PM, Tatnall ARL (1997) Introduction neural networks in remote sensing. Int J Remote Sens 18(4):699–709CrossRef

Auddy S S, Tyagi K, Nguyen S, Manry M (2016) Discriminant vector tranformations in neural network classifiers. In: 2016 International Joint Conference on Neural Networks (IJCNN)

Baxt WG (1991) Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med 115(11):843–848CrossRef

Beck C, Weinan E, Jentzen A (2019) Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. J Nonlinear Sci 29(4):1563–1619MathSciNetCrossRef

Bhandarkar SM, Koh J, Suk M (1997) Multiscale image segmentation using a hierarchical self-organizing map. Neurocomputing 14(3):241–272CrossRef

Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinMATH

Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151CrossRef

Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, EnglandCrossRef

10.

Brause Rüdiger W (2001) Medical analysis and diagnosis by neural networks. In: International Symposium on Medical Data Analysis. pp 1–13 Springer

11.

Dai T, Cai J, Zhang Y, Xia ST, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11065–11074

12.

Eapi GR (2015) Comprehensive neural network forecasting system for ground level ozone in multiple regions. Ph.D. dissertation, The University of Texas at Arlington

13.

Economou G-PK, Spiropoulos C, Economopoulos NM, Charokopos N, Lymberopoulos D, Spiliopoulou M, Haralambopulu E, Goutis CE (1994) Medical diagnosis and artificial neural networks: a medical expert system applied to pulmonary diseases. In: Neural Networks for Signal Processing [1994] IV. Proceedings of the 1994 IEEE Workshop. pp 482–489 IEEE

14.

Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networks a review. Pattern Recogn 35(10):2279–2301CrossRef

15.

Gill PE, Murray W (1979) Conjugate-Gradient methods for large-scale nonlinear optimization. Technical report, Standford Univ Calif Systems Optimization LAB

16.

Goodfellow I, Bengio Y, Courville A (2016) Deep Learn. MIT press, USAMATH

17.

Gore RG, Li J, Manry M, Liu L-M, Changhua Yu, Wei J (2005) Iterative design of neural network classifiers through regression. Int J Artif Intell Tools 14(01n02):281–301CrossRef

18.

Hamidieh K (2018) A data-driven statistical model for predicting the critical temperature of a superconductor. Comput Mater Sci 154:346–354CrossRef

19.

Ho Y-C, Kashyap RL (1965) An algorithm for linear inequalities and its applications. IEEE Transactions on Electronic Computers 5:683–688CrossRef

20.

Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef

21.

Kavzoglu T, Mather PM (1999) Pruning artificial neural networks: an example using land cover classification of multi-sensor images. Int J Remote Sens 20(14):2787–2803CrossRef

22.

Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, Citeseer

23.

Kulluk S, Ozbakir L, Baykasoglu A (2012) Training neural networks with harmony search algorithms for classification problems. Eng Appl Artif Intell 25(1):11–19CrossRef

24.

Le QV, Ngiam J, Coates A, Lahiri A, Prochnow Bobby, Ng Andrew Y (2011) On optimization methods for deep learning. In: Proceedings of the 28th International Conference on International Conference on Machine Learning. pp 265–272 Omnipress

25.

LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. Handb Brain Theor Neural Netw 3361(10):1995

26.

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

27.

LeCun Y A, Bottou Léon, Orr Genevieve B, Müller Klaus-Robert (2012) Efficient backprop. In: Neural networks: Tricks of the trade. pp 9–48 Springer

28.

Lee KY, Cha YT, Park JH (1992) Short-term load forecasting using an artificial neural network. IEEE Trans Power Syst 7(1):124–132CrossRef

29.

Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2(2):164–168MathSciNetCrossRef

30.

Lin JT, Inigo R (1991) Hand written zip code recognition by back propagation neural network. In: IEEE Proceedings of Southeastcon’91. pp 731–735 IEEE

31.

Liu K, Subbarayan S, Shoults RR, Manry M, Kwan C, Lewis FI, Naccarino J (1996) Comparison of very short-term load forecasting techniques. IEEE Trans Power Syst 11(2):877–882CrossRef

32.

Liu LM, Manry M, Amar F, Dawson MS, Fung AK (1994) Image classification in remote sensing using functional link neural networks. In: Proceedings of the IEEE southwest symposium on image analysis and interpretation. pp 54–58 IEEE

33.

Luxhøj JT (1998) An artificial neural network for nonlinear estimation of the turbine flow-meter coefficient. Eng Appl Artif Intell 11(6):723–734CrossRef

34.

Manry M, Dawson MS, Fung AK, Apollo SJ, Allen LS, Lyle WD, Gong W (1994) Fast training of neural networks for remote sensing. Remote Sens Rev 9(1–2):77–96CrossRef

35.

Morgan N, Bourlard HA (1995) Neural networks for statistical recognition of continuous speech. Proc IEEE 83(5):742–772CrossRef

36.

Nazeer Shahrin Azuan, Omar Nazaruddin, Marzuki Khalid (2007) Face recognition system using artificial neural networks approach. In: 2007 International Conference on Signal Processing, Communications and Networking. pp 420–425 IEEE

37.

Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning

38.

Nguyen S (2019) Affine invariance in multilayer perceptron training. Ph.D. dissertation, The University of Texas at Arlington

39.

Nguyen Son, Tyagi Kanishka, Kheirkhah Parastoo, Manry Michael (2016) Partially affine invariant back propagation. In: 2016 International Joint Conference on Neural Networks (IJCNN). pp 811–818 IEEE

40.

Yisok O, Sarabandi K, Ulaby FT (1992) An empirical model and an inversion technique for radar scattering from bare soil surfaces. IEEE Trans Geosci Remote Sens 30(2):370–381CrossRef

41.

Osawa K, Tsuji Y, Ueno Y, Naruse A, Yokota R, Matsuoka S (2019) Large-scale distributed second-order optimization using kronecker-factored approximate curvature for deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 12359–12367

42.

Oz C, Leu MC (2011) American sign language word recognition with a sensory glove using artificial neural networks. Eng Appl Artif Intell 24(7):1204–1213CrossRef

43.

Parisini T, Zoppoli R (1994) Neural networks for nonlinear state estimation. Int J Robust Nonlinear Control 4(2):231–248MathSciNetCrossRef

44.

Patra JC, Panda G, Baliarsingh R (1994) Artificial neural network-based nonlinearity estimation of pressure sensors. IEEE Trans Instrum Meas 43(6):874–881CrossRef

45.

Polak S, Skowron A, Brandys J, Mendyk A (2008) Artificial neural networks based modeling for pharmacoeconomics application. Appl Math Comput 203(2):482–492MathSciNetMATH

46.

Raudys S (2012) Statistical and neural classifiers: an integrated approach to design. Springer, BerlinMATH

47.

Robinson MD, Manry M, Malalur SS, Changhua Yu (2017) Properties of a batch training algorithm for feedforward networks. Neural Process Lett 45(3):841–854CrossRef

48.

Rosenbrock HH (1960) An automatic method for finding the greatest or least value of a function. Comput J 3(3):175–184MathSciNetCrossRef

49.

Rui Yong, El-Keib AA (1995) A review of ann-based short-term load forecasting models. In: Proceedings of the Twenty-Seventh Southeastern Symposium on System Theory, 1995. pp 78–82 IEEE

50.

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRef

51.

Saifullah Y, Manry M (1993) Classification-based segmentation of zip codes. IEEE Trans Syst, Man, Cybern 23(5):1437–1443CrossRef

52.

Shepherd AJ (1996) Second-order methods for neural networks fast and reliable training methods for multi-layer perceptrons, chapter 1. Multi-layer perceptron training, 1st edn. Springer, Berlin, pp 1–22

53.

Tyagi K, Manry M (2018) Multi-step training of a generalized linear classifier. Neural Process Lett 50(2):1341–1360CrossRef

54.

Tyagi K, Nguyen S, Rawat R, Manry M (2019) Second order training and sizing for the multilayer perceptron. Neural Process Lett 51(1):963–991CrossRef

55.

Voultsidou M, Dodel S, Herrmann JM (2005) Neural networks approach to clustering of activity in fmri data. IEEE Trans Med Imaging 24(8):987–996CrossRef

56.

Wang J, Huang J (2001) Neural network enhanced output regulation in nonlinear systems. Automatica 37(8):1189–1200MathSciNetCrossRef

57.

Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. dissertation, Harvard University

Titel: Balanced Gradient Training of Feed Forward Networks
verfasst von: Son Nguyen
Michael T. Manry
Publikationsdatum: 05.03.2021
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 3/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-021-10474-1

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2021

Feature Extraction via Sparse Fuzzy Difference Embedding (SFDE) for Robust Subspace Learning

A Wiener Causality Defined by Divergence

Multi-Objective Memetic Algorithms with Tree-Based Genetic Programming and Local Search for Symbolic Regression

Attention-Based Deep Gated Fully Convolutional End-to-End Architectures for Time Series Classification

User’s Review Habits Enhanced Hierarchical Neural Network for Document-Level Sentiment Classification

Exponential Synchronization of Stochastic Neural Networks with Time-Varying Delays and Lévy Noises via Event-Triggered Control

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.