nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

1. Introduction and Motivation

verfasst von : Anthony L. Caterini, Dong Eui Chang

Erschienen in: Deep Neural Networks in a Mathematical Framework

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This chapter serves as a basic introduction to neural networks, including their history and some applications in which they have achieved state-of-the-art results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Mathematical Preliminaries

E.g. self-driving cars, finance, other important systems.

Although the perceptron is just a specific case of logistic regression, which has roots from 1944 and earlier; see [7], for example.

We have generally two main classes of deep networks: supervised networks, requiring a specific target for each input, and unsupervised networks, which have no specific targets and only look to find structure within the input data. We can also have semi-supervised learning, in which some proportion of the training examples have targets, but this is not as common. Finally, another category called reinforcement learning exists, in which an autonomous agent attempts to learn a task, but the neural networks used within this are still often supervised—they attempt to predict the value of an action given the current state.

MNIST is from [34].

E.g. Wikipedia articles, LaTeX documents.

Although there were other major contributions to the first so-called A.I. winter, including over-promising to grant agencies when the current technology could not deliver; see [30] for more.

Perceptrons have no hidden layers.

This was also inspired by biological function, as the ReLU activation function is a realistic description of neuron firing [20].

D. Ackley, G. Hinton, T. Sejnowski, A learning algorithm for Boltzmann machines. Cogn. Sci. 9(1), 147–169 (1985)

M. Arjovsky, L. Bottou, Towards principled methods for training generative adversarial networks. arXiv:1701.04862 (2017, preprint)

M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN. arXiv:1701.07875 (2017, preprint)

D. Ballard, Modular learning in neural networks, in AAAI (1987), pp. 279–284.

A. Baydin, B. Pearlmutter, A. Radul, J. Siskind, Automatic differentiation in machine learning: a survey. arXiv:1502.05767 (2015, preprint)

Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy layer-wise training of deep networks, in Advances in Neural Information Processing Systems (2007), pp. 153–160

J. Berkson, Application of the logistic function to bio-assay. J. Am. Stat. Assoc. 39(227), 357–365 (1944)

A.L. Caterini, D.E. Chang, A geometric framework for convolutional neural networks. arXiv:1608.04374 (2016, preprint)

A.L. Caterini, D.E. Chang, A novel representation of neural networks. arXiv:1610.01549 (2016, preprint)

10.

D. Cireşan, U. Meier, L. Gambardella, J. Schmidhuber, Deep, big, simple neural nets for handwritten digit recognition. Neural Comput. 22(12), 3207–3220 (2010)

11.

D. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (ELUs). arXiv:1511.07289 (2015, preprint)

12.

G. Cybenko, Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989)

13.

Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, in Advances in Neural Information Processing Systems (2014), pp. 2933–2941

14.

R. Eldan, O. Shamir, The power of depth for feedforward neural networks, in Conference on Learning Theory (2016), pp. 907–940

15.

K. Fukushima, S. Miyake, Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition, in Competition and Cooperation in Neural Nets (Springer, Berlin, 1982), pp. 267–285

16.

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680

17.

I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016), http://www.deeplearningbook.org.

18.

B. Graham, Fractional max-pooling. arXiv:1412.6071 (2014, preprint)

19.

A. Graves, Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013, preprint)

20.

R. Hahnloser, R. Sarpeshkar, M.A. Mahowald, R. Douglas, H. Seung, Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000)

21.

K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level performance on imagenet classification, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1026–1034

22.

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778

23.

K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN. arXiv:1703.06870 (2017, preprint)

24.

G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

25.

S. Hochreiter, Untersuchungen zu dynamischen neuronalen netzen, Diploma, Technische Universität München, 91, 1991

26.

S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

27.

S. Hochreiter, Y. Bengio, P. Frasconi, J. Schmidhuber, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press (2001)

28.

K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)

29.

A. Ivakhnenko, V. Lapa, Cybernetic predicting devices, Technical report, DTIC Document, 1966

30.

L. Kanal, Perceptron, in Encyclopedia of Computer Science (Wiley, Chichester, 2003)

31.

Y. LeCun, D. Touresky, G. Hinton, T. Sejnowski, A theoretical framework for back-propagation, in The Connectionist Models Summer School, vol. 1 (1988), pp. 21–28

32.

Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel, Handwritten digit recognition with a back-propagation network, in Advances in Neural Information Processing Systems (1990), pp. 396–404

33.

Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

34.

Y. LeCun, C. Cortes, C. Burges, Mnist handwritten digit database. AT&T Labs [Online]. http://yann.lecun.com/exdb/mnist, 2 (2010)

35.

Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)CrossRef

36.

H. Lin, M. Tegmark, Why does deep and cheap learning work so well? arXiv:1608.08225 (2016, preprint)

37.

H. Lutkepohl, Handbook of Matrices (Wiley, Hoboken, 1997)MATH

38.

A. Maas, A. Hannun, A. Ng, Rectifier nonlinearities improve neural network acoustic models, in Proceedings of ICML, vol. 30 (2013)

39.

W. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biol. 5(4), 115–133 (1943)MathSciNetMATH

40.

M. Minsky, S. Papert, Perceptrons (MIT press, Cambridge, 1969)MATH

41.

V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef

42.

G. Montufar, R. Pascanu, K. Cho, Y. Bengio, On the number of linear regions of deep neural networks, in Advances in Neural Information Processing Systems (2014), pp. 2924–2932

43.

V. Nair, G. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proceedings of the 27th International Conference on Machine Learning (ICML-10) (2010), pp. 807–814

44.

R. Pascanu, G. Montufar, Y. Bengio, On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv:1312.6098 (2013, preprint)

45.

A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015, preprint)

46.

F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)

47.

D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985

48.

T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, in Advances in Neural Information Processing Systems (2016), pp. 2226–2234

49.

G. Saon, T. Sercu, S.J. Rennie, H. Jeff Kuo, The IBM 2016 English conversational telephone speech recognition system. arXiv:1604.08242 (2016, preprint)

50.

J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef

51.

U. Shaham, A. Cloninger, R. Coifman, Provable approximation properties for deep neural networks. Appl. Comput. Harmon. Anal. 44(3), 537–557 (2018)MathSciNetCrossRef

52.

D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, et al., Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef

53.

K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014, preprint)

54.

S. Vallender, Calculation of the wasserstein distance between probability distributions on the line. Theory Prob. Appl. 18(4), 784–786 (1974)CrossRef

55.

O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: A neural image caption generator, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3156–3164

56.

P. Werbos, Applications of advances in nonlinear sensitivity analysis, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770

57.

B. Widrow, M. Hoff, Associative storage and retrieval of digital information in networks of adaptive “neurons”, in Biological Prototypes and Synthetic Systems (Springer, Berlin, 1962), pp. 160–160

58.

R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef

59.

Z. Xie, A. Avati, N. Arivazhagan, D. Jurafsky, A. Ng, Neural language correction with character-based attention. arXiv:1603.09727 (2016, preprint)

Titel: Introduction and Motivation
verfasst von: Anthony L. Caterini
Dong Eui Chang
Verlag: Springer International Publishing
Buch: Deep Neural Networks in a Mathematical Framework
Print ISBN: 978-3-319-75303-4

Electronic ISBN: 978-3-319-75304-1

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-75304-1_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner