Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

1. Introduction and Motivation

verfasst von : Anthony L. Caterini, Dong Eui Chang

Erschienen in: Deep Neural Networks in a Mathematical Framework

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter serves as a basic introduction to neural networks, including their history and some applications in which they have achieved state-of-the-art results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
E.g. self-driving cars, finance, other important systems.
 
2
Although the perceptron is just a specific case of logistic regression, which has roots from 1944 and earlier; see [7], for example.
 
3
We have generally two main classes of deep networks: supervised networks, requiring a specific target for each input, and unsupervised networks, which have no specific targets and only look to find structure within the input data. We can also have semi-supervised learning, in which some proportion of the training examples have targets, but this is not as common. Finally, another category called reinforcement learning exists, in which an autonomous agent attempts to learn a task, but the neural networks used within this are still often supervised—they attempt to predict the value of an action given the current state.
 
4
MNIST is from [34].
 
5
E.g. Wikipedia articles, LaTeX documents.
 
6
Although there were other major contributions to the first so-called A.I. winter, including over-promising to grant agencies when the current technology could not deliver; see [30] for more.
 
7
Perceptrons have no hidden layers.
 
8
This was also inspired by biological function, as the ReLU activation function is a realistic description of neuron firing [20].
 
Literatur
1.
Zurück zum Zitat D. Ackley, G. Hinton, T. Sejnowski, A learning algorithm for Boltzmann machines. Cogn. Sci. 9(1), 147–169 (1985) D. Ackley, G. Hinton, T. Sejnowski, A learning algorithm for Boltzmann machines. Cogn. Sci. 9(1), 147–169 (1985)
2.
Zurück zum Zitat M. Arjovsky, L. Bottou, Towards principled methods for training generative adversarial networks. arXiv:1701.04862 (2017, preprint) M. Arjovsky, L. Bottou, Towards principled methods for training generative adversarial networks. arXiv:1701.04862 (2017, preprint)
3.
Zurück zum Zitat M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN. arXiv:1701.07875 (2017, preprint) M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN. arXiv:1701.07875 (2017, preprint)
4.
Zurück zum Zitat D. Ballard, Modular learning in neural networks, in AAAI (1987), pp. 279–284. D. Ballard, Modular learning in neural networks, in AAAI (1987), pp. 279–284.
5.
Zurück zum Zitat A. Baydin, B. Pearlmutter, A. Radul, J. Siskind, Automatic differentiation in machine learning: a survey. arXiv:1502.05767 (2015, preprint) A. Baydin, B. Pearlmutter, A. Radul, J. Siskind, Automatic differentiation in machine learning: a survey. arXiv:1502.05767 (2015, preprint)
6.
Zurück zum Zitat Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy layer-wise training of deep networks, in Advances in Neural Information Processing Systems (2007), pp. 153–160 Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy layer-wise training of deep networks, in Advances in Neural Information Processing Systems (2007), pp. 153–160
7.
Zurück zum Zitat J. Berkson, Application of the logistic function to bio-assay. J. Am. Stat. Assoc. 39(227), 357–365 (1944) J. Berkson, Application of the logistic function to bio-assay. J. Am. Stat. Assoc. 39(227), 357–365 (1944)
8.
Zurück zum Zitat A.L. Caterini, D.E. Chang, A geometric framework for convolutional neural networks. arXiv:1608.04374 (2016, preprint) A.L. Caterini, D.E. Chang, A geometric framework for convolutional neural networks. arXiv:1608.04374 (2016, preprint)
9.
Zurück zum Zitat A.L. Caterini, D.E. Chang, A novel representation of neural networks. arXiv:1610.01549 (2016, preprint) A.L. Caterini, D.E. Chang, A novel representation of neural networks. arXiv:1610.01549 (2016, preprint)
10.
Zurück zum Zitat D. Cireşan, U. Meier, L. Gambardella, J. Schmidhuber, Deep, big, simple neural nets for handwritten digit recognition. Neural Comput. 22(12), 3207–3220 (2010) D. Cireşan, U. Meier, L. Gambardella, J. Schmidhuber, Deep, big, simple neural nets for handwritten digit recognition. Neural Comput. 22(12), 3207–3220 (2010)
11.
Zurück zum Zitat D. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (ELUs). arXiv:1511.07289 (2015, preprint) D. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (ELUs). arXiv:1511.07289 (2015, preprint)
12.
Zurück zum Zitat G. Cybenko, Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989) G. Cybenko, Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989)
13.
Zurück zum Zitat Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, in Advances in Neural Information Processing Systems (2014), pp. 2933–2941 Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, in Advances in Neural Information Processing Systems (2014), pp. 2933–2941
14.
Zurück zum Zitat R. Eldan, O. Shamir, The power of depth for feedforward neural networks, in Conference on Learning Theory (2016), pp. 907–940 R. Eldan, O. Shamir, The power of depth for feedforward neural networks, in Conference on Learning Theory (2016), pp. 907–940
15.
Zurück zum Zitat K. Fukushima, S. Miyake, Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition, in Competition and Cooperation in Neural Nets (Springer, Berlin, 1982), pp. 267–285 K. Fukushima, S. Miyake, Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition, in Competition and Cooperation in Neural Nets (Springer, Berlin, 1982), pp. 267–285
16.
Zurück zum Zitat I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680 I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680
18.
Zurück zum Zitat B. Graham, Fractional max-pooling. arXiv:1412.6071 (2014, preprint) B. Graham, Fractional max-pooling. arXiv:1412.6071 (2014, preprint)
19.
Zurück zum Zitat A. Graves, Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013, preprint) A. Graves, Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013, preprint)
20.
Zurück zum Zitat R. Hahnloser, R. Sarpeshkar, M.A. Mahowald, R. Douglas, H. Seung, Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000) R. Hahnloser, R. Sarpeshkar, M.A. Mahowald, R. Douglas, H. Seung, Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000)
21.
Zurück zum Zitat K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level performance on imagenet classification, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1026–1034 K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level performance on imagenet classification, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1026–1034
22.
Zurück zum Zitat K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778 K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778
23.
Zurück zum Zitat K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN. arXiv:1703.06870 (2017, preprint) K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN. arXiv:1703.06870 (2017, preprint)
24.
Zurück zum Zitat G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
25.
Zurück zum Zitat S. Hochreiter, Untersuchungen zu dynamischen neuronalen netzen, Diploma, Technische Universität München, 91, 1991 S. Hochreiter, Untersuchungen zu dynamischen neuronalen netzen, Diploma, Technische Universität München, 91, 1991
26.
Zurück zum Zitat S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
27.
Zurück zum Zitat S. Hochreiter, Y. Bengio, P. Frasconi, J. Schmidhuber, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press (2001) S. Hochreiter, Y. Bengio, P. Frasconi, J. Schmidhuber, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press (2001)
28.
Zurück zum Zitat K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991) K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
29.
Zurück zum Zitat A. Ivakhnenko, V. Lapa, Cybernetic predicting devices, Technical report, DTIC Document, 1966 A. Ivakhnenko, V. Lapa, Cybernetic predicting devices, Technical report, DTIC Document, 1966
30.
Zurück zum Zitat L. Kanal, Perceptron, in Encyclopedia of Computer Science (Wiley, Chichester, 2003) L. Kanal, Perceptron, in Encyclopedia of Computer Science (Wiley, Chichester, 2003)
31.
Zurück zum Zitat Y. LeCun, D. Touresky, G. Hinton, T. Sejnowski, A theoretical framework for back-propagation, in The Connectionist Models Summer School, vol. 1 (1988), pp. 21–28 Y. LeCun, D. Touresky, G. Hinton, T. Sejnowski, A theoretical framework for back-propagation, in The Connectionist Models Summer School, vol. 1 (1988), pp. 21–28
32.
Zurück zum Zitat Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel, Handwritten digit recognition with a back-propagation network, in Advances in Neural Information Processing Systems (1990), pp. 396–404 Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel, Handwritten digit recognition with a back-propagation network, in Advances in Neural Information Processing Systems (1990), pp. 396–404
33.
Zurück zum Zitat Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
35.
Zurück zum Zitat Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)CrossRef Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
36.
Zurück zum Zitat H. Lin, M. Tegmark, Why does deep and cheap learning work so well? arXiv:1608.08225 (2016, preprint) H. Lin, M. Tegmark, Why does deep and cheap learning work so well? arXiv:1608.08225 (2016, preprint)
37.
Zurück zum Zitat H. Lutkepohl, Handbook of Matrices (Wiley, Hoboken, 1997)MATH H. Lutkepohl, Handbook of Matrices (Wiley, Hoboken, 1997)MATH
38.
Zurück zum Zitat A. Maas, A. Hannun, A. Ng, Rectifier nonlinearities improve neural network acoustic models, in Proceedings of ICML, vol. 30 (2013) A. Maas, A. Hannun, A. Ng, Rectifier nonlinearities improve neural network acoustic models, in Proceedings of ICML, vol. 30 (2013)
39.
Zurück zum Zitat W. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biol. 5(4), 115–133 (1943)MathSciNetMATH W. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biol. 5(4), 115–133 (1943)MathSciNetMATH
40.
Zurück zum Zitat M. Minsky, S. Papert, Perceptrons (MIT press, Cambridge, 1969)MATH M. Minsky, S. Papert, Perceptrons (MIT press, Cambridge, 1969)MATH
41.
Zurück zum Zitat V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef
42.
Zurück zum Zitat G. Montufar, R. Pascanu, K. Cho, Y. Bengio, On the number of linear regions of deep neural networks, in Advances in Neural Information Processing Systems (2014), pp. 2924–2932 G. Montufar, R. Pascanu, K. Cho, Y. Bengio, On the number of linear regions of deep neural networks, in Advances in Neural Information Processing Systems (2014), pp. 2924–2932
43.
Zurück zum Zitat V. Nair, G. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proceedings of the 27th International Conference on Machine Learning (ICML-10) (2010), pp. 807–814 V. Nair, G. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proceedings of the 27th International Conference on Machine Learning (ICML-10) (2010), pp. 807–814
44.
Zurück zum Zitat R. Pascanu, G. Montufar, Y. Bengio, On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv:1312.6098 (2013, preprint) R. Pascanu, G. Montufar, Y. Bengio, On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv:1312.6098 (2013, preprint)
45.
Zurück zum Zitat A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015, preprint) A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015, preprint)
46.
Zurück zum Zitat F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958) F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
47.
Zurück zum Zitat D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985 D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985
48.
Zurück zum Zitat T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, in Advances in Neural Information Processing Systems (2016), pp. 2226–2234 T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, in Advances in Neural Information Processing Systems (2016), pp. 2226–2234
49.
Zurück zum Zitat G. Saon, T. Sercu, S.J. Rennie, H. Jeff Kuo, The IBM 2016 English conversational telephone speech recognition system. arXiv:1604.08242 (2016, preprint) G. Saon, T. Sercu, S.J. Rennie, H. Jeff Kuo, The IBM 2016 English conversational telephone speech recognition system. arXiv:1604.08242 (2016, preprint)
50.
Zurück zum Zitat J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef
51.
Zurück zum Zitat U. Shaham, A. Cloninger, R. Coifman, Provable approximation properties for deep neural networks. Appl. Comput. Harmon. Anal. 44(3), 537–557 (2018)MathSciNetCrossRef U. Shaham, A. Cloninger, R. Coifman, Provable approximation properties for deep neural networks. Appl. Comput. Harmon. Anal. 44(3), 537–557 (2018)MathSciNetCrossRef
52.
Zurück zum Zitat D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, et al., Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, et al., Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
53.
Zurück zum Zitat K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014, preprint) K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014, preprint)
54.
Zurück zum Zitat S. Vallender, Calculation of the wasserstein distance between probability distributions on the line. Theory Prob. Appl. 18(4), 784–786 (1974)CrossRef S. Vallender, Calculation of the wasserstein distance between probability distributions on the line. Theory Prob. Appl. 18(4), 784–786 (1974)CrossRef
55.
Zurück zum Zitat O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: A neural image caption generator, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3156–3164 O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: A neural image caption generator, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3156–3164
56.
Zurück zum Zitat P. Werbos, Applications of advances in nonlinear sensitivity analysis, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770 P. Werbos, Applications of advances in nonlinear sensitivity analysis, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770
57.
Zurück zum Zitat B. Widrow, M. Hoff, Associative storage and retrieval of digital information in networks of adaptive “neurons”, in Biological Prototypes and Synthetic Systems (Springer, Berlin, 1962), pp. 160–160 B. Widrow, M. Hoff, Associative storage and retrieval of digital information in networks of adaptive “neurons”, in Biological Prototypes and Synthetic Systems (Springer, Berlin, 1962), pp. 160–160
58.
Zurück zum Zitat R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef
59.
Zurück zum Zitat Z. Xie, A. Avati, N. Arivazhagan, D. Jurafsky, A. Ng, Neural language correction with character-based attention. arXiv:1603.09727 (2016, preprint) Z. Xie, A. Avati, N. Arivazhagan, D. Jurafsky, A. Ng, Neural language correction with character-based attention. arXiv:1603.09727 (2016, preprint)
Metadaten
Titel
Introduction and Motivation
verfasst von
Anthony L. Caterini
Dong Eui Chang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75304-1_1

Premium Partner