Skip to main content

2018 | OriginalPaper | Buchkapitel

4. Specific Network Descriptions

verfasst von : Anthony L. Caterini, Dong Eui Chang

Erschienen in: Deep Neural Networks in a Mathematical Framework

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We developed an algebraic framework for a generic layered network in the preceding chapter, including a method to express error backpropagation and loss function derivatives directly over the inner product space in which the network parameters are defined. We dedicate this chapter to expressing three common neural network structures within this generic framework: the Multilayer Perceptron (MLP), the Convolutional Neural Network (CNN), and the Deep Auto-Encoder (DAE). The exact layout of this chapter is as follows. We first explore the simple case of the MLP, deriving the canonical vector-valued form of backpropagation along the way. Then, we shift our attention to the CNN. Here, our layerwise function is far more complicated, as our inputs and parameters are in tensor product spaces, and thus we require more complex operations to combine the inputs and the parameters. CNNs still fit squarely in the framework developed in the previous chapter. The final network that we consider in this chapter, the DAE, does not fit as easily into that framework, as the parameters at any given layer have a deterministic relationship with the parameters at exactly one other layer, violating the assumption of parametric independence across layers. We overcome this issue, however, with a small adjustment to the framework. An algorithm for one step of gradient descent is derived for each type of network.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Also known as a neuron in keeping with the brain analogy.
 
2
We will omit the use of a bias vector b in this formulation because it is a simple extension of what we will develop here and will lighten the notation. Refer to [1] to see how we can handle the bias vector.
 
3
Actually, in the neural network community, we use cross-correlation instead of convolution, although the difference is minor and we almost never mention cross-correlation; refer to [5] for more on the difference between the two.
 
4
Here, we have assumed that both n 1 and 1 are divisible by Δ; in particular, \(n_1 = \varDelta \widehat n_1\) and \(\ell _1 = \varDelta \widehat \ell _1\). If this is not the case, however, we can increase n 1 or 1 to be divisible by Δ via boundary conditions on the input matrices; refer to [5] for more on image padding or boundary conditions.
 
5
Again, we have established a relationship between \((\widehat n_1, \widehat \ell _1)\) and (n 2, 2)—in particular, \(\widehat n_1 = r n_2\) and \(\widehat \ell _1 = r \ell _2\). If \(\widehat n_1\) or \(\widehat \ell _1\) is not divisible by r, we can add padding or boundary conditions as in the convolution.
 
6
Also C and c j , AND Ψ and ψ.
 
Literatur
1.
Zurück zum Zitat A.L. Caterini, D.E. Chang, A geometric framework for convolutional neural networks. arXiv:1608.04374 (2016, preprint) A.L. Caterini, D.E. Chang, A geometric framework for convolutional neural networks. arXiv:1608.04374 (2016, preprint)
2.
Zurück zum Zitat A.L. Caterini, D.E. Chang, A novel representation of neural networks. arXiv:1610.01549 (2016, preprint) A.L. Caterini, D.E. Chang, A novel representation of neural networks. arXiv:1610.01549 (2016, preprint)
4.
5.
Zurück zum Zitat A. Jain, Fundamentals of Digital Image Processing (Prentice-Hall, Englewood, 1989)MATH A. Jain, Fundamentals of Digital Image Processing (Prentice-Hall, Englewood, 1989)MATH
6.
Zurück zum Zitat F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958) F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Metadaten
Titel
Specific Network Descriptions
verfasst von
Anthony L. Caterini
Dong Eui Chang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75304-1_4