Skip to main content

2018 | OriginalPaper | Buchkapitel

6. Convolutional Neural Networks

verfasst von : Sandro Skansi

Erschienen in: Introduction to Deep Learning

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter introduces the first deep learning architecture of the book, convolutional neural networks. It starts with redefining the way a logistic regression accepts data, and defines 1D and 2D convolutional layers as a natural extension of the logistic regression. The chapter also details on how to connect the layers and dimensionality problems. The local receptive field is introduced as a core concept of any convolutional architecture and the connections with the vanishing gradient problem is explored. Also the idea of padding is introduced in the visual setting, as well as the stride of the local receptive field. Pooling is also explored in the general setting and as max-pooling. A complete convolutional neural network for classifying MNIST is then presented in Keras code, and all the details of the code are presented as comments and illustrations. The final section of the chapter presents modifications needed to adapt convolutional networks, which are primarily visual classificators, to work with text and language.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Yann LeCun once told in an interview that he prefers the name ‘convolutional network’ rather than ‘convolutional neural network’.
 
2
An image in this sense is any 2D array with values between 0 and 255. In Fig. 6.1 we have numbered the positions, and you may think of them as ‘cell numbers’, in the sense that they will contain some value, but the number on the image denotes only their order. In addition, note that if we have e.g. 100 by 100 RGB images, each image would be a 3D array (tensor) with dimensions (100, 100, 3). The last dimension of the array would hold the three channels, red, green and blue.
 
3
Here you might notice how important is weight initialization. We do have some techniques that are better than random initialization, but to find a good weight initialization strategy is an important open research problem.
 
4
If using padding we will keep the same size, but still expand the depth. Padding is useful when there is possibly important information on the edges of the image.
 
5
You have everything you need in this book to get the array (tensor) with the feature maps, and even to squash it to 2D, but you might have to search the Internet to find out how to visualize the tensor as an image. Consider it a good (but advanced) Python exercise.
 
6
If it has 100 neurons per layer, with only one output neuron, that makes the total of \(784\cdot 100 + 100\cdot 100+ 100\cdot 100 + 100\cdot 1 = 98500\) parameters, and that is without the biases!.
 
7
Which is, mathematically speaking, a tensor.
 
8
Remember how we can convert a 28 by 28 matrix into a 784-dimensional vector.
 
9
Keras calls them ‘Dense’.
 
10
Trivially, every paper will have a ‘trickiest part’, and it is your job to learn how to decode this part, since it is often the most important part of the paper.
 
11
Since the whole alphabet will not fit on a page, but you can easily imagine how it will expand to the normal English alphabet.
 
12
A couple of hours each day—not a literal week.
 
Literatur
1.
Zurück zum Zitat Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
2.
Zurück zum Zitat D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)CrossRef D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)CrossRef
3.
Zurück zum Zitat X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, in Advances in Neural Information Processing Systems 28, NIPS (2015) X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, in Advances in Neural Information Processing Systems 28, NIPS (2015)
Metadaten
Titel
Convolutional Neural Networks
verfasst von
Sandro Skansi
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-73004-2_6

Premium Partner