nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

6. Convolutional Neural Networks

verfasst von : Sandro Skansi

Erschienen in: Introduction to Deep Learning

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This chapter introduces the first deep learning architecture of the book, convolutional neural networks. It starts with redefining the way a logistic regression accepts data, and defines 1D and 2D convolutional layers as a natural extension of the logistic regression. The chapter also details on how to connect the layers and dimensionality problems. The local receptive field is introduced as a core concept of any convolutional architecture and the connections with the vanishing gradient problem is explored. Also the idea of padding is introduced in the visual setting, as well as the stride of the local receptive field. Pooling is also explored in the general setting and as max-pooling. A complete convolutional neural network for classifying MNIST is then presented in Keras code, and all the details of the code are presented as comments and illustrations. The final section of the chapter presents modifications needed to adapt convolutional networks, which are primarily visual classificators, to work with text and language.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Modifications and Extensions to a Feed-Forward Neural Network

Nächstes Kapitel Recurrent Neural Networks

Yann LeCun once told in an interview that he prefers the name ‘convolutional network’ rather than ‘convolutional neural network’.

An image in this sense is any 2D array with values between 0 and 255. In Fig. 6.1 we have numbered the positions, and you may think of them as ‘cell numbers’, in the sense that they will contain some value, but the number on the image denotes only their order. In addition, note that if we have e.g. 100 by 100 RGB images, each image would be a 3D array (tensor) with dimensions (100, 100, 3). The last dimension of the array would hold the three channels, red, green and blue.

Here you might notice how important is weight initialization. We do have some techniques that are better than random initialization, but to find a good weight initialization strategy is an important open research problem.

If using padding we will keep the same size, but still expand the depth. Padding is useful when there is possibly important information on the edges of the image.

You have everything you need in this book to get the array (tensor) with the feature maps, and even to squash it to 2D, but you might have to search the Internet to find out how to visualize the tensor as an image. Consider it a good (but advanced) Python exercise.

If it has 100 neurons per layer, with only one output neuron, that makes the total of \(784\cdot 100 + 100\cdot 100+ 100\cdot 100 + 100\cdot 1 = 98500\) parameters, and that is without the biases!.

Which is, mathematically speaking, a tensor.

Remember how we can convert a 28 by 28 matrix into a 784-dimensional vector.

Keras calls them ‘Dense’.

Trivially, every paper will have a ‘trickiest part’, and it is your job to learn how to decode this part, since it is often the most important part of the paper.

Since the whole alphabet will not fit on a page, but you can easily imagine how it will expand to the normal English alphabet.

A couple of hours each day—not a literal week.

Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)CrossRef

X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, in Advances in Neural Information Processing Systems 28, NIPS (2015)

Titel: Convolutional Neural Networks
verfasst von: Sandro Skansi
Verlag: Springer International Publishing
Buch: Introduction to Deep Learning
Print ISBN: 978-3-319-73003-5

Electronic ISBN: 978-3-319-73004-2

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-73004-2_6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner