nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

5. Recurrent Neural Networks

verfasst von : Anthony L. Caterini, Dong Eui Chang

Erschienen in: Deep Neural Networks in a Mathematical Framework

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We applied the generic neural network framework from Chap. 3 to specific network structures in the previous chapter. Multilayer Perceptrons and Convolutional Neural Networks fit squarely into that framework, and we were also able to modify it to capture Deep Auto-Encoders. We now extend the generic framework even further to handle Recurrent Neural Networks (RNNs), the sequence-parsing network structure containing a recurring latent, or hidden, state that evolves at each layer of the network. This involves the development of new notation, but we remain as consistent as possible with previous chapters. The specific layout of this chapter is as follows. We first formulate a generic, feed-forward recurrent neural network. We calculate gradients of loss functions for these networks in two ways: Real-Time Recurrent Learning (RTRL) and Backpropagation Through Time (BPTT). Using our notation for vector-valued maps, we derive these algorithms directly over the inner product space in which the parameters reside. We then proceed to formally represent a vanilla RNN, which is the simplest form of RNN, and we formulate RTRL and BPTT for that as well. At the end of the chapter, we briefly mention modern RNN variants in the context of our generic framework.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Specific Network Descriptions

Nächstes Kapitel Conclusion and Future Work

We have adopted a slightly different indexing convention in this chapter—notice that f _i takes in h _i−1 and outputs h _i, as opposed to the previous chapters where we evolved the state variable according to x _i+1 = f _i(x _i). This indexing convention is more natural for RNNs, as we will see that the ith prediction will depend on h _i with this adjustment, instead of on h _i+1.

We use \(\overline e_j\) here instead of simply e _j since we already have e _i defined in (5.8) and will continue to use it throughout this section.

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 (2014, preprint)

J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555 (2014, preprint)

I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016). http://www.deeplearningbook.org MATH

A. Graves, N. Jaitly, A. Mohamed, Hybrid speech recognition with deep bidirectional LSTM, in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (IEEE, New York, 2013), pp. 273–278CrossRef

A. Graves, A. Mohamed, G. Hinton, Speech recognition with deep recurrent neural networks, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, New York, 2013), pp. 6645–6649CrossRef

K. Greff, R. Srivastava, J. Koutník, B. Steunebrink, J. Schmidhuber, LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)MathSciNetCrossRef

S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

R. Jozefowicz, W. Zaremba, I. Sutskever, An empirical exploration of recurrent network architectures, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (2015), pp. 2342–2350

R. Pascanu, C. Gulcehre, K. Cho, Y. Bengio, How to construct deep recurrent neural networks. arXiv:1312.6026 (2013, preprint)

10.

D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation. Technical report, California University San Diego La Jolla Institute for Cognitive Science, 1985

11.

J. Schmidhuber, A fixed size storage O(n ³) time complexity learning algorithm for fully recurrent continually running networks. Neural Comput. 4(2), 243–248 (1992)CrossRef

12.

M. Schuster, K. Paliwal, Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef

13.

I. Sutskever, Training recurrent neural networks. University of Toronto, Toronto, Ontario, Canada, 2013

14.

S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, K. Saenko, Translating videos to natural language using deep recurrent neural networks. arXiv:1412.4729 (2014, preprint)

15.

R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef

Titel: Recurrent Neural Networks
verfasst von: Anthony L. Caterini
Dong Eui Chang
Verlag: Springer International Publishing
Buch: Deep Neural Networks in a Mathematical Framework
Print ISBN: 978-3-319-75303-4

Electronic ISBN: 978-3-319-75304-1

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-75304-1_5

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"