Skip to main content

2018 | OriginalPaper | Buchkapitel

5. Recurrent Neural Networks

verfasst von : Anthony L. Caterini, Dong Eui Chang

Erschienen in: Deep Neural Networks in a Mathematical Framework

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We applied the generic neural network framework from Chap. 3 to specific network structures in the previous chapter. Multilayer Perceptrons and Convolutional Neural Networks fit squarely into that framework, and we were also able to modify it to capture Deep Auto-Encoders. We now extend the generic framework even further to handle Recurrent Neural Networks (RNNs), the sequence-parsing network structure containing a recurring latent, or hidden, state that evolves at each layer of the network. This involves the development of new notation, but we remain as consistent as possible with previous chapters. The specific layout of this chapter is as follows. We first formulate a generic, feed-forward recurrent neural network. We calculate gradients of loss functions for these networks in two ways: Real-Time Recurrent Learning (RTRL) and Backpropagation Through Time (BPTT). Using our notation for vector-valued maps, we derive these algorithms directly over the inner product space in which the parameters reside. We then proceed to formally represent a vanilla RNN, which is the simplest form of RNN, and we formulate RTRL and BPTT for that as well. At the end of the chapter, we briefly mention modern RNN variants in the context of our generic framework.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We have adopted a slightly different indexing convention in this chapter—notice that f i takes in h i−1 and outputs h i , as opposed to the previous chapters where we evolved the state variable according to x i+1 = f i (x i ). This indexing convention is more natural for RNNs, as we will see that the ith prediction will depend on h i with this adjustment, instead of on h i+1.
 
2
We use \(\overline e_j\) here instead of simply e j since we already have e i defined in (5.8) and will continue to use it throughout this section.
 
Literatur
1.
Zurück zum Zitat K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 (2014, preprint) K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 (2014, preprint)
2.
Zurück zum Zitat J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555 (2014, preprint) J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555 (2014, preprint)
4.
Zurück zum Zitat A. Graves, N. Jaitly, A. Mohamed, Hybrid speech recognition with deep bidirectional LSTM, in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (IEEE, New York, 2013), pp. 273–278CrossRef A. Graves, N. Jaitly, A. Mohamed, Hybrid speech recognition with deep bidirectional LSTM, in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (IEEE, New York, 2013), pp. 273–278CrossRef
5.
Zurück zum Zitat A. Graves, A. Mohamed, G. Hinton, Speech recognition with deep recurrent neural networks, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, New York, 2013), pp. 6645–6649CrossRef A. Graves, A. Mohamed, G. Hinton, Speech recognition with deep recurrent neural networks, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, New York, 2013), pp. 6645–6649CrossRef
6.
Zurück zum Zitat K. Greff, R. Srivastava, J. Koutník, B. Steunebrink, J. Schmidhuber, LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)MathSciNetCrossRef K. Greff, R. Srivastava, J. Koutník, B. Steunebrink, J. Schmidhuber, LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)MathSciNetCrossRef
7.
Zurück zum Zitat S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
8.
Zurück zum Zitat R. Jozefowicz, W. Zaremba, I. Sutskever, An empirical exploration of recurrent network architectures, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (2015), pp. 2342–2350 R. Jozefowicz, W. Zaremba, I. Sutskever, An empirical exploration of recurrent network architectures, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (2015), pp. 2342–2350
9.
Zurück zum Zitat R. Pascanu, C. Gulcehre, K. Cho, Y. Bengio, How to construct deep recurrent neural networks. arXiv:1312.6026 (2013, preprint) R. Pascanu, C. Gulcehre, K. Cho, Y. Bengio, How to construct deep recurrent neural networks. arXiv:1312.6026 (2013, preprint)
10.
Zurück zum Zitat D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation. Technical report, California University San Diego La Jolla Institute for Cognitive Science, 1985 D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation. Technical report, California University San Diego La Jolla Institute for Cognitive Science, 1985
11.
Zurück zum Zitat J. Schmidhuber, A fixed size storage O(n 3) time complexity learning algorithm for fully recurrent continually running networks. Neural Comput. 4(2), 243–248 (1992)CrossRef J. Schmidhuber, A fixed size storage O(n 3) time complexity learning algorithm for fully recurrent continually running networks. Neural Comput. 4(2), 243–248 (1992)CrossRef
12.
Zurück zum Zitat M. Schuster, K. Paliwal, Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef M. Schuster, K. Paliwal, Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef
13.
Zurück zum Zitat I. Sutskever, Training recurrent neural networks. University of Toronto, Toronto, Ontario, Canada, 2013 I. Sutskever, Training recurrent neural networks. University of Toronto, Toronto, Ontario, Canada, 2013
14.
Zurück zum Zitat S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, K. Saenko, Translating videos to natural language using deep recurrent neural networks. arXiv:1412.4729 (2014, preprint) S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, K. Saenko, Translating videos to natural language using deep recurrent neural networks. arXiv:1412.4729 (2014, preprint)
15.
Zurück zum Zitat R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef R. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)CrossRef
Metadaten
Titel
Recurrent Neural Networks
verfasst von
Anthony L. Caterini
Dong Eui Chang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75304-1_5