nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

29.01.2020 | Original Article

Learning deep hierarchical and temporal recurrent neural networks with residual learning

verfasst von: Tehseen Zia, Assad Abbas, Usman Habib, Muhammad Sajid Khan

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Learning both hierarchical and temporal dependencies can be crucial for recurrent neural networks (RNNs) to deeply understand sequences. To this end, a unified RNN framework is required that can ease the learning of both the deep hierarchical and temporal structures by allowing gradients to propagate back from both ends without being vanished. The residual learning (RL) has appeared as an effective and less-costly method to facilitate backward propagation of gradients. The significance of the RL is exclusively shown for learning deep hierarchical representations and temporal dependencies. Nevertheless, there is lack of efforts to unify these finding into a single framework for learning deep RNNs. In this study, we aim to prove that approximating identity mapping is crucial for optimizing both hierarchical and temporal structures. We propose a framework called hierarchical and temporal residual RNNs, to learn RNNs by approximating identity mappings across hierarchical and temporal structures. To validate the proposed method, we explore the efficacy of employing shortcut connections for training deep RNNs structures for sequence learning problems. Experiments are performed on Penn Treebank, Hutter Prize and IAM-OnDB datasets and results demonstrate the utility of the framework in terms of accuracy and computational complexity. We demonstrate that even for large datasets exploiting parameters for increasing network depth can gain computational benefits with reduced size of the RNN "state".

Vorheriger Artikel Single image rain streaks removal: a review and an exploration

Nächster Artikel Weighted multi-deep ranking supervised hashing for efficient image retrieval

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

These notations are used consistently throughout the paper unless specified otherwise.

https://www.fit.vutbr.cz/imikolov/rnnlm/simple-examples.tgz.

Notations are used.

Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127MathSciNetCrossRef

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef

Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. https://arxiv.org/abs/arXiv:1506.00019

Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S (2010) Recurrent neural network based language model. Interspeech 2:3

Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6645–6649

Gregor K, Danihelka I, Graves A, Rezende DJ, Wierstra D (2015) DRAW: a recurrent neural network for image generation. https://arxiv.org/abs/arXiv:1502.04623

Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552

Mao J, Xu W, Yang Y, Wang J, Huang Z, Yuille A (2014) Deep captioning with multimodal recurrent neural networks (M-RNN). https://arxiv.org/abs/arXiv:1412.6632

Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2625–2634

10.

Luo J, Wu J, Zhao S, Wang L, Xu T (2019) Lossless compression for hyperspectral image using deep recurrent neural networks. Int J Mach Learn Cybern 10(10):2619–2629CrossRef

11.

Kim J, Kim H (2016) Classification performance using gated recurrent unit recurrent neural network on energy disaggregation. In: 2016 international conference on machine learning and cybernetics (ICMLC), vol 1. IEEE, pp 105–110

12.

Chung J, Gulcehre C, Cho K, Bengio Y (2015) Gated feedback recurrent neural networks. In: International conference on machine learning, pp 2067–2075

13.

Schmidhuber J (1992) Learning complex, extended sequences using the principle of history compression. Neural Comput 4(2):234–242CrossRef

14.

Pascanu R, Gulcehre C, Cho K, Bengio Y (2013) How to construct deep recurrent neural networks. https://arxiv.org/abs/arXiv:1312.6026

15.

Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conference on machine learning, pp 13100–1318

16.

Krizhevsky A, Sutskever I, Hinton GE (2016) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

17.

Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML'15: proceedings of the 32nd international conference on international conference on machine learning, vol 37, pp 448–456

18.

Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256

19.

Bengio Y, Boulanger-Lewandowski N, Pascanu R (2013) Advances in optimizing recurrent networks. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, Vancouver, pp 8624–8628CrossRef

20.

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef

21.

Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association

22.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

23.

Kim JH, Lee SW, Kwak D, Heo MO, Kim J, Ha JW, Zhang BT (2016) Multimodal residual learning for visual qa. In: Advances in neural information processing systems, pp 361–369

24.

Yao K, Cohn T, Vylomova K, Duh K, Dyer C (2015) Depth-gated recurrent neural networks. arXiv:1508.03790

25.

Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232MathSciNetCrossRef

26.

Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. https://arxiv.org/abs/arXiv:1412.3555

27.

Šter B (2013) Selective recurrent neural network. Neural Process Lett 38(1):1–15CrossRef

28.

Zilly JG, Srivastava RK, Koutník J, Schmidhuber J (2016) Recurrent highway networks. https://arxiv.org/abs/arXiv:1607.03474

29.

Zhang Y, Chen G, Yu D, Yaco K, Khudanpur S, Glass J (2016) Highway long short-term memory RNNS for distant speech recognition. In: IEEE international conference on acoustics, speech and signal processing, pp 5755–5759

30.

Zia T (2019) Hierarchical recurrent highway networks. Pattern Recogn Lett 119:71–76CrossRef

31.

Zia T, Razzaq S (2018) Residual recurrent highway networks for learning deep sequence prediction models. J Grid Comput 1–8

32.

Wang Y, Tian F (2016) Recurrent residual learning for sequence classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 938–943

33.

Kim J, El-Khamy M, Lee J (2017) Residual LSTM: design of a deep recurrent architecture for distant speech recognition. https://arxiv.org/abs/arXiv:1701.03360

34.

Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560CrossRef

35.

Chung J, Ahn S, Bengio Y (2016) Hierarchical multiscale recurrent neural networks. https://arxiv.org/abs/arXiv:1609.01704

36.

El Hihi S, Bengio Y (1996) Hierarchical recurrent neural networks for long-term dependencies. In: Advances in neural information processing systems, pp 493–499

37.

Hutter M (2012) The human knowledge compression contest. https://prize.hutter1.net. Accessed 15 Jan 2019

38.

Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330

39.

Graves A (2013) Generating sequences with recurrent neural networks. https://arxiv.org/abs/arXiv:1308.0850

40.

Goldsborough P (2016) A tour of tensorflow. https://arxiv.org/abs/arXiv:1610.01178

Titel: Learning deep hierarchical and temporal recurrent neural networks with residual learning
verfasst von: Tehseen Zia
Assad Abbas
Usman Habib
Muhammad Sajid Khan
Publikationsdatum: 29.01.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 4/2020
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-020-01063-0

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence_ieS/© Springer Fachmedien Wiesbaden GmbH, Search Icon, Banner Hanser, Bunte Männchen, die Kunden darstelle, werden von einem riesigen Magneten angezogen. /© Oleksiy Mark, Dr. Daniel Schneider/© Fraunhofer IESE, Interview Level Ten PPA Bild/© LevelTen, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 4/2020

Robustness to adversarial examples can be improved with overfitting

Recent advances in deep learning

Single image rain streaks removal: a review and an exploration

Weighted multi-deep ranking supervised hashing for efficient image retrieval

A deep neural networks based recommendation algorithm using user and item basic data

A technical view on neural architecture search

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.