Top

Published in:

2019 | OriginalPaper | Chapter

9. Attention and Memory Augmented Networks

Authors : Uday Kamath, John Liu, James Whitaker

Published in: Deep Learning for NLP and Speech Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In deep learning networks, as we have seen in the previous chapters, there are good architectures for handling spatial and temporal data using various forms of convolutional and recurrent networks, respectively. When the data has certain dependencies such as out-of-order access, long-term dependencies, unordered access, most standard architectures discussed are not suitable. Let us consider a specific example from the bAbI dataset where there are stories/facts presented, a question is asked, and the answer needs to be inferred from the stories. As shown in Fig. 9.1, it requires out of order access and long-term dependencies to find the right answer.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Automatic Speech Recognition

next chapter Transfer Learning: Scenarios, Self-Taught Learning, and Multitask Learning

[BCB14b]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate”. In: CoRR abs/1409.0473 (2014).

[Bah+16b]

Dzmitry Bahdanau et al. “End-to-end attention-based large vocabulary speech recognition”. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2016, Shanghai, China, March 20–25, 2016. 2016, pp. 4945–4949.

[Cha+16a]

William Chan et al. “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition”. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2016, Shanghai, China, March 20–25, 2016. 2016, pp. 4960–4964.

[Cho+15b]

Jan Chorowski et al. “Attention-Based Models for Speech Recognition”. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada. 2015, pp. 577–585.

[Dan+17]

Michal Daniluk et al. “Frustratingly Short Attention Spans in Neural Language Modeling”. In: CoRR abs/1702.04521 (2017).

[DGS92]

Sreerupa Das, C. Lee Giles, and Guo-Zheng Sun. “Using Prior Knowledge in a { NNPDA} to Learn Context-Free Languages”. In: Advances in Neural Information Processing Systems 5, [NIPS Conference, Denver, Colorado, USA, November 30 - December 3, 1992]. 1992, pp. 65–72.

[Den+12]

M. Denil et al. “Learning where to Attend with Deep Architectures for Image Tracking”. In: Neural Computation (2012).MathSciNetCrossRef

[GWD14b]

Alex Graves, Greg Wayne, and Ivo Danihelka. “Neural Turing Machines”. In: CoRR abs/1410.5401 (2014).

[Gra+16]

Alex Graves et al. “Hybrid computing using a neural network with dynamic external memory”. In: Nature 538.7626 (Oct. 2016), pp. 471–476.

[Gre+15]

Edward Grefenstette et al. “Learning to Transduce with Unbounded Memory”. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada. 2015, pp. 1828–1836.

[Hen+16]

Mikael Henaff et al. “Tracking the World State with Recurrent Entity Networks”. In: CoRR abs/1612.03969 (2016).

[Kum+16]

Ankit Kumar et al. “Ask Me Anything: Dynamic Memory Networks for Natural Language Processing”. In: Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19–24, 2016. 2016, pp. 1378–1387.

[LH10]

Hugo Larochelle and Geoffrey E Hinton. “Learning to combine foveal glimpses with a third-order Boltzmann machine”. In: Advances in Neural Information Processing Systems 23. Ed. by J. D. Lafferty et al. Curran Associates, Inc., 2010, pp. 1243–1251.

[Lin+17]

Zhouhan Lin et al. “A Structured Self-attentive Sentence Embedding”. In: CoRR abs/1703.03130 (2017).

[LPM15]

Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. “Effective Approaches to Attention-based Neural Machine Translation”. In: CoRR abs/1508.04025 (2015).

[Moz94]

Michael C. Mozer. “Neural Net Architectures for Temporal Sequence Processing”. In: Addison-Wesley, 1994, pp. 243–264.

[RCW15]

Alexander M. Rush, Sumit Chopra, and Jason Weston. “A Neural Attention Model for Abstractive Sentence Summarization”. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015. 2015, pp. 379–389.

[SP63]

Karl Steinbuch and Uwe A. W. Piske. “Learning Matrices and Their Applications”. In: IEEE Trans. Electronic Computers 12.6 (1963), pp. 846–862.CrossRef

[Suk+15]

Sainbayar Sukhbaatar et al. “End-To-End Memory Networks”. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada. 2015, pp. 2440–2448.

[Vas+17c]

Ashish Vaswani et al. “Attention is All you Need”. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA. 2017, pp. 6000–6010.

[Vin+15a]

Oriol Vinyals et al. “Grammar as a Foreign Language”. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada. 2015, pp. 2773–2781.

[Wan+16b]

Yequan Wang et al. “Attention-based LSTM for Aspect-level Sentiment Classification”. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016. 2016, pp. 606–615.

[WCB14]

Jason Weston, Sumit Chopra, and Antoine Bordes. “Memory Networks”. In: CoRR abs/1410.3916 (2014).

[Yan+16]

Zichao Yang et al. “Hierarchical Attention Networks for Document Classification”. In: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016. 2016, pp. 1480–1489.

[Zha+18]

Yuanyuan Zhang et al. “Attention Based Fully Convolutional Network for Speech Emotion Recognition”. In: CoRR abs/1806.01506 (2018).

Title: Attention and Memory Augmented Networks
Authors: Uday Kamath
John Liu
James Whitaker
Publisher: Springer International Publishing
Book: Deep Learning for NLP and Speech Recognition
Print ISBN: 978-3-030-14595-8

Electronic ISBN: 978-3-030-14596-5

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-14596-5_9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner