nach oben

Neural Computing and Applications

Erschienen in:

21.06.2019 | Hybrid Artificial Intelligence and Machine Learning Technologies

Recurrent neural network with attention mechanism for language model

verfasst von: Mu-Yen Chen, Hsiu-Sen Chiang, Arun Kumar Sangaiah, Tsung-Che Hsieh

Erschienen in: Neural Computing and Applications | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The rapid growth of the Internet promotes the growth of textual data, and people get the information they need from the amount of textual data to solve problems. The textual data may include some potential information like the opinions of the crowd, the opinions of the product, or some market-relevant information. However, some problems that point to “How to get features from the text” must be solved. The model of extracting the text features by using the neural network method is called neural network language model. The features are based on n-gram Model concept, which are the co-occurrence relationship between the vocabularies. The word vectors are important because the sentence vectors or the document vectors still have to understand the relationship between the words, and based on this, this study discusses the word vectors. This study assumes that the words contain “the meaning in sentences” and “the position of grammar.” This study uses recurrent neural network with attention mechanism to establish a language model. This study uses Penn Treebank, WikiText-2, and NLPCC2017 text datasets. According to these datasets, the proposed models provide the better performance by the perplexity.

Vorheriger Artikel Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction

Nächster Artikel Hybrid optimization scheme for intrusion detection using considerable feature selection

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Khodabakhsh M, Kahani M, Bagheri E (2018) Predicting future personal life events on twitter via recurrent neural networks. J Intell Inf Syst. https://doi.org/10.1007/s10844-018-0519-2 CrossRef

Zilly JG, Srivastava RK, Koutník J, Schmidhuber J (2017) Recurrent highway network. arXiv preprint arXiv:1607.03474

Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of the international conference on learning representations (ICLR 2013), Scottsdale, Arizona, USA

Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. In: The conference on empirical methods in natural language processing (EMNLP 2015), Lisbon, Portugal, pp 829–834

Niu Y, Xie R, Liu Z, Sun M (2017) Improved Word Representation Learning with Sememes. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL 2017), vol 1, pp 2049–2058

Han H, Bai X, Li P (2018) Augmented sentiment representation by learning context information. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3698-4 CrossRef

Wu X, Du Z, Guo Y, Fujita H (2019) Hierarchical attention based long short-term memory for Chinese lyric generation. Appl Intell 49(1):44–52CrossRef

Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH

Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. InL INTERSPEECH 2010, Makuhari, Chiba, Japan, pp 1045–1048

10.

Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural Network. In: Advances in neural information processing systems (NIPS 2016), Barcelona, Spain, pp 1019–1027

11.

Merity S, Xiong C, Bradbury J, Socher R (2017) Pointer sentinel mixture models. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France

12.

Press O, Wolf L (2016) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2, pp 157–163, Valencia, Spain

13.

Inan H, Khosravi K, Socher R (2016) Tying word vectors and word classifiers: a loss framework for language modeling. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France

14.

Ali ES, Elazim SMA (2018) Mine blast algorithm for environmental economic load dispatch with valve loading effect. Neural Comput Appl 30:261–270CrossRef

15.

Abd-Elazim SM, Ali ES (2018) Load frequency controller design of a two-area system composing of PV grid and thermal generator via firefly algorithm. Neural Comput Appl 30(2):607–616CrossRef

16.

Oshaba AS, Ali ES, Elazim SMA (2017) PI controller design using ABC algorithm for MPPT of PV system supplying DC motor-pump load. Neural Comput Appl 28(2):353–364CrossRef

17.

Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164

18.

Zaremba W (2015). https://github.com/wojzaremba/lstm. Accessed 1 June 2018

19.

Pytorch (2016). https://github.com/pytorch/examples/tree/master/word_language_model. Accessed 1 June 2018

20.

Zhou H, Huang M, Zhang T, Zhu X, Liu B (2017) Emotional chatting machine: emotional conversation generation with internal and external memory. In: The 32nd AAAI conference on artificial intelligence (AAAI-18), New Orleans, Louisiana, USA, pp 730–738

21.

Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Computational linguistics 19:313–330

22.

Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, 177-180

23.

Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE spoken language technology workshop (SLT), Miami, USA, pp 234–239

24.

Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv preprint arXiv:1409.2329

25.

Grave E, Joulin A, Usunier N (2017) Improving neural language models with a continuous cache. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France

Titel: Recurrent neural network with attention mechanism for language model
verfasst von: Mu-Yen Chen
Hsiu-Sen Chiang
Arun Kumar Sangaiah
Tsung-Che Hsieh
Publikationsdatum: 21.06.2019
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 12/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-019-04301-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 12/2020

Solving a new cost-oriented assembly line balancing problem by classical and hybrid meta-heuristic algorithms

Generative adversarial fusion network for class imbalance credit scoring

Special issue on “Soft computing techniques: applications and challenges” neural computing and applications

A novel modified whale optimization algorithm for load frequency controller design of a two-area power system composing of PV grid and thermal generator

A deep analysis on optimization techniques for appropriate PID tuning to incline efficient artificial pancreas

Classification of severe autism in fMRI using functional connectivity and conditional random forests

Premium Partner