Skip to main content
Erschienen in: Neural Computing and Applications 12/2020

21.06.2019 | Hybrid Artificial Intelligence and Machine Learning Technologies

Recurrent neural network with attention mechanism for language model

verfasst von: Mu-Yen Chen, Hsiu-Sen Chiang, Arun Kumar Sangaiah, Tsung-Che Hsieh

Erschienen in: Neural Computing and Applications | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The rapid growth of the Internet promotes the growth of textual data, and people get the information they need from the amount of textual data to solve problems. The textual data may include some potential information like the opinions of the crowd, the opinions of the product, or some market-relevant information. However, some problems that point to “How to get features from the text” must be solved. The model of extracting the text features by using the neural network method is called neural network language model. The features are based on n-gram Model concept, which are the co-occurrence relationship between the vocabularies. The word vectors are important because the sentence vectors or the document vectors still have to understand the relationship between the words, and based on this, this study discusses the word vectors. This study assumes that the words contain “the meaning in sentences” and “the position of grammar.” This study uses recurrent neural network with attention mechanism to establish a language model. This study uses Penn Treebank, WikiText-2, and NLPCC2017 text datasets. According to these datasets, the proposed models provide the better performance by the perplexity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of the international conference on learning representations (ICLR 2013), Scottsdale, Arizona, USA Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of the international conference on learning representations (ICLR 2013), Scottsdale, Arizona, USA
4.
Zurück zum Zitat Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. In: The conference on empirical methods in natural language processing (EMNLP 2015), Lisbon, Portugal, pp 829–834 Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. In: The conference on empirical methods in natural language processing (EMNLP 2015), Lisbon, Portugal, pp 829–834
5.
Zurück zum Zitat Niu Y, Xie R, Liu Z, Sun M (2017) Improved Word Representation Learning with Sememes. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL 2017), vol 1, pp 2049–2058 Niu Y, Xie R, Liu Z, Sun M (2017) Improved Word Representation Learning with Sememes. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL 2017), vol 1, pp 2049–2058
7.
Zurück zum Zitat Wu X, Du Z, Guo Y, Fujita H (2019) Hierarchical attention based long short-term memory for Chinese lyric generation. Appl Intell 49(1):44–52CrossRef Wu X, Du Z, Guo Y, Fujita H (2019) Hierarchical attention based long short-term memory for Chinese lyric generation. Appl Intell 49(1):44–52CrossRef
8.
Zurück zum Zitat Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH
9.
Zurück zum Zitat Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. InL INTERSPEECH 2010, Makuhari, Chiba, Japan, pp 1045–1048 Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. InL INTERSPEECH 2010, Makuhari, Chiba, Japan, pp 1045–1048
10.
Zurück zum Zitat Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural Network. In: Advances in neural information processing systems (NIPS 2016), Barcelona, Spain, pp 1019–1027 Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural Network. In: Advances in neural information processing systems (NIPS 2016), Barcelona, Spain, pp 1019–1027
11.
Zurück zum Zitat Merity S, Xiong C, Bradbury J, Socher R (2017) Pointer sentinel mixture models. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France Merity S, Xiong C, Bradbury J, Socher R (2017) Pointer sentinel mixture models. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
12.
Zurück zum Zitat Press O, Wolf L (2016) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2, pp 157–163, Valencia, Spain Press O, Wolf L (2016) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2, pp 157–163, Valencia, Spain
13.
Zurück zum Zitat Inan H, Khosravi K, Socher R (2016) Tying word vectors and word classifiers: a loss framework for language modeling. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France Inan H, Khosravi K, Socher R (2016) Tying word vectors and word classifiers: a loss framework for language modeling. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
14.
Zurück zum Zitat Ali ES, Elazim SMA (2018) Mine blast algorithm for environmental economic load dispatch with valve loading effect. Neural Comput Appl 30:261–270CrossRef Ali ES, Elazim SMA (2018) Mine blast algorithm for environmental economic load dispatch with valve loading effect. Neural Comput Appl 30:261–270CrossRef
15.
Zurück zum Zitat Abd-Elazim SM, Ali ES (2018) Load frequency controller design of a two-area system composing of PV grid and thermal generator via firefly algorithm. Neural Comput Appl 30(2):607–616CrossRef Abd-Elazim SM, Ali ES (2018) Load frequency controller design of a two-area system composing of PV grid and thermal generator via firefly algorithm. Neural Comput Appl 30(2):607–616CrossRef
16.
Zurück zum Zitat Oshaba AS, Ali ES, Elazim SMA (2017) PI controller design using ABC algorithm for MPPT of PV system supplying DC motor-pump load. Neural Comput Appl 28(2):353–364CrossRef Oshaba AS, Ali ES, Elazim SMA (2017) PI controller design using ABC algorithm for MPPT of PV system supplying DC motor-pump load. Neural Comput Appl 28(2):353–364CrossRef
17.
Zurück zum Zitat Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164 Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164
20.
Zurück zum Zitat Zhou H, Huang M, Zhang T, Zhu X, Liu B (2017) Emotional chatting machine: emotional conversation generation with internal and external memory. In: The 32nd AAAI conference on artificial intelligence (AAAI-18), New Orleans, Louisiana, USA, pp 730–738 Zhou H, Huang M, Zhang T, Zhu X, Liu B (2017) Emotional chatting machine: emotional conversation generation with internal and external memory. In: The 32nd AAAI conference on artificial intelligence (AAAI-18), New Orleans, Louisiana, USA, pp 730–738
21.
Zurück zum Zitat Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Computational linguistics 19:313–330 Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Computational linguistics 19:313–330
22.
Zurück zum Zitat Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, 177-180 Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, 177-180
23.
Zurück zum Zitat Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE spoken language technology workshop (SLT), Miami, USA, pp 234–239 Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE spoken language technology workshop (SLT), Miami, USA, pp 234–239
24.
25.
Zurück zum Zitat Grave E, Joulin A, Usunier N (2017) Improving neural language models with a continuous cache. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France Grave E, Joulin A, Usunier N (2017) Improving neural language models with a continuous cache. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
Metadaten
Titel
Recurrent neural network with attention mechanism for language model
verfasst von
Mu-Yen Chen
Hsiu-Sen Chiang
Arun Kumar Sangaiah
Tsung-Che Hsieh
Publikationsdatum
21.06.2019
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 12/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04301-x

Weitere Artikel der Ausgabe 12/2020

Neural Computing and Applications 12/2020 Zur Ausgabe

Premium Partner