Skip to main content
Erschienen in: Cognitive Computation 4/2018

02.03.2018

Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks

verfasst von: Paula Lauren, Guangzhi Qu, Jucheng Yang, Paul Watta, Guang-Bin Huang, Amaury Lendasse

Erschienen in: Cognitive Computation | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Word Embeddings are low-dimensional distributed representations that encompass a set of language modeling and feature learning techniques from Natural Language Processing (NLP). Words or phrases from the vocabulary are mapped to vectors of real numbers in a low-dimensional space. In previous work, we proposed using an Extreme Learning Machine (ELM) for generating word embeddings. In this research, we apply the ELM-based Word Embeddings to the NLP task of Text Categorization, specifically Sentiment Analysis and Sequence Labeling. The ELM-based Word Embeddings utilizes a count-based approach similar to the Global Vectors (GloVe) model, where the word-context matrix is computed then matrix factorization is applied. A comparative study is done with Word2Vec and GloVe, which are the two popular state-of-the-art models. The results show that ELM-based Word Embeddings slightly outperforms the aforementioned two methods in the Sentiment Analysis and Sequence Labeling tasks.In addition, only one hyperparameter is needed using ELM whereas several are utilized for the other methods. ELM-based Word Embeddings are comparable to the state-of-the-art methods: Word2Vec and GloVe models. In addition, the count-based ELM model have word similarities to both the count-based GloVe and the predict-based Word2Vec models, with subtle differences.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge: MIT press; 1999. Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge: MIT press; 1999.
2.
Zurück zum Zitat Duda RO, Hart PE, Stork DG. Pattern classification. New York: Wiley; 2012. Duda RO, Hart PE, Stork DG. Pattern classification. New York: Wiley; 2012.
3.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. ICLR Workshop. 2013. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. ICLR Workshop. 2013.
4.
Zurück zum Zitat Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. EMNLP; 2014. Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. EMNLP; 2014.
5.
Zurück zum Zitat Goth G. Deep or shallow, nlp is breaking out. Commun ACM 2016;59(3):13–16.CrossRef Goth G. Deep or shallow, nlp is breaking out. Commun ACM 2016;59(3):13–16.CrossRef
6.
Zurück zum Zitat Maas AL, Ng AY. A probabilistic model for semantic word vectors. NIPS 2010 workshop on deep learning and unsupervised feature learning; 2010. p. 1–8. Maas AL, Ng AY. A probabilistic model for semantic word vectors. NIPS 2010 workshop on deep learning and unsupervised feature learning; 2010. p. 1–8.
7.
Zurück zum Zitat Zou WY, Socher R, Cer DM, Manning CD. Bilingual word embeddings for phrase-based machine translation. EMNLP; 2013. p. 1393–1398. Zou WY, Socher R, Cer DM, Manning CD. Bilingual word embeddings for phrase-based machine translation. EMNLP; 2013. p. 1393–1398.
8.
Zurück zum Zitat Le QV, Mikolov T. Distributed representations of sentences and documents. Proceedings of ICML; 2014. Le QV, Mikolov T. Distributed representations of sentences and documents. Proceedings of ICML; 2014.
9.
Zurück zum Zitat Huang G, Huang G-B, Song S, You K. Trends in extreme learning machines: a review. Neural Netw 2015;61:32–48.CrossRefPubMed Huang G, Huang G-B, Song S, You K. Trends in extreme learning machines: a review. Neural Netw 2015;61:32–48.CrossRefPubMed
10.
Zurück zum Zitat Dai AM, Olah C, Le QV. Document embedding with paragraph vectors, arXiv:1507.07998, 2015. 2015. Dai AM, Olah C, Le QV. Document embedding with paragraph vectors, arXiv:1507.​07998, 2015. 2015.
11.
Zurück zum Zitat Landauer TK, Foltz PW, Laham D. An introduction to latent semantic analysis. Discourse Processes 1998; 25:259–284.CrossRef Landauer TK, Foltz PW, Laham D. An introduction to latent semantic analysis. Discourse Processes 1998; 25:259–284.CrossRef
12.
Zurück zum Zitat Lauren P, Qu G, Zhang F, Lendasse A. Clinical narrative classification using discriminant word embeddings with elm. Int’l joint conference on neural networks, Vancouver, Canada, July 24–29. IEEE; 2016. Lauren P, Qu G, Zhang F, Lendasse A. Clinical narrative classification using discriminant word embeddings with elm. Int’l joint conference on neural networks, Vancouver, Canada, July 24–29. IEEE; 2016.
13.
Zurück zum Zitat Lauren P, Qu G, Zhang F, Lendasse A. Discriminant document embeddings with an extreme learning machine for classifying clinical narrative. Neurocomputing, vol. 277, 14 February 2018, pp. 129–138. 2017. Lauren P, Qu G, Zhang F, Lendasse A. Discriminant document embeddings with an extreme learning machine for classifying clinical narrative. Neurocomputing, vol. 277, 14 February 2018, pp. 129–138. 2017.
14.
Zurück zum Zitat Zheng W, Qian Y, Lu H. Text categorization based on regularization extreme learning machine. Neural Comput & Applic 2013;22(3-4):447–456.CrossRef Zheng W, Qian Y, Lu H. Text categorization based on regularization extreme learning machine. Neural Comput & Applic 2013;22(3-4):447–456.CrossRef
15.
Zurück zum Zitat Zeng L, Li Z. Text classification based on paragraph distributed representation and extreme learning machine. Int’l conference in swarm intelligence. Springer; 2015. p. 81–88. Zeng L, Li Z. Text classification based on paragraph distributed representation and extreme learning machine. Int’l conference in swarm intelligence. Springer; 2015. p. 81–88.
16.
Zurück zum Zitat Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput Intell Mag 2015;10(4):26–36.CrossRef Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput Intell Mag 2015;10(4):26–36.CrossRef
17.
Zurück zum Zitat Cambria E, Gastaldo P, Bisio F, Zunino R. An elm-based model for affective analogical reasoning. Neurocomputing 2015;149:443–455.CrossRef Cambria E, Gastaldo P, Bisio F, Zunino R. An elm-based model for affective analogical reasoning. Neurocomputing 2015;149:443–455.CrossRef
18.
Zurück zum Zitat Erb RJ. Introduction to backpropagation neural network computation. Pharm Res 1993;10(2):165–170.CrossRefPubMed Erb RJ. Introduction to backpropagation neural network computation. Pharm Res 1993;10(2):165–170.CrossRefPubMed
19.
Zurück zum Zitat Murphy KP. Machine learning: a probabilistic perspective. Cambridge: MIT Press; 2012. Murphy KP. Machine learning: a probabilistic perspective. Cambridge: MIT Press; 2012.
20.
Zurück zum Zitat Baldi P. Autoencoders, unsupervised learning, and deep architectures. J. Mach. Learn. Res. (Proceedings of ICML workshop on unsupervised and transfer learning) 2012;27:37–50. Baldi P. Autoencoders, unsupervised learning, and deep architectures. J. Mach. Learn. Res. (Proceedings of ICML workshop on unsupervised and transfer learning) 2012;27:37–50.
24.
Zurück zum Zitat Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.CrossRef Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.CrossRef
25.
Zurück zum Zitat Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Proceedings of interspeech; 2013. Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Proceedings of interspeech; 2013.
26.
Zurück zum Zitat Yao K, Zweig G, Hwang M-Y, Shi Y, Yu D. Recurrent neural networks for language understanding. Interspeech; 2013. p. 2524–2528. Yao K, Zweig G, Hwang M-Y, Shi Y, Yu D. Recurrent neural networks for language understanding. Interspeech; 2013. p. 2524–2528.
27.
Zurück zum Zitat Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2017. p. 5675–5679. Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2017. p. 5675–5679.
28.
Zurück zum Zitat Lauren P, Qu G, Huang G-B, Watta P, Lendasse A. A low-dimensional vector representation for words using an extreme learning machine. 2017 international joint conference on neural networks (IJCNN); 2017. p. 1817–1822. Lauren P, Qu G, Huang G-B, Watta P, Lendasse A. A low-dimensional vector representation for words using an extreme learning machine. 2017 international joint conference on neural networks (IJCNN); 2017. p. 1817–1822.
29.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems; 2013. p. 3111–3119. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems; 2013. p. 3111–3119.
30.
Zurück zum Zitat Bottou L. Online learning and stochastic approximations. On-Line Learning in Neural Networks 1998;17(9):142. Bottou L. Online learning and stochastic approximations. On-Line Learning in Neural Networks 1998;17(9):142.
31.
Zurück zum Zitat Mnih A, Hinton GE. A scalable hierarchical distributed language model. Advances in neural information processing systems; 2009. p. 1081–1088. Mnih A, Hinton GE. A scalable hierarchical distributed language model. Advances in neural information processing systems; 2009. p. 1081–1088.
32.
Zurück zum Zitat Mikolov T, Deoras A, Povey D, Burget L, Černockỳ J. Strategies for training large scale neural network language models. 2011 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE; 2011. p. 196–201. Mikolov T, Deoras A, Povey D, Burget L, Černockỳ J. Strategies for training large scale neural network language models. 2011 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE; 2011. p. 196–201.
33.
Zurück zum Zitat Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation. Advances in neural information processing systems; 2013. p. 2265–2273. Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation. Advances in neural information processing systems; 2013. p. 2265–2273.
34.
Zurück zum Zitat Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 2011;12(Jul):2121–2159. Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 2011;12(Jul):2121–2159.
35.
Zurück zum Zitat Rao CR, Mitra SK, et al. Generalized inverse of a matrix and its applications. Proceedings of the sixth berkeley symposium on mathematical statistics and probability, Volume 1: Theory of Statistics. The Regents of the University of California; 1972. Rao CR, Mitra SK, et al. Generalized inverse of a matrix and its applications. Proceedings of the sixth berkeley symposium on mathematical statistics and probability, Volume 1: Theory of Statistics. The Regents of the University of California; 1972.
36.
Zurück zum Zitat Huang G-B, Siew C-K. Extreme learning machine: Rbf network case. ICARCV 2004 8th control, automation, robotics and vision conference, 2004. vol. 2. IEEE; 2004. p. 1029–1036. Huang G-B, Siew C-K. Extreme learning machine: Rbf network case. ICARCV 2004 8th control, automation, robotics and vision conference, 2004. vol. 2. IEEE; 2004. p. 1029–1036.
37.
Zurück zum Zitat Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing 2006;70(1):489–501.CrossRef Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing 2006;70(1):489–501.CrossRef
38.
Zurück zum Zitat Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 2012;42(2):513–529.CrossRefPubMed Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 2012;42(2):513–529.CrossRefPubMed
39.
Zurück zum Zitat Kasun LLC, Zhou H, Huang G-B, Vong CM. Representational learning with elms for big data. IEEE Intell Syst 2013;28(6):31–34. Kasun LLC, Zhou H, Huang G-B, Vong CM. Representational learning with elms for big data. IEEE Intell Syst 2013;28(6):31–34.
40.
Zurück zum Zitat Alpaydin E. Introduction to machine learning. Cambridge: MIT Press; 2014. Alpaydin E. Introduction to machine learning. Cambridge: MIT Press; 2014.
41.
Zurück zum Zitat Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT Press; 2016. Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT Press; 2016.
42.
Zurück zum Zitat James G, Witten D, Hastie T. An introduction to statistical learning: with applications in r. New York: Springer. 2014. James G, Witten D, Hastie T. An introduction to statistical learning: with applications in r. New York: Springer. 2014.
43.
Zurück zum Zitat Everitt BS, Dunn G. 2001. Applied multivariate data analysis. Wiley Online Library, Vol. 2. Everitt BS, Dunn G. 2001. Applied multivariate data analysis. Wiley Online Library, Vol. 2.
44.
Zurück zum Zitat Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the ACL; 2004. Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the ACL; 2004.
45.
Zurück zum Zitat Hemphill CT, Godfrey JJ, Doddington GR, et al. The atis spoken language systems pilot corpus. Proceedings of the DARPA speech and natural language workshop; 1990. Hemphill CT, Godfrey JJ, Doddington GR, et al. The atis spoken language systems pilot corpus. Proceedings of the DARPA speech and natural language workshop; 1990.
46.
Zurück zum Zitat Dahl DA, Bates M, Brown M, Fisher W, Hunicke-Smith K, Pallett D, Pao C, Rudnicky A, Shriberg E. Expanding the scope of the atis task: the atis-3 corpus. Proceedings of the workshop on human language technology. Association for Computational Linguistics; 1994. p. 43–48. Dahl DA, Bates M, Brown M, Fisher W, Hunicke-Smith K, Pallett D, Pao C, Rudnicky A, Shriberg E. Expanding the scope of the atis task: the atis-3 corpus. Proceedings of the workshop on human language technology. Association for Computational Linguistics; 1994. p. 43–48.
47.
Zurück zum Zitat Tur G, Hakkani-Tür D, Heck L. What is left to be understood in atis? 2010 IEEE spoken language technology workshop (SLT). IEEE; 2010. p. 19–24. Tur G, Hakkani-Tür D, Heck L. What is left to be understood in atis? 2010 IEEE spoken language technology workshop (SLT). IEEE; 2010. p. 19–24.
48.
Zurück zum Zitat Ramshaw LA, Marcus MP. Text chunking using transformation-based learning. Natural language processing using very large corpora. Springer; 1999. p. 157–176. Ramshaw LA, Marcus MP. Text chunking using transformation-based learning. Natural language processing using very large corpora. Springer; 1999. p. 157–176.
49.
Zurück zum Zitat Bullinaria JA, Levy JP. Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 2007;39(3):510–526.CrossRefPubMed Bullinaria JA, Levy JP. Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 2007;39(3):510–526.CrossRefPubMed
51.
Zurück zum Zitat Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–1958. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–1958.
52.
Zurück zum Zitat Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9(Nov):2579–2605. Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9(Nov):2579–2605.
53.
Zurück zum Zitat Tan P-N, Steinbach M, Kumar V. Introduction to data mining. London: Pearson; 2006. Tan P-N, Steinbach M, Kumar V. Introduction to data mining. London: Pearson; 2006.
54.
Zurück zum Zitat Li J, Jurafsky D. Do multi-sense embeddings improve natural language understanding? arXiv:1506.01070. 2015. Li J, Jurafsky D. Do multi-sense embeddings improve natural language understanding? arXiv:1506.​01070. 2015.
Metadaten
Titel
Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks
verfasst von
Paula Lauren
Guangzhi Qu
Jucheng Yang
Paul Watta
Guang-Bin Huang
Amaury Lendasse
Publikationsdatum
02.03.2018
Verlag
Springer US
Erschienen in
Cognitive Computation / Ausgabe 4/2018
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-018-9548-y

Weitere Artikel der Ausgabe 4/2018

Cognitive Computation 4/2018 Zur Ausgabe

Premium Partner