Skip to main content
Erschienen in: Discover Computing 2-3/2018

09.11.2017 | Neural Information Retrieval

Using word embeddings in Twitter election classification

verfasst von: Xiao Yang, Craig Macdonald, Iadh Ounis

Erschienen in: Discover Computing | Ausgabe 2-3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, as well as the parameters of the word embedding training process, namely the context window size, the dimensionality and the number of negative samples, on the attained classification performance. By comparing the classification results of word embedding models that have been trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data should align with the Twitter classification dataset both in data type and time period to achieve significantly better performance compared to baselines such as SVM with TF-IDF. Moreover, by evaluating the results of word embedding models trained using various context window sizes and dimensionalities, we find that large context window and dimension sizes are preferable to improve the performance. However, the number of negative samples parameter does not significantly affect the performance of the CNN classifiers. Our experimental results also show that choosing the correct word embedding model for use with CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings. Finally, for out-of-vocabulary (OOV) words that are not available in the learned word embedding models, we show that a simple OOV strategy to randomly initialise the OOV words without any prior knowledge is sufficient to attain a good classification performance among the current OOV strategies (e.g. a random initialisation using statistics of the pre-trained word embedding models).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Amati, G., Amodeo, G., Bianchi, M., Marcone, G., Bordoni, F. U., Gaibisso, C., Gambosi, G., Celi, A., Di Nicola, C., & Flammini, M. (2011). FUB, IASI-CNR, UNIVAQ at TREC 2011 microblog track. In: Proceedings of TREC. Amati, G., Amodeo, G., Bianchi, M., Marcone, G., Bordoni, F. U., Gaibisso, C., Gambosi, G., Celi, A., Di Nicola, C., & Flammini, M. (2011). FUB, IASI-CNR, UNIVAQ at TREC 2011 microblog track. In: Proceedings of TREC.
Zurück zum Zitat Bansal, M., Gimpel, K., & Livescu, K. (2014). Tailoring continuous word representations for dependency parsing. Proceedings of ACL, 2, 809–815. Bansal, M., Gimpel, K., & Livescu, K. (2014). Tailoring continuous word representations for dependency parsing. Proceedings of ACL, 2, 809–815.
Zurück zum Zitat Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.MATH Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.MATH
Zurück zum Zitat Bermingham, A., & Smeaton, A. F. (2011). On using Twitter to monitor political sentiment and predict election results. In: Proceedings of SAAIP workshop at IJCNLP. Bermingham, A., & Smeaton, A. F. (2011). On using Twitter to monitor political sentiment and predict election results. In: Proceedings of SAAIP workshop at IJCNLP.
Zurück zum Zitat Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:​1607.​04606.
Zurück zum Zitat Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213.CrossRef Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213.CrossRef
Zurück zum Zitat Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of ICML (pp. 160–167). Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of ICML (pp. 160–167).
Zurück zum Zitat Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–2537.MATH Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–2537.MATH
Zurück zum Zitat Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. (2016). Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781. Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. (2016). Very deep convolutional networks for natural language processing. arXiv preprint arXiv:​1606.​01781.
Zurück zum Zitat Dhingra, B., Zhou, Z., Fitzpatrick, D., Muehl, M., & Cohen, W. W. (2016). Tweet2vec: Character-based distributed representations for social media. In Proceedings of ACL. Dhingra, B., Zhou, Z., Fitzpatrick, D., Muehl, M., & Cohen, W. W. (2016). Tweet2vec: Character-based distributed representations for social media. In Proceedings of ACL.
Zurück zum Zitat Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.CrossRef Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.CrossRef
Zurück zum Zitat Ebert, S., Vu, N.T., & Schütze, H. (2015). CIS-positive: Combining convolutional neural networks and SVMs for sentiment analysis in Twitter. In: Proceedings of SemEval (p. 527). Ebert, S., Vu, N.T., & Schütze, H. (2015). CIS-positive: Combining convolutional neural networks and SVMs for sentiment analysis in Twitter. In: Proceedings of SemEval (p. 527).
Zurück zum Zitat Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In: Proceedings of AISTATS (Vol. 15, p. 275). Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In: Proceedings of AISTATS (Vol. 15, p. 275).
Zurück zum Zitat Godin, F., Vandersmissen, B., De Neve, W., & Van de Walle, R. (2015). Multimedia Lab@ ACL W-NUT NER shared task: Named entity recognition for Twitter microposts using distributed word representations. In: Proceedings of ACL-IJCNLP (p. 146). Godin, F., Vandersmissen, B., De Neve, W., & Van de Walle, R. (2015). Multimedia Lab@ ACL W-NUT NER shared task: Named entity recognition for Twitter microposts using distributed word representations. In: Proceedings of ACL-IJCNLP (p. 146).
Zurück zum Zitat Hahnloser, R. H., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J., & Seung, H. S. (2000). Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature, 405(6789), 947–951.CrossRef Hahnloser, R. H., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J., & Seung, H. S. (2000). Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature, 405(6789), 947–951.CrossRef
Zurück zum Zitat Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759. Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:​1607.​01759.
Zurück zum Zitat Kim, Y. (2014). Convolutional neural networks for sentence classification. In: Proceedings of EMNLP (pp. 1746–1751). Kim, Y. (2014). Convolutional neural networks for sentence classification. In: Proceedings of EMNLP (pp. 1746–1751).
Zurück zum Zitat Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS (pp. 1097–1105). Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS (pp. 1097–1105).
Zurück zum Zitat Macdonald, C., McCreadie, R., Santos, R.L., & Ounis, I. (2012). From puppy to maturity: Experiences in developing terrier. In: Proceedings of OSIR workshop at SIGIR (Vol. 60). Macdonald, C., McCreadie, R., Santos, R.L., & Ounis, I. (2012). From puppy to maturity: Experiences in developing terrier. In: Proceedings of OSIR workshop at SIGIR (Vol. 60).
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781.
Zurück zum Zitat Mikolov, T., Le, Q. V., & Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168. Mikolov, T., Le, Q. V., & Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:​1309.​4168.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (pp. 3111–3119). Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (pp. 3111–3119).
Zurück zum Zitat Mikolov, T., Yih, W. T., & Zweig, G. (2013d). Linguistic regularities in continuous space word representations. In: Proceedings of HLT-NAACL (Vol. 13, pp. 746–751). Mikolov, T., Yih, W. T., & Zweig, G. (2013d). Linguistic regularities in continuous space word representations. In: Proceedings of HLT-NAACL (Vol. 13, pp. 746–751).
Zurück zum Zitat Mitra, B., Nalisnick, E., Craswell, N., & Caruana, R. (2016). A dual embedding space model for document ranking. arXiv preprint arXiv:1602.01137. Mitra, B., Nalisnick, E., Craswell, N., & Caruana, R. (2016). A dual embedding space model for document ranking. arXiv preprint arXiv:​1602.​01137.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.MathSciNetMATH
Zurück zum Zitat Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In: Proceedings of EMNLP (pp. 1532–1543). Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In: Proceedings of EMNLP (pp. 1532–1543).
Zurück zum Zitat Severyn, A., & Moschitti, A. (2015). UNITN: training deep convolutional neural network for Twitter sentiment classification. In: Proceedings of SemEval (pp. 464–469). Severyn, A., & Moschitti, A. (2015). UNITN: training deep convolutional neural network for Twitter sentiment classification. In: Proceedings of SemEval (pp. 464–469).
Zurück zum Zitat Severyn, A., Nicosia, M., Barlacchi, G., & Moschitti, A. (2015). Distributional neural networks for automatic resolution of crossword puzzles. In: Proceedings of ACL-IJCNLP. Severyn, A., Nicosia, M., Barlacchi, G., & Moschitti, A. (2015). Distributional neural networks for automatic resolution of crossword puzzles. In: Proceedings of ACL-IJCNLP.
Zurück zum Zitat Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH
Zurück zum Zitat Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP (pp. 1422–1432). Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP (pp. 1422–1432).
Zurück zum Zitat Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., & Qin, B. (2014). Learning sentiment-specific word embedding for Twitter sentiment classification. Proceedings of ACL, 1, 1555–1565. Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., & Qin, B. (2014). Learning sentiment-specific word embedding for Twitter sentiment classification. Proceedings of ACL, 1, 1555–1565.
Zurück zum Zitat Wang, P., Xu, J., Xu, B., Liu, C. L., Zhang, H., Wang, F., et al. (2015). Semantic clustering and convolutional neural network for short text categorization. Proceedings of ACL-IJCNLP, 2, 352–357. Wang, P., Xu, J., Xu, B., Liu, C. L., Zhang, H., Wang, F., et al. (2015). Semantic clustering and convolutional neural network for short text categorization. Proceedings of ACL-IJCNLP, 2, 352–357.
Zurück zum Zitat Zesch, T., & Gurevych, I. (2006). Automatically creating datasets for measures of semantic relatedness. In: Proceedings of linguistic distances workshop at ACL (pp. 16–24). Zesch, T., & Gurevych, I. (2006). Automatically creating datasets for measures of semantic relatedness. In: Proceedings of linguistic distances workshop at ACL (pp. 16–24).
Zurück zum Zitat Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In: Proceedings of NIPS (pp. 649–657). Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In: Proceedings of NIPS (pp. 649–657).
Zurück zum Zitat Zhang, Y., & Wallace, B. (2015). A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820. Zhang, Y., & Wallace, B. (2015). A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:​1510.​03820.
Metadaten
Titel
Using word embeddings in Twitter election classification
verfasst von
Xiao Yang
Craig Macdonald
Iadh Ounis
Publikationsdatum
09.11.2017
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 2-3/2018
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-017-9319-5

Weitere Artikel der Ausgabe 2-3/2018

Discover Computing 2-3/2018 Zur Ausgabe