nach oben

Discover Computing

Erschienen in:

09.11.2017 | Neural Information Retrieval

Using word embeddings in Twitter election classification

verfasst von: Xiao Yang, Craig Macdonald, Iadh Ounis

Erschienen in: Discover Computing | Ausgabe 2-3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, as well as the parameters of the word embedding training process, namely the context window size, the dimensionality and the number of negative samples, on the attained classification performance. By comparing the classification results of word embedding models that have been trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data should align with the Twitter classification dataset both in data type and time period to achieve significantly better performance compared to baselines such as SVM with TF-IDF. Moreover, by evaluating the results of word embedding models trained using various context window sizes and dimensionalities, we find that large context window and dimension sizes are preferable to improve the performance. However, the number of negative samples parameter does not significantly affect the performance of the CNN classifiers. Our experimental results also show that choosing the correct word embedding model for use with CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings. Finally, for out-of-vocabulary (OOV) words that are not available in the learned word embedding models, we show that a simple OOV strategy to randomly initialise the OOV words without any prior knowledge is sufficient to attain a good classification performance among the current OOV strategies (e.g. a random initialisation using statistics of the pre-trained word embedding models).

Vorheriger Artikel Neural information retrieval: at the end of the early years

Nächster Artikel Picture it in your mind: generating high level visual representations from textual descriptions

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://deeplearning4j.org/.

http://scikit-learn.org.

https://github.com/facebookresearch/fastText.

Amati, G., Amodeo, G., Bianchi, M., Marcone, G., Bordoni, F. U., Gaibisso, C., Gambosi, G., Celi, A., Di Nicola, C., & Flammini, M. (2011). FUB, IASI-CNR, UNIVAQ at TREC 2011 microblog track. In: Proceedings of TREC.

Bansal, M., Gimpel, K., & Livescu, K. (2014). Tailoring continuous word representations for dependency parsing. Proceedings of ACL, 2, 809–815.

BBC. (2015). Venezuela opposition politician luis manuel diaz killed. http://www.bbc.co.uk/news/world-latin-america-34929332. Accessed 15 May 2016.

Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.MATH

Bermingham, A., & Smeaton, A. F. (2011). On using Twitter to monitor political sentiment and predict election results. In: Proceedings of SAAIP workshop at IJCNLP.

Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.

Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213.CrossRef

Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of ICML (pp. 160–167).

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–2537.MATH

Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. (2016). Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781.

Dhingra, B., Zhou, Z., Fitzpatrick, D., Muehl, M., & Cohen, W. W. (2016). Tweet2vec: Character-based distributed representations for social media. In Proceedings of ACL.

Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.CrossRef

Ebert, S., Vu, N.T., & Schütze, H. (2015). CIS-positive: Combining convolutional neural networks and SVMs for sentiment analysis in Twitter. In: Proceedings of SemEval (p. 527).

Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In: Proceedings of AISTATS (Vol. 15, p. 275).

Godin, F., Vandersmissen, B., De Neve, W., & Van de Walle, R. (2015). Multimedia Lab@ ACL W-NUT NER shared task: Named entity recognition for Twitter microposts using distributed word representations. In: Proceedings of ACL-IJCNLP (p. 146).

Hahnloser, R. H., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J., & Seung, H. S. (2000). Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature, 405(6789), 947–951.CrossRef

Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.

Kim, Y. (2014). Convolutional neural networks for sentence classification. In: Proceedings of EMNLP (pp. 1746–1751).

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS (pp. 1097–1105).

Macdonald, C., McCreadie, R., Santos, R.L., & Ounis, I. (2012). From puppy to maturity: Experiences in developing terrier. In: Proceedings of OSIR workshop at SIGIR (Vol. 60).

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Mikolov, T., Le, Q. V., & Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (pp. 3111–3119).

Mikolov, T., Yih, W. T., & Zweig, G. (2013d). Linguistic regularities in continuous space word representations. In: Proceedings of HLT-NAACL (Vol. 13, pp. 746–751).

Mitra, B., Nalisnick, E., Craswell, N., & Caruana, R. (2016). A dual embedding space model for document ranking. arXiv preprint arXiv:1602.01137.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.MathSciNetMATH

Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In: Proceedings of EMNLP (pp. 1532–1543).

Severyn, A., & Moschitti, A. (2015). UNITN: training deep convolutional neural network for Twitter sentiment classification. In: Proceedings of SemEval (pp. 464–469).

Severyn, A., Nicosia, M., Barlacchi, G., & Moschitti, A. (2015). Distributional neural networks for automatic resolution of crossword puzzles. In: Proceedings of ACL-IJCNLP.

Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH

Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP (pp. 1422–1432).

Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., & Qin, B. (2014). Learning sentiment-specific word embedding for Twitter sentiment classification. Proceedings of ACL, 1, 1555–1565.

Wang, P., Xu, J., Xu, B., Liu, C. L., Zhang, H., Wang, F., et al. (2015). Semantic clustering and convolutional neural network for short text categorization. Proceedings of ACL-IJCNLP, 2, 352–357.

Zesch, T., & Gurevych, I. (2006). Automatically creating datasets for measures of semantic relatedness. In: Proceedings of linguistic distances workshop at ACL (pp. 16–24).

Zhang, X., & LeCun, Y. (2015). Text understanding from scratch. arXiv preprint arXiv:1502.01710.

Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In: Proceedings of NIPS (pp. 649–657).

Zhang, Y., & Wallace, B. (2015). A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820.

Titel: Using word embeddings in Twitter election classification
verfasst von: Xiao Yang
Craig Macdonald
Iadh Ounis
Publikationsdatum: 09.11.2017
Verlag: Springer Netherlands
Erschienen in: Discover Computing / Ausgabe 2-3/2018
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-017-9319-5

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2-3/2018

Neural information retrieval: at the end of the early years

Picture it in your mind: generating high level visual representations from textual descriptions

Sequence-based context-aware music recommendation

Neural information retrieval: introduction to the special issue