research-article

Gaussian Processes for Rumour Stance Classification in Social Media

Authors:
Michal Lukasik

University of Sheffield, Sheffield, United Kingdom

University of Sheffield, Sheffield, United Kingdom
View Profile

,
Kalina Bontcheva

University of Sheffield, Sheffield, United Kingdom

University of Sheffield, Sheffield, United Kingdom
View Profile

,
Trevor Cohn

University of Melbourne, Melbourne, Australia

University of Melbourne, Melbourne, Australia
View Profile

,
Arkaitz Zubiaga

University of Warwick, Coventry, United Kingdom

University of Warwick, Coventry, United Kingdom
View Profile

,
Maria Liakata

University of Warwick and Alan Turing Institute, Coventry, United Kingdom

University of Warwick and Alan Turing Institute, Coventry, United Kingdom
View Profile

,
Rob Procter

University of Warwick and Alan Turing Institute, Coventry, United Kingdom

University of Warwick and Alan Turing Institute, Coventry, United Kingdom
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 37 Issue 2Article No.: 20pp 1–24https://doi.org/10.1145/3295823

Published:13 February 2019Publication History

ACM Transactions on Information Systems

Abstract

Social media tend to be rife with rumours while new reports are released piecemeal during breaking news. Interestingly, one can mine multiple reactions expressed by social media users in those situations, exploring their stance towards rumours, ultimately enabling the flagging of highly disputed rumours as being potentially false. In this work, we set out to develop an automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a conversation around a rumour as either supporting, denying or questioning the rumour. Using a Gaussian Process classifier, and exploring its effectiveness on two datasets with very different characteristics and varying distributions of stances, we show that our approach consistently outperforms competitive baseline classifiers. Our classifier is especially effective in estimating the distribution of different types of stance associated with a given rumour, which we set forth as a desired characteristic for a rumour-tracking system that will show both ordinary users of Twitter and professional news practitioners how others orient to the disputed veracity of a rumour, with the final aim of establishing its actual truth value.

Supplemental Material

Available for Download

zip

lukasik.zip (26.2 KB)

Supplemental movie, appendix, image and software files for, Gaussian Processes for Rumour Stance Classification in Social Media

References

Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau. 2011. Sentiment analysis of Twitter data. In Proceedings of the Workshop on Languages in Social Media (LSM’11). Association for Computational Linguistics, 30--38. Google ScholarDigital Library
G. W. Allport and L. Postman. 1947. The psychology of rumor. J. Clin. Psychol. (1947). https://psycnet.apa.org/record/1948-00288-000.Google Scholar
Mauricio A. Álvarez, Lorenzo Rosasco, and Neil D. Lawrence. 2012. Kernels for vector-valued functions: A review. Found. Trends Mach. Learn. 4, 3 (2012), 195--266. Google ScholarDigital Library
Daniel Beck, Trevor Cohn, and Lucia Specia. 2014. Joint emotion analysis via multi-task Gaussian processes. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1798--1803.Google ScholarCross Ref
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2013. Predicting information credibility in time-sensitive social media. Internet Res. 23, 5 (2013), 560--588.Google ScholarCross Ref
Trevor Cohn and Lucia Specia. 2013. Modelling annotator bias with multi-task Gaussian processes: An application to machine translation quality estimation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13). 32--42.Google Scholar
Leon Derczynski, Kalina Bontcheva, Michal Lukasik, Thierry Declerck, Arno Scharl, Georgi Georgiev, Petya Osenova, Toms Pariente Lobo, Anna Kolliakou, Robert Stewart, Sara-Jayne Terp, Geraldine Wong, Christian Burger, Arkaitz Zubiaga, Rob Procter, and Maria Liakata. 2015. PHEME: Computing veracity the fourth challenge of big social data. In European Semantic Web Conference ESWC. 25--29.Google Scholar
Nicholas DiFonzo and Prashant Bordia. 2007. Rumor, gossip and urban legends. Diogenes 54, 1 (2007), 19--35.Google ScholarCross Ref
Pamela Donovan. 2007. How idle is idle talk? One hundred years of rumor research. Diogenes 54, 1 (2007), 59--82.Google ScholarCross Ref
Theodoros Evgeniou and Massimiliano Pontil. 2004. Regularized multi--task learning. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). 109--117. Google ScholarDigital Library
The GPy authors. 2015. GPy: A Gaussian process framework in Python. Retrieved from http://github.com/SheffieldML/GPy.Google Scholar
Bernard Guerin and Yoshihiko Miyazaki. 2006. Analyzing rumors, gossip, and urban legends through their conversational properties. Psychol. Rec. 56, 1, Article 2 (2006).Google Scholar
Sana Hamdi, Alda Lopes Gancarski, Amel Bouzeghoub, and Sadok Ben Yahia. 2016. TISoN: Trust inference in trust-oriented social networks. ACM Trans. Inf. Syst. 34, 3 (2016), 17. Google ScholarDigital Library
Sardar Hamidian and Mona T. Diab. 2016. Rumor identification and belief investigation on Twitter. In Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16). 3--8.Google Scholar
Solomon Kullback and Richard A. Leibler. 1951. On information and sufficiency. Ann. Math. Stat. 22, 1 (1951), 79--86.Google ScholarCross Ref
Vasileios Lampos, Nikolaos Aletras, Daniel Preotiuc-Pietro, and Trevor Cohn. 2014. Predicting and characterising user impact on Twitter. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). 405--413.Google ScholarCross Ref
Stephan Lewandowsky, Ullrich K. H. Ecker, Colleen M. Seifert, Norbert Schwarz, and John Cook. 2012. Misinformation and its correction continued influence and successful debiasing. Psychol. Sci. Publ. Interest 13, 3 (2012), 106--131.Google ScholarCross Ref
Percy Liang. 2005. Semi-Supervised Learning for Natural Language. Ph.D. Dissertation. Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology.Google Scholar
Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, and Sameena Shah. 2015. Real-time rumor debunking on Twitter. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, New York, NY, 1867--1870. Google ScholarDigital Library
Clare Llewellyn, Claire Grover, Jon Oberlander, and Ewan Klein. 2014. Re-using an argument corpus to aid in the curation of social media collections. In Proceedings of the 9th International Conference on Language Resources and Evaluation (26-31) (LREC’14). 462--468.Google Scholar
Michal Lukasik and Trevor Cohn. 2016. Convolution kernels for discriminative learning from streaming text. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). Google ScholarDigital Library
Michal Lukasik, Trevor Cohn, and Kalina Bontcheva. 2015. Classifying tweet level judgements of rumours in social media. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2590--2595.Google ScholarCross Ref
Michal Lukasik, Trevor Cohn, and Kalina Bontcheva. 2015. Point process modelling of rumour dynamics in social media. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 518--523.Google Scholar
Michal Lukasik, P. K. Srijith, Trevor Cohn, and Kalina Bontcheva. 2015. Modeling tweet arrival times using log-Gaussian Cox processes. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 250--255.Google ScholarCross Ref
Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, and Trevor Cohn. 2016. Hawkes processes for continuous time sequence classification: An application to rumour stance classification in Twitter. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 393--398.Google ScholarCross Ref
Marcelo Mendoza, Barbara Poblete, and Carlos Castillo. 2010. Twitter under crisis: Can we trust what we RT? In Proceedings of the 1st Workshop on Social Media Analytics (SOMA’10). 71--79. Google ScholarDigital Library
Stuart E. Middleton and Vadims Krivcovs. 2016. Geoparsing and geosemantics for social media: Spatio-temporal grounding of content propagating rumours to support trust and veracity analysis during breaking news. ACM Trans. Inf. Syst. 34, 3 (2016), 1--27. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Thomas Minka and John Lafferty. 2002. Expectation-propagation for the generative aspect model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI’02). 352--359. Google ScholarDigital Library
Olutobi Owoputi, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. 2013. Improved part-of-speech tagging for online conversational text with word clusters. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’13). 380--390.Google Scholar
Symeon Papadopoulos, Kalina Bontcheva, Eva Jaho, Mihai Lupu, and Carlos Castillo. 2016. Overview of the special issue on trust and veracity of information in social media. ACM Trans. Inf. Syst. 34, 3 (2016), 14. Google ScholarDigital Library
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825--2830. Google ScholarDigital Library
Daniel Preotiuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1754--1764. http://aclweb.org/anthology/P/P15/P15-1169.pdf.Google Scholar
Rob Procter, Jeremy Crump, Susanne Karstedt, Alex Voss, and Marta Cantijoch. 2013. Reading the riots: What were the police doing on Twitter? Pol. Soc. 23, 4 (2013), 413--436.Google ScholarCross Ref
Rob Procter, Farida Vis, and Alex Voss. 2013. Reading the riots on Twitter: Methodological innovation for the analysis of big data. Int. J. Soc. Res. Methodol. 16, 3 (2013), 197--214.Google ScholarCross Ref
Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1589--1599. Google ScholarDigital Library
Carl Edward Rasmussen and Christopher K. I. Williams. 2005. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press. Google ScholarDigital Library
Ralph L. Rosnow. 1991. The psychology of rumor. Am. Psychol. 46, 5 (1991), 484--496.Google ScholarCross Ref
Tamotsu Shibutani. 1969. Improvised news: A sociological study of rumor. Soc. Res. 36, 1 (1969), 169--171.Google Scholar
Dhanya Sridhar, Lise Getoor, and Marilyn Walker. 2014. Collective stance classification of posts in online debate forums. In Proceedings of the ACL Joint Workshop on Social Dynamics and Personal Attributes in Social Media.Google ScholarCross Ref
Peter Tolmie, Rob Procter, Mark Rouncefield, Maria Liakata, and Arkaitz Zubiaga. 2015. Microblog analysis as a programme of work. arXiv preprint arXiv:1511.03193 (2015). Google ScholarDigital Library
Peter Tolmie, Rob Procter, Mark Rouncefield, Maria Liakata, Arkaitz Zubiaga, and Dave Randall. 2017. Supporting the use of user generated content in journalistic practice. In Proceedings of the ACM CHI Conference on Human Factors in Computing Systems. Google ScholarDigital Library
Helena Webb, Pete Burnap, Rob Procter, Omer Rana, Bernd Carsten Stahl, Matthew Williams, William Housley, Adam Edwards, and Marina Jirotka. 2016. Digital wildfires: Propagation, verification, regulation, and responsible innovation. ACM Trans. Inf. Syst. 34, 3 (2016), 15. Google ScholarDigital Library
Li Zeng, Kate Starbird, and Emma S. Spiro. 2016. # unconfirmed: Classifying rumor stance in crisis-related social media messages. In Proceedings of the 10th International AAAI Conference on Web and Social Media.Google Scholar
Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Early detection of rumors in social media from enquiry posts. In Proceedings of the International World Wide Web Conference Committee (IW3C2). Google ScholarDigital Library
Daniel Xiaodan Zhou, Paul Resnick, and Qiaozhu Mei. 2011. Classifying the political leaning of news articles and users from user votes. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM’11). 417--424.Google Scholar
Arkaitz Zubiaga, Maria Liakata, and Rob Procter. 2016. Learning reporting dynamics during breaking news for rumour detection in social media. arXiv preprint arXiv:1610.07363 (2016).Google Scholar
Arkaitz Zubiaga, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Peter Tolmie. 2016. Analysing how people orient to and spread rumours in social media by looking at conversational threads. PLoS ONE 11, 3 (03 2016), 1--29.Google Scholar
Arkaitz Zubiaga, Peter Tolmie, Maria Liakata, and Rob Procter. 2014. D2.1 development of an annotation scheme for social media rumours. PHEME Deliverable (2014). http://www.pheme.eu/wp-content/uploads/2016/02/PHEME-D2.1-Development_of_an_annotation_scheme.pdf.Google Scholar
Arkaitz Zubiaga, Peter Tolmie, Maria Liakata, and Rob Procter. 2015. D2.4 Qualitative Analysis of Rumours, Sources, and Diffusers Across Media and Languages. Technical Report. University of Warwick.Google Scholar

Index Terms

Gaussian Processes for Rumour Stance Classification in Social Media
1. Information systems

Recommendations

Detect Rumor and Stance Jointly by Neural Multi-task Learning
WWW '18: Companion Proceedings of the The Web Conference 2018

In recent years, an unhealthy phenomenon characterized as the massive spread of fake news or unverified information (i.e., rumors) has become increasingly a daunting issue in human society. The rumors commonly originate from social media outlets, ...
Read More
VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text
WWW '20: Proceedings of The Web Conference 2020

Social media became popular and percolated almost all aspects of our daily lives. While online posting proves very convenient for individual users, it also fosters fast-spreading of various rumors. The rapid and wide percolation of rumors can cause ...
Read More
Geoparsing and Geosemantics for Social Media: Spatiotemporal Grounding of Content Propagating Rumors to Support Trust and Veracity Analysis during Breaking News
Special Issue on Trust and Veracity of Information in Social Media

In recent years, there has been a growing trend to use publicly available social media sources within the field of journalism. Breaking news has tight reporting deadlines, measured in minutes not days, but content must still be checked and rumors ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 37, Issue 2
April 2019
410 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3306215
Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands
Issue’s Table of Contents
Copyright © 2019 Owner/Author
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 February 2019
- Accepted: 1 November 2018
- Revised: 1 October 2018
- Received: 1 August 2016
Published in tois Volume 37, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Social media
breaking news
machine learning
rumours
stance classification
veracity classification
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 631
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Gaussian Processes for Rumour Stance Classification in Social Media

ACM Transactions on Information Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Detect Rumor and Stance Jointly by Neural Multi-task Learning

VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text

Geoparsing and Geosemantics for Social Media: Spatiotemporal Grounding of Content Propagating Rumors to Support Trust and Veracity Analysis during Breaking News