Abstract
Social media tend to be rife with rumours while new reports are released piecemeal during breaking news. Interestingly, one can mine multiple reactions expressed by social media users in those situations, exploring their stance towards rumours, ultimately enabling the flagging of highly disputed rumours as being potentially false. In this work, we set out to develop an automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a conversation around a rumour as either supporting, denying or questioning the rumour. Using a Gaussian Process classifier, and exploring its effectiveness on two datasets with very different characteristics and varying distributions of stances, we show that our approach consistently outperforms competitive baseline classifiers. Our classifier is especially effective in estimating the distribution of different types of stance associated with a given rumour, which we set forth as a desired characteristic for a rumour-tracking system that will show both ordinary users of Twitter and professional news practitioners how others orient to the disputed veracity of a rumour, with the final aim of establishing its actual truth value.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Gaussian Processes for Rumour Stance Classification in Social Media
- Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau. 2011. Sentiment analysis of Twitter data. In Proceedings of the Workshop on Languages in Social Media (LSM’11). Association for Computational Linguistics, 30--38. Google ScholarDigital Library
- G. W. Allport and L. Postman. 1947. The psychology of rumor. J. Clin. Psychol. (1947). https://psycnet.apa.org/record/1948-00288-000.Google Scholar
- Mauricio A. Álvarez, Lorenzo Rosasco, and Neil D. Lawrence. 2012. Kernels for vector-valued functions: A review. Found. Trends Mach. Learn. 4, 3 (2012), 195--266. Google ScholarDigital Library
- Daniel Beck, Trevor Cohn, and Lucia Specia. 2014. Joint emotion analysis via multi-task Gaussian processes. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1798--1803.Google ScholarCross Ref
- Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2013. Predicting information credibility in time-sensitive social media. Internet Res. 23, 5 (2013), 560--588.Google ScholarCross Ref
- Trevor Cohn and Lucia Specia. 2013. Modelling annotator bias with multi-task Gaussian processes: An application to machine translation quality estimation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13). 32--42.Google Scholar
- Leon Derczynski, Kalina Bontcheva, Michal Lukasik, Thierry Declerck, Arno Scharl, Georgi Georgiev, Petya Osenova, Toms Pariente Lobo, Anna Kolliakou, Robert Stewart, Sara-Jayne Terp, Geraldine Wong, Christian Burger, Arkaitz Zubiaga, Rob Procter, and Maria Liakata. 2015. PHEME: Computing veracity the fourth challenge of big social data. In European Semantic Web Conference ESWC. 25--29.Google Scholar
- Nicholas DiFonzo and Prashant Bordia. 2007. Rumor, gossip and urban legends. Diogenes 54, 1 (2007), 19--35.Google ScholarCross Ref
- Pamela Donovan. 2007. How idle is idle talk? One hundred years of rumor research. Diogenes 54, 1 (2007), 59--82.Google ScholarCross Ref
- Theodoros Evgeniou and Massimiliano Pontil. 2004. Regularized multi--task learning. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). 109--117. Google ScholarDigital Library
- The GPy authors. 2015. GPy: A Gaussian process framework in Python. Retrieved from http://github.com/SheffieldML/GPy.Google Scholar
- Bernard Guerin and Yoshihiko Miyazaki. 2006. Analyzing rumors, gossip, and urban legends through their conversational properties. Psychol. Rec. 56, 1, Article 2 (2006).Google Scholar
- Sana Hamdi, Alda Lopes Gancarski, Amel Bouzeghoub, and Sadok Ben Yahia. 2016. TISoN: Trust inference in trust-oriented social networks. ACM Trans. Inf. Syst. 34, 3 (2016), 17. Google ScholarDigital Library
- Sardar Hamidian and Mona T. Diab. 2016. Rumor identification and belief investigation on Twitter. In Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16). 3--8.Google Scholar
- Solomon Kullback and Richard A. Leibler. 1951. On information and sufficiency. Ann. Math. Stat. 22, 1 (1951), 79--86.Google ScholarCross Ref
- Vasileios Lampos, Nikolaos Aletras, Daniel Preotiuc-Pietro, and Trevor Cohn. 2014. Predicting and characterising user impact on Twitter. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). 405--413.Google ScholarCross Ref
- Stephan Lewandowsky, Ullrich K. H. Ecker, Colleen M. Seifert, Norbert Schwarz, and John Cook. 2012. Misinformation and its correction continued influence and successful debiasing. Psychol. Sci. Publ. Interest 13, 3 (2012), 106--131.Google ScholarCross Ref
- Percy Liang. 2005. Semi-Supervised Learning for Natural Language. Ph.D. Dissertation. Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology.Google Scholar
- Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, and Sameena Shah. 2015. Real-time rumor debunking on Twitter. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, New York, NY, 1867--1870. Google ScholarDigital Library
- Clare Llewellyn, Claire Grover, Jon Oberlander, and Ewan Klein. 2014. Re-using an argument corpus to aid in the curation of social media collections. In Proceedings of the 9th International Conference on Language Resources and Evaluation (26-31) (LREC’14). 462--468.Google Scholar
- Michal Lukasik and Trevor Cohn. 2016. Convolution kernels for discriminative learning from streaming text. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). Google ScholarDigital Library
- Michal Lukasik, Trevor Cohn, and Kalina Bontcheva. 2015. Classifying tweet level judgements of rumours in social media. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2590--2595.Google ScholarCross Ref
- Michal Lukasik, Trevor Cohn, and Kalina Bontcheva. 2015. Point process modelling of rumour dynamics in social media. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 518--523.Google Scholar
- Michal Lukasik, P. K. Srijith, Trevor Cohn, and Kalina Bontcheva. 2015. Modeling tweet arrival times using log-Gaussian Cox processes. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 250--255.Google ScholarCross Ref
- Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, and Trevor Cohn. 2016. Hawkes processes for continuous time sequence classification: An application to rumour stance classification in Twitter. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 393--398.Google ScholarCross Ref
- Marcelo Mendoza, Barbara Poblete, and Carlos Castillo. 2010. Twitter under crisis: Can we trust what we RT? In Proceedings of the 1st Workshop on Social Media Analytics (SOMA’10). 71--79. Google ScholarDigital Library
- Stuart E. Middleton and Vadims Krivcovs. 2016. Geoparsing and geosemantics for social media: Spatio-temporal grounding of content propagating rumours to support trust and veracity analysis during breaking news. ACM Trans. Inf. Syst. 34, 3 (2016), 1--27. Google ScholarDigital Library
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
- Thomas Minka and John Lafferty. 2002. Expectation-propagation for the generative aspect model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI’02). 352--359. Google ScholarDigital Library
- Olutobi Owoputi, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. 2013. Improved part-of-speech tagging for online conversational text with word clusters. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’13). 380--390.Google Scholar
- Symeon Papadopoulos, Kalina Bontcheva, Eva Jaho, Mihai Lupu, and Carlos Castillo. 2016. Overview of the special issue on trust and veracity of information in social media. ACM Trans. Inf. Syst. 34, 3 (2016), 14. Google ScholarDigital Library
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825--2830. Google ScholarDigital Library
- Daniel Preotiuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1754--1764. http://aclweb.org/anthology/P/P15/P15-1169.pdf.Google Scholar
- Rob Procter, Jeremy Crump, Susanne Karstedt, Alex Voss, and Marta Cantijoch. 2013. Reading the riots: What were the police doing on Twitter? Pol. Soc. 23, 4 (2013), 413--436.Google ScholarCross Ref
- Rob Procter, Farida Vis, and Alex Voss. 2013. Reading the riots on Twitter: Methodological innovation for the analysis of big data. Int. J. Soc. Res. Methodol. 16, 3 (2013), 197--214.Google ScholarCross Ref
- Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1589--1599. Google ScholarDigital Library
- Carl Edward Rasmussen and Christopher K. I. Williams. 2005. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press. Google ScholarDigital Library
- Ralph L. Rosnow. 1991. The psychology of rumor. Am. Psychol. 46, 5 (1991), 484--496.Google ScholarCross Ref
- Tamotsu Shibutani. 1969. Improvised news: A sociological study of rumor. Soc. Res. 36, 1 (1969), 169--171.Google Scholar
- Dhanya Sridhar, Lise Getoor, and Marilyn Walker. 2014. Collective stance classification of posts in online debate forums. In Proceedings of the ACL Joint Workshop on Social Dynamics and Personal Attributes in Social Media.Google ScholarCross Ref
- Peter Tolmie, Rob Procter, Mark Rouncefield, Maria Liakata, and Arkaitz Zubiaga. 2015. Microblog analysis as a programme of work. arXiv preprint arXiv:1511.03193 (2015). Google ScholarDigital Library
- Peter Tolmie, Rob Procter, Mark Rouncefield, Maria Liakata, Arkaitz Zubiaga, and Dave Randall. 2017. Supporting the use of user generated content in journalistic practice. In Proceedings of the ACM CHI Conference on Human Factors in Computing Systems. Google ScholarDigital Library
- Helena Webb, Pete Burnap, Rob Procter, Omer Rana, Bernd Carsten Stahl, Matthew Williams, William Housley, Adam Edwards, and Marina Jirotka. 2016. Digital wildfires: Propagation, verification, regulation, and responsible innovation. ACM Trans. Inf. Syst. 34, 3 (2016), 15. Google ScholarDigital Library
- Li Zeng, Kate Starbird, and Emma S. Spiro. 2016. # unconfirmed: Classifying rumor stance in crisis-related social media messages. In Proceedings of the 10th International AAAI Conference on Web and Social Media.Google Scholar
- Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Early detection of rumors in social media from enquiry posts. In Proceedings of the International World Wide Web Conference Committee (IW3C2). Google ScholarDigital Library
- Daniel Xiaodan Zhou, Paul Resnick, and Qiaozhu Mei. 2011. Classifying the political leaning of news articles and users from user votes. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM’11). 417--424.Google Scholar
- Arkaitz Zubiaga, Maria Liakata, and Rob Procter. 2016. Learning reporting dynamics during breaking news for rumour detection in social media. arXiv preprint arXiv:1610.07363 (2016).Google Scholar
- Arkaitz Zubiaga, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Peter Tolmie. 2016. Analysing how people orient to and spread rumours in social media by looking at conversational threads. PLoS ONE 11, 3 (03 2016), 1--29.Google Scholar
- Arkaitz Zubiaga, Peter Tolmie, Maria Liakata, and Rob Procter. 2014. D2.1 development of an annotation scheme for social media rumours. PHEME Deliverable (2014). http://www.pheme.eu/wp-content/uploads/2016/02/PHEME-D2.1-Development_of_an_annotation_scheme.pdf.Google Scholar
- Arkaitz Zubiaga, Peter Tolmie, Maria Liakata, and Rob Procter. 2015. D2.4 Qualitative Analysis of Rumours, Sources, and Diffusers Across Media and Languages. Technical Report. University of Warwick.Google Scholar
Index Terms
- Gaussian Processes for Rumour Stance Classification in Social Media
Recommendations
Detect Rumor and Stance Jointly by Neural Multi-task Learning
WWW '18: Companion Proceedings of the The Web Conference 2018In recent years, an unhealthy phenomenon characterized as the massive spread of fake news or unverified information (i.e., rumors) has become increasingly a daunting issue in human society. The rumors commonly originate from social media outlets, ...
VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text
WWW '20: Proceedings of The Web Conference 2020Social media became popular and percolated almost all aspects of our daily lives. While online posting proves very convenient for individual users, it also fosters fast-spreading of various rumors. The rapid and wide percolation of rumors can cause ...
Geoparsing and Geosemantics for Social Media: Spatiotemporal Grounding of Content Propagating Rumors to Support Trust and Veracity Analysis during Breaking News
Special Issue on Trust and Veracity of Information in Social MediaIn recent years, there has been a growing trend to use publicly available social media sources within the field of journalism. Breaking news has tight reporting deadlines, measured in minutes not days, but content must still be checked and rumors ...
Comments