CSI: A Hybrid Deep Model for Fake News Detection

Authors:
Natali Ruchansky

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Sungyong Seo

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

,
Yan Liu

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNovember 2017Pages 797–806https://doi.org/10.1145/3132847.3132877

Published:06 November 2017Publication History

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Pages 797–806

ABSTRACT

The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are at play, automatically detecting fake news is an important, yet challenging problem that is not yet well understood. Nevertheless, there are three generally agreed upon characteristics of fake news: the text of an article, the user response it receives, and the source users promoting it. Existing work has largely focused on tailoring solutions to one particular characteristic which has limited their success and generality.

In this work, we propose a model that combines all three characteristics for a more accurate and automated prediction. Specifically, we incorporate the behavior of both parties, users and articles, and the group behavior of users who propagate fake news. Motivated by the three characteristics, we propose a model called CSI which is composed of three modules: Capture, Score, and Integrate. The first module is based on the response and text; it uses a Recurrent Neural Network to capture the temporal pattern of user activity on a given article. The second module learns the source characteristic based on the behavior of users, and the two are integrated with the third module to classify an article as fake or not. Experimental analysis on real-world data demonstrates that CSI achieves higher accuracy than existing models, and extracts meaningful latent representations of both users and articles.

References

Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks Proceedings of the 22nd international conference on World Wide Web. ACM, 119--130. Google ScholarDigital Library
Carlos Castillo, Mohammed El-Haddad, Jürgen Pfeffer, and Matt Stempeck. 2014. Characterizing the life cycle of online news stories using social media reactions Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. ACM, 211--223. Google ScholarDigital Library
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web. ACM, 675--684. Google ScholarDigital Library
Nikan Chavoshi, Hossein Hamooni, and Abdullah Mueen. 2016. DeBot: Twitter Bot Detection via Warped Correlation. 2016 IEEE 16th International Conference on Data Mining (ICDM) (2016), 817--822.Google ScholarCross Ref
Yimin Chen, Niall J Conroy, and Victoria L Rubin. 2015. Misleading online content: Recognizing clickbait as false news Proceedings of the 2015 ACM on Workshop on Multimodal Deception Detection. ACM, 15--19. Google ScholarDigital Library
Brett Edkins. 2016. Americans Believe They Can Detect Fake News. Studies Show They Can't. (December. 2016). www.forbes.com/sites/brettedkins/2016/12/20/americans-believe-they-can-detect-fake-news-studies-show-they-cant/Google Scholar
Vanessa Wei Feng and Graeme Hirst. 2013. Detecting Deceptive Opinions with Profile Compatibility. IJCNLP. 338--346.Google Scholar
William Ferreira and Andreas Vlachos. 2016. Emergent: a novel data-set for stance classification Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL.Google Scholar
Adrien Friggeri, Lada A Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor Cascades. ICWSM.Google Scholar
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo, and Patrick Meier. 2014. Tweetcred: Real-time credibility assessment of content on twitter International Conference on Social Informatics. Springer, 228--243.Google Scholar
Michael Hüsken and Peter Stagge. 2003. Recurrent neural networks for time series classification. Neurocomputing Vol. 50 (2003), 223--235.Google ScholarCross Ref
Meng Jiang, Peng Cui, and Christos Faloutsos. 2016. Suspicious behavior detection: Current trends and future directions. IEEE Intelligent Systems Vol. 31, 1 (2016), 31--39. Google ScholarDigital Library
Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. 2013. Epidemiological modeling of news and rumors on twitter Proceedings of the 7th Workshop on Social Network Mining and Analysis. ACM, 8. Google ScholarDigital Library
Srijan Kumar, Robert West, and Jure Leskovec. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 591--602. Google ScholarDigital Library
Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor Detection over Varying Time Windows. PLOS ONE, Vol. 12, 1 (2017), e0168344.Google ScholarCross Ref
Quoc V Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. ICML, Vol. Vol. 14. 1188--1196. Google ScholarDigital Library
Ji Young Lee and Franck Dernoncourt. 2016. Sequential short-text classification with recurrent and convolutional neural networks. arXiv preprint arXiv:1603.03827 (2016).Google Scholar
Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of massive datasets. Cambridge University Press. Google ScholarDigital Library
Gilad Lotan. 2016. Fake News Is Not the Only Problem. (November. 2016). points.datasociety.net/fake-news-is-not-the-problem-f00ec8cdfcbGoogle Scholar
Wuqiong Luo, Wee Peng Tay, and Mei Leng. 2013. Identifying infection sources and regions in large networks. IEEE Transactions on Signal Processing Vol. 61, 11 (2013), 2850--2865. Google ScholarDigital Library
Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting rumors from microblogs with recurrent neural networks Proceedings of IJCAI. Google ScholarDigital Library
Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, and Kam-Fai Wong. 2015. Detect rumors using time series of social context information on microblogging websites Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 1751--1754. Google ScholarDigital Library
Sapa Maheshwari. 2016. How Fake News Goes Viral: A Case Study. (November. 2016). https://www.nytimes.com/2016/11/20/business/media/how-fake-news-spreads.htmlGoogle Scholar
Benjamin Markines, Ciro Cattuto, and Filippo Menczer. 2009. Social spam detection Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web. ACM, 41--48. Google ScholarDigital Library
David M Markowitz and Jeffrey T Hancock. 2014. Linguistic traces of a scientific fraud: The case of Diederik Stapel. PloS one, Vol. 9, 8 (2014), e105937.Google ScholarCross Ref
Laura McClure. 2017. How to tell fake news from real news. (January. 2017). blog.ed.ted.com/2017/01/12/how-to-tell-fake-news-from-real-news/Google Scholar
Krikamol Muandet and Bernhard Schölkopf. 2013. One-class support measure machines for group anomaly detection. arXiv preprint arXiv:1303.0309 (2013). Google ScholarDigital Library
Arjun Mukherjee, Bing Liu, and Natalie Glance. 2012. Spotting fake reviewer groups in consumer reviews. Proceedings of the 21st international conference on World Wide Web. ACM, 191--200. Google ScholarDigital Library
Amela Prelić, Stefan Bleuler, Philip Zimmermann, Anja Wille, Peter Bühlmann, Wilhelm Gruissem, Lars Hennig, Lothar Thiele, and Eckart Zitzler. 2006. A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, Vol. 22, 9 (2006), 1122--1129. Google ScholarDigital Library
Victoria L Rubin. 2017. Deception Detection and Rumor Debunking for Social Media. (2017).Google Scholar
Victoria L Rubin, Yimin Chen, and Niall J Conroy. 2015. Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, Vol. 52, 1 (2015), 1--4. Google ScholarCross Ref
Kate Starbird, Jim Maddock, Mania Orand, Peg Achterman, and Robert M Mason. 2014. Rumors, false flags, and digital vigilantes: Misinformation on twitter after the 2013 boston marathon bombing. iConference 2014 Proceedings (2014).Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks Advances in neural information processing systems. 3104--3112. Google ScholarDigital Library
Tess Townsend. 2017. Google has banned 200 publishers since it passed a new policy against fake news. (January. 2017). www.recode.net/2017/1/25/14375750/google-adsense-advertisers-publishers-fake-newsGoogle Scholar
Onur Varol, Emilio Ferrara, Clayton A Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online human-bot interactions: Detection, estimation, and characterization. arXiv preprint arXiv:1703.03107 (2017).Google Scholar
Di Wang and Eric Nyberg. 2015. A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering. ACL (2). 707--712.Google Scholar
Zhaoxu Wang, Wenxiang Dong, Wenyi Zhang, and Chee Wei Tan. 2014. Rumor source detection with multiple observations: Fundamental limits and algorithms ACM SIGMETRICS Performance Evaluation Review, Vol. Vol. 42. ACM, 1--13. Google ScholarDigital Library
Ke Wu, Song Yang, and Kenny Q Zhu. 2015. False rumors detection on sina weibo by propagation structures Data Engineering (ICDE), 2015 IEEE 31st International Conference on. IEEE, 651--662.Google Scholar
Liang Xiong, Barnabás Póczos, and Jeff G Schneider. 2011 a. Group anomaly detection using flexible genre models Advances in neural information processing systems. 1071--1079. Google ScholarDigital Library
Liang Xiong, Barnabás Póczos, Jeff G Schneider, Andrew J Connolly, and Jake VanderPlas. 2011 b. Hierarchical Probabilistic Models for Group Anomaly Detection. AISTATS. 789--797.Google Scholar
Rose Yu, Xinran He, and Yan Liu. 2015. Glad: group anomaly detection in social media analysis. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 10, 2 (2015), 18. Google ScholarDigital Library
Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts Proceedings of the 24th International Conference on World Wide Web. ACM, 1395--1405. Google ScholarDigital Library
Kai Zhu and Lei Ying. 2016. Information source detection in the SIR model: A sample-path-based approach. IEEE/ACM Transactions on Networking (TON) Vol. 24, 1 (2016), 408--421. Google ScholarDigital Library

Index Terms

CSI: A Hybrid Deep Model for Fake News Detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks
2. Information systems
  1. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Identifying the influential bloggers in a community
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data Mining

Blogging becomes a popular way for a Web user to publish information on the Web. Bloggers write blog posts, share their likes and dislikes, voice their opinions, provide suggestions, report news, and form groups in Blogosphere. Bloggers form their ...
Read More
Artificial Inflation: The Real Story of Trends and Trend-Setters in Sina Weibo
SOCIALCOM-PASSAT '12: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust

There has been a tremendous rise in the growth of online social networks all over the world in recent years. This has resulted in a large amount of content created and propagated at an incessant rate, all competing with each other to attract enough ...
Read More
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web Conference

Over the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
General Chairs:
Ee-Peng Lim
Singapore Management University, Singapore
,
Marianne Winslett
University of Illinois at Urbana-Champaign, USA, and Advanced Digital Sciences Center, Singapore
,
Program Chairs:
Mark Sanderson
RMIT, Australia
,
Ada Fu
Chinese University of Hong Kong, Hong Kong
,
Jimeng Sun
Georgia Tech, USA
,
Shane Culpepper
RMIT, Australia
,
Eric Lo
Chinese University of Hong Kong, Hong Kong
,
Joyce Ho
Emory University, USA
,
Debora Donato
Mix Tech, Inc., USA
,
Rakesh Agrawal
Data Insights Laboratories, USA
,
Yu Zheng
Microsoft Research Asia, China
,
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Aixin Sun
Nanyang Technological University, Singapore
,
Vincent S. Tseng
National Cheng Kung University, Taiwan
,
Chenliang Li
Wuhan University, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
fake news detection
group anomaly detection
neural network
social networks
temporal analysis
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 477
  Total Citations
  View Citations
- 10,046
  Total Downloads
- Downloads (Last 12 months)1,356
- Downloads (Last 6 weeks)204
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

CSI: A Hybrid Deep Model for Fake News Detection

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Identifying the influential bloggers in a community

Artificial Inflation: The Real Story of Trends and Trend-Setters in Sina Weibo

Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web