Skip to main content
Top
Published in: Social Network Analysis and Mining 1/2021

01-12-2021 | Original Article

CHECKED: Chinese COVID-19 fake news dataset

Authors: Chen Yang, Xinyi Zhou, Reza Zafarani

Published in: Social Network Analysis and Mining | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

COVID-19 has impacted all lives. To maintain social distancing and avoiding exposure, works and lives have gradually moved online. Under this trend, social media usage to obtain COVID-19 news has increased. Also, misinformation on COVID-19 is frequently spread on social media. In this work, we develop CHECKED, the first Chinese dataset on COVID-19 misinformation. CHECKED provides a total 2,104 verified microblogs related to COVID-19 from December 2019 to August 2020, identified by using a specific list of keywords. Correspondingly, CHECKED includes 1,868,175 reposts, 1,185,702 comments, and 56,852,736 likes that reveal how these verified microblogs are spread and reacted on Weibo. The dataset contains a rich set of multimedia information for each microblog including ground-truth label, textual, visual, temporal, and network information. Extensive experiments have been conducted to analyze CHECKED data and to provide benchmark results for well-established methods when predicting fake news using CHECKED. We hope that CHECKED can facilitate studies that target misinformation on coronavirus. The dataset is available at https://​github.​com/​cyang03/​CHECKED.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Chen E, Lerman K, Ferrara E (2020) Tracking social media discourse about the COVID-19 pandemic: development of a public coronavirus Twitter data set. JMIR Public Health Surveill 6(2):e19273CrossRef Chen E, Lerman K, Ferrara E (2020) Tracking social media discourse about the COVID-19 pandemic: development of a public coronavirus Twitter data set. JMIR Public Health Surveill 6(2):e19273CrossRef
go back to reference Cui L, Lee D (2020) CoAID: COVID-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885 Cui L, Lee D (2020) CoAID: COVID-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885
go back to reference Gao Z, Yada S, Wakamiya S, Aramaki E (2020) NAIST COVID: multilingual COVID-19 Twitter and Weibo Dataset. arXiv preprint arXiv:2004.08145 Gao Z, Yada S, Wakamiya S, Aramaki E (2020) NAIST COVID: multilingual COVID-19 Twitter and Weibo Dataset. arXiv preprint arXiv:2004.08145
go back to reference Hu Y, Huang H, Chen A, Mao XL (2020) Weibo-COV: a large-scale COVID-19 social media dataset from Weibo. arXiv pp. arXiv–2005 Hu Y, Huang H, Chen A, Mao XL (2020) Weibo-COV: a large-scale COVID-19 social media dataset from Weibo. arXiv pp. arXiv–2005
go back to reference Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X et al (2020) Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395(10223):497–506CrossRef Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X et al (2020) Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395(10223):497–506CrossRef
go back to reference Jin Z, Cao J, Guo H, Zhang Y, Luo J (2017) Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 2017 ACM on Multimedia Conference, ACM, pp. 795–816 Jin Z, Cao J, Guo H, Zhang Y, Luo J (2017) Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 2017 ACM on Multimedia Conference, ACM, pp. 795–816
go back to reference Joulin A, Grave É, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431 Joulin A, Grave É, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431
go back to reference Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751
go back to reference Li Y, Jiang B, Shu K, Liu H (2020) MM-COVID: a multilingual and multidimensional data repository for combating COVID-19 fake new. arXiv preprint arXiv:2011.04088 Li Y, Jiang B, Shu K, Liu H (2020) MM-COVID: a multilingual and multidimensional data repository for combating COVID-19 fake new. arXiv preprint arXiv:​2011.​04088
go back to reference Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 2873–2879 Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 2873–2879
go back to reference Sohrabi C, Alsafi Z, O’Neill N, Khan M, Kerwan A, Al-Jabir A, Iosifidis C, Agha R (2020) World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int J Surg 76:71–76CrossRef Sohrabi C, Alsafi Z, O’Neill N, Khan M, Kerwan A, Al-Jabir A, Iosifidis C, Agha R (2020) World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int J Surg 76:71–76CrossRef
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems, pp 6000–6010 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems, pp 6000–6010
go back to reference Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) EANN: Event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 849–857. ACM Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) EANN: Event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 849–857. ACM
go back to reference Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 207–212 Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 207–212
go back to reference Zhou X, Mulay A, Ferrara E, Zafarani R (2020) ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3205–3212 Zhou X, Mulay A, Ferrara E, Zafarani R (2020) ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3205–3212
go back to reference Zhou X, Wu J, Zafarani R (2020) SAFE: Similarity-Aware Multi-Modal Fake News detection. In: The 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). Springer Zhou X, Wu J, Zafarani R (2020) SAFE: Similarity-Aware Multi-Modal Fake News detection. In: The 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). Springer
Metadata
Title
CHECKED: Chinese COVID-19 fake news dataset
Authors
Chen Yang
Xinyi Zhou
Reza Zafarani
Publication date
01-12-2021
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2021
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-021-00766-8

Other articles of this Issue 1/2021

Social Network Analysis and Mining 1/2021 Go to the issue

Premium Partner