skip to main content
research-article

Bots in Social and Interaction Networks: Detection and Impact Estimation

Authors Info & Claims
Published:17 October 2020Publication History
Skip Abstract Section

Abstract

The rise of bots and their influence on social networks is a hot topic that has aroused the interest of many researchers. Despite the efforts to detect social bots, it is still difficult to distinguish them from legitimate users. Here, we propose a simple yet effective semi-supervised method that allows distinguishing between bots and legitimate users with high accuracy. The method learns a joint representation of social connections and interactions between users by leveraging graph-based representation learning. Then, on the proximity graph derived from user embeddings, a sample of bots is used as seeds for a label propagation algorithm. We demonstrate that when the label propagation is done according to pairwise account proximity, our method achieves F1 = 0.93, whereas other state-of-the-art techniques achieve F1 ≤ 0.87. By applying our method to a large dataset of retweets, we uncover the presence of different clusters of bots in the network of Twitter interactions. Interestingly, such clusters feature different degrees of integration with legitimate users. By analyzing the interactions produced by the different clusters of bots, our results suggest that a significant group of users was systematically exposed to content produced by bots and to interactions with bots, indicating the presence of a selective exposure phenomenon.

References

  1. Sinan Aral and Dean Eckles. 2019. Protecting elections from social media manipulation. Science 365, 6456 (2019), 858--861.Google ScholarGoogle Scholar
  2. Chris Baraniuk. 2018. How Twitter bots help fuel political feuds. Sci. Amer. (2018), 20--30. https://www.scientificamerican.com/article/how-twitter-bots-help-fuel-political-feuds/.Google ScholarGoogle Scholar
  3. David A. Broniatowski, Amelia M. Jamison, SiHua Qi, Lulwah AlKulaib, Tao Chen, Adrian Benton, Sandra C. Quinn, and Mark Dredze. 2018. Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Amer. J. Pub. Health 108, 10 (2018), 1378--1384.Google ScholarGoogle ScholarCross RefCross Ref
  4. Carlos Castillo, Marcelo Mendoza, and Bárbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference Companion on World Wide Web (WWW’11). ACM, 675--684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Carlos Castillo, Marcelo Mendoza, and Bárbara Poblete. 2013. Predicting information credibility in time-sensitive social media. Internet Res. 23, 5 (2013), 560--588.Google ScholarGoogle ScholarCross RefCross Ref
  6. Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2012. Detecting automation of Twitter accounts: Are you a human, bot, or cyborg? IEEE Trans. Depend. Sec. Comput. 9, 6 (2012), 811--824.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Matteo Cinelli, Stefano Cresci, Alessandro Galeazzi, Walter Quattrociocchi, and Maurizio Tesconi. 2020. The limited reach of fake news on Twitter during 2019 European elections. PLoS ONE 15, 6 (2020), e0234689.Google ScholarGoogle ScholarCross RefCross Ref
  8. Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, and Peter Sheridan Dodds. 2016. Vaporous marketing: Uncovering pervasive electronic cigarette advertisements on Twitter. PLoS One 11, 7 (2016).Google ScholarGoogle Scholar
  9. Aaron Clauset, M. E. J. Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Phys. Rev. E 70, 6 (2004).Google ScholarGoogle ScholarCross RefCross Ref
  10. Stefano Cresci. 2020. A decade of social bot detection. Commun. ACM 63, 10 (2020), 72--83.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2015. Fame for sale: Efficient detection of fake Twitter followers. Dec. Supp. Syst. 80 (2015), 56--71.Google ScholarGoogle Scholar
  12. Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. Social fingerprinting: Detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Depend. Sec. Comput. 15, 4 (2017), 561--576.Google ScholarGoogle Scholar
  13. Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. In Proceedings of the 26th International Conference Companion on World Wide Web (WWW’17). 963--972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2020. Emergent properties, models, and laws of behavioral similarities within groups of Twitter users. Comput. Commun. 150 (2020), 47--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Stefano Cresci, Fabrizio Lillo, Daniele Regoli, Serena Tardelli, and Maurizio Tesconi. 2018. $FAKE: Evidence of spam and bot activity in stock microblogs on Twitter. In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM’18). AAAI, 580--583.Google ScholarGoogle Scholar
  16. Stefano Cresci, Fabrizio Lillo, Daniele Regoli, Serena Tardelli, and Maurizio Tesconi. 2019. Cashtag piggybacking: Uncovering spam and bot activity in stock microblogs on Twitter. ACM Trans. Web 13, 2 (2019), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi. 2019. Better safe than sorry: An adversarial approach to improve social bot detection. In Proceedings of the 11th International ACM Web Science Conference (WebSci’19). ACM, 47--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi. 2019. On the capability of evolved spambots to evade detection via genetic engineering. Online Social Netw. Media 9 (2019), 1--16.Google ScholarGoogle ScholarCross RefCross Ref
  19. Giovanni Da San Martino, Stefano Cresci, Alberto Barrón-Cedeño, Seunghak Yu, Roberto Di Pietro, and Preslav Nakov. 2020. A survey on computational propaganda detection. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20).Google ScholarGoogle ScholarCross RefCross Ref
  20. Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2016. BotOrNot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW’16). 273--274.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Pedro M. Domingos. 2012. A few useful things to know about machine learning.Commun. ACM 55, 10 (2012), 78--87.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mohd Fazil and Muhammad Abulaish. 2020. A socialbots analysis-driven graph-based approach for identifying coordinated campaigns in Twitter. J. Intell. Fuzzy Syst. Preprint (2020), 1--17.Google ScholarGoogle Scholar
  23. Emilio Ferrara. 2020. #COVID-19 on Twitter: Bots, conspiracies, and social media activism. arXiv preprint arXiv:2004.09531 (2020).Google ScholarGoogle Scholar
  24. Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM 59, 7 (2016), 96--104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Emilio Ferrara, Onur Varol, Filippo Menczer, and Alessandro Flammini. 2016. Detection of promoted social media campaigns. In Proceedings of the 10th International AAAI Conference on Web and Social Media (ICWSM’16). AAAI, 563--566.Google ScholarGoogle Scholar
  26. Syeda Nadia Firdaus, Chen Ding, and Alireza Sadeghian. 2018. Retweet: A popular information diffusion mechanism--A survey paper. Online Social Netw. Media 6 (2018), 26--40.Google ScholarGoogle ScholarCross RefCross Ref
  27. Riccardo Gallotti, Francesco Valle, Nicola Castaldo, Pierluigi Sacco, and Manlio De Domenico. 2020. Assessing the risks of “infodemics” in response to COVID-19 epidemics. arXiv preprint arXiv:2004.03997 (2020).Google ScholarGoogle Scholar
  28. Zafar Gilani, Reza Farahbakhsh, Gareth Tyson, and Jon Crowcroft. 2019. A large-scale behavioural analysis of bots and humans on Twitter. ACM Trans. Web 13, 1 (2019), 7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sharad Goel, Duncan J. Watts, and Daniel G. Goldstein. 2012. The structure of online diffusion networks. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC’12). ACM, 623--638.Google ScholarGoogle Scholar
  30. Huy Hang, Xuetao Wei, Michalis Faloutsos, and Tina Eliassi-Rad. 2013. Entelecheia: Detecting P2P botnets in their waiting stage. In Proceedings of the 12th IFIP Networking Conference. IEEE, 1--9.Google ScholarGoogle Scholar
  31. Philip N. Howard. 2018. How political campaigns weaponize social media bots. IEEE Spectrum 55, 11 (2018).Google ScholarGoogle Scholar
  32. Ville Hyvönen, Teemu Pitkänen, Sotiris K. Tasoulis, Elias Jaasaari, Risto Tuomainen, Liang Wang, Jukka Corander, and Teemu Roos. 2016. Fast nearest neighbor search through sparse random projections and voting. In Proceedings of the 3rd IEEE International Conference on Big Data (BigData’16). IEEE, 881--888.Google ScholarGoogle ScholarCross RefCross Ref
  33. Elias Jääsaari, Ville Hyvönen, and Teemu Roos. 2019. Efficient autotuning of hyperparameters in approximate nearest neighbor search. In Proceedings of the 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’19). 590--602.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Bence Kollanyi, Philip N. Howard, and Samuel C. Woolley. 2016. Bots and automation over Twitter during the first U.S. election. Data Memo 2016.4. Oxford, UK: Project on Computational Propaganda (2016).Google ScholarGoogle Scholar
  35. Kyumin Lee, James Caverlee, and Steve Webb. 2010. Uncovering social spammers: Social honeypots + machine learning. In Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, 435--442.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kyumin Lee, Brian David Eoff, and James Caverlee. 2011. Seven months with the devils: A long-term study of content polluters on Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11). AAAI.Google ScholarGoogle Scholar
  37. Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alexander Peysakhovich. 2019. PyTorch-BigGraph: A large-scale graph embedding system. In Proceedings of the 2nd Conference on Systems and Machine Learning (SysML’19).Google ScholarGoogle Scholar
  38. Shing-Han Li, Yu-Cheng Kao, Zong-Cyuan Zhang, Ying-Ping Chuang, and David C. Yen. 2015. A network behavior-based botnet detection mechanism using PSO and k-means. ACM Trans. Manag. Inf. Syst. 6, 1 (2015), 3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. Holoscope: Topology-and-spike aware fraud detection. In Proceedings of the 26th ACM Conference on Information and Knowledge Management (CIKM’17). ACM, 1539--1548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Luca Luceri, Ashok Deb, Silvia Giordano, and Emilio Ferrara. 2019. Evolution of bot and human behavior during elections. First Mond. 24, 9 (2019).Google ScholarGoogle Scholar
  41. Michele Mazza, Stefano Cresci, Marco Avvenuti, Walter Quattrociocchi, and Maurizio Tesconi. 2019. RTbust: Exploiting temporal patterns for botnet detection on Twitter. In Proceedings of the 11th International ACM Web Science Conference (WebSci’19). ACM, 183--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Stuart E. Middleton and Vadims Krivcovs. 2016. Geoparsing and geosemantics for social media: Spatiotemporal grounding of content propagating rumors to support trust and veracity analysis during breaking news. ACM Trans. Inf. Syst. 34, 3 (2016), 16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Claude Nadeau and Yoshua Bengio. 2003. Inference for the generalization error. Mach. Learn. 52, 3 (2003), 239--281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Leonardo Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, and Maurizio Tesconi. 2020. Coordinated behavior on social media in 2019 UK general election. arXiv preprint arXiv:2008.08370 (2020).Google ScholarGoogle Scholar
  45. Leonardo Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, Maurizio Tesconi, and Emilio Ferrara. 2020. Charting the landscape of online cryptocurrency manipulation. IEEE Access 8 (2020), 113230--113245.Google ScholarGoogle ScholarCross RefCross Ref
  46. Symeon Papadopoulos, Kalina Bontcheva, Eva Jaho, Mihai Lupu, and Carlos Castillo. 2016. Overview of the special issue on trust and veracity of information in social media. ACM Trans. Inf. Syst. 34, 3 (2016), 14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Eli Pariser. 2011. The Filter Bubble: What the Internet is Hiding from You. Penguin UK.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. David Martin Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness 8 correlation. J. Mach. Learn. Technol. 2, 1 (2011), 37--63.Google ScholarGoogle ScholarCross RefCross Ref
  49. Michael Reinhard. 2020. Automating fandom: Social bots, music celebrity, and identity online. Transform. Works Cult. 32 (2020).Google ScholarGoogle ScholarCross RefCross Ref
  50. Marian-Andrei Rizoiu, Timothy Graham, Rui Zhang, Yifei Zhang, Robert Ackland, and Lexing Xie. 2018. #DebateNight: The role and influence of socialbots on Twitter during the 1st 2016 US presidential debate. In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM’18). AAAI.Google ScholarGoogle Scholar
  51. Yu Rong, Qiankun Zhu, and Hong Cheng. 2016. A model-free approach to infer the diffusion network from event cascade. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). ACM, 1653--1662.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The graph neural network model. IEEE Trans. Neural Netw. 20, 1 (2009), 61--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ross Schuchard, Andrew T. Crooks, Anthony Stefanidis, and Arie Croitoru. 2019. Bot stamina: Examining the influence and staying power of bots in online social networks. Appl. Netw. Sci. 4, 1 (2019), 55.Google ScholarGoogle ScholarCross RefCross Ref
  54. Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Kai-Cheng Yang, Alessandro Flammini, and Filippo Menczer. 2018. The spread of low-credibility content by social bots. Nat. Commun. 9, 4787 (2018).Google ScholarGoogle Scholar
  55. Chengcheng Shao, Pik Mai Hui, Lei Wang, Xinwen Jiang, Alessandro Flammini, Filippo Menczer, and Giovanni Luca Ciampaglia. 2018. Anatomy of an online misinformation network. PLoS ONE 13, 4 (2018), e0196087.Google ScholarGoogle ScholarCross RefCross Ref
  56. Kate Starbird. 2019. Disinformation’s spread: Bots, trolls and all of us. Nature 571, 7766 (2019), 449.Google ScholarGoogle Scholar
  57. Kate Starbird, Ahmer Arif, and Tom Wilson. 2019. Disinformation as collaborative work: Surfacing the participatory nature of strategic information operations. In Proceedings of the 22nd ACM Conference on Computer Supported Cooperative Work 8 Social Computing (CSCW’19). ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM’17). AAAI.Google ScholarGoogle Scholar
  59. Onur Varol, Emilio Ferrara, Filippo Menczer, and Alessandro Flammini. 2017. Early detection of promoted campaigns on social media. EPJ Data Sci. 6, 1 (2017), 13.Google ScholarGoogle ScholarCross RefCross Ref
  60. Soroush Vosoughi, Mostafa‘‘Neo’’ Mohsenvand, and Deb Roy. 2017. Rumor gauge: Predicting the veracity of rumors on Twitter. ACM Trans. Knowl. Discov. Data 11, 4 (2017), 50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146--1151.Google ScholarGoogle Scholar
  62. Lilian Weng, Filippo Menczer, and Yong Yeol Ahn. 2013. Virality prediction and community structure in social networks. Sci. Rep. 3, 2522 (2013).Google ScholarGoogle Scholar
  63. Han Xiao, Cigdem Aslay, and Aristides Gionis. 2018. Robust cascade reconstruction by Steiner tree sampling. In Proceedings of the 18th International Conference on Data Mining (ICDM’18). IEEE, 637--646.Google ScholarGoogle ScholarCross RefCross Ref
  64. Kai-Cheng Yang, Onur Varol, Clayton A. Davis, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2019. Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 1, 1 (2019), 48--61.Google ScholarGoogle ScholarCross RefCross Ref
  65. Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, and Filippo Menczer. 2020. Scalable and generalizable social bot detection through data selection. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI’20). AAAI.Google ScholarGoogle ScholarCross RefCross Ref
  66. Chengxi Zang, Peng Cui, Chaoming Song, Christos Faloutsos, and Wenwu Zhu. 2017. Quantifying structural patterns of information cascades. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17 Companion). 867--868.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Jinxue Zhang, Rui Zhang, Yanchao Zhang, and Guanhua Yan. 2018. The rise of social botnets: Attacks and countermeasures. IEEE Trans. Depend. Sec. Comput. 15, 6 (2018), 1068--1082.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Bots in Social and Interaction Networks: Detection and Impact Estimation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Information Systems
        ACM Transactions on Information Systems  Volume 39, Issue 1
        January 2021
        329 pages
        ISSN:1046-8188
        EISSN:1558-2868
        DOI:10.1145/3423044
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2020
        • Accepted: 1 August 2020
        • Revised: 1 May 2020
        • Received: 1 October 2019
        Published in tois Volume 39, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format