ABSTRACT
Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots. Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design. First, we measure current Twitter's capabilities of detecting the new social spambots. Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots. Then, we benchmark several state-of-the-art techniques proposed by the academic literature. Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots. Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon. We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors. Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots. Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.
- M. R. Ackermann, M. Märtens, C. Raupach, K. Swierkot, C. Lammersen, and C. Sohler. StreamKM : A clustering algorithm for data streams. Experimental Algorithmics, 17:2--4, 2012. Google ScholarDigital Library
- F. Ahmed and M. Abulaish. A generic statistical approach for spam detection in online social networks. Computer Communications, 36(10):1120--1129, 2013. Google ScholarCross Ref
- M. Avvenuti, S. Bellomo, S. Cresci, M. N. La Polla, and M. Tesconi. Hybrid crowdsensing: A novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In WWW companion. ACM, 2017. Google ScholarDigital Library
- F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on Twitter. In CEAS, 2010.Google Scholar
- F. Benevenuto, T. Rodrigues, V. Almeida, J. Almeida, and M. Gonçalves. Detecting spammers and content promoters in online video social networks. In SIGIR. ACM, 2009. Google ScholarDigital Library
- A. Beutel, W. Xu, V. Guruswami, C. Palow, and C. Faloutsos. CopyCatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW. ACM, 2013. Google ScholarDigital Library
- Bot or not? http://truthy.indiana.edu/botornot/, 2014. Accessed: 2016-10-24.Google Scholar
- F. Cao, M. Ester, W. Qian, and A. Zhou. Density-Based Clustering over an Evolving Data Stream with Noise. In SDM. SIAM, 2006. Google ScholarCross Ref
- Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro. Aiding the detection of fake accounts in large scale social online services. In NSDI. USENIX, 2012. Google ScholarDigital Library
- Q. Cao, X. Yang, J. Yu, and C. Palow. Uncovering large groups of active malicious accounts in online social networks. In CCS. ACM, 2014. Google ScholarDigital Library
- A. Clauset, M. E. Newman, and C. Moore. Finding community structure in very large networks. Physical review E, 70(6):066111, 2004. Google ScholarCross Ref
- S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi. A Criticism to Society (as seen by Twitter analytics). In DASec. IEEE, 2014. Google ScholarDigital Library
- S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi. Fame for sale: Efficient detection of fake Twitter followers. Decision Support Systems, 80:56--71, 2015. Google ScholarDigital Library
- S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi. DNA-inspired online behavioral modeling and its application to spambot detection. IEEE Intelligent Systems, 5(31):58--64, 2016. Google ScholarCross Ref
- C. A. Davis, O. Varol, E. Ferrara, A. Flammini, and F. Menczer. BotOrNot: A system to evaluate social bots. In WWW companion. ACM, 2016. Google ScholarDigital Library
- E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini. The rise of social bots. Commun. ACM, 59(7):96--104, 2016. Google ScholarDigital Library
- H. Gao, Y. Chen, K. Lee, D. Palsetia, and A. N. Choudhary. Towards online spam filtering in social networks. In NDSS. Internet Society, 2012.Google Scholar
- S. Ghosh, B. Viswanath, F. Kooti, N. K. Sharma, G. Korlam, F. Benevenuto, N. Ganguly, and K. P. Gummadi. Understanding and combating link farming in the Twitter social network. In WWW. ACM, 2012. Google ScholarDigital Library
- M. Giatsoglou, D. Chatzakou, N. Shah, A. Beutel, C. Faloutsos, and A. Vakali. ND-Sync: Detecting synchronized fraud activities. In PAKDD, pages 201--214. Springer, 2015.Google Scholar
- K. L. Gwet. Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters. Advanced Analytics, LLC, 2014.Google Scholar
- X. Hu, J. Tang, and H. Liu. Online social spammer detection. In Artificial Intelligence. AAAI, 2014. Google ScholarDigital Library
- M. Jiang, P. Cui, A. Beutel, C. Faloutsos, and S. Yang. Inferring lockstep behavior from connectivity pattern in large graphs. Knowledge and Information Systems, pages 1--30, 2015. Google ScholarDigital Library
- M. Jiang, P. Cui, A. Beutel, C. Faloutsos, and S. Yang. Catching synchronized behaviors in large networks: A graph mining approach. ACM Trans. Knowl. Discov. Data, 10(4):35:1--35:27, 2016. Google ScholarDigital Library
- M. Jiang, P. Cui, and C. Faloutsos. Suspicious behavior detection: Current trends and future directions. IEEE Intelligent Systems, 31(1):31--39, 2016. Google ScholarDigital Library
- K. Lee, J. Caverlee, and S. Webb. Uncovering social spammers: social honeypots machine learning. In SIGIR. ACM, 2010. Google ScholarDigital Library
- K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: A long-term study of content polluters on Twitter. In ICWSM. AAAI, 2011.Google Scholar
- S. Lee and J. Kim. WarningBird: A near real-time detection system for suspicious URLs in Twitter stream. IEEE Trans. Depend. Sec. Comput., 10(3):183--195, 2013. Google ScholarDigital Library
- S. Lee and J. Kim. Early filtering of ephemeral malicious accounts on Twitter. Computer Communications, 54:48--57, 2014. Google ScholarDigital Library
- H. Liu, J. Han, and H. Motoda. Uncovering deception in social media. Social Network Analysis and Mining, 4(1):1--2, 2014. Google ScholarCross Ref
- Y. Liu, B. Wu, B. Wang, and G. Li. SDHM: A hybrid model for spammer detection in Weibo. In ASONAM. IEEE/ACM, 2014. Google ScholarCross Ref
- Z. Miller, B. Dickinson, W. Deitrick, W. Hu, and A. H. Wang. Twitter spammer detection using data stream clustering. Information Sciences, 260:64--73, 2014. Google ScholarDigital Library
- A. Paradise, R. Puzis, and A. Shabtai. Anti- reconnaissance tools: Detecting targeted socialbots. IEEE Internet Computing, 18(5):11--19, 2014. Google ScholarCross Ref
- T. Stein, E. Chen, and K. Mangla. Facebook immune system. In SNS. ACM, 2011. Google ScholarDigital Library
- G. Stringhini, M. Egele, C. Kruegel, and G. Vigna. Poultry markets: on the underground economy of Twitter followers. In WOSN. ACM, 2012. Google ScholarDigital Library
- G. Stringhini, C. Kruegel, and G. Vigna. Detecting spammers on social networks. In ACSAC. ACM, 2010. Google ScholarDigital Library
- G. Stringhini, G. Wang, M. Egele, C. Kruegel, G. Vigna, H. Zheng, and B. Y. Zhao. Follow the green: growth and dynamics in Twitter follower markets. In IMC. ACM, 2013. Google ScholarDigital Library
- K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time URL spam filtering service. In S&P. IEEE, 2011. Google ScholarDigital Library
- K. Thomas, D. McCoy, C. Grier, A. Kolcz, and V. Paxson. Trafficking fraudulent accounts: The role of the underground market in Twitter spam and abuse. In Security Symposium. USENIX, 2013. Google ScholarDigital Library
- S. M. Van Dongen. Graph clustering via a discrete uncoupling process. SIAM Journal on Matrix Analysis and Applications, 30(1):121--141, 2008. Google ScholarDigital Library
- B. Viswanath, M. A. Bashir, M. B. Zafar, S. Bouget, S. Guha, K. P. Gummadi, A. Kate, and A. Mislove. Strength in numbers: Robust tamper detection in crowd computations. In COSN. ACM, 2015. Google ScholarDigital Library
- B. Viswanath, A. Post, K. P. Gummadi, and A. Mislove. An analysis of social network-based sybil defenses. Computer Communication Review, 41(4):363--374, 2011. Google ScholarDigital Library
- G. Wang, M. Mohanlal, C. Wilson, X. Wang, M. Metzger, H. Zheng, and B. Y. Zhao. Social Turing tests: Crowdsourcing sybil detection. In NDSS. Internet Society, 2013.Google Scholar
- Y. Xie, F. Yu, Q. Ke, M. Abadi, E. Gillum, K. Vitaldevaria, J. Walter, J. Huang, and Z. M. Mao. Innocent by association: early recognition of legitimate users. In CCS. ACM, 2012. Google ScholarDigital Library
- C. Yang, R. Harkreader, and G. Gu. Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans. Inform. Forens. Sec., 8(8):1280--1293, 2013. Google ScholarDigital Library
- Z. Yang, C. Wilson, X. Wang, T. Gao, B. Y. Zhao, and Y. Dai. Uncovering social network sybils in the wild. ACM Trans. Knowl. Discov. Data, 8(1):2, 2014. Google ScholarDigital Library
- S. Yardi, D. Romero, G. Schoenebeck, et al. Detecting spam in a twitter network. First Monday, 15(1), 2009. Google ScholarCross Ref
- R. Yu, X. He, and Y. Liu. GLAD: Group anomaly detection in social media analysis. ACM Trans. Know. Discov. Data, 10(2):1--22, 2015. Google ScholarDigital Library
- J. Zhang, R. Zhang, Y. Zhang, and G. Yan. The rise of social botnets: Attacks and countermeasures. arXiv preprint arXiv:1603.02714, 2016.Google Scholar
Index Terms
- The Paradigm-Shift of Social Spambots: Evidence, Theories, and Tools for the Arms Race
Recommendations
Using supervised machine learning algorithms to detect suspicious URLs in online social networks
ASONAM '17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017The increasing volume of malicious content in social networks requires automated methods to detect and eliminate such content. This paper describes a supervised machine learning classification model that has been built to detect the distribution of ...
@spam: the underground on 140 characters or less
CCS '10: Proceedings of the 17th ACM conference on Computer and communications securityIn this work we present a characterization of spam on Twitter. We find that 8% of 25 million URLs posted to the site point to phishing, malware, and scams listed on popular blacklists. We analyze the accounts that send spam and find evidence that it ...
The social honeypot project: protecting online communities from spammers
WWW '10: Proceedings of the 19th international conference on World wide webWe present the conceptual framework of the Social Honeypot Project for uncovering social spammers who target online communities and initial empirical results from Twitter and MySpace. Two of the key components of the Social Honeypot Project are: (1) The ...
Comments