ABSTRACT
Given a retweeter network in Twitter for any event, how can we detect the group of users that collude to retweet together maliciously? A large number of retweets of a post often indicates the virality of the post. It also helps increase the visibility and volume of hashtags, topics or URLs, to promote the event associated with it. Our primary hunch is that there is synchronization or indicative pattern in the behavior of such users. In this paper, we propose (i) MalReG, a novel algorithm to detect retweeter groups, and (ii) a set of 23 group-based features (entropy-based and temporal-based) to train a supervised model to identify malicious retweeter groups (MRG). We present experiments on three real-world datasets with more than 10 million retweets crawled from Twitter. MalReG identifies 1, 017 retweeter groups present in our dataset. We train a supervised learning model to detect MRG which achieves 0.921 ROC AUC using Random Forest, outperforming the baseline by 7.97% higher AUC. Additionally, we perform geographical location-based and temporal analysis of these groups. Interestingly, we find the presence of the same group, retweeting different political events that took place in different continents at different times. We also discover masquerading techniques used by MRG to evade detection.
- Anupama Aggarwal, Saravana Kumar, Kushagra Bhargava, and Ponnurangam Kumaraguru. 2018. The Follower Count Fallacy: Detecting Twitter Users with Manipulated Follower Count. arXiv preprint arXiv:1802.03625 (2018). Google ScholarDigital Library
- Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. Oddball: Spotting anomalies in weighted graphs. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 410--421. Google ScholarDigital Library
- Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos E Papalexakis, and Danai Koutra. 2014. Com2: fast automatic discovery of temporal (âĂŸcometáĂŹ) communities. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 271--283.Google Scholar
- Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. ACM, 119--130. Google ScholarDigital Library
- Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008.Google ScholarCross Ref
- Tanmoy Chakraborty, Ayushi Dalmia, Animesh Mukherjee, and Niloy Ganguly. 2017. Metrics for Community Analysis: A Survey. ACM Comput. Surv. 50, 4, Article 54 (Aug. 2017), 37 pages. Google ScholarDigital Library
- Philip K Chan and Matthew V Mahoney. 2005. Modeling multiple time series for anomaly detection. In Data Mining, Fifth IEEE International Conference on. IEEE, 8-pp. Google ScholarDigital Library
- Darko Cherepnalkoski and Igor Mozetic. 2015. A retweet network analysis of the European Parliament. In Signal-Image Technology & Internet-Based Systems (SITIS), 2015 11th International Conference on. IEEE, 350--357. Google ScholarDigital Library
- Kaustav Das, Jeff Schneider, and Daniel B Neill. 2008. Anomaly pattern detection in categorical datasets. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 169--176. Google ScholarDigital Library
- Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2016. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, 273--274. Google ScholarDigital Library
- Hridoy Sankar Dutta, Aditya Chetan, Brihi Joshi, and Tanmoy Chakraborty. 2018. Retweet Us, We Will Retweet You: Spotting Collusive Retweeters Involved in Blackmarket Services. CoRR abs/1806.08979 (2018). arXiv:1806.08979 http://arxiv.org/abs/1806.08979Google Scholar
- Rumi Ghosh, Tawan Surachawala, and Kristina Lerman. 2011. Entropy-based classification of 'retweeting' activity on twitter. arXiv preprint arXiv:1106.0346 (2011).Google Scholar
- Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Alex Beutel, Christos Faloutsos, and Athena Vakali. 2015. Nd-sync: Detecting synchronized fraud activities. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 201--214.Google ScholarCross Ref
- Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Retweeting activity on twitter: Signs of deception. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 122--134.Google ScholarCross Ref
- Aditi Gupta and Ponnurangam Kumaraguru. 2012. Credibility ranking of tweets during high impact events. In Proceedings of the 1st workshop on privacy and security in online social media. ACM, 2. Google ScholarDigital Library
- Aditi Gupta, Hemank Lamba, Ponnurangam Kumaraguru, and Anupam Joshi. 2013. Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd international conference on World Wide Web. ACM, 729--736. Google ScholarDigital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 941--950. Google ScholarDigital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Inferring strange behavior from connectivity pattern in social networks. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 126--138.Google ScholarCross Ref
- Monica Kaminska, Bence Kollanyi, and Philip N Howard. {n. d.}. Junk News and Bots during the 2017 UK General Election: What Are UK Voters Sharing Over Twitter? ({n. d.}).Google Scholar
- Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. HoloScope: Topology-and-Spike Aware Fraud Detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1539--1548. Google ScholarDigital Library
- Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 1--14.Google ScholarCross Ref
- Amanda Minnich, Nikan Chavoshi, Danai Koutra, and Abdullah Mueen. 2017. BotWalk: Efficient adaptive exploration of Twitter bot networks. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 467--474. Google ScholarDigital Library
- Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th international conference on World Wide Web. ACM, 201--210. Google ScholarDigital Library
- Charles Perez, Marc Lemercier, Babiga Birregah, and Alain Corpel. 2011. Spot 1.0: Scoring suspicious profiles on twitter. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE, 377--381. Google ScholarDigital Library
- Indira Sen, Anupama Aggarwal, Shiven Mian, Siddharth Singh, Ponnurangam Kumaraguru, and Anwitaman Datta. 2018. Worth its Weight in Likes: Towards Detecting Fake Likes on Instagram. In Proceedings of the 10th ACM Conference on Web Science. ACM, 205--209. Google ScholarDigital Library
- Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fbox: An adversarial perspective. In Data Mining (ICDM), 2014 IEEE International Conference on. IEEE, 959--964. Google ScholarDigital Library
- Keith S Taber. 2017. The use of cronbachâĂŹs alpha when developing and reporting research instruments in science education. Research in Science Education (2017), 1--24.Google Scholar
- Nguyen Vo, Kyumin Lee, Cheng Cao, Thanh Tran, and Hongkyu Choi. 2017. Revealing and detecting malicious retweeter groups. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 363--368. Google ScholarDigital Library
- Rose Yu, Xinran He, and Yan Liu. 2015. Glad: group anomaly detection in social media analysis. ACM Transactions on Knowledge Discovery from Data (TKDD) 10, 2 (2015), 18. Google ScholarDigital Library
- Rose Yu, Huida Qiu, Zhen Wen, ChingYung Lin, and Yan Liu. 2016. A survey on social media anomaly detection. ACM SIGKDD Explorations Newsletter 18, 1 (2016), 1--14. Google ScholarDigital Library
Index Terms
- MalReG: Detecting and Analyzing Malicious Retweeter Groups
Recommendations
Retweet us, we will retweet you: Spotting Collusive Retweeters Involved in Blackmarket Services
ASONAM '18: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningTwitter has increasingly become a popular platform to share news and user opinion. A tweet is considered to be important if it receives high number of affirmative reactions from other Twitter users via Retweets. Retweet count is thus considered as a ...
Multitask learning for blackmarket tweet detection
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningOnline social media platforms have made the world more connected than ever before, thereby making it easier for everyone to spread their content across a wide variety of audiences. Twitter is one such popular platform where people publish tweets to ...
CoReRank: Ranking to Detect Users Involved in Blackmarket-Based Collusive Retweeting Activities
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data MiningTwitter's popularity has fostered the emergence of various illegal user activities - one such activity is to artificially bolster visibility of tweets by gaining large number of retweets within a short time span. The natural way to gain visibility is ...
Comments