skip to main content
10.1145/3297001.3297009acmotherconferencesArticle/Chapter ViewAbstractPublication PagescodsConference Proceedingsconference-collections
research-article

MalReG: Detecting and Analyzing Malicious Retweeter Groups

Published:03 January 2019Publication History

ABSTRACT

Given a retweeter network in Twitter for any event, how can we detect the group of users that collude to retweet together maliciously? A large number of retweets of a post often indicates the virality of the post. It also helps increase the visibility and volume of hashtags, topics or URLs, to promote the event associated with it. Our primary hunch is that there is synchronization or indicative pattern in the behavior of such users. In this paper, we propose (i) MalReG, a novel algorithm to detect retweeter groups, and (ii) a set of 23 group-based features (entropy-based and temporal-based) to train a supervised model to identify malicious retweeter groups (MRG). We present experiments on three real-world datasets with more than 10 million retweets crawled from Twitter. MalReG identifies 1, 017 retweeter groups present in our dataset. We train a supervised learning model to detect MRG which achieves 0.921 ROC AUC using Random Forest, outperforming the baseline by 7.97% higher AUC. Additionally, we perform geographical location-based and temporal analysis of these groups. Interestingly, we find the presence of the same group, retweeting different political events that took place in different continents at different times. We also discover masquerading techniques used by MRG to evade detection.

References

  1. Anupama Aggarwal, Saravana Kumar, Kushagra Bhargava, and Ponnurangam Kumaraguru. 2018. The Follower Count Fallacy: Detecting Twitter Users with Manipulated Follower Count. arXiv preprint arXiv:1802.03625 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. Oddball: Spotting anomalies in weighted graphs. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 410--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos E Papalexakis, and Danai Koutra. 2014. Com2: fast automatic discovery of temporal (âĂŸcometáĂŹ) communities. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 271--283.Google ScholarGoogle Scholar
  4. Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. ACM, 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008.Google ScholarGoogle ScholarCross RefCross Ref
  6. Tanmoy Chakraborty, Ayushi Dalmia, Animesh Mukherjee, and Niloy Ganguly. 2017. Metrics for Community Analysis: A Survey. ACM Comput. Surv. 50, 4, Article 54 (Aug. 2017), 37 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Philip K Chan and Matthew V Mahoney. 2005. Modeling multiple time series for anomaly detection. In Data Mining, Fifth IEEE International Conference on. IEEE, 8-pp. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Darko Cherepnalkoski and Igor Mozetic. 2015. A retweet network analysis of the European Parliament. In Signal-Image Technology & Internet-Based Systems (SITIS), 2015 11th International Conference on. IEEE, 350--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kaustav Das, Jeff Schneider, and Daniel B Neill. 2008. Anomaly pattern detection in categorical datasets. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 169--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2016. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, 273--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hridoy Sankar Dutta, Aditya Chetan, Brihi Joshi, and Tanmoy Chakraborty. 2018. Retweet Us, We Will Retweet You: Spotting Collusive Retweeters Involved in Blackmarket Services. CoRR abs/1806.08979 (2018). arXiv:1806.08979 http://arxiv.org/abs/1806.08979Google ScholarGoogle Scholar
  12. Rumi Ghosh, Tawan Surachawala, and Kristina Lerman. 2011. Entropy-based classification of 'retweeting' activity on twitter. arXiv preprint arXiv:1106.0346 (2011).Google ScholarGoogle Scholar
  13. Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Alex Beutel, Christos Faloutsos, and Athena Vakali. 2015. Nd-sync: Detecting synchronized fraud activities. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 201--214.Google ScholarGoogle ScholarCross RefCross Ref
  14. Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Retweeting activity on twitter: Signs of deception. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 122--134.Google ScholarGoogle ScholarCross RefCross Ref
  15. Aditi Gupta and Ponnurangam Kumaraguru. 2012. Credibility ranking of tweets during high impact events. In Proceedings of the 1st workshop on privacy and security in online social media. ACM, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Aditi Gupta, Hemank Lamba, Ponnurangam Kumaraguru, and Anupam Joshi. 2013. Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd international conference on World Wide Web. ACM, 729--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 941--950. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Inferring strange behavior from connectivity pattern in social networks. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 126--138.Google ScholarGoogle ScholarCross RefCross Ref
  19. Monica Kaminska, Bence Kollanyi, and Philip N Howard. {n. d.}. Junk News and Bots during the 2017 UK General Election: What Are UK Voters Sharing Over Twitter? ({n. d.}).Google ScholarGoogle Scholar
  20. Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. HoloScope: Topology-and-Spike Aware Fraud Detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1539--1548. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  22. Amanda Minnich, Nikan Chavoshi, Danai Koutra, and Abdullah Mueen. 2017. BotWalk: Efficient adaptive exploration of Twitter bot networks. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 467--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th international conference on World Wide Web. ACM, 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Charles Perez, Marc Lemercier, Babiga Birregah, and Alain Corpel. 2011. Spot 1.0: Scoring suspicious profiles on twitter. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE, 377--381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Indira Sen, Anupama Aggarwal, Shiven Mian, Siddharth Singh, Ponnurangam Kumaraguru, and Anwitaman Datta. 2018. Worth its Weight in Likes: Towards Detecting Fake Likes on Instagram. In Proceedings of the 10th ACM Conference on Web Science. ACM, 205--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fbox: An adversarial perspective. In Data Mining (ICDM), 2014 IEEE International Conference on. IEEE, 959--964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Keith S Taber. 2017. The use of cronbachâĂŹs alpha when developing and reporting research instruments in science education. Research in Science Education (2017), 1--24.Google ScholarGoogle Scholar
  28. Nguyen Vo, Kyumin Lee, Cheng Cao, Thanh Tran, and Hongkyu Choi. 2017. Revealing and detecting malicious retweeter groups. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 363--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rose Yu, Xinran He, and Yan Liu. 2015. Glad: group anomaly detection in social media analysis. ACM Transactions on Knowledge Discovery from Data (TKDD) 10, 2 (2015), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rose Yu, Huida Qiu, Zhen Wen, ChingYung Lin, and Yan Liu. 2016. A survey on social media anomaly detection. ACM SIGKDD Explorations Newsletter 18, 1 (2016), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MalReG: Detecting and Analyzing Malicious Retweeter Groups

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data
        January 2019
        380 pages
        ISBN:9781450362078
        DOI:10.1145/3297001

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 January 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        CODS-COMAD '19 Paper Acceptance Rate62of198submissions,31%Overall Acceptance Rate197of680submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader