nach oben

Social Network Analysis and Mining

Erschienen in:

01.12.2016 | Original Article

Discover millions of fake followers in Weibo

verfasst von: Yi Zhang, Jianguo Lu

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Weibo is the Chinese counterpart of Twitter, which has attracted hundreds of millions of users. Just like other Online Social Networks (hereafter OSNs), Weibo has a large number of fake accounts. They are created to sell their following links to customers, who want to boost their follower counts. These bogus accounts are difficult to identify individually, especially when they are created by sophisticated programs or controlled by human beings directly. This paper proposes a novel fake account detection method that is based on the very purpose of the existence of these accounts: they are created to follow their targets en masse, resulting in high-overlapping between the follower lists of their customers. This paper investigates the top Weibo accounts whose follower lists duplicate or nearly duplicate each other (hereafter called near-duplicates). Discovering near-duplicates is a challenging task. The network is large; the data in its entirety are not available; the pair-wise comparison is very expensive. We developed a sampling-based approach to discover all the near-duplicates of the top accounts, who have at least 50,000 followers. In the experiment, we found 395 near-duplicates, which leads us to 11.90 million fake accounts (4.56 % of total users) who send 741.10 million links (9.50 % of the entire edges). Furthermore, we characterize four typical structures of the spammers, cluster these spammers into 34 groups, and analyze the properties of each group.

Vorheriger Artikel Impact analysis of facebook in family bonding

Nächster Artikel A synthetic data generator for online social network graphs

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://jlu.myweb.cs.uwindsor.ca/spammer/view_node-1787709495.

Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, page 12

Chen C, Wu K, Srinivasan V, Zhang V (2013) Battling the internet water army: detection of hidden paid posters. In: The 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Chu Z et al (2012) Detecting automation of twitter accounts: are you a human, bot, or cyborg? IEEE Trans Depend Secure Comput 9(6):811–824CrossRef

Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703MathSciNetCrossRefMATH

Dasgupta A, Kumar R, Sarlos T (2014) On estimating the average degree. In: Proceedings of the 23rd international conference on World wide web. International World Wide Web Conferences Steering Committee

Ghosh S, Viswanath B, Kooti F, Sharma NK, Korlam G, Benevenuto F, Ganguly N, Gummadi KP (2012) Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st international conference on World Wide Web, pp 61–70. ACM

Giles J (2011) Social-bots infiltrate twitter and trick human users. New Sci 209(2804):28CrossRef

Gjoka M, Kurant M, Butts C, Markopoulou A (2009) A walk in facebook: uniform sampling of users in online social networks. arXiv:0906.0060

Henzinger M (2006) Finding near-duplicate web pages: a large-scale evaluation of algorithms. In SIGIR. ACM

Hu X, Tang J, Zhang Y, Liu H (2013) Social spammer detection in microblogging. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pp 2633–2639. AAAI Press

Jacomy M, Venturini T, Heymann S, Bastian M (2014) Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PLoS One, 9(6):1–12

Katzir L, Liberty E, Somekh O (2011) Estimating sizes of social networks via biased sampling. In WWW, pp 597–606. ACM

Lee S-M, Chao A (1994) Estimating population size via sample coverage for closed capture-recapture models. Biometrics 50(1):88–97CrossRefMATH

Lin C, He J, Zhou J, Yang X, Chen K, Song L (2013) Analysis and identification of spamming behaviors in sina weibo microblog. In: Proceedings of the 7th Workshop on Social Network Mining and Analysis, ACM

Lu J, Li D (2013) Bias correction in small sample from big data. TKDE, IEEE Trans Knowledge Data Eng 25(11):2658–2663CrossRef

Manku GS, Jain A, Das Sarma A (2007) Detecting near-duplicates for web crawling. In: Proceedings of the 16th International Conference on World Wide Web, WWW ’07, pp 141–150, New York. ACM

Manning CD, Raghavan P, Schütze H et al (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge England

Miller Z, Dickinson B, Deitrick W, Hu W, Wang AH (2014) Twitter spammer detection using data stream clustering. Information Sci 260:64–73CrossRef

Myers SA, Sharma A, Gupta P, Lin J (2014) Information network or social network?: The structure of the twitter follow graph. In 23rd International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, Companion Volume, pp 493–498. International World Wide Web Conferences Steering Committee

Newman M (2010) Networks: an introduction. Oxford University Press Inc, Oxford England

Perlroth N (2013) Fake twitter followers become multimillion-dollar business. NewYork Times

Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: Proceedings of the 26th Annual Computer Security Applications Conference on - ACSAC ’10, p 1, New York. ACM Press

Tao K, Abel F, Hauff C, Houben GJ, Gadiraju U (2013) Groundhog day: near-duplicate detection on twitter. In: Proceedings of the 22nd international conference on World Wide Web, pp 1273–1284. International World Wide Web Conferences Steering Committee

Thomas K, Grier C, Song D, Paxson V (2011) Suspended accounts in retrospect: an analysis of twitter spam. In: Proceedings of the 2011 ACM

Wang A (2009) Don’t follow me: spam detection in twitter. In: International Conference on Security and Cryptography (SECRYPT)

Wang H, Lu J (2013) Detect inflated follower numbers in osn using star sampling. The IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp 127–133

Wu B, Davison BD (2005) Identifying link farm spam pages. In: Proceedings of the 14th International World Wide Web Conference, pp 820–829. ACM Press

Zhang Q, Ma H, Qian W, Zhou A (2013) Duplicate detection for identifying social spam in microblogs. In: Big Data (BigData Congress), 2013 IEEE International Congress on, pp 141–148. IEEE

Titel: Discover millions of fake followers in Weibo
verfasst von: Yi Zhang
Jianguo Lu
Publikationsdatum: 01.12.2016
Verlag: Springer Vienna
Erschienen in: Social Network Analysis and Mining / Ausgabe 1/2016
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI: https://doi.org/10.1007/s13278-016-0324-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 1/2016

A review of features for the discrimination of twitter users: application to the prediction of offline influence

An empirical study of socialbot infiltration strategies in the Twitter social network

A mathematical model of news propagation on online social network and a control strategy for rumor spreading

An algebraic approach to temporal network analysis based on temporal quantities

Sign prediction in social networks based on users reputation and optimism

A synthetic data generator for online social network graphs