nach oben

Neural Computing and Applications

Erschienen in:

02.07.2019 | Emerging Trends of Applied Neural Computation - E_TRAINCO

Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks

verfasst von: Aliaksandr Barushka, Petr Hajek

Erschienen in: Neural Computing and Applications | Ausgabe 9/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Spam detection on social networks is increasingly important owing to the rapid growth of social network user base. Sophisticated spam filters must be developed to deal with this complex problem. Traditional machine learning approaches such as neural networks, support vector machines and Naïve Bayes classifiers are not effective enough to process and utilize complex features present in high-dimensional data on social network spam. Moreover, the traditional objective criteria of social network spam filters cannot cope with different costs assigned to type I and type II errors. To overcome these problems, here we propose a novel cost-sensitive approach to social network spam filtering. The proposed approach is composed of two stages. In the first stage, multi-objective evolutionary feature selection is used to minimize both the misclassification cost of the proposed model and the number of attributes necessary for spam filtering. Then, the approach uses cost-sensitive ensemble learning techniques with regularized deep neural networks as base learners. We demonstrate that this approach is effective for social network spam filtering on two benchmark datasets. We also show that the proposed approach outperforms other popular algorithms used in social network spam filtering, such as random forest, Naïve Bayes or support vector machines.

Vorheriger Artikel Continuous drone control using deep reinforcement learning for frontal view person shooting

Nächster Artikel Text synthesis from keywords: a comparison of recurrent-neural-network-based architectures and hybrid approaches

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://ilps.science.uva.nl/framework-unsupervised-spam-detection-social-networking-sites/.

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0182487#pone.0182487.s003.

Cormack GV (2006) Email spam filtering: a systematic review. Found Trends Inf Retr 1(4):335–455. https://doi.org/10.1561/1500000006 CrossRef

Nexgate (2013) State of social media spam. http://nexgate.com/wp-content/uploads/2013/09/Nexgate-2013-State-of-Social-Media-Spam-Research-Report.pdf. Accessed 20 Apr 2019

Statista (2018) Twitter: number of monthly active users 2010–2018. https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/. Accessed 20 Apr 2019

Prieto VM, Alvarez M, Cacheda F (2013) Detecting linkedin spammers and its spam nets. Int J Adv Comput Sci Appl (IJACSA) 4(9):189–199

Shen H, Ma F, Zhang X, Zong L, Liu X, Liang W (2017) Discovering social spammers from multiple views. Neurocomputing 225:49–57. https://doi.org/10.1016/j.neucom.2016.11.013 CrossRef

Adewole KS, Anuar NB, Kamsin A, Varathan KD, Razak SA (2017) Malicious accounts: dark of the social networks. J Netw Comput Appl 79:41–67. https://doi.org/10.1016/j.jnca.2016.11.030 CrossRef

Soliman A, Girdzijauskas S (2016) Adaptive graph-based algorithms for spam detection in social networks. KTH Royal Institute of Technology, diva2:998690

Dutta S, Ghatak S, Dey R, Das AK, Ghosh S (2018) Attribute selection for improving spam classification in online social networks: a rough set theory-based approach. Soc Netw Anal Min 8(7):1–16. https://doi.org/10.1007/s13278-017-0484-8 CrossRef

Barushka A, Hajek P (2016) Spam filtering using regularized neural networks with rectified linear units. In: Adorni G, Cagnoni S, Gori M, Maratea M (eds) Conference of the Italian Association for artificial intelligence. Lecture notes in computer science, vol 10037. Springer, Cham, pp 65–75. https://doi.org/10.1007/978-3-319-49130-1_6

10.

Bhowmick A, Hazarika SM (2018) E-mail spam filtering: a review of techniques and trends. In: Kalam A, Das S, Sharma K (eds) Advances in electronics, communication and computing. Lecture notes in electrical engineering, vol 443. Springer, Singapore, pp 583–590. https://doi.org/10.1007/978-981-10-4765-7_61

11.

Almeida TA, Almeida J, Yamakami A (2011) Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers. J Internet Serv Appl 1(3):183–200. https://doi.org/10.1007/s13174-010-0014-7 CrossRef

12.

Choudhary N, Jain AK (2017) Towards filtering of SMS spam messages using machine learning based technique. In: Singh D, Raman B, Luhach A, Lingras P (eds) Advanced informatics for computing research. Communications in computer and information science, vol 712. Springer, Singapore, pp 18–30. https://doi.org/10.1007/978-981-10-5780-9_2 CrossRef

13.

Kaur P, Singhal A, Kaur J (2016) Spam detection on Twitter: A survey. In: 2016 3rd international conference on computing for sustainable global development (INDIACom). IEEE, New Delhi, pp 2570–2573

14.

Kaur R, Singh S, Kumar H (2018) Rise of spam and compromised accounts in online social networks: a state-of-the-art review of different combating approaches. J Netw Comput Appl 112:53–88. https://doi.org/10.1016/j.jnca.2018.03.015 CrossRef

15.

Sanz JA, Bernardo D, Herrera F, Bustince H, Hagras H (2015) A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data. IEEE Trans Fuzzy Syst 23(4):973–990. https://doi.org/10.1109/TFUZZ.2014.2336263 CrossRef

16.

Al-Janabi M, Quincey ED, Andras P (2017) Using supervised machine learning algorithms to detect suspicious URLs in online social networks. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, ACM, pp 1104–1111. https://doi.org/10.1145/3110025.3116201

17.

Jiménez F, Sánchez G, García JM, Sciavicco G, Miralles L (2017) Multi-objective evolutionary feature selection for online sales forecasting. Neurocomputing 234:75–92. https://doi.org/10.1016/j.neucom.2016.12.045 CrossRef

18.

Barushka A, Hajek P (2018) Spam filtering in social networks using regularized deep neural networks with ensemble learning. In: Iliadis L, Maglogiannis I, Plagianakos V (eds) Artificial intelligence applications and innovations. AIAI 2018. IFIP advances in information and communication technology, vol 519. Springer, Cham, pp 38–49. https://doi.org/10.1007/978-3-319-92007-8_4

19.

Statista (2018) Number of facebook users worldwide 2008–2018. https://www.statista.com/statistics/264810/number-of-monthly-active-facebook-users-worldwide/. Accessed 20 Apr 2019

20.

Zheng X, Zeng Z, Chen Z, Yu Y, Rong C (2015) Detecting spammers on social networks. Neurocomputing 159:27–34. https://doi.org/10.1016/j.neucom.2015.02.047 CrossRef

21.

Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: Proceedings of the 26th annual computer security applications conference. ACM, pp 1–9

22.

Lee K, Caverlee J, Webb S (2010) Uncovering social spammers: social honeypots + machine learning. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 435–442

23.

Wang AH (2010) Don’t follow me: spam detection in Twitter. In: Proceedings of the 2010 international conference on security and cryptography (SECRYPT). IEEE, pp 1–10

24.

Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: 6th collaboration, electronic messaging, anti-abuse and spam conference (CEAS), pp 1–12

25.

Lee K, Eoff BD, Caverlee J (2011) Seven months with the devils: a long-term study of content polluters on Twitter. In: Proceedings of the 5th international AAAI conference on weblogs and social media, pp 185–192

26.

Jin X, Lin C, Luo J, Han J (2011) A data mining-based spam detection system for social media networks. Proc VLDB Endow 4(12):1458–81461

27.

Thomas K, Grier C, Song D, Paxson V (2011) Suspended accounts in retrospect: an analysis of twitter spam. In: Proceedings of the 2011 ACM SIGCOMM conference on internet measurement conference. ACM, pp 243–258

28.

Song J, Lee S, Kim J (2011) Spam filtering in twitter using sender-receiver relationship. In: International workshop on recent advances in intrusion detection. Springer, Berlin, pp 301–317

29.

Chu Z, Widjaja I, Wang H (2012) Detecting social spam campaigns on twitter. In: International conference on applied cryptography and network security. Springer, Berlin, pp 455–472. https://doi.org/10.1007/978-3-642-31284-7_27

30.

Bosma M, Meij E, Weerkamp W (2012) A framework for unsupervised spam detection in social networking sites. In: Baeza-Yates R et al (eds) European conference on information retrieval. Springer, Berlin, pp 364–375. https://doi.org/10.1007/978-3-642-28997-2_31

31.

Yang C, Harkreader R, Gu G (2013) Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans Inf Forensics Secur 8(8):1280–1293. https://doi.org/10.1109/TIFS.2013.2267732 CrossRef

32.

Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000. https://doi.org/10.1016/j.eswa.2012.12.015 CrossRef

33.

Lee S, Kim J (2013) Warningbird: a near real-time detection system for suspicious urls in twitter stream. IEEE Trans Dependable Secure Comput 10(3):183–195. https://doi.org/10.1109/TDSC.2013.3 CrossRef

34.

Bhat SY, Abulaish M (2013) Community-based features for identifying spammers in online social networks. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 100–107

35.

Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in online social networks. Comput Commun 36(10–11):1120–1129. https://doi.org/10.1016/j.comcom.2013.04.004 CrossRef

36.

Miller Z, Dickinson B, Deitrick W, Hu W, Wang AH (2014) Twitter spammer detection using data stream clustering. Inf Sci 260:64–73. https://doi.org/10.1016/j.ins.2013.11.016 CrossRef

37.

Cao C, Caverlee J (2015) Detecting spam urls in social media via behavioral analysis. In: European conference on information retrieval. Springer, Cham, pp 703–714. https://doi.org/10.1007/978-3-319-16354-3_77

38.

Antonakaki D, Polakis I, Athanasopoulos E, Ioannidis S, Fragopoulou P (2016) Exploiting abused trending topics to identify spam campaigns in Twitter. Soc Netw Anal Min 6(1):48. https://doi.org/10.1007/s13278-016-0354-9 CrossRef

39.

Liu C, Wang G (2016) Analysis and detection of spam accounts in social networks. In: 2016 2nd IEEE international conference on computer and communications (ICCC). IEEE, pp 2526–2530. https://doi.org/10.1109/compcomm.2016.7925154

40.

Wu F, Shu J, Huang Y, Yuan Z (2016) Co-detecting social spammers and spam messages in microblogging via exploiting social contexts. Neurocomputing 201:51–65. https://doi.org/10.1016/j.neucom.2016.03.036 CrossRef

41.

Zheng X, Zhang X, Yu Y, Kechadi T, Rong C (2016) ELM-based spammer detection in social networks. J Supercomput 72(8):2991–3005. https://doi.org/10.1007/s11227-015-1437-5 CrossRef

42.

Song L, Lau RYK, Kwok RCW, Mirkovski K, Dou W (2017) Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection. Electron Commer Res 17(1):51–81. https://doi.org/10.1007/s10660-016-9244-5 CrossRef

43.

Chen C, Wang Y, Zhang J, Xiang Y, Zhou W, Min G (2017) Statistical features-based real-time detection of drifted twitter spam. IEEE Trans Inf Forensics Secur 12(4):914–925. https://doi.org/10.1109/TIFS.2016.2621888 CrossRef

44.

Adewole KS, Anuar NB, Kamsin A, Sangaiah AK (2019) SMSAD: a framework for spam message and spam account detection. Multimed Tools Appl 78(4):3925–3960. https://doi.org/10.1007/s11042-017-5018-x CrossRef

45.

Watcharenwong N, Saikaew K (2017) Spam detection for closed Facebook groups. In: 2017 14th international joint conference on computer science and software engineering (JCSSE). IEEE, pp 1–6. https://doi.org/10.1109/jcsse.2017.8025914

46.

Yu D, Chen N, Jiang F, Fu B, Qin A (2017) Constrained NMF-based semi-supervised learning for social media spammer detection. Knowl-Based Syst 125:64–73. https://doi.org/10.1016/j.knosys.2017.03.025 CrossRef

47.

Chen W, Yeo CK, Lau CT, Lee BS (2017) A study on real-time low-quality content detection on Twitter from the users’ perspective. PLoS ONE 12(8):e0182487. https://doi.org/10.1371/journal.pone.0182487 CrossRef

48.

Al-Zoubi AM, Faris H, Hassonah MA (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl-Based Syst 153:91–104. https://doi.org/10.1016/j.knosys.2018.04.025 CrossRef

49.

Aswani R, Kar AK, Ilavarasan PV (2017) Detection of spammers in twitter marketing: a hybrid approach using social media analytics and bio inspired computing. Inf Syst Front. https://doi.org/10.1007/s10796-017-9805-8 CrossRef

50.

Bindu PV, Mishra R, Thilagam PS (2018) Discovering spammer communities in twitter. J Intell Inf Syst. https://doi.org/10.1007/s10844-017-0494-z CrossRef

51.

Sedhai S, Sun A (2018) Semi-supervised spam detection in Twitter stream. IEEE Trans Comput Soc Syst 5(1):169–175. https://doi.org/10.1109/TCSS.2017.2773581 CrossRef

52.

Sohrabi MK, Karimi F (2018) A feature selection approach to detect spam in the Facebook social network. Arab J Sci Eng 43(2):949–958. https://doi.org/10.1007/s13369-017-2855-x CrossRef

53.

Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 48(10):3538–3556. https://doi.org/10.1007/s10489-018-1161-y CrossRef

54.

Gogoglou A, Theodosiou Z, Kounoudes T, Vakali A, Manolopoulos Y (2016) Early malicious activity discovery in microblogs by social bridges detection. In: 2016 IEEE international symposium on signal processing and information technology (ISSPIT). IEEE, Limassol, pp 132–137. https://doi.org/10.1109/isspit.2016.7886022

55.

Hinton G, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580

56.

Dhillon IS, Mallela S, Kumar R (2003) A divisive information-theoretic feature clustering algorithm for text classification. J Mach Learn Res 3:1265–1287. https://doi.org/10.1162/153244303322753661 MathSciNetCrossRefMATH

57.

Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024 CrossRef

58.

Jiménez F, Marzano E, Sánchez G, Sciavicco G, Vitacolonna N (2015) Attribute selection via multi-objective evolutionary computation applied to multi-skill contact center data classification. In: 2015 IEEE symposium series on computational intelligence. IEEE, pp 488–495. https://doi.org/10.1109/ssci.2015.78

59.

Zhang Y, Wang S, Phillips P, Ji G (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl-Based Syst 64:22–31. https://doi.org/10.1016/j.knosys.2014.03.015 CrossRef

60.

Jia X, Shang L (2014) Three-way decisions versus two-way decisions on filtering spam email. In: Transactions on rough sets XVIII, Springer, Berlin, pp 69–91. https://doi.org/10.1007/978-3-662-44680-5_5

61.

Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th international conference on machine learning, pp 1–6

62.

Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: 13th international conference on machine learning, San Francisco, pp 148–156

63.

Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655 CrossRefMATH

64.

Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844. https://doi.org/10.1109/34.709601 CrossRef

65.

Bermejo P, Gámez JA, Puerta JM (2011) Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Syst Appl 38(3):2072–2080. https://doi.org/10.1016/j.eswa.2010.07.146 CrossRef

66.

Pérez-Díaz N, Ruano-Ordás D, Fdez-Riverola F, Méndez JR (2012) SDAI: an integral evaluation methodology for content-based spam filtering models. Expert Syst Appl 39(16):12487–12500. https://doi.org/10.1016/j.eswa.2012.04.064 CrossRef

67.

Cao J, Fu Q, Li Q, Guo D (2017) Discovering hidden suspicious accounts in online social networks. Inf Sci 394:123–140. https://doi.org/10.1016/j.ins.2017.02.030 CrossRef

68.

Gao H, Chen Y, Lee K, Palsetia D, Choudhary AN (2012) Towards online spam filtering in social networks. NDSS 12(2012):1–16

69.

Masood F, Almogren A, Abbas A, Khattak HA, Din IU, Guizani M, Zuair M (2019) Spammer detection and fake user identification on social networks. IEEE Access 7:68140–68152. https://doi.org/10.1109/ACCESS.2019.2918196 CrossRef

70.

Barushka A, Hajek P (2019). Review spam detection using word embeddings and deep neural networks. In: MacIntyre J, Maglogiannis I, Iliadis L, Pimenidis E (eds) Artificial intelligence applications and innovations. AIAI 2019. IFIP Advances in information and communication technology, vol 559. Springer, Cham, pp 340–350. https://doi.org/10.1007/978-3-030-19823-7_28

71.

Jang B, Jeong S, Kim CK (2019) Distance-based customer detection in fake follower markets. Inf Syst 81:104–116. https://doi.org/10.1016/j.is.2018.12.001 CrossRef

Titel: Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks
verfasst von: Aliaksandr Barushka
Petr Hajek
Publikationsdatum: 02.07.2019
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 9/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-019-04331-5

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 9/2020

Minimization of test time in system on chip using artificial intelligence-based test scheduling techniques

Analysis and design of genetic algorithm-based cascade control strategy for improving the dynamic performance of interleaved DC–DC SEPIC PFC converter

A model for collective behaviour propagation: a case study of video game industry

Spatiotemporal neural networks for action recognition based on joint loss

An adaptive ensemble classification framework for real-time data streams by distributed control systems

Gryphon: a semi-supervised anomaly detection system based on one-class evolving spiking neural network