Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2023

01.12.2023 | Original Article

Machine learning algorithm-based spam detection in social networks

verfasst von: M. Sumathi, S. P. Raja

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many social media (SM) platforms have emerged as a result of the online social network’s (OSN) rapid expansion. SM has become important in day-to-day life, and spammers have turned their attention to SM. Spam detection (SD) is done in two different ways, such as machine learning (ML) and expert-based detection. The expert-based detection technique’s accuracy depends on expert knowledge, and it takes huge time to detect the spams. Thus, ML-based spam detection is preferred in OSN. Spam identification on social networks is a difficult operation involving a variety of factors, and spam and ham have resulted in an imbalanced data distribution, which gives flexibility to spammers for corrupting our devices. SD based on ML algorithms like logistic regression (LR), K-nearest neighbor (KNN), decision trees (DT), random forest (RF), support vector machine (SVM) and eXtreme gradient boosting (XGB), voting classifier (VC) and extra tree classifier (ETC) are used to design the address balance and to attain high assessment accuracy in an imbalanced datasets. ETC method minimizes the bias through the original sampling process. For reducing processing complexity, the ETC method uses a smaller size constant factor instead of a larger one. Thus, the ETC technique produces better data splitting than DT and RF techniques. Text is vectorized by vectorizers, and all the relative results are stored in it. The VC is an ensemble method that integrates predictions form several methods to forecast an output class depending on which predictions have the highest probability. The multi-class results are aggregated and forecast for the majority voted class. The experimental result shows that, as compared to KN, NB, ETC, RF, SVC, LR, XGB and DT, the proposed VC provides a higher classification accuracy rate of 97.96%, 97.56% of precision, 89.95% of recall and 91.96% of F1-measures. Similarly, ETC provides 97.77% accuracy, 98.31% of precision, 84.78% of recall and 91.05% of F1-measures. Compared to conventional ML algorithms, VC and ETC provide higher accuracy, precision, recall and F1-measures. Thus, ETC and VC are preferable for spam detection. The website has been designed to detect messages as spam or not.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abkenar SB, Kashani MH, Mahdipour E, Jameii SM (2021) Big data analytics meets social media: a systematic review of techniques, open issues, and future directions. Telematics Inf 57:101517CrossRef Abkenar SB, Kashani MH, Mahdipour E, Jameii SM (2021) Big data analytics meets social media: a systematic review of techniques, open issues, and future directions. Telematics Inf 57:101517CrossRef
Zurück zum Zitat Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in online social networks. Comput Commun 36(10–11):1120–1129CrossRef Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in online social networks. Comput Commun 36(10–11):1120–1129CrossRef
Zurück zum Zitat Ahmed N, Amin R, Aldabbas H, Koundal D (2022) Machine learning techniques for spam detection in Email and IoT platforms: analysis and research challenges. Secur Commun Netw 8:1–19CrossRef Ahmed N, Amin R, Aldabbas H, Koundal D (2022) Machine learning techniques for spam detection in Email and IoT platforms: analysis and research challenges. Secur Commun Netw 8:1–19CrossRef
Zurück zum Zitat Alom Z, Carminati B, Ferrari E (2020) A deep learning model for Twitter spam detection. Online Soc Netw Media 18:1–12 Alom Z, Carminati B, Ferrari E (2020) A deep learning model for Twitter spam detection. Online Soc Netw Media 18:1–12
Zurück zum Zitat Barushka A, Hajek P (2020) Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Comput Appl 32:1–19CrossRef Barushka A, Hajek P (2020) Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Comput Appl 32:1–19CrossRef
Zurück zum Zitat Chakraborty M, Pal S, Pramanik R, Ravindranath Chowdary C (2016) Recent developments in social spam detection and combating techniques: a survey. Inf Process Manag 52(6):1053–1073CrossRef Chakraborty M, Pal S, Pramanik R, Ravindranath Chowdary C (2016) Recent developments in social spam detection and combating techniques: a survey. Inf Process Manag 52(6):1053–1073CrossRef
Zurück zum Zitat Choi J, Jeon C (2021) Cost-based heterogeneous learning framework for real-time spam detection in social networks with expert decisions. IEEE Access 9:103573–103587CrossRef Choi J, Jeon C (2021) Cost-based heterogeneous learning framework for real-time spam detection in social networks with expert decisions. IEEE Access 9:103573–103587CrossRef
Zurück zum Zitat Choudhury D, Acharjee T (2022) A novel approach to fake news detection in social networks using genetic algorithm applying machine learning classifiers. Multimed Tools Appl 82:1–17 Choudhury D, Acharjee T (2022) A novel approach to fake news detection in social networks using genetic algorithm applying machine learning classifiers. Multimed Tools Appl 82:1–17
Zurück zum Zitat Elakkiya E, Selvakumar S (2022) Stratified hyperparameters optimization of feed-forward neural network for social network spam detection (SON2S). Soft Comput 8:1–20 Elakkiya E, Selvakumar S (2022) Stratified hyperparameters optimization of feed-forward neural network for social network spam detection (SON2S). Soft Comput 8:1–20
Zurück zum Zitat Govil N, Agarwal K, Bansal A, Varshney A (2020a) A machine learning based spam detection mechanism. In: Fourth international conference on computing methodologies and communication (ICCMC 2020a), pp 954–957 Govil N, Agarwal K, Bansal A, Varshney A (2020a) A machine learning based spam detection mechanism. In: Fourth international conference on computing methodologies and communication (ICCMC 2020a), pp 954–957
Zurück zum Zitat Govil N, Agarwal K, Bansal A, Varshney A (2020b) A machine learning based spam detection mechanism. In: 2020b Fourth international conference on computing methodologies and communication (ICCMC), Erode, India Govil N, Agarwal K, Bansal A, Varshney A (2020b) A machine learning based spam detection mechanism. In: 2020b Fourth international conference on computing methodologies and communication (ICCMC), Erode, India
Zurück zum Zitat Gupta M, Bakliwal A, Agarwal S, Mehndiratta P (2018) A comparative study of spam SMS detection using machine learning classifiers. In: 2018 Eleventh international conference on contemporary computing (IC3), pp 1–7 Gupta M, Bakliwal A, Agarwal S, Mehndiratta P (2018) A comparative study of spam SMS detection using machine learning classifiers. In: 2018 Eleventh international conference on contemporary computing (IC3), pp 1–7
Zurück zum Zitat Heidemann J, Klier M, Probst F (2012) Online social networks: a survey of a global phenomenon. Comput Netw 56:3866–3878CrossRef Heidemann J, Klier M, Probst F (2012) Online social networks: a survey of a global phenomenon. Comput Netw 56:3866–3878CrossRef
Zurück zum Zitat Hu X, Tang J, Liu H (2014) Online social spammer detection. In: Proceeding 28th AAAI conference on artificial intelligence (AAAI), pp 59–65 Hu X, Tang J, Liu H (2014) Online social spammer detection. In: Proceeding 28th AAAI conference on artificial intelligence (AAAI), pp 59–65
Zurück zum Zitat Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern 42(2):513–529CrossRef Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern 42(2):513–529CrossRef
Zurück zum Zitat Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85:21–44CrossRef Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85:21–44CrossRef
Zurück zum Zitat Janez-Martino F, Alaiz-Rodriguez R, Gonzalez-Castro V, Fidalgo E, Alegre E (2023) A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev 56:1145–1173CrossRef Janez-Martino F, Alaiz-Rodriguez R, Gonzalez-Castro V, Fidalgo E, Alegre E (2023) A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev 56:1145–1173CrossRef
Zurück zum Zitat Jbara YHF, Mohamed HAS (2020) Twitter spammer identification using URL based detection. IOP Conf Ser Mater Sci Eng 925:1–7 Jbara YHF, Mohamed HAS (2020) Twitter spammer identification using URL based detection. IOP Conf Ser Mater Sci Eng 925:1–7
Zurück zum Zitat Jenifer Darling Rosita P, Jacob WS (2022) Multi-objective genetic algorithm and CNN-based deep learning architectural scheme for effective spam detection. Int J Intell Netw 3:9–15 Jenifer Darling Rosita P, Jacob WS (2022) Multi-objective genetic algorithm and CNN-based deep learning architectural scheme for effective spam detection. Int J Intell Netw 3:9–15
Zurück zum Zitat Karim A, Azam S, Shanmugam B, Kannoorpatti K, Alazab M (2019) A comprehensive survey for intelligent spam email detection. IEEE Access 7:168261–168295CrossRef Karim A, Azam S, Shanmugam B, Kannoorpatti K, Alazab M (2019) A comprehensive survey for intelligent spam email detection. IEEE Access 7:168261–168295CrossRef
Zurück zum Zitat Kumar C, Bharti TS, Prakash S (2023) A hybrid data-driven framework for spam detection in online social network. In: International conference of machine learning and data engineering Kumar C, Bharti TS, Prakash S (2023) A hybrid data-driven framework for spam detection in online social network. In: International conference of machine learning and data engineering
Zurück zum Zitat Madisetty S, Desarkar MS (2018) A neural network-based ensemble approach for spam detection in twitter. IEEE Trans Comput Soc Syst 5(4):973–984CrossRef Madisetty S, Desarkar MS (2018) A neural network-based ensemble approach for spam detection in twitter. IEEE Trans Comput Soc Syst 5(4):973–984CrossRef
Zurück zum Zitat Masood F, Ammad G, Almogren A, Abbas A (2019) Spammer detection and fake user identification on social networks. IEEE Access 7:68140–68152CrossRef Masood F, Ammad G, Almogren A, Abbas A (2019) Spammer detection and fake user identification on social networks. IEEE Access 7:68140–68152CrossRef
Zurück zum Zitat Mateen M, Iqbal MA, Aleem M, Islam MA (2017) A hybrid approach for spam detection for Twitter. In: 2017 14th International Bhurban conference on applied sciences and technology (IBCAST), Islamabad, Pakistan, pp 466–471 Mateen M, Iqbal MA, Aleem M, Islam MA (2017) A hybrid approach for spam detection for Twitter. In: 2017 14th International Bhurban conference on applied sciences and technology (IBCAST), Islamabad, Pakistan, pp 466–471
Zurück zum Zitat Niranjani V, Agalya Y, Charunandhini K, Gayathri K, Gayathri R (2022) Spam detection for social media networks using machine learning. In:2022 8th International conference on advanced computing and communication systems (ICACCS), pp 2082–2088 Niranjani V, Agalya Y, Charunandhini K, Gayathri K, Gayathri R (2022) Spam detection for social media networks using machine learning. In:2022 8th International conference on advanced computing and communication systems (ICACCS), pp 2082–2088
Zurück zum Zitat Pirozmand P, Sadeghilalimi M, Rahmani AA (2021) A feature selection approach for spam detection in social networks using gravitational force-based heuristic algorithm. J Amb Intell Human Comput 8:1–14 Pirozmand P, Sadeghilalimi M, Rahmani AA (2021) A feature selection approach for spam detection in social networks using gravitational force-based heuristic algorithm. J Amb Intell Human Comput 8:1–14
Zurück zum Zitat Rodrigues AP, Fernandes R, Aakash A, Abhishek B, Shetty A (2022) Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput Intell Neurosci 2022:1–14CrossRef Rodrigues AP, Fernandes R, Aakash A, Abhishek B, Shetty A (2022) Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput Intell Neurosci 2022:1–14CrossRef
Zurück zum Zitat Sharma R, Kaur G (2016) E-mail spam detection using SVM and RBF. Int J Mod Educ Comput Sci 8:57–63CrossRef Sharma R, Kaur G (2016) E-mail spam detection using SVM and RBF. Int J Mod Educ Comput Sci 8:57–63CrossRef
Zurück zum Zitat Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In Proceeding 26th annual computer security application conference (ACSAC), pp 1–9 Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In Proceeding 26th annual computer security application conference (ACSAC), pp 1–9
Zurück zum Zitat Sun N, Lin G, Qiu J, Rimba P (2022) Near real-time twitter spam detection with machine learning techniques. Int J Comput Appl 44:1–12 Sun N, Lin G, Qiu J, Rimba P (2022) Near real-time twitter spam detection with machine learning techniques. Int J Comput Appl 44:1–12
Zurück zum Zitat Svadasu G, Adimoolam M (2022) Spam detection in social media using artificial neural network algorithm and comparing accuracy with support vector machine algorithm. In: 2022 International conference on business analytics for technology and security (ICBATS), pp 1–5 Svadasu G, Adimoolam M (2022) Spam detection in social media using artificial neural network algorithm and comparing accuracy with support vector machine algorithm. In: 2022 International conference on business analytics for technology and security (ICBATS), pp 1–5
Zurück zum Zitat Swathi P (2018) Analysis on solutions for over-fitting and under-fitting in machine learning algorithms. Int J Innov Res Sci Eng Technol 7:10–15680 Swathi P (2018) Analysis on solutions for over-fitting and under-fitting in machine learning algorithms. Int J Innov Res Sci Eng Technol 7:10–15680
Zurück zum Zitat Thomas M, Meshram BB (2023) Chso-DNFNet: spam detection in Twitter using feature fusion and optimized deep neuro fuzzy network. Adv Eng Softw 175:1–12CrossRef Thomas M, Meshram BB (2023) Chso-DNFNet: spam detection in Twitter using feature fusion and optimized deep neuro fuzzy network. Adv Eng Softw 175:1–12CrossRef
Zurück zum Zitat Venkatewarlu B, Viswanath Shenoi V (2021) Optimized generative adversarial network with fractional calculus based feature fusion using twitter stream for spam detection. Inf Secur J Glob Perspect 8:1–20 Venkatewarlu B, Viswanath Shenoi V (2021) Optimized generative adversarial network with fractional calculus based feature fusion using twitter stream for spam detection. Inf Secur J Glob Perspect 8:1–20
Zurück zum Zitat Vijayaraj N, Sumathi M, Rajkamal MU (2022) Decision trees to detect malware in a cloud computing environment. In: 2022 International conference on electronic systems and intelligent computing (ICESIC), pp 299–303 Vijayaraj N, Sumathi M, Rajkamal MU (2022) Decision trees to detect malware in a cloud computing environment. In: 2022 International conference on electronic systems and intelligent computing (ICESIC), pp 299–303
Zurück zum Zitat Zhang Z, Hou R, Yang J (2020) Detection of social network spam based on improved extreme learning machine. IEEE Access 8:112003–112014CrossRef Zhang Z, Hou R, Yang J (2020) Detection of social network spam based on improved extreme learning machine. IEEE Access 8:112003–112014CrossRef
Zurück zum Zitat Zhao C, Xin Y, Li X, Yang Y, Chen Y (2020) A heterogeneous ensemble learning framework for spam detection in social networks with imbalanced data. Appl Sci 10:1–18 Zhao C, Xin Y, Li X, Yang Y, Chen Y (2020) A heterogeneous ensemble learning framework for spam detection in social networks with imbalanced data. Appl Sci 10:1–18
Zurück zum Zitat Zheng X, Zeng Z, Chen Z, Yuanlong Yu, Rong C (2015) Detecting spammers on social networks. Neurocomputing 159:27–34CrossRef Zheng X, Zeng Z, Chen Z, Yuanlong Yu, Rong C (2015) Detecting spammers on social networks. Neurocomputing 159:27–34CrossRef
Zurück zum Zitat Zheng X, Zhang X, Yu Y, Kechadi T, Rong C (2016) ELM-based spammer detection in social networks. J Supercomput 72(8):2991–3005CrossRef Zheng X, Zhang X, Yu Y, Kechadi T, Rong C (2016) ELM-based spammer detection in social networks. J Supercomput 72(8):2991–3005CrossRef
Metadaten
Titel
Machine learning algorithm-based spam detection in social networks
verfasst von
M. Sumathi
S. P. Raja
Publikationsdatum
01.12.2023
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2023
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-023-01108-6

Weitere Artikel der Ausgabe 1/2023

Social Network Analysis and Mining 1/2023 Zur Ausgabe

Premium Partner