Skip to main content

05.06.2019

Spam e-mail classification for the Internet of Things environment using semantic similarity approach

verfasst von: S. Venkatraman, B. Surendiran, P. Arun Raj Kumar

Erschienen in: The Journal of Supercomputing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Unauthorized service or product advertising messages sent via electronic mails are called as spam e-mails. Detecting spam e-mail remains a challenging task. Existing countermeasures based on the statistical keyword, conceptual and IP address-based blacklists are not efficient due to difficulty in finding new attack patterns generated by the Internet of Things botnet devices. The other spam detection approaches rely on a hybrid of conceptual knowledge engineering with machine learning techniques. But, modern spammers evade the hybrid techniques through word polysemy and word ambiguity due to the context-sensitive nature of words. In this paper, the integration of Naïve Bayesian classification with conceptual and semantic similarity technique is proposed to combat the ambiguity raised through polysemy in spam detection. To analyse the effectiveness of our approach, the experiments were conducted on benchmark data sets such as Spambase, PU1, Enron corpus, and Ling-spam. From the experimental results, it is evident that our proposed system achieves high accuracy of 98.89% than the existing approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Siponen M, Stucke C (2006) Effective anti-spam strategies in companies: an international study. In: Proceedings of HICSS’06, vol 6 Siponen M, Stucke C (2006) Effective anti-spam strategies in companies: an international study. In: Proceedings of HICSS’06, vol 6
4.
Zurück zum Zitat Bueti MC (2005) ITU survey on Anti_Spam Legistation Worldwide. WSIS Thematic Meeting on Cybersecurity, Document CYB/06, Geneva Bueti MC (2005) ITU survey on Anti_Spam Legistation Worldwide. WSIS Thematic Meeting on Cybersecurity, Document CYB/06, Geneva
5.
Zurück zum Zitat Swindle O (2003) Statement before the House Subcommittee on Commerce, et all. Federal Trade Commission. June 11, 2003 Swindle O (2003) Statement before the House Subcommittee on Commerce, et all. Federal Trade Commission. June 11, 2003
7.
Zurück zum Zitat Li CH, Huang JX (2012) Spam filtering using semantic similarity approach and adaptive BPNN. Neurocomputing 92:88–97CrossRef Li CH, Huang JX (2012) Spam filtering using semantic similarity approach and adaptive BPNN. Neurocomputing 92:88–97CrossRef
8.
Zurück zum Zitat Nasir JA, Varlamis I, Karim A, Tsatsaronis G (2013) Semantic smoothing for text clustering. Knowl-Based Syst 54:216–229CrossRef Nasir JA, Varlamis I, Karim A, Tsatsaronis G (2013) Semantic smoothing for text clustering. Knowl-Based Syst 54:216–229CrossRef
9.
Zurück zum Zitat Amayri O, Bouguila N (2010) A study of spam filtering using support vector machines. Artif Intell Rev 34:73–108CrossRef Amayri O, Bouguila N (2010) A study of spam filtering using support vector machines. Artif Intell Rev 34:73–108CrossRef
10.
Zurück zum Zitat Metsis V, Androutsopoulos I, Paliouras G (2006) Spam filtering with naive bayes—which naive bayes? In: Third Conference on E-Mail and Anti-Spam (CEAS) Metsis V, Androutsopoulos I, Paliouras G (2006) Spam filtering with naive bayes—which naive bayes? In: Third Conference on E-Mail and Anti-Spam (CEAS)
12.
Zurück zum Zitat Zhang Y, Wang S, Phillips P, Ji G (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl-Based Syst 64:22–31CrossRef Zhang Y, Wang S, Phillips P, Ji G (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl-Based Syst 64:22–31CrossRef
13.
Zurück zum Zitat Sarafijanovic S, Boudec JL (2008) Artificial immune system for collaborative spam filtering. In: Proceedings of NICSO 2007, The Second Workshop on Nature Inspired Cooperative Strategies for Optimization Sarafijanovic S, Boudec JL (2008) Artificial immune system for collaborative spam filtering. In: Proceedings of NICSO 2007, The Second Workshop on Nature Inspired Cooperative Strategies for Optimization
14.
Zurück zum Zitat Delany SJ, Cunningham P, Tsymbal A, Coyle L (2005) A case-based technique for tracking concept drift in spam filtering. Knowl-Based Syst 18:187–195CrossRef Delany SJ, Cunningham P, Tsymbal A, Coyle L (2005) A case-based technique for tracking concept drift in spam filtering. Knowl-Based Syst 18:187–195CrossRef
15.
Zurück zum Zitat Clark J, Koprinska I, Poon J (2003) A neural network based approach to automated e-mail classification. In: Proceedings of the IEEE/WIC International Conference on Web Intelligence Clark J, Koprinska I, Poon J (2003) A neural network based approach to automated e-mail classification. In: Proceedings of the IEEE/WIC International Conference on Web Intelligence
16.
Zurück zum Zitat Elssied NOF, Ibrahim O, Osman AH (2015) Enhancement of spam detection mechanism based on hybrid k-mean clustering and support vector machine. Soft Comput 19(11):3237–3248CrossRef Elssied NOF, Ibrahim O, Osman AH (2015) Enhancement of spam detection mechanism based on hybrid k-mean clustering and support vector machine. Soft Comput 19(11):3237–3248CrossRef
17.
Zurück zum Zitat Eyharabide V, Amandi A (2008) Semantic spam filtering from personalized ontologies. J Web Eng 7(2):158–176 Eyharabide V, Amandi A (2008) Semantic spam filtering from personalized ontologies. J Web Eng 7(2):158–176
18.
Zurück zum Zitat Sculley D, Wachman GM, Brodley CE (2006) Spam filtering using inexact string matching in explicit feature space with on-line linear classifiers. In: Proceedings of Fifteenth Text Retrieval Conference, Section 2 Sculley D, Wachman GM, Brodley CE (2006) Spam filtering using inexact string matching in explicit feature space with on-line linear classifiers. In: Proceedings of Fifteenth Text Retrieval Conference, Section 2
19.
Zurück zum Zitat Dai Y, Tada S, Ban T, Nakazato J, Shimamura J (2014) Detecting malicious spam mails: an online machine learning approach. In: 21st International Conference on Neural Information Processing (ICONIP), pp 365–372CrossRef Dai Y, Tada S, Ban T, Nakazato J, Shimamura J (2014) Detecting malicious spam mails: an online machine learning approach. In: 21st International Conference on Neural Information Processing (ICONIP), pp 365–372CrossRef
20.
Zurück zum Zitat Perez-Diaz N, Ruano-Ordas D, Fdez-Riverola F, Mendez JR (2016) Boosting accuracy of classical machine learning antispam classifiers in real scenarios by applying rough set theory. Sci Program 2016:1–11 Perez-Diaz N, Ruano-Ordas D, Fdez-Riverola F, Mendez JR (2016) Boosting accuracy of classical machine learning antispam classifiers in real scenarios by applying rough set theory. Sci Program 2016:1–11
21.
Zurück zum Zitat Zhou B, Yao Y, Luo J (2014) Cost-sensitive three-way E-mail spam filtering. J Intell Inf Syst 42(1):19–45CrossRef Zhou B, Yao Y, Luo J (2014) Cost-sensitive three-way E-mail spam filtering. J Intell Inf Syst 42(1):19–45CrossRef
22.
Zurück zum Zitat Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Proceedings of 3rd IEEE International Conference on Data Mining (ICDM03), Melbourne, FL, pp 541–544 Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Proceedings of 3rd IEEE International Conference on Data Mining (ICDM03), Melbourne, FL, pp 541–544
23.
Zurück zum Zitat Hu W, Du J, Xing Y (2016) Spam filtering by semantics-based text classification. In: Proceedings of the 8th International Conference on Advanced Computational Intelligence, pp 89–94 Hu W, Du J, Xing Y (2016) Spam filtering by semantics-based text classification. In: Proceedings of the 8th International Conference on Advanced Computational Intelligence, pp 89–94
24.
Zurück zum Zitat Stolfo S, Hershkop S (2006) Behavior-based modeling and its application to E-mail analysis. ACM Trans Internet Technol 6:187–221CrossRef Stolfo S, Hershkop S (2006) Behavior-based modeling and its application to E-mail analysis. ACM Trans Internet Technol 6:187–221CrossRef
25.
Zurück zum Zitat Yeh CY, Wu CH, Doong SH (2005) Effective spam classification based on meta-heuristics. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp 3872–3877 Yeh CY, Wu CH, Doong SH (2005) Effective spam classification based on meta-heuristics. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp 3872–3877
26.
Zurück zum Zitat Brendel R, Krawczyk H (2007) Detection methods of dynamic spammers behavior. In: International conference on dependability of computer systems, pp. 145–152 Brendel R, Krawczyk H (2007) Detection methods of dynamic spammers behavior. In: International conference on dependability of computer systems, pp. 145–152
27.
Zurück zum Zitat Hsiao WF, Chang TM (2008) An incremental cluster-based approach to spam filtering. Expert Syst Appl 34(3):1599–1608CrossRef Hsiao WF, Chang TM (2008) An incremental cluster-based approach to spam filtering. Expert Syst Appl 34(3):1599–1608CrossRef
28.
Zurück zum Zitat Haidar AA, Rocha LM (2008) Adaptive spam detection inspired by a cross-regulation model of immune dynamics: a study of concept drift. Lecture notes in computer science, vol 5132. Springer, Berlin Haidar AA, Rocha LM (2008) Adaptive spam detection inspired by a cross-regulation model of immune dynamics: a study of concept drift. Lecture notes in computer science, vol 5132. Springer, Berlin
29.
Zurück zum Zitat Shih DH, Chiang HS, Lin B (2008) Collaborative spam filtering with heterogeneous agents. Expert Syst Appl 34(4):1555–1566CrossRef Shih DH, Chiang HS, Lin B (2008) Collaborative spam filtering with heterogeneous agents. Expert Syst Appl 34(4):1555–1566CrossRef
30.
Zurück zum Zitat Yih WT, Goodman J, Hulton G (2006) Learning at low false positive rates. In: Proceedings of the Third Conference on E-mail and Anti-Spam Yih WT, Goodman J, Hulton G (2006) Learning at low false positive rates. In: Proceedings of the Third Conference on E-mail and Anti-Spam
38.
Zurück zum Zitat Bin X, Ruiguang L, Yashu L, Hanbing Y, Siyuan L, Honggang Z (2015) Filtering Chinese image spam using Pseudo-OCR. Chin J Electron 24(1):134–139CrossRef Bin X, Ruiguang L, Yashu L, Hanbing Y, Siyuan L, Honggang Z (2015) Filtering Chinese image spam using Pseudo-OCR. Chin J Electron 24(1):134–139CrossRef
39.
Zurück zum Zitat Wang J, Herath T, Chen R, Vishwanath A, Rao HR (2012) Phishing susceptibility: an investigation into the processing of a targeted spear phishing E-mail. IEEE Trans Prof Commun 55(4):345–362CrossRef Wang J, Herath T, Chen R, Vishwanath A, Rao HR (2012) Phishing susceptibility: an investigation into the processing of a targeted spear phishing E-mail. IEEE Trans Prof Commun 55(4):345–362CrossRef
40.
Zurück zum Zitat Jung JJ (2009) Towards collaborative spam filtering based on collective intelligence. In: First Asian Conference on Intelligent Information and Database Systems, pp 356–361 Jung JJ (2009) Towards collaborative spam filtering based on collective intelligence. In: First Asian Conference on Intelligent Information and Database Systems, pp 356–361
41.
Zurück zum Zitat Chirita PA, Nejdl W, Zamfir C (2005) Preventing shilling attacks in online recommender systems. In: Proceedings of the Seventh ACM International Workshop on Web Information and Data Management Chirita PA, Nejdl W, Zamfir C (2005) Preventing shilling attacks in online recommender systems. In: Proceedings of the Seventh ACM International Workshop on Web Information and Data Management
42.
Zurück zum Zitat Hau X, Lee PN, Jung JJ, Sadeghi-niaraki A (2013) Collaborative spam filtering based on incremental ontology learning. Telecommun Syst 52:693–700 Hau X, Lee PN, Jung JJ, Sadeghi-niaraki A (2013) Collaborative spam filtering based on incremental ontology learning. Telecommun Syst 52:693–700
44.
Zurück zum Zitat Cunningham P, Nowlan N, Delany SJ, Haahr M (1994) A case-based approach to spam filtering that can track concept drift. no. Ml Cunningham P, Nowlan N, Delany SJ, Haahr M (1994) A case-based approach to spam filtering that can track concept drift. no. Ml
45.
Zurück zum Zitat Xu H, Yu B (2010) Automatic thesaurus construction for spam filtering using revised back propagation neural network. Expert Syst Appl 37(1):18–23CrossRef Xu H, Yu B (2010) Automatic thesaurus construction for spam filtering using revised back propagation neural network. Expert Syst Appl 37(1):18–23CrossRef
46.
Zurück zum Zitat Wu CH (2009) Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks. Expert Syst Appl 36(3):4321–4330CrossRef Wu CH (2009) Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks. Expert Syst Appl 36(3):4321–4330CrossRef
47.
Zurück zum Zitat Bahgat EM, Moawad IF (2016) Semantic-based feature reduction approach for E-mail classification. In: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, pp 53–63 Bahgat EM, Moawad IF (2016) Semantic-based feature reduction approach for E-mail classification. In: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, pp 53–63
48.
Zurück zum Zitat Hu W, Du J, Xing Y (2016) Spam filtering by semantics-based text classification. In: Proceedings of the 8th International Conference on Advanced Computational Intelligence, ICACI, pp 89–94 Hu W, Du J, Xing Y (2016) Spam filtering by semantics-based text classification. In: Proceedings of the 8th International Conference on Advanced Computational Intelligence, ICACI, pp 89–94
49.
Zurück zum Zitat Han A, Kim H, Ha I, Jo G (2008) Semantic analysis of user behaviors for detecting spam mail. In: IEEE International Workshop on Semantic Computing and Applications, pp 91–95 Han A, Kim H, Ha I, Jo G (2008) Semantic analysis of user behaviors for detecting spam mail. In: IEEE International Workshop on Semantic Computing and Applications, pp 91–95
Metadaten
Titel
Spam e-mail classification for the Internet of Things environment using semantic similarity approach
verfasst von
S. Venkatraman
B. Surendiran
P. Arun Raj Kumar
Publikationsdatum
05.06.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02913-7

Premium Partner