Skip to main content

2017 | OriginalPaper | Buchkapitel

Detecting Targeted Malicious E-Mail Using Linear Regression Algorithm with Data Mining Techniques

verfasst von : A. Sesha Rao, P. S. Avadhani, Nandita Bhanja Chaudhuri

Erschienen in: Computational Intelligence in Data Mining

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

E-mail is the most fundamental means of communication. It is the focus of attack by the terrorists, e-mail spammers, imposters, business fraudsters, and hackers. To combat this, different data mining classifiers are used to identify the spam mails. This paper introduces a system that imports data from the e-mail accounts and performs preprocessing techniques like file conversions that are appropriate to conduct the experiments, searching for frequency of a word by Knuth–Morris–Pratt (KMP) string searching algorithm, and feature selection using principal component analysis (PCA) are applied. Next, linear regression classification is used to predict the spam mails. Then, association rule mining is performed. The mean absolute error and root mean squared error for the training data and test data are computed. The errors of the training and test data sets are negligible which indicates the classifier is well trained. Finally, the results are displayed by the visualization techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Fan Jia-Peng, Wu Xia-Hui, Zhu Shi-dong, and Xia Yan, “Research and Implementation of Web mail Forensics System”, 978-1-4244-6581-1/11, 2011 IEEE. Fan Jia-Peng, Wu Xia-Hui, Zhu Shi-dong, and Xia Yan, “Research and Implementation of Web mail Forensics System”, 978-1-4244-6581-1/11, 2011 IEEE.
2.
Zurück zum Zitat Chih-Chin Lai, and Ming-Chi Tsai, “An empirical Performance Comparison of Machine Learning Methods for Spam E-mail Categorization”, Proceedings of the Fourth International Conference on Hybrid Intelligent Systems (HIS’04) 0-7695-2291-2014, IEEE. Chih-Chin Lai, and Ming-Chi Tsai, “An empirical Performance Comparison of Machine Learning Methods for Spam E-mail Categorization”, Proceedings of the Fourth International Conference on Hybrid Intelligent Systems (HIS’04) 0-7695-2291-2014, IEEE.
3.
Zurück zum Zitat Walaa Gad, Sherine Rady, “Email Filtering based on Supervised Learning and Mutual Information Feature Selection”, in 978-1-4673-9971-5/15- IEEE, 2015, pp 147–152. Walaa Gad, Sherine Rady, “Email Filtering based on Supervised Learning and Mutual Information Feature Selection”, in 978-1-4673-9971-5/15- IEEE, 2015, pp 147–152.
4.
Zurück zum Zitat R. Shams and R. E. MercerIn, “Classifying Spam Emails using Text and Readability Features”, In 13th International Conference on Data Mining, IEEE, 2013, pp. 657–666. R. Shams and R. E. MercerIn, “Classifying Spam Emails using Text and Readability Features”, In 13th International Conference on Data Mining, IEEE, 2013, pp. 657–666.
6.
Zurück zum Zitat DeBarr, H.W.D., Spam Detection using Clustering, Random Forests and Active Learning, presented at the 6th Conference on Email and Anti-Spam, California, 2009. DeBarr, H.W.D., Spam Detection using Clustering, Random Forests and Active Learning, presented at the 6th Conference on Email and Anti-Spam, California, 2009.
7.
Zurück zum Zitat Awad, S.M.E.W.A., “Machine Learning methods for Email Classification”, International Journal of Computer Applications, 2011. Awad, S.M.E.W.A., “Machine Learning methods for Email Classification”, International Journal of Computer Applications, 2011.
8.
Zurück zum Zitat P. Ozarkar and Dr. M. Patwardhan, “Efficient Spam Classification By Appropriate Feature Selection”, International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 –6375 (Online) vol. 4(3), May–June, 2013. P. Ozarkar and Dr. M. Patwardhan, “Efficient Spam Classification By Appropriate Feature Selection”, International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 –6375 (Online) vol. 4(3), May–June, 2013.
9.
Zurück zum Zitat Josin Thomas, Nisha S. Raj, Vinod P., “Robust Feature Vector for Spam Classification”, In proceedings of the International Conference on Data Sciences, Universities Press, ISBN: 978-81-7371-926-4, Feb 2014, pp. 87–95. Josin Thomas, Nisha S. Raj, Vinod P., “Robust Feature Vector for Spam Classification”, In proceedings of the International Conference on Data Sciences, Universities Press, ISBN: 978-81-7371-926-4, Feb 2014, pp. 87–95.
10.
Zurück zum Zitat Tich Phuoc Tran, Pohsiang Tsai, Tony Jan, “An Adjustable Combination of Linear Regression and Modified Probabilistic Neural Network for Anti-Spam Filtering” IEEE 2008. Tich Phuoc Tran, Pohsiang Tsai, Tony Jan, “An Adjustable Combination of Linear Regression and Modified Probabilistic Neural Network for Anti-Spam Filtering” IEEE 2008.
11.
Zurück zum Zitat D. Puniškis, R. Laurutis, R. Dirmeikis, “An Artificial Neural Nets for Spam e-mail Recognition”, electronics and electrical engineering ISSN 1392 – 1215 2006. Nr. 5(69). D. Puniškis, R. Laurutis, R. Dirmeikis, “An Artificial Neural Nets for Spam e-mail Recognition”, electronics and electrical engineering ISSN 1392 – 1215 2006. Nr. 5(69).
12.
Zurück zum Zitat Rachana Mishara, Ramjeeevan Singh Thakur, “An efficient Approach For Supervised Learning Algorithms using Different Data Mining Tools For Spam Categorization”, Fourth International Conference on Communication Systems and Network Technologies, 2014, pp 472–477. Rachana Mishara, Ramjeeevan Singh Thakur, “An efficient Approach For Supervised Learning Algorithms using Different Data Mining Tools For Spam Categorization”, Fourth International Conference on Communication Systems and Network Technologies, 2014, pp 472–477.
13.
Zurück zum Zitat Sujeet More, Ravi Kalkundri, “Evaluation of Deceptive Mails using Filtering & Weka”, IEEE sponsored 2nd International Conference on Innovations in Information Embedded and Communication Systems, ICIIECS, IEEE, 2015. Sujeet More, Ravi Kalkundri, “Evaluation of Deceptive Mails using Filtering & Weka”, IEEE sponsored 2nd International Conference on Innovations in Information Embedded and Communication Systems, ICIIECS, IEEE, 2015.
Metadaten
Titel
Detecting Targeted Malicious E-Mail Using Linear Regression Algorithm with Data Mining Techniques
verfasst von
A. Sesha Rao
P. S. Avadhani
Nandita Bhanja Chaudhuri
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-3874-7_3