Skip to main content
Top

2020 | OriginalPaper | Chapter

MapReduce mRMR: Random Forests-Based Email Spam Classification in Distributed Environment

Authors : V. Sri Vinitha, D. Karthika Renuka

Published in: Data Management, Analytics and Innovation

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The furthermost standard message transfer system used on the internet for communication is email. These days spam is a serious concern that causes major problems in today’s internet. Spam emails are uninhibited messages that are sent to a large number of beneficiaries arbitrarily. Owing to an overgrowing rise in reputation, the number of unsolicited data has also increased promptly and has led to many security concerns. Although the sufficient number of spam filtering techniques exists, nowadays spammers start discovering innovative practices to escape data that are filtered using the spam filters. Spammers use this communication source for spreading the malware in the name of an executable file. These spam emails waste user’s system memory, computing power, and bandwidth of the network. Spam emails have been initiated to progressively damage the integrity of email and destroy the online experience. The research revealed that if the classification algorithms are used with feature selection then that will return the exact results than the standard classification. In this paper, feature selection is done through minimum redundancy and maximum relevance (mRMR) and the classification is done by means of Random Forests in the MapReduce environment. The performance is compared using various measures, namely sensitivity, correctness, and accuracy with the Random Forests in the distributed environment using Spambase dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zhang, Y., He, J., Xu, J.: A new anti-spam model based on e-mail address concealment technique. J. Nat. Sci. 23(1), 79–83 (2018) Zhang, Y., He, J., Xu, J.: A new anti-spam model based on e-mail address concealment technique. J. Nat. Sci. 23(1), 79–83 (2018)
2.
go back to reference Khalaf, O.I., Abdulsahib, G.M., Salman, A.D.: Handling dimensionality reduction in spam e-mail classification. J. Adv. Res. Dyn. Control Syst. 10(1), 691–697 (2018) Khalaf, O.I., Abdulsahib, G.M., Salman, A.D.: Handling dimensionality reduction in spam e-mail classification. J. Adv. Res. Dyn. Control Syst. 10(1), 691–697 (2018)
3.
go back to reference Bassiouni, M., Ali, M., El-Dahshan, E.A.: Ham and spam e-mails classification using machine learning techniques. J. Appl. Secur. Res. 13(3), 315–331 (2018)CrossRef Bassiouni, M., Ali, M., El-Dahshan, E.A.: Ham and spam e-mails classification using machine learning techniques. J. Appl. Secur. Res. 13(3), 315–331 (2018)CrossRef
4.
go back to reference Kaur, J., Priyanka: Feature selection based efficient machine learning technique for email spam predicition. Int. J. Eng. Appl. Sci. Technol. 2(12), 13–19 (2018) Kaur, J., Priyanka: Feature selection based efficient machine learning technique for email spam predicition. Int. J. Eng. Appl. Sci. Technol. 2(12), 13–19 (2018)
5.
go back to reference Vijayasekaran, G., Rosi, S.: Spam and email detection in big data platform using naives bayesian classifier. Int. J. Comput. Sci. Mob. Comput. (IJCSMC) 7(4), 53–58 (2018) Vijayasekaran, G., Rosi, S.: Spam and email detection in big data platform using naives bayesian classifier. Int. J. Comput. Sci. Mob. Comput. (IJCSMC) 7(4), 53–58 (2018)
6.
go back to reference Radovic, M., Ghalwash, M., Filipovic, N., Obradovic, Z.: Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinform. 18(8), 1–14 (2017) Radovic, M., Ghalwash, M., Filipovic, N., Obradovic, Z.: Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinform. 18(8), 1–14 (2017)
7.
go back to reference Easwaramoorthy, S., Thamburasa, S., Aravind, K., Bhushan, S.B., Rajadurai, H.: Heterogeneous classifier model for e-mail spam classification using FSO feature selection method. In: International Conference on Inventive Computation Technologies (ICICT), pp. 1–6 (2017) Easwaramoorthy, S., Thamburasa, S., Aravind, K., Bhushan, S.B., Rajadurai, H.: Heterogeneous classifier model for e-mail spam classification using FSO feature selection method. In: International Conference on Inventive Computation Technologies (ICICT), pp. 1–6 (2017)
8.
go back to reference Awad, M., Foqaha, M.: Email spam classification using hybrid approach of Rbfneural network and particle swarm optimization. Int. J. Netw. Secur. Appl. (IJNSA) 8(4), 1–12 (2016) Awad, M., Foqaha, M.: Email spam classification using hybrid approach of Rbfneural network and particle swarm optimization. Int. J. Netw. Secur. Appl. (IJNSA) 8(4), 1–12 (2016)
9.
go back to reference Sri Vinitha, V., Karthika Renuka, D., Bharathi, A.: E-mail spam classification using machine learning in distributed environment. J. Comput. Theor. Nanosci. 15(5), 1688–1694 (2018)CrossRef Sri Vinitha, V., Karthika Renuka, D., Bharathi, A.: E-mail spam classification using machine learning in distributed environment. J. Comput. Theor. Nanosci. 15(5), 1688–1694 (2018)CrossRef
10.
go back to reference Nesi, P., Pantaleo, G., Sanesi, G.: A hadoop based platform for natural language processing of web pages and documents. J. Vis. Lang. Comput. 31, 130–138 (2015)CrossRef Nesi, P., Pantaleo, G., Sanesi, G.: A hadoop based platform for natural language processing of web pages and documents. J. Vis. Lang. Comput. 31, 130–138 (2015)CrossRef
11.
go back to reference Ramirez-Gallego, S., Lastra, I., Martinez-Rego, D., Bolon-Canedo, V., Benitez, J.M., Herrera, F., Alonso-Betanzos, A.: Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data. Int. J. Intell. Syst. 00, 1–19 (2016) Ramirez-Gallego, S., Lastra, I., Martinez-Rego, D., Bolon-Canedo, V., Benitez, J.M., Herrera, F., Alonso-Betanzos, A.: Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data. Int. J. Intell. Syst. 00, 1–19 (2016)
12.
go back to reference Ozarkar, P., Patwardhan, M.: Efficient spam classification by appropriate feature selection. Glob. J. Comput. Sci. Technol. Softw. Data Eng. 13(5), 49–57 (2013) Ozarkar, P., Patwardhan, M.: Efficient spam classification by appropriate feature selection. Glob. J. Comput. Sci. Technol. Softw. Data Eng. 13(5), 49–57 (2013)
13.
go back to reference Vaishnavi, N., Thiyagarajan, K.: A study on prediction of malicious program using classification based approches. Int. J. Comput. Sci. Mob. Comput. IJCSMC 7(5), 38–46 (2018) Vaishnavi, N., Thiyagarajan, K.: A study on prediction of malicious program using classification based approches. Int. J. Comput. Sci. Mob. Comput. IJCSMC 7(5), 38–46 (2018)
Metadata
Title
MapReduce mRMR: Random Forests-Based Email Spam Classification in Distributed Environment
Authors
V. Sri Vinitha
D. Karthika Renuka
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-32-9949-8_18