Skip to main content
Top

2021 | OriginalPaper | Chapter

Bio-inspired Machine Learning Mechanism for Detecting Malicious URL Through Passive DNS in Big Data Platform

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Malicious links are used as a source by the distribution channels to broadcast malware all over the Web. These links become instrumental in giving partial or full system control to the attackers. To overcome these issues, researchers have applied machine learning techniques for malicious URL detection. However, these techniques fall to identify distinguishable generic features that are able to define the maliciousness of a given domain. Generally, well-crafted URL’s features contribute considerably to the success of machine learning approaches, and on the contrary, poor features may ruin even good detection algorithms. In addition, the complex relationships between features are not easy to spot. The work presented in this paper explores how to detect malicious Web sites from passive DNS based features. This problem lends itself naturally to modern algorithms for selecting discriminative features in the continuously evolving distribution of malicious URLs. So, the suggested model adapts a bio-inspired feature selection technique to choose an optimal feature set in order to reduce the cost and running time of a given system, as well as achieving an acceptably high recognition rate. Moreover, a two-step artificial bee colony (ABC) algorithm is utilized for efficient data clustering. The two approaches are incorporated within a unified framework that operates on the top of Hadoop infrastructure to deal with large samples of URLs. Both the experimental and statistical analyses show that improvements in the hybrid model have an advantage over some conventional algorithms for detecting malicious URL attacks. The results demonstrated that the suggested model capable to scale 10 million query answer pairs with more than 96.6% accuracy.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Sayamber, A., Dixit, A.: Malicious URL detection and identification. Int. J. Comput. Appl. 99(17), 17–23 (2014) Sayamber, A., Dixit, A.: Malicious URL detection and identification. Int. J. Comput. Appl. 99(17), 17–23 (2014)
2.
go back to reference Zhauniarovich, Y., Khalil, I., Yu, T., Dacier, M.: A survey on malicious domains detection through DNS data analysis. ACM Comput. Surv. 51(4), 1–36 (2018)CrossRef Zhauniarovich, Y., Khalil, I., Yu, T., Dacier, M.: A survey on malicious domains detection through DNS data analysis. ACM Comput. Surv. 51(4), 1–36 (2018)CrossRef
3.
go back to reference Watkins, L., Beck, S., Zook, J., Buczak, A., Chavis, J., Mishra, S.: Using semi-supervised machine learning to address the big data problem in DNS networks. In: Proceedings of the IEEE 7th Annual Computing and Communication Conference (CCWC), pp. 1–6, USA (2017) Watkins, L., Beck, S., Zook, J., Buczak, A., Chavis, J., Mishra, S.: Using semi-supervised machine learning to address the big data problem in DNS networks. In: Proceedings of the IEEE 7th Annual Computing and Communication Conference (CCWC), pp. 1–6, USA (2017)
4.
go back to reference Sahoo, D., Liu, C., Hoi, S.: Malicious URL Detection Using Machine Learning: A Survey. arXiv preprint arXiv:1701.07179, pp. 1–21 (2017) Sahoo, D., Liu, C., Hoi, S.: Malicious URL Detection Using Machine Learning: A Survey. arXiv preprint arXiv:​1701.​07179, pp. 1–21 (2017)
5.
go back to reference Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou, N., Dagon, D.: Detecting malware domains at the upper DNS hierarchy. In: Proceedings of the 20th USENIX Conference on Security (SEC’11), pp. 1–16, USA (2011) Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou, N., Dagon, D.: Detecting malware domains at the upper DNS hierarchy. In: Proceedings of the 20th USENIX Conference on Security (SEC’11), pp. 1–16, USA (2011)
6.
go back to reference Ma, J., Saul, L., Savage, S., Voelker, G: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: Proceeding of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254, France (2009) Ma, J., Saul, L., Savage, S., Voelker, G: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: Proceeding of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254, France (2009)
7.
go back to reference Zhang, Y., Hong, J., Cranor, L.: CANTINA: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th International Conference on World Wide Web, pp. 639–648, Canada (2007) Zhang, Y., Hong, J., Cranor, L.: CANTINA: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th International Conference on World Wide Web, pp. 639–648, Canada (2007)
8.
go back to reference Kan, M.-Y., Thi, H.: Fast webpage classification using URL features. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 325–326, Germany (2005) Kan, M.-Y., Thi, H.: Fast webpage classification using URL features. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 325–326, Germany (2005)
9.
go back to reference Guan, D., Chen, C., Lin, J.: Anomaly based malicious URL detection in instant messaging. In: Proceedings of the Joint Workshop on Information Security, Taiwan (2009) Guan, D., Chen, C., Lin, J.: Anomaly based malicious URL detection in instant messaging. In: Proceedings of the Joint Workshop on Information Security, Taiwan (2009)
10.
go back to reference Bilge, L., Sen, S., Balzarotti, D., Kirda, E., Kruegel, C.: EXPOSURE: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. 16(4), 1–28 (2014)CrossRef Bilge, L., Sen, S., Balzarotti, D., Kirda, E., Kruegel, C.: EXPOSURE: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. 16(4), 1–28 (2014)CrossRef
11.
go back to reference Manikandan, S., Ravi, S.: Big data analysis using Apache Hadoop. In: Proceedings of the International Conference on IT Convergence and Security (ICITCS), pp. 1–4, China (2014) Manikandan, S., Ravi, S.: Big data analysis using Apache Hadoop. In: Proceedings of the International Conference on IT Convergence and Security (ICITCS), pp. 1–4, China (2014)
12.
go back to reference Figo, D., Diniz, P., Ferreira, D., Cardoso, J.: Preprocessing techniques for context recognition from accelerometer data. Pers. Ubiquit. Comput. 14(7), 645–662 (2010) Figo, D., Diniz, P., Ferreira, D., Cardoso, J.: Preprocessing techniques for context recognition from accelerometer data. Pers. Ubiquit. Comput. 14(7), 645–662 (2010)
13.
go back to reference El-Sawy, A., Hussein, M., Zaki, E., Mousa, A.: An introduction to genetic algorithms: a survey, a practical issues. Int. J. Sci. Eng. Res. 5(1), 252–262 (2014) El-Sawy, A., Hussein, M., Zaki, E., Mousa, A.: An introduction to genetic algorithms: a survey, a practical issues. Int. J. Sci. Eng. Res. 5(1), 252–262 (2014)
14.
go back to reference Sivanandam, S., Deepa, S.: Introduction to Genetic Algorithms. Springer, USA (2007) Sivanandam, S., Deepa, S.: Introduction to Genetic Algorithms. Springer, USA (2007)
15.
go back to reference Kumar, Y., Sahoo, G.: A two-step artificial bee colony algorithm for clustering. Neural Comput. Appl. 28(3), 537–551 (2015) Kumar, Y., Sahoo, G.: A two-step artificial bee colony algorithm for clustering. Neural Comput. Appl. 28(3), 537–551 (2015)
16.
go back to reference Veček, N., Liu, S., Črepinšek, M., Mernik, M.: On the importance of the artificial bee colony control parameter ‘Limit’. Inf. Technol. Control 46(4), 566–604 (2017) Veček, N., Liu, S., Črepinšek, M., Mernik, M.: On the importance of the artificial bee colony control parameter ‘Limit’. Inf. Technol. Control 46(4), 566–604 (2017)
Metadata
Title
Bio-inspired Machine Learning Mechanism for Detecting Malicious URL Through Passive DNS in Big Data Platform
Authors
Saad M. Darwish
Ali E. Anber
Saleh Mesbah
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-59338-4_9

Premium Partner