Skip to main content
Erschienen in: Journal of Network and Systems Management 3/2021

01.07.2021

URLdeepDetect: A Deep Learning Approach for Detecting Malicious URLs Using Semantic Vector Models

verfasst von: Sara Afzal, Muhammad Asim, Abdul Rehman Javed, Mirza Omer Beg, Thar Baker

Erschienen in: Journal of Network and Systems Management | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Malicious Uniform Resource Locators (URLs) embedded in emails or Twitter posts have been used as weapons for luring susceptible Internet users into executing malicious content leading to compromised systems, scams, and a multitude of cyber-attacks. These attacks can potentially might cause damages ranging from fraud to massive data breaches resulting in huge financial losses. This paper proposes a hybrid deep-learning approach named URLdeepDetect for time-of-click URL analysis and classification to detect malicious URLs. URLdeepDetect analyzes semantic and lexical features of a URL by applying various techniques, including semantic vector models and URL encryption to determine a given URL as either malicious or benign. URLdeepDetect uses supervised and unsupervised mechanisms in the form of LSTM (Long Short-Term Memory) and k-means clustering for URL classification. URLdeepDetect achieves accuracy of 98.3% and 99.7% with LSTM and k-means clustering, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bakshy, E., Rosenn, I., Marlow, C., Adamic, L.: The role of social networks in information diffusion. In: Proceedings of the 21st international conference on World Wide Web, pp 519–528 (2012) Bakshy, E., Rosenn, I., Marlow, C., Adamic, L.: The role of social networks in information diffusion. In: Proceedings of the 21st international conference on World Wide Web, pp 519–528 (2012)
2.
Zurück zum Zitat Basit, A., Zafar, M., Liu, X., Javed, A.R., Jalil, Z., Kifayat, K.: A comprehensive survey of ai-enabled phishing attacks detection techniques. Telecommunication Systems pp 1–16 (2020) Basit, A., Zafar, M., Liu, X., Javed, A.R., Jalil, Z., Kifayat, K.: A comprehensive survey of ai-enabled phishing attacks detection techniques. Telecommunication Systems pp 1–16 (2020)
3.
Zurück zum Zitat Asad, M., Asim, M., Javed, T., Beg, M.O., Mujtaba, H., Abbas, S.: Deepdetect: detection of distributed denial of service attacks using deep learning. Comput. J. 63(7), 983–994 (2020)CrossRef Asad, M., Asim, M., Javed, T., Beg, M.O., Mujtaba, H., Abbas, S.: Deepdetect: detection of distributed denial of service attacks using deep learning. Comput. J. 63(7), 983–994 (2020)CrossRef
4.
Zurück zum Zitat Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, p 12 (2010) Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, p 12 (2010)
5.
Zurück zum Zitat Javed, A.R., Beg, M.O., Asim, M., Baker, T., Al-Bayatti, A.H.: Alphalogger: detecting motion-based side-channel attack using smartphone keystrokes. J. Ambient Intell. Human. Comput. pp 1–14 (2020) Javed, A.R., Beg, M.O., Asim, M., Baker, T., Al-Bayatti, A.H.: Alphalogger: detecting motion-based side-channel attack using smartphone keystrokes. J. Ambient Intell. Human. Comput. pp 1–14 (2020)
6.
Zurück zum Zitat Nair, M.C., Prema, S.: A distributed system for detecting phishing in twitter stream. Int. J. Eng. Sci. Innov. Technol. 3(2), 151–158 (2014) Nair, M.C., Prema, S.: A distributed system for detecting phishing in twitter stream. Int. J. Eng. Sci. Innov. Technol. 3(2), 151–158 (2014)
7.
Zurück zum Zitat Leukfeldt, E.R., Kleemans, E.R., Stol, W.P.: Cybercriminal networks, social ties and online forums: social ties versus digital ties within phishing and malware networks. Br. J. Criminol. 57(3), 704–722 (2017) Leukfeldt, E.R., Kleemans, E.R., Stol, W.P.: Cybercriminal networks, social ties and online forums: social ties versus digital ties within phishing and malware networks. Br. J. Criminol. 57(3), 704–722 (2017)
8.
Zurück zum Zitat Ohta, S., Kurebayashi, R., Kobayashi, K.: Minimizing false positives of a decision tree classifier for intrusion detection on the internet. J. Netw. Syst. Manag. 16(4), 399–419 (2008)CrossRef Ohta, S., Kurebayashi, R., Kobayashi, K.: Minimizing false positives of a decision tree classifier for intrusion detection on the internet. J. Netw. Syst. Manag. 16(4), 399–419 (2008)CrossRef
9.
Zurück zum Zitat Jiang, J., Papavassiliou, S.: Detecting network attacks in the internet via statistical network traffic normality prediction. J. Netw. Syst. Manag. 12(1), 51–72 (2004)CrossRef Jiang, J., Papavassiliou, S.: Detecting network attacks in the internet via statistical network traffic normality prediction. J. Netw. Syst. Manag. 12(1), 51–72 (2004)CrossRef
10.
Zurück zum Zitat Joshi, A., Lloyd, L., Westin, P., Seethapathy, S.: Using lexical features for malicious url detection–a machine learning approach. arXiv preprint arXiv:191006277 (2019) Joshi, A., Lloyd, L., Westin, P., Seethapathy, S.: Using lexical features for malicious url detection–a machine learning approach. arXiv preprint arXiv:191006277 (2019)
11.
Zurück zum Zitat Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of the 19th international conference on World wide web, pp 281–290 (2010) Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of the 19th international conference on World wide web, pp 281–290 (2010)
12.
Zurück zum Zitat Moshchuk, A., Bragin, T., Gribble, S.D., Levy, H.M.: A crawler-based study of spyware in the web. In: NDSS, vol 1, p 2 (2006) Moshchuk, A., Bragin, T., Gribble, S.D., Levy, H.M.: A crawler-based study of spyware in the web. In: NDSS, vol 1, p 2 (2006)
13.
Zurück zum Zitat Hofstede, R., Jonker, M., Sperotto, A., Pras, A.: Flow-based web application brute-force attack and compromise detection. J. Netw. Syst. Manag. 25(4), 735–758 (2017)CrossRef Hofstede, R., Jonker, M., Sperotto, A., Pras, A.: Flow-based web application brute-force attack and compromise detection. J. Netw. Syst. Manag. 25(4), 735–758 (2017)CrossRef
14.
Zurück zum Zitat Alshboul, Y., Nepali, R., Wang, Y.: Detecting malicious short urls on twitter. In: Conference: 21st Americas Conference on Information SystemsAt: Puerto Rico (2015) Alshboul, Y., Nepali, R., Wang, Y.: Detecting malicious short urls on twitter. In: Conference: 21st Americas Conference on Information SystemsAt: Puerto Rico (2015)
15.
Zurück zum Zitat Shafahi, M., Kempers, L., Afsarmanesh, H.: Phishing through social bots on twitter. In: 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp 3703–3712 (2016) Shafahi, M., Kempers, L., Afsarmanesh, H.: Phishing through social bots on twitter. In: 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp 3703–3712 (2016)
16.
Zurück zum Zitat Burnap, P., Javed, A., Rana, O.F., Awan, M.S.: Real-time classification of malicious urls on twitter using machine activity data. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp 970–977 (2015) Burnap, P., Javed, A., Rana, O.F., Awan, M.S.: Real-time classification of malicious urls on twitter using machine activity data. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp 970–977 (2015)
17.
Zurück zum Zitat Lee, C.H.: Unsupervised and supervised learning to evaluate event relatedness based on content mining from social-media streams. Expert Syst. Appl. 39(18), 13338–13356 (2012)CrossRef Lee, C.H.: Unsupervised and supervised learning to evaluate event relatedness based on content mining from social-media streams. Expert Syst. Appl. 39(18), 13338–13356 (2012)CrossRef
18.
Zurück zum Zitat Imtiaz, S.I., ur Rehman, S., Javed, A.R., Jalil, Z., Liu, X., Alnumay, W.S.: Deepamd: Detection and identification of android malware using high-efficient deep artificial neural network. Future Generation Computer Systems (2020) Imtiaz, S.I., ur Rehman, S., Javed, A.R., Jalil, Z., Liu, X., Alnumay, W.S.: Deepamd: Detection and identification of android malware using high-efficient deep artificial neural network. Future Generation Computer Systems (2020)
19.
Zurück zum Zitat Nepali, R.K., Wang, Y.: You look suspicious!!: Leveraging visible attributes to classify malicious short urls on twitter. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), IEEE, pp 2648–2655 (2016) Nepali, R.K., Wang, Y.: You look suspicious!!: Leveraging visible attributes to classify malicious short urls on twitter. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), IEEE, pp 2648–2655 (2016)
20.
Zurück zum Zitat Kuyama, M., Kakizaki, Y., Sasaki, R.: Method for detecting a malicious domain by using whois and dns features. In: The third international conference on digital security and forensics (DigitalSec2016), vol 74 (2016) Kuyama, M., Kakizaki, Y., Sasaki, R.: Method for detecting a malicious domain by using whois and dns features. In: The third international conference on digital security and forensics (DigitalSec2016), vol 74 (2016)
21.
Zurück zum Zitat Javed, A., Burnap, P., Rana, O.: Prediction of drive-by download attacks on twitter. Inf. Process. Manag. 56(3), 1133–1145 (2019)CrossRef Javed, A., Burnap, P., Rana, O.: Prediction of drive-by download attacks on twitter. Inf. Process. Manag. 56(3), 1133–1145 (2019)CrossRef
22.
Zurück zum Zitat Jahani, H., Jalili, S.: Online tor privacy breach through website fingerprinting attack. J. Netw. Syst. Manag. 27(2), 289–326 (2019)CrossRef Jahani, H., Jalili, S.: Online tor privacy breach through website fingerprinting attack. J. Netw. Syst. Manag. 27(2), 289–326 (2019)CrossRef
23.
Zurück zum Zitat Blum, A., Wardman, B., Solorio, T., Warner, G.: Lexical feature based phishing url detection using online learning. In: Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security, pp 54–60 (2010) Blum, A., Wardman, B., Solorio, T., Warner, G.: Lexical feature based phishing url detection using online learning. In: Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security, pp 54–60 (2010)
24.
Zurück zum Zitat Cao, C., Caverlee, J.: Detecting spam urls in social media via behavioral analysis. In: European conference on information retrieval, Springer, pp 703–714 (2015) Cao, C., Caverlee, J.: Detecting spam urls in social media via behavioral analysis. In: European conference on information retrieval, Springer, pp 703–714 (2015)
25.
Zurück zum Zitat Wang, D., Navathe, S.B., Liu, L., Irani, D., Tamersoy, A., Pu, C.: Click traffic analysis of short url spam on twitter. In: 9th IEEE International Conference on Collaborative Computing: Networking, pp. 250–259. Applications and Worksharing, IEEE (2013) Wang, D., Navathe, S.B., Liu, L., Irani, D., Tamersoy, A., Pu, C.: Click traffic analysis of short url spam on twitter. In: 9th IEEE International Conference on Collaborative Computing: Networking, pp. 250–259. Applications and Worksharing, IEEE (2013)
26.
Zurück zum Zitat Verma, M., Sofat, S.: Techniques to detect spammers in twitter-a survey. Intl. J. Comput. Appl. 85(10), (2014) Verma, M., Sofat, S.: Techniques to detect spammers in twitter-a survey. Intl. J. Comput. Appl. 85(10), (2014)
27.
Zurück zum Zitat Selvaganapathy, S., Nivaashini, M., Natarajan, H.: Deep belief network based detection and categorization of malicious urls. Inf. Secur. J. 27(3), 145–161 (2018) Selvaganapathy, S., Nivaashini, M., Natarajan, H.: Deep belief network based detection and categorization of malicious urls. Inf. Secur. J. 27(3), 145–161 (2018)
28.
Zurück zum Zitat Vinayakumar, R., Soman, K., Poornachandran, P.: Evaluating deep learning approaches to characterize and classify malicious url’s. Journal of Intelligent & Fuzzy Systems 34(3), 1333–1343 (2018)CrossRef Vinayakumar, R., Soman, K., Poornachandran, P.: Evaluating deep learning approaches to characterize and classify malicious url’s. Journal of Intelligent & Fuzzy Systems 34(3), 1333–1343 (2018)CrossRef
29.
Zurück zum Zitat Saxe, J., Berlin, K.: expose: A character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. arXiv preprint arXiv:170208568 (2017) Saxe, J., Berlin, K.: expose: A character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. arXiv preprint arXiv:170208568 (2017)
30.
Zurück zum Zitat Patgiri, R., Katari, H., Kumar, R., Sharma, D.: Empirical study on malicious url detection using machine learning. In: International Conference on Distributed Computing and Internet Technology, Springer, pp 380–388 (2019) Patgiri, R., Katari, H., Kumar, R., Sharma, D.: Empirical study on malicious url detection using machine learning. In: International Conference on Distributed Computing and Internet Technology, Springer, pp 380–388 (2019)
31.
Zurück zum Zitat Begum, A., Badugu, S.: A study of malicious url detection using machine learning and heuristic approaches. In: Advances in Decision Sciences, pp. 587–597. Image Processing, Security and Computer Vision, Springer (2020) Begum, A., Badugu, S.: A study of malicious url detection using machine learning and heuristic approaches. In: Advances in Decision Sciences, pp. 587–597. Image Processing, Security and Computer Vision, Springer (2020)
32.
Zurück zum Zitat Kulkarni, A.D., Brown, L.L., III.: Phishing websites detection using machine learning. Intl. J. Adv. Comput. Sci. Appl. 10(7), (2019) Kulkarni, A.D., Brown, L.L., III.: Phishing websites detection using machine learning. Intl. J. Adv. Comput. Sci. Appl. 10(7), (2019)
33.
Zurück zum Zitat Zafar, S., Jangsher, S., Bouachir, O., Aloqaily, M., Othman, J.B.: Qos enhancement with deep learning-based interference prediction in mobile iot. Comput. Commun. 148, 86–97 (2019)CrossRef Zafar, S., Jangsher, S., Bouachir, O., Aloqaily, M., Othman, J.B.: Qos enhancement with deep learning-based interference prediction in mobile iot. Comput. Commun. 148, 86–97 (2019)CrossRef
34.
Zurück zum Zitat Zafar, S., Jangsher, S., Aloqaily, M., Bouachir, O., Othman, J.B.: Resource allocation in moving small cell network using deep learning based interference determination. In: 2019 IEEE 30th Annual International Symposium on Personal, pp. 1–6. Indoor and Mobile Radio Communications (PIMRC), IEEE (2019) Zafar, S., Jangsher, S., Aloqaily, M., Bouachir, O., Othman, J.B.: Resource allocation in moving small cell network using deep learning based interference determination. In: 2019 IEEE 30th Annual International Symposium on Personal, pp. 1–6. Indoor and Mobile Radio Communications (PIMRC), IEEE (2019)
35.
Zurück zum Zitat Lee, S., Kim, J.: Warningbird: a near real-time detection system for suspicious urls in twitter stream. IEEE Trans. Depend. Secure Comput. 10(3), 183–195 (2013)CrossRef Lee, S., Kim, J.: Warningbird: a near real-time detection system for suspicious urls in twitter stream. IEEE Trans. Depend. Secure Comput. 10(3), 183–195 (2013)CrossRef
36.
Zurück zum Zitat Liew, S.W., Sani, N.F.M., Abdullah, M.T., Yaakob, R., Sharum, M.Y.: An effective security alert mechanism for real-time phishing tweet detection on twitter. Comput. Secur. 83, 201–207 (2019)CrossRef Liew, S.W., Sani, N.F.M., Abdullah, M.T., Yaakob, R., Sharum, M.Y.: An effective security alert mechanism for real-time phishing tweet detection on twitter. Comput. Secur. 83, 201–207 (2019)CrossRef
37.
Zurück zum Zitat Patil, D.R., Patil, J.B.: Feature-based malicious url and attack type detection using multi-class classification. ISeCure 10(2), (2018) Patil, D.R., Patil, J.B.: Feature-based malicious url and attack type detection using multi-class classification. ISeCure 10(2), (2018)
38.
Zurück zum Zitat Namasivayam, B.: Categorization of phishing detection features. PhD thesis, PhD thesis, Arizona State University (2017) Namasivayam, B.: Categorization of phishing detection features. PhD thesis, PhD thesis, Arizona State University (2017)
39.
Zurück zum Zitat Hai, Q.T., Hwang, S.O.: Detection of malicious urls based on word vector representation and ngram. J. Intell. Fuzzy Syst. 35(6), 5889–5900 (2018)CrossRef Hai, Q.T., Hwang, S.O.: Detection of malicious urls based on word vector representation and ngram. J. Intell. Fuzzy Syst. 35(6), 5889–5900 (2018)CrossRef
40.
Zurück zum Zitat Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: Url2vec: Url modeling with character embeddings for fast and accurate phishing website detection. In: 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), IEEE, pp 265–272 (2018) Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: Url2vec: Url modeling with character embeddings for fast and accurate phishing website detection. In: 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), IEEE, pp 265–272 (2018)
41.
Zurück zum Zitat Jang, B., Kim, I., Kim, J.W.: Word2vec convolutional neural networks for classification of news articles and tweets. PLoS ONE 14(8), (2019) Jang, B., Kim, I., Kim, J.W.: Word2vec convolutional neural networks for classification of news articles and tweets. PLoS ONE 14(8), (2019)
42.
Zurück zum Zitat Otoum, S., Kantarci, B., Mouftah, H.T.: On the feasibility of deep learning in sensor network intrusion detection. IEEE Netw. Lett. 1(2), 68–71 (2019)CrossRef Otoum, S., Kantarci, B., Mouftah, H.T.: On the feasibility of deep learning in sensor network intrusion detection. IEEE Netw. Lett. 1(2), 68–71 (2019)CrossRef
43.
Zurück zum Zitat Aloqaily, M., Otoum, S., Al Ridhawi, I., Jararweh, Y.: An intrusion detection system for connected vehicles in smart cities. Ad Hoc Netw. 90, 101842 (2019)CrossRef Aloqaily, M., Otoum, S., Al Ridhawi, I., Jararweh, Y.: An intrusion detection system for connected vehicles in smart cities. Ad Hoc Netw. 90, 101842 (2019)CrossRef
45.
Zurück zum Zitat Rehman Javed, A., Jalil, Z., Atif Moqurrab, S., Abbas, S., Liu, X.: Ensemble adaboost classifier for accurate and fast detection of botnet attacks in connected vehicles. Trans. Emerg. Telecommun. Technol. p e4088 (2020) Rehman Javed, A., Jalil, Z., Atif Moqurrab, S., Abbas, S., Liu, X.: Ensemble adaboost classifier for accurate and fast detection of botnet attacks in connected vehicles. Trans. Emerg. Telecommun. Technol. p e4088 (2020)
46.
Zurück zum Zitat Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: Urlnet: learning a url representation with deep learning for malicious url detection. arXiv preprint arXiv:180203162 (2018) Le, H., Pham, Q., Sahoo, D., Hoi, S.C.: Urlnet: learning a url representation with deep learning for malicious url detection. arXiv preprint arXiv:180203162 (2018)
47.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: t-distributed stochastic neighbor embedding. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: t-distributed stochastic neighbor embedding. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet
Metadaten
Titel
URLdeepDetect: A Deep Learning Approach for Detecting Malicious URLs Using Semantic Vector Models
verfasst von
Sara Afzal
Muhammad Asim
Abdul Rehman Javed
Mirza Omer Beg
Thar Baker
Publikationsdatum
01.07.2021
Verlag
Springer US
Erschienen in
Journal of Network and Systems Management / Ausgabe 3/2021
Print ISSN: 1064-7570
Elektronische ISSN: 1573-7705
DOI
https://doi.org/10.1007/s10922-021-09587-8

Weitere Artikel der Ausgabe 3/2021

Journal of Network and Systems Management 3/2021 Zur Ausgabe

Premium Partner