Skip to main content
Erschienen in: Neural Computing and Applications 11/2021

24.09.2020 | Original Article

A heuristic technique to detect phishing websites using TWSVM classifier

verfasst von: Routhu Srinivasa Rao, Alwyn Roshan Pais, Pritam Anand

Erschienen in: Neural Computing and Applications | Ausgabe 11/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Phishing websites are on the rise and are hosted on compromised domains such that legitimate behavior is embedded into the designed phishing site to overcome the detection. The traditional heuristic techniques using HTTPS, search engine, Page Ranking and WHOIS information may fail in detecting phishing sites hosted on the compromised domain. Moreover, list-based techniques fail to detect phishing sites when the target website is not in the whitelisted data. In this paper, we propose a novel heuristic technique using TWSVM to detect malicious registered phishing sites and also sites which are hosted on compromised servers, to overcome the aforementioned limitations. Our technique detects the phishing websites hosted on compromised domains by comparing the log-in page and home page of the visiting website. The hyperlink and URL-based features are used to detect phishing sites which are maliciously registered. We have used different versions of support vector machines (SVMs) for the classification of phishing websites. We found that twin support vector machine classifier (TWSVM) outperformed the other versions with a significant accuracy of 98.05% and recall of 98.33%.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Afroz S, Greenstadt R (2011) Phishzoo: Detecting phishing websites by looking at them. In: Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on, IEEE, pp 368–375 Afroz S, Greenstadt R (2011) Phishzoo: Detecting phishing websites by looking at them. In: Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on, IEEE, pp 368–375
5.
Zurück zum Zitat Ardi C, Heidemann J (2016) Auntietuna: Personalized content-based phishing detection. In: NDSS Usable Security Workshop (USEC) Ardi C, Heidemann J (2016) Auntietuna: Personalized content-based phishing detection. In: NDSS Usable Security Workshop (USEC)
6.
Zurück zum Zitat Britt J, Wardman B, Sprague A, Warner G (2012) Clustering potential phishing websites using deepmd5. In: LEET Britt J, Wardman B, Sprague A, Warner G (2012) Clustering potential phishing websites using deepmd5. In: LEET
7.
Zurück zum Zitat Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRef Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRef
8.
Zurück zum Zitat Chen KT, Chen JY, Huang CR, Chen CS (2009) Fighting phishing with discriminative keypoint features. IEEE Internet Comput 13(3) Chen KT, Chen JY, Huang CR, Chen CS (2009) Fighting phishing with discriminative keypoint features. IEEE Internet Comput 13(3)
10.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
11.
Zurück zum Zitat Drew J, Moore T (2014) Automatic identification of replicated criminal websites using combined clustering. Security and privacy workshops (SPW). IEEE, IEEE, pp 116–123 Drew J, Moore T (2014) Automatic identification of replicated criminal websites using combined clustering. Security and privacy workshops (SPW). IEEE, IEEE, pp 116–123
12.
Zurück zum Zitat Dunlop M, Groat S, Shelly D (2010) Goldphish: Using images for content-based phishing analysis. In: Internet Monitoring and Protection (ICIMP), 2010 Fifth International Conference on, IEEE, pp 123–128 Dunlop M, Groat S, Shelly D (2010) Goldphish: Using images for content-based phishing analysis. In: Internet Monitoring and Protection (ICIMP), 2010 Fifth International Conference on, IEEE, pp 123–128
13.
Zurück zum Zitat Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd annual meeting on association for computational linguistics, Association for Computational Linguistics, pp 363–370 Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd annual meeting on association for computational linguistics, Association for Computational Linguistics, pp 363–370
14.
Zurück zum Zitat Fung GM, Mangasarian OL (2005) Multicategory proximal support vector machine classifiers. Mach Learn 59(1–2):77–97CrossRef Fung GM, Mangasarian OL (2005) Multicategory proximal support vector machine classifiers. Mach Learn 59(1–2):77–97CrossRef
20.
Zurück zum Zitat Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993MathSciNetCrossRef Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993MathSciNetCrossRef
22.
Zurück zum Zitat Jayadeva KR, Chandra S (2017) Twin support vector machines. Springer, BerlinCrossRef Jayadeva KR, Chandra S (2017) Twin support vector machines. Springer, BerlinCrossRef
25.
Zurück zum Zitat Mao J, Tian W, Li P, Wei T, Liang Z (2017) Phishing-alarm: robust and efficient phishing detection via page component similarity. IEEE Access 5:17020–17030CrossRef Mao J, Tian W, Li P, Wei T, Liang Z (2017) Phishing-alarm: robust and efficient phishing detection via page component similarity. IEEE Access 5:17020–17030CrossRef
26.
Zurück zum Zitat Marchal S, Saari K, Singh N, Asokan N (2016) Know your phish: novel techniques for detecting phishing sites and their targets. In: Distributed Computing Systems (ICDCS), 2016 IEEE 36th International Conference on, IEEE, pp 323–333 Marchal S, Saari K, Singh N, Asokan N (2016) Know your phish: novel techniques for detecting phishing sites and their targets. In: Distributed Computing Systems (ICDCS), 2016 IEEE 36th International Conference on, IEEE, pp 323–333
27.
Zurück zum Zitat Medvet E, Kirda E, Kruegel C (2008) Visual-similarity-based phishing detection. In: Proceedings of the 4th international conference on Security and privacy in communication netowrks, ACM, p 22 Medvet E, Kirda E, Kruegel C (2008) Visual-similarity-based phishing detection. In: Proceedings of the 4th international conference on Security and privacy in communication netowrks, ACM, p 22
28.
Zurück zum Zitat Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond Ser A Contain Pap Math Phys Char 209:415–446MATH Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond Ser A Contain Pap Math Phys Char 209:415–446MATH
30.
Zurück zum Zitat Mohammad RM, Thabtah F, McCluskey L (2012) An assessment of features related to phishing websites using an automated technique. In: Internet Technology And Secured Transactions, 2012 International Conference for, IEEE, pp 492–497 Mohammad RM, Thabtah F, McCluskey L (2012) An assessment of features related to phishing websites using an automated technique. In: Internet Technology And Secured Transactions, 2012 International Conference for, IEEE, pp 492–497
31.
Zurück zum Zitat Mohammad RM, Thabtah F, McCluskey L (2015) Tutorial and critical analysis of phishing websites methods. Comput Sci Rev 17:1–24MathSciNetCrossRef Mohammad RM, Thabtah F, McCluskey L (2015) Tutorial and critical analysis of phishing websites methods. Comput Sci Rev 17:1–24MathSciNetCrossRef
32.
Zurück zum Zitat Moore T, Clayton R (2007) Examining the impact of website take-down on phishing. In: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit, ACM, pp 1–13 Moore T, Clayton R (2007) Examining the impact of website take-down on phishing. In: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit, ACM, pp 1–13
36.
Zurück zum Zitat Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications
39.
Zurück zum Zitat Rao RS, Pais AR (2017) An enhanced blacklist method to detect phishing websites. In: International Conference on Information Systems Security, Springer, pp 323–333 Rao RS, Pais AR (2017) An enhanced blacklist method to detect phishing websites. In: International Conference on Information Systems Security, Springer, pp 323–333
41.
Zurück zum Zitat Rosiello AP, Kirda E, Ferrandi F, et al (2007) A layout-similarity-based approach for detecting phishing pages. In: Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007. Third International Conference on, IEEE, pp 454–463 Rosiello AP, Kirda E, Ferrandi F, et al (2007) A layout-similarity-based approach for detecting phishing pages. In: Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007. Third International Conference on, IEEE, pp 454–463
42.
Zurück zum Zitat Rosiello AP, Kirda E, Ferrandi F, et al (2007) A layout-similarity-based approach for detecting phishing pages. In: Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007. Third International Conference on, IEEE, pp 454–463 Rosiello AP, Kirda E, Ferrandi F, et al (2007) A layout-similarity-based approach for detecting phishing pages. In: Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007. Third International Conference on, IEEE, pp 454–463
44.
Zurück zum Zitat Shao YH, Zhang CH, Wang XB, Deng NY (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968CrossRef Shao YH, Zhang CH, Wang XB, Deng NY (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968CrossRef
45.
Zurück zum Zitat Shirazi H, Bezawada B, Ray I (2018) “kn0w thy doma1n name”: Unbiased phishing detection using domain name based features. In: Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies, ACM, SACMAT ’18, pp 69–75, https://doi.org/10.1145/3205977.3205992 Shirazi H, Bezawada B, Ray I (2018) “kn0w thy doma1n name”: Unbiased phishing detection using domain name based features. In: Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies, ACM, SACMAT ’18, pp 69–75, https://​doi.​org/​10.​1145/​3205977.​3205992
46.
47.
Zurück zum Zitat Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New YorkMATH Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New YorkMATH
49.
Zurück zum Zitat Wenyin L, Huang G, Xiaoyue L, Min Z, Deng X (2005) Detection of phishing webpages based on visual similarity. In: Special interest tracks and posters of the 14th international conference on World Wide Web, ACM, pp 1060–1061 Wenyin L, Huang G, Xiaoyue L, Min Z, Deng X (2005) Detection of phishing webpages based on visual similarity. In: Special interest tracks and posters of the 14th international conference on World Wide Web, ACM, pp 1060–1061
50.
Zurück zum Zitat Xiang G, Hong JI (2009) A hybrid phish detection approach by identity discovery and keywords retrieval. In: Proceedings of the 18th international conference on World wide web, ACM, pp 571–580 Xiang G, Hong JI (2009) A hybrid phish detection approach by identity discovery and keywords retrieval. In: Proceedings of the 18th international conference on World wide web, ACM, pp 571–580
53.
Zurück zum Zitat Zhang H, Liu G, Chow TW, Liu W (2011) Textual and visual content-based anti-phishing: a bayesian approach. IEEE Trans Neural Netw 22(10):1532–1546CrossRef Zhang H, Liu G, Chow TW, Liu W (2011) Textual and visual content-based anti-phishing: a bayesian approach. IEEE Trans Neural Netw 22(10):1532–1546CrossRef
Metadaten
Titel
A heuristic technique to detect phishing websites using TWSVM classifier
verfasst von
Routhu Srinivasa Rao
Alwyn Roshan Pais
Pritam Anand
Publikationsdatum
24.09.2020
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 11/2021
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05354-z

Weitere Artikel der Ausgabe 11/2021

Neural Computing and Applications 11/2021 Zur Ausgabe