Skip to main content
Erschienen in: Telecommunication Systems 1/2021

19.05.2021

Automatic detection of phishing pages with event-based request processing, deep-hybrid feature extraction and light gradient boosted machine model

verfasst von: Ömer Kasim

Erschienen in: Telecommunication Systems | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cyber attackers target unconscious users with phishing methods is a serious threat to cyber security. It is important to quickly detect benign web pages according to legitimate ones. Despite the successful detection of phishing in the studies suggested in the literature, the problems of high false positive rate after the web page request is processed should be resolved. The novelty of the study is that classification of deep-hybrid features with the Light Gradient Boosted Machine model is evaluated as an event when the web address is entered on the address bar of the browser. Thus, phishing can be detected at every request entry before the process is completed. In the proposed approach, normalized features from requests of web pages are applied to Sparse Autoencoder and Principal Component Analysis methods. These methods contribute to encoding of the deep-hybrid feature extraction. Light Gradient Boosted Machine model classifier can effectively distinguish legitimate pages and phishing attacks using these features. The ISCX-URL phishing dataset is used to measure performance of the proposed approach and validate it. The proposed method classifies the features that are encoded with SAE-PCA by using the Light Gradient Boosted Machine model at the rate of 99.6% within the event. The obtained results show that the proposed approach performs better classification performance metrics than most others. This accuracy contributed to the solution of the false-positives problem before requests are processed compared to other models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Demirci, S., Demirci, M., & Sagiroglu, S. (2019). Virtual security functions and their placement in software defined networks: A survey. Gazi University Journal of Science, 32(3), 833–851CrossRef Demirci, S., Demirci, M., & Sagiroglu, S. (2019). Virtual security functions and their placement in software defined networks: A survey. Gazi University Journal of Science, 32(3), 833–851CrossRef
2.
Zurück zum Zitat Basit, A., Zafar, M., Liu, X., Javed, A. R., Jalil, Z., & Kifayat, K. (2020). A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommunication Systems, 1–16. Basit, A., Zafar, M., Liu, X., Javed, A. R., Jalil, Z., & Kifayat, K. (2020). A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommunication Systems, 1–16.
3.
Zurück zum Zitat El Aassal, A., Baki, S., Das, A., & Verma, R. M. (2020). An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access, 8, 22170–22192CrossRef El Aassal, A., Baki, S., Das, A., & Verma, R. M. (2020). An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access, 8, 22170–22192CrossRef
4.
Zurück zum Zitat Ferrag, M. A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications., 50, 102419CrossRef Ferrag, M. A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications., 50, 102419CrossRef
5.
Zurück zum Zitat Harinahalli Lokesh, G., & BoreGowda, G. (2020). Phishing website detection based on effective machine learning approach. Journal of Cyber Security Technology, 1–14. Harinahalli Lokesh, G., & BoreGowda, G. (2020). Phishing website detection based on effective machine learning approach. Journal of Cyber Security Technology, 1–14.
7.
Zurück zum Zitat Banu, R., Anand, M., Kamath, A., Ashika, S., Ujwala, H. S., & Harshitha, S. N. (2019). Detecting phishing attacks using natural language processing and machine learning. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 1210–1214). Banu, R., Anand, M., Kamath, A., Ashika, S., Ujwala, H. S., & Harshitha, S. N. (2019). Detecting phishing attacks using natural language processing and machine learning. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 1210–1214).
8.
Zurück zum Zitat Rao, R. S., Vaishnavi, T., & Pais, A. R. (2020). CatchPhish: detection of phishing websites by inspecting URLs. Journal of Ambient Intelligence and Humanized Computing, 11(2), 813–825CrossRef Rao, R. S., Vaishnavi, T., & Pais, A. R. (2020). CatchPhish: detection of phishing websites by inspecting URLs. Journal of Ambient Intelligence and Humanized Computing, 11(2), 813–825CrossRef
9.
Zurück zum Zitat Ali, W., & Ahmed, A. A. (2019). Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Information Security, 13(6), 659–669CrossRef Ali, W., & Ahmed, A. A. (2019). Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Information Security, 13(6), 659–669CrossRef
10.
Zurück zum Zitat Han, W., Cao, Y., Bertino, E., & Yong, J. (2012). Using automated individual white-list to protect web digital identities. Expert Systems with Applications, 39(15), 11861–11869CrossRef Han, W., Cao, Y., Bertino, E., & Yong, J. (2012). Using automated individual white-list to protect web digital identities. Expert Systems with Applications, 39(15), 11861–11869CrossRef
11.
Zurück zum Zitat Jain, A. K., & Gupta, B. B. (2016). A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP Journal on Information Security, 2016(1), 1–11CrossRef Jain, A. K., & Gupta, B. B. (2016). A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP Journal on Information Security, 2016(1), 1–11CrossRef
12.
Zurück zum Zitat Ravi, R., & Raja, E. (2020). A performance analysis of Software Defined Network based prevention on phishing attack in cyberspace using a deep machine learning with CANTINA approach (DMLCA). Computer Communications, 153, 375–381CrossRef Ravi, R., & Raja, E. (2020). A performance analysis of Software Defined Network based prevention on phishing attack in cyberspace using a deep machine learning with CANTINA approach (DMLCA). Computer Communications, 153, 375–381CrossRef
13.
Zurück zum Zitat Cao, Y., Han, W., & Le, Y. (2008, October). Anti-phishing based on automated individual white-list. In Proceedings of the 4th ACM workshop on Digital identity management (pp. 51–60). Cao, Y., Han, W., & Le, Y. (2008, October). Anti-phishing based on automated individual white-list. In Proceedings of the 4th ACM workshop on Digital identity management (pp. 51–60).
14.
Zurück zum Zitat Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Intelligent rule-based phishing websites classification. IET Information Security, 8(3), 153–160CrossRef Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Intelligent rule-based phishing websites classification. IET Information Security, 8(3), 153–160CrossRef
15.
Zurück zum Zitat Li, T., Kou, G., & Peng, Y. (2020). Improving malicious URLs detection via feature engineering: Linear and nonlinear space transformation methods. Information Systems., 91, 101494CrossRef Li, T., Kou, G., & Peng, Y. (2020). Improving malicious URLs detection via feature engineering: Linear and nonlinear space transformation methods. Information Systems., 91, 101494CrossRef
16.
Zurück zum Zitat Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345–357CrossRef Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345–357CrossRef
17.
Zurück zum Zitat Chiew, K. L., Tan, C. L., Wong, K., Yong, K. S., & Tiong, W. K. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Information Sciences, 484, 153–166CrossRef Chiew, K. L., Tan, C. L., Wong, K., Yong, K. S., & Tiong, W. K. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Information Sciences, 484, 153–166CrossRef
18.
Zurück zum Zitat Xiang, G., Hong, J., Rose, C. P., & Cranor, L. (2011). Cantina+ a feature-rich machine learning framework for detecting phishing web sites. ACM Transactions on Information and System Security (TISSEC), 14(2), 1–28CrossRef Xiang, G., Hong, J., Rose, C. P., & Cranor, L. (2011). Cantina+ a feature-rich machine learning framework for detecting phishing web sites. ACM Transactions on Information and System Security (TISSEC), 14(2), 1–28CrossRef
19.
Zurück zum Zitat He, M., Horng, S. J., Fan, P., Khan, M. K., Run, R. S., Lai, J. L., & Sutanto, A. (2011). An efficient phishing webpage detector. Expert systems with applications, 38(10), 12018–12027CrossRef He, M., Horng, S. J., Fan, P., Khan, M. K., Run, R. S., Lai, J. L., & Sutanto, A. (2011). An efficient phishing webpage detector. Expert systems with applications, 38(10), 12018–12027CrossRef
20.
Zurück zum Zitat Marchal, S., François, J., State, R., & Engel, T. (2014). PhishScore: Hacking phishers' minds. In 10th International Conference on Network and Service Management (CNSM) and Workshop, IEEE (pp. 46–54). Marchal, S., François, J., State, R., & Engel, T. (2014). PhishScore: Hacking phishers' minds. In 10th International Conference on Network and Service Management (CNSM) and Workshop, IEEE (pp. 46–54).
21.
Zurück zum Zitat Gowtham, R., & Krishnamurthi, I. (2014). A comprehensive and efficacious architecture for detecting phishing webpages. Computers and Security, 40, 23–37CrossRef Gowtham, R., & Krishnamurthi, I. (2014). A comprehensive and efficacious architecture for detecting phishing webpages. Computers and Security, 40, 23–37CrossRef
22.
Zurück zum Zitat Babagoli, M., Aghababa, M. P., & Solouk, V. (2019). Heuristic nonlinear regression strategy for detecting phishing websites. Soft Computing, 23(12), 4315–4327CrossRef Babagoli, M., Aghababa, M. P., & Solouk, V. (2019). Heuristic nonlinear regression strategy for detecting phishing websites. Soft Computing, 23(12), 4315–4327CrossRef
23.
Zurück zum Zitat Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25(2), 443–458CrossRef Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25(2), 443–458CrossRef
24.
Zurück zum Zitat Jain, A. K., & Gupta, B. B. (2018). Towards detection of phishing websites on client-side using machine learning based approach. Telecommunication Systems, 68(4), 687–700CrossRef Jain, A. K., & Gupta, B. B. (2018). Towards detection of phishing websites on client-side using machine learning based approach. Telecommunication Systems, 68(4), 687–700CrossRef
25.
Zurück zum Zitat Feng, F., Zhou, Q., Shen, Z., Yang, X., Han, L., & Wang, J. (2018). The application of a novel neural network in the detection of phishing websites. Journal of Ambient Intelligence and Humanized Computing, 1–15. Feng, F., Zhou, Q., Shen, Z., Yang, X., Han, L., & Wang, J. (2018). The application of a novel neural network in the detection of phishing websites. Journal of Ambient Intelligence and Humanized Computing, 1–15.
26.
Zurück zum Zitat Bozkir, A. S., & Aydos, M. (2020). LogoSENSE: A Companion HOG based logo detection scheme for phishing web page and e-mail brand recognition. Computers and Security, 101855. Bozkir, A. S., & Aydos, M. (2020). LogoSENSE: A Companion HOG based logo detection scheme for phishing web page and e-mail brand recognition. Computers and Security, 101855.
27.
Zurück zum Zitat Powell, A., Bates, D., Van Wyk, C., & de Abreu, D. (2019). A cross-comparison of feature selection algorithms on multiple cyber security data-sets. In FAIR (pp. 196–207). Powell, A., Bates, D., Van Wyk, C., & de Abreu, D. (2019). A cross-comparison of feature selection algorithms on multiple cyber security data-sets. In FAIR (pp. 196–207).
28.
Zurück zum Zitat Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., & Woźniak, M. (2020). Accurate and fast URL phishing detector: A convolutional neural network approach. Computer Networks, 107275. Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., & Woźniak, M. (2020). Accurate and fast URL phishing detector: A convolutional neural network approach. Computer Networks, 107275.
29.
Zurück zum Zitat Bahnsen, A. C., Bohorquez, E. C., Villegas, S., Vargas, J., & González, F. A. (2017, April). Classifying phishing URLs using recurrent neural networks. In 2017 APWG symposium on electronic crime research (eCrime) (pp. 1–8). Bahnsen, A. C., Bohorquez, E. C., Villegas, S., Vargas, J., & González, F. A. (2017, April). Classifying phishing URLs using recurrent neural networks. In 2017 APWG symposium on electronic crime research (eCrime) (pp. 1–8).
30.
Zurück zum Zitat Zhang, J., & Li, X. (2017, December). Phishing detection method based on borderline-smote deep belief network. In International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage (pp. 45–53). Cham:Springer. Zhang, J., & Li, X. (2017, December). Phishing detection method based on borderline-smote deep belief network. In International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage (pp. 45–53). Cham:Springer.
31.
Zurück zum Zitat Yang, P., Zhao, G., & Zeng, P. (2019). Phishing website detection based on multidimensional features driven by deep learning. IEEE Access, 7, 15196–15209CrossRef Yang, P., Zhao, G., & Zeng, P. (2019). Phishing website detection based on multidimensional features driven by deep learning. IEEE Access, 7, 15196–15209CrossRef
32.
Zurück zum Zitat Uçar E., İncetaş M., Mürsel O., (2019). A Deep learning approach for detection of malicious URLs. In 6th International Management Information Systems Conference, (pp.12–20). Uçar E., İncetaş M., Mürsel O., (2019). A Deep learning approach for detection of malicious URLs. In 6th International Management Information Systems Conference, (pp.12–20).
33.
Zurück zum Zitat Mamun, M. S. I., Rathore, M. A., Lashkari, A. H., Stakhanova, N., & Ghorbani, A. A. (2016, September). Detecting malicious urls using lexical analysis. In International Conference on Network and System Security (pp. 467–482). Mamun, M. S. I., Rathore, M. A., Lashkari, A. H., Stakhanova, N., & Ghorbani, A. A. (2016, September). Detecting malicious urls using lexical analysis. In International Conference on Network and System Security (pp. 467–482).
34.
Zurück zum Zitat Aburomman, A. A., & Reaz, M. B. I. (2016, October). Ensemble of binary SVM classifiers based on PCA and LDA feature extraction for intrusion detection. In 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) (pp. 636–640). Aburomman, A. A., & Reaz, M. B. I. (2016, October). Ensemble of binary SVM classifiers based on PCA and LDA feature extraction for intrusion detection. In 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) (pp. 636–640).
35.
Zurück zum Zitat Kim, S., Jo, W., & Shon, T. (2020). APAD: autoencoder-based payload anomaly detection for industrial IoE. Applied Soft Computing., 88, 106017CrossRef Kim, S., Jo, W., & Shon, T. (2020). APAD: autoencoder-based payload anomaly detection for industrial IoE. Applied Soft Computing., 88, 106017CrossRef
36.
Zurück zum Zitat Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 3146–3154 Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 3146–3154
38.
Zurück zum Zitat Li, Y., Yang, Z., Chen, X., Yuan, H., & Liu, W. (2019). A stacking model using URL and HTML features for phishing webpage detection. Future Generation Computer Systems, 94, 27–39CrossRef Li, Y., Yang, Z., Chen, X., Yuan, H., & Liu, W. (2019). A stacking model using URL and HTML features for phishing webpage detection. Future Generation Computer Systems, 94, 27–39CrossRef
Metadaten
Titel
Automatic detection of phishing pages with event-based request processing, deep-hybrid feature extraction and light gradient boosted machine model
verfasst von
Ömer Kasim
Publikationsdatum
19.05.2021
Verlag
Springer US
Erschienen in
Telecommunication Systems / Ausgabe 1/2021
Print ISSN: 1018-4864
Elektronische ISSN: 1572-9451
DOI
https://doi.org/10.1007/s11235-021-00799-6

Weitere Artikel der Ausgabe 1/2021

Telecommunication Systems 1/2021 Zur Ausgabe

Neuer Inhalt