Skip to main content
Erschienen in: World Wide Web 4/2017

29.09.2016

Two-stage ELM for phishing Web pages detection using hybrid features

verfasst von: Wei Zhang, Qingshan Jiang, Lifei Chen, Chengming Li

Erschienen in: World Wide Web | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Increasing high volume phishing attacks are being encountered every day due to attackers’ high financial returns. Recently, there has been significant interest in applying machine learning for phishing Web pages detection. Different from literatures, this paper introduces predicted labels of textual contents to be part of the features and proposes a novel framework for phishing Web pages detection using hybrid features consisting of URL-based, Web-based, rule-based and textual content-based features. We achieve this framework by developing an efficient two-stage extreme learning machine (ELM). The first stage is to construct classification models on textual contents of Web pages using ELM. In particular, we take Optical Character Recognition (OCR) as an assistant tool to extract textual contents from image format Web pages in this stage. In the second stage, a classification model on hybrid features is developed by using a linear combination model-based ensemble ELMs (LC-ELMs), with the weights calculated by the generalized inverse. Experimental results indicate the proposed framework is promising for detecting phishing Web pages.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abbasi, A., Chen, H.: A comparison of tools for detecting fake Websites. Computer 42(10), 78–86 (2009)CrossRef Abbasi, A., Chen, H.: A comparison of tools for detecting fake Websites. Computer 42(10), 78–86 (2009)CrossRef
2.
Zurück zum Zitat Abdelhamid, N., Ayesh, A., Thabtah, F.: Phishing detection based associative classification data mining. Expert Syst. Appl. 41(13), 5948–5959 (2014)CrossRef Abdelhamid, N., Ayesh, A., Thabtah, F.: Phishing detection based associative classification data mining. Expert Syst. Appl. 41(13), 5948–5959 (2014)CrossRef
3.
Zurück zum Zitat Arachchilage, N.A.G., Love, S.: A game design framework for avoiding phishing attacks. Comput. Hum. Behav. 29(3), 706–714 (2013)CrossRef Arachchilage, N.A.G., Love, S.: A game design framework for avoiding phishing attacks. Comput. Hum. Behav. 29(3), 706–714 (2013)CrossRef
4.
Zurück zum Zitat Barraclough, P.A., Hossain, M.A., Tahir, M.A., Sexton, G., Aslam, N.: Intelligent phishing detection and protection scheme for online transactions. Expert Syst. Appl. 40(11), 4697–4706 (2013)CrossRef Barraclough, P.A., Hossain, M.A., Tahir, M.A., Sexton, G., Aslam, N.: Intelligent phishing detection and protection scheme for online transactions. Expert Syst. Appl. 40(11), 4697–4706 (2013)CrossRef
5.
Zurück zum Zitat Cao, J.J., Kwong, S., Wang, R., Li, K.: A weighted voting method using minimum square error based on Extreme Learning Machine. In: Proceedings of International Conference on Machine Learning and Cybernetics, 1, 411–414 (2012) Cao, J.J., Kwong, S., Wang, R., Li, K.: A weighted voting method using minimum square error based on Extreme Learning Machine. In: Proceedings of International Conference on Machine Learning and Cybernetics, 1, 411–414 (2012)
6.
Zurück zum Zitat Cao, J., Lin, Z., Huang, G. B., Liu, N.: Voting based extreme learning machine. Inf. Sci. 185(1), 66–77 (2012)MathSciNetCrossRef Cao, J., Lin, Z., Huang, G. B., Liu, N.: Voting based extreme learning machine. Inf. Sci. 185(1), 66–77 (2012)MathSciNetCrossRef
7.
Zurück zum Zitat Ding, S., Zhao, H., Zhang, Y., Xu, X., Nie, R.: Extreme learning machine: algorithm, theory and application. Artif. Intell. Rev. 44(1), 103–115 (2013)CrossRef Ding, S., Zhao, H., Zhang, Y., Xu, X., Nie, R.: Extreme learning machine: algorithm, theory and application. Artif. Intell. Rev. 44(1), 103–115 (2013)CrossRef
8.
Zurück zum Zitat Dunlop, M., Groat, S., Shelly, D.: GoldPhish: Using Images for Content-Based Phishing Analysis. In: Proceedings of International Conference on Internet Monitoring and Protection, 123-128, IEEE (2010) Dunlop, M., Groat, S., Shelly, D.: GoldPhish: Using Images for Content-Based Phishing Analysis. In: Proceedings of International Conference on Internet Monitoring and Protection, 123-128, IEEE (2010)
9.
Zurück zum Zitat Feroz, M.N., Mengel, S.: Examination of data, rule generation and detection of phishing URLs using online logistic regression. In: Proceddings of 2014 IEEE International Conference on Big Data, IEEE, 241-250 (2014) Feroz, M.N., Mengel, S.: Examination of data, rule generation and detection of phishing URLs using online logistic regression. In: Proceddings of 2014 IEEE International Conference on Big Data, IEEE, 241-250 (2014)
11.
Zurück zum Zitat Gu, X., Wang, H., Ni, T.: An Efficient Approach to Detecting Phishing Web. J. Comput. Inf. Syst. 9(14), 5553–5560 (2013) Gu, X., Wang, H., Ni, T.: An Efficient Approach to Detecting Phishing Web. J. Comput. Inf. Syst. 9(14), 5553–5560 (2013)
12.
Zurück zum Zitat He, M., Horng, S.J., Fan, P., Khan, M.K., Run, R.S., Lai, J.L., et al.: An efficient phishing Webpage detector. Expert Syst. Appl. 38(10), 12018–12027 (2011)CrossRef He, M., Horng, S.J., Fan, P., Khan, M.K., Run, R.S., Lai, J.L., et al.: An efficient phishing Webpage detector. Expert Syst. Appl. 38(10), 12018–12027 (2011)CrossRef
13.
Zurück zum Zitat Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward Neural Networks. In: Proceedings of IEEE International Joint Confrence on Neural Networks, 2, 985-990, IEEE (2004) Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward Neural Networks. In: Proceedings of IEEE International Joint Confrence on Neural Networks, 2, 985-990, IEEE (2004)
14.
Zurück zum Zitat Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1-3), 489–501 (2006)CrossRef Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1-3), 489–501 (2006)CrossRef
15.
Zurück zum Zitat Huang, G.B., Ding, X.J., Zhou, H.M.: Optimization method based extreme learning machine for classification. Neurocomputing 74(1-3), 155–163 (2010)CrossRef Huang, G.B., Ding, X.J., Zhou, H.M.: Optimization method based extreme learning machine for classification. Neurocomputing 74(1-3), 155–163 (2010)CrossRef
16.
Zurück zum Zitat Huang, G.B., Zhou, H.M., Ding, X.J., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B Cybern. 42(2), 513–529 (2012)CrossRef Huang, G.B., Zhou, H.M., Ding, X.J., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B Cybern. 42(2), 513–529 (2012)CrossRef
17.
Zurück zum Zitat Huang, D., Xu, K., Pei, J.: Malicious URL detection by dynamically mining patterns without pre-defined elements. World Wide Web 17(6), 1375–1394 (2014)CrossRef Huang, D., Xu, K., Pei, J.: Malicious URL detection by dynamically mining patterns without pre-defined elements. World Wide Web 17(6), 1375–1394 (2014)CrossRef
19.
Zurück zum Zitat Iraqi, Y., Jones, A., Khonji, M.: Phishing detection: a literature survey. IEEE Commun. Surv. Tutorials 15(4), 2091–2121 (2013)CrossRef Iraqi, Y., Jones, A., Khonji, M.: Phishing detection: a literature survey. IEEE Commun. Surv. Tutorials 15(4), 2091–2121 (2013)CrossRef
20.
Zurück zum Zitat Kumaraguru, P., Sheng, S., Acquisti, A., Cranor, L. F., Hong, J.: Lessons from a real world evaluation of anti-phishing training. In: Proceedings of eCrime Researchers Summit, 1-12, IEEE (2008) Kumaraguru, P., Sheng, S., Acquisti, A., Cranor, L. F., Hong, J.: Lessons from a real world evaluation of anti-phishing training. In: Proceedings of eCrime Researchers Summit, 1-12, IEEE (2008)
21.
Zurück zum Zitat Kumaraguru, P., Sheng, S., Acquisti, A., Cranor, L. F., Hong, J.: Teaching johnny not to fall for phish. ACM Trans. Internet Technol. 10(2), 890–895 (2010)CrossRef Kumaraguru, P., Sheng, S., Acquisti, A., Cranor, L. F., Hong, J.: Teaching johnny not to fall for phish. ACM Trans. Internet Technol. 10(2), 890–895 (2010)CrossRef
22.
Zurück zum Zitat Laencina, P.J.G.: Improving predictions using linear combination of multiple extreme learning machines. Inf. Technol. Control 42(1), 86–93 (2013) Laencina, P.J.G.: Improving predictions using linear combination of multiple extreme learning machines. Inf. Technol. Control 42(1), 86–93 (2013)
23.
Zurück zum Zitat Lan, Y., Soh, Y.C., Huang, G.B.: Ensemble of online sequential extreme learning machine. Neurocomputing 72, 3391–3395 (2009)CrossRef Lan, Y., Soh, Y.C., Huang, G.B.: Ensemble of online sequential extreme learning machine. Neurocomputing 72, 3391–3395 (2009)CrossRef
24.
Zurück zum Zitat Li, S., Schmitz, R.: A novel anti-phishing framework based on honeypots. In: Proceedings of eCrime Researchers Summit, 1-13, IEEE (2009) Li, S., Schmitz, R.: A novel anti-phishing framework based on honeypots. In: Proceedings of eCrime Researchers Summit, 1-13, IEEE (2009)
25.
Zurück zum Zitat Li, Y., Chu, S., Xiao, R.: A pharming attack hybrid detection model based on IP addresses and Web content. Optik-Inter. J. Light and Electron Optics 126, 234–239 (2015)CrossRef Li, Y., Chu, S., Xiao, R.: A pharming attack hybrid detection model based on IP addresses and Web content. Optik-Inter. J. Light and Electron Optics 126, 234–239 (2015)CrossRef
26.
Zurück zum Zitat Liu, N., Wang, H.: Ensemble based extreme learning machine. IEEE Signal Process Lett. 7(8), 754–757 (2010) Liu, N., Wang, H.: Ensemble based extreme learning machine. IEEE Signal Process Lett. 7(8), 754–757 (2010)
27.
Zurück zum Zitat Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious Web sites from suspicious URLs. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1245-1254, ACM (2009) Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious Web sites from suspicious URLs. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1245-1254, ACM (2009)
28.
Zurück zum Zitat Miche, Y., Sorjamaa, A., Bas, P., Jutten, C., Lendasse, A.: OP-ELM: optimally pruned extreme learning machine. IEEE Trans. Neural Netw. 21(1), 158–62 (2010)CrossRef Miche, Y., Sorjamaa, A., Bas, P., Jutten, C., Lendasse, A.: OP-ELM: optimally pruned extreme learning machine. IEEE Trans. Neural Netw. 21(1), 158–62 (2010)CrossRef
29.
Zurück zum Zitat Mohammad, R.M., Thabtah, F., Mccluskey, L.: Predicting phishing Websites based on self-structuring Neural Network. Neural Comput. & Applic. 25(2), 443–458 (2014)CrossRef Mohammad, R.M., Thabtah, F., Mccluskey, L.: Predicting phishing Websites based on self-structuring Neural Network. Neural Comput. & Applic. 25(2), 443–458 (2014)CrossRef
30.
Zurück zum Zitat Nah, F H.: A study on tolerable waiting time: how long are Web users willing to wait? Behav. Inform. Technol. 23(3), 153–163 (2003)CrossRef Nah, F H.: A study on tolerable waiting time: how long are Web users willing to wait? Behav. Inform. Technol. 23(3), 153–163 (2003)CrossRef
32.
Zurück zum Zitat Ramanathan, V., Wechsler, H.: Phishing Website detection using Latent Dirichlet Allocation and AdaBoost. In: Proceedings of IEEE International Conference on Intelligence and Security Informatics, 102–107 (2012) Ramanathan, V., Wechsler, H.: Phishing Website detection using Latent Dirichlet Allocation and AdaBoost. In: Proceedings of IEEE International Conference on Intelligence and Security Informatics, 102–107 (2012)
33.
Zurück zum Zitat Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill (1983) Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill (1983)
34.
Zurück zum Zitat Xiang, G., Hong, J., Rose, C.P., Cranor, L.: CANTINA+: a feature-rich machine learning framework for detecting phishing Web sites. ACM Trans. Inf. Syst. Secur. 14(2), 1–28 (2011)CrossRef Xiang, G., Hong, J., Rose, C.P., Cranor, L.: CANTINA+: a feature-rich machine learning framework for detecting phishing Web sites. ACM Trans. Inf. Syst. Secur. 14(2), 1–28 (2011)CrossRef
35.
Zurück zum Zitat Yao, W., He, J., Wang, H., Zhang, Y., Cao, J.: Collaborative topic ranking: leveraging item meta-data for sparsity reduction. In: Proceedings of AAAI, 374-380 (2015) Yao, W., He, J., Wang, H., Zhang, Y., Cao, J.: Collaborative topic ranking: leveraging item meta-data for sparsity reduction. In: Proceedings of AAAI, 374-380 (2015)
36.
Zurück zum Zitat Zhang, H., Liu, G., Chow, T.W.S., Liu, W.: Textual and visual content-based anti-phishing: a bayesian approach. IEEE Trans. Neural Netw. 22(10), 1532–1546 (2011)CrossRef Zhang, H., Liu, G., Chow, T.W.S., Liu, W.: Textual and visual content-based anti-phishing: a bayesian approach. IEEE Trans. Neural Netw. 22(10), 1532–1546 (2011)CrossRef
37.
Zurück zum Zitat Zhuang, W.W., Jiang, Q.S.: Intelligent anti-phishing framework using multiple classifiers combination. J. Comput. Inf. Syst. 8(17), 7267–7281 (2012) Zhuang, W.W., Jiang, Q.S.: Intelligent anti-phishing framework using multiple classifiers combination. J. Comput. Inf. Syst. 8(17), 7267–7281 (2012)
Metadaten
Titel
Two-stage ELM for phishing Web pages detection using hybrid features
verfasst von
Wei Zhang
Qingshan Jiang
Lifei Chen
Chengming Li
Publikationsdatum
29.09.2016
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 4/2017
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-016-0418-9

Weitere Artikel der Ausgabe 4/2017

World Wide Web 4/2017 Zur Ausgabe