Skip to main content
Top
Published in: Soft Computing 22/2019

18-01-2019 | Methodologies and Application

WebHound: a data-driven intrusion detection from real-world web access logs

Authors: Te-En Wei, Hahn-Ming Lee, Albert B. Jeng, Hemank Lamba, Christos Faloutsos

Published in: Soft Computing | Issue 22/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hackers usually discover and exploit vulnerabilities existing in the entry point before invading a corporate environment. The web server exploration and spams are two popular means used by hackers to gain access to the enterprise computer systems. In this paper, we focus on protecting a web server in dealing with such cybersecurity intrusion threat. During the discovery stage, a web vulnerability investigation scanner (e.g., SQLMap, NMap, and Kali) is used by hackers to learn the web server versions and other related vulnerabilities. Then, in the exploitation stage, hackers develop a customized intrusion method which exploits those previously learned vulnerabilities to launch a subsequent attack. Currently, the most popular defense approaches (e.g., IDS, WAF) detect web server intrusion events through domain expert rules and anomaly pattern matches. For example, ModSecurity is an open source WAF which only detects known malware signature by domain expert rules. Thus, those approaches are good to defend the first discovery stage intrusion. However, they are not effective to deal with the customized intrusion in the second exploitation stage since no rules or signatures are available for such kind of intrusion detection. In this paper, in order to resolve the above problem, we propose an unsupervised data-driven anomaly detection known as WebHound. It not only identifies hackers reconnaissance but also detects the customized intrusion means deployed by hackers by analyzing large-scale web access logs. Moreover, WebHoundalso provides intrusion evidence using storyline for recovering intrusion procedure. Among numerous experiments and case studies, we applied WebHoundto a special government case for the intrusion evidence investigation and at the same time, we compared our results with the work done by computer forensic experts. The results showed that WebHoundcould discover more intrusion evidence than human experts. We also compared WebHoundwith ModSecurity which is updated with the newest domain expert rules running in a virtualized corporate environment. The experimental results show that WebHoundhas a better accuracy rate than ModSecurity. In summary, WebHoundalleviates the heavy demand on expert knowledge and human efforts to detect cyber-attack on a web server, and it also enhances detection accuracy and recall rate. Moreover, WebHoundcould provide more evidence for forensic experts to trace the original entry points.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Akoglu L, McGlohon M, Faloutsos C (2010) Oddball: spotting anomalies in weighted graphs. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 410–421 Akoglu L, McGlohon M, Faloutsos C (2010) Oddball: spotting anomalies in weighted graphs. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 410–421
go back to reference Akoglu L, Tong H, Koutra D (2015) Graph based anomaly detection and description: a survey. Data Min Knowl Discov 29(3):626–688MathSciNetCrossRef Akoglu L, Tong H, Koutra D (2015) Graph based anomaly detection and description: a survey. Data Min Knowl Discov 29(3):626–688MathSciNetCrossRef
go back to reference Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104 Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104
go back to reference Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRef Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRef
go back to reference Cheng W, Zhang K, Chen H, Jiang G, Chen Z, Wang W (2016) Ranking causal anomalies via temporal and dynamical analysis on vanishing correlations. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD) Cheng W, Zhang K, Chen H, Jiang G, Chen Z, Wang W (2016) Ranking causal anomalies via temporal and dynamical analysis on vanishing correlations. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD)
go back to reference Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
go back to reference Di Mauro M, Di Sarno C (2018) Improving siem capabilities through an enhanced probe for encrypted skype traffic detection. J Inf Secur Appl 38:85–95 Di Mauro M, Di Sarno C (2018) Improving siem capabilities through an enhanced probe for encrypted skype traffic detection. J Inf Secur Appl 38:85–95
go back to reference Elkan C (2003) Using the triangle inequality to accelerate k-means. In: ICML, pp 147–153 Elkan C (2003) Using the triangle inequality to accelerate k-means. In: ICML, pp 147–153
go back to reference Ge Y, Jiang G, Ding M, Xiong H (2014) Ranking metric anomaly in invariant networks. ACM Trans Knowl Discov Data (TKDD) 8(2):8 Ge Y, Jiang G, Ding M, Xiong H (2014) Ranking metric anomaly in invariant networks. ACM Trans Knowl Discov Data (TKDD) 8(2):8
go back to reference Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. Numerische mathematik 14(5):403–420MathSciNetCrossRef Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. Numerische mathematik 14(5):403–420MathSciNetCrossRef
go back to reference Gunestas M, Bilgin Z (2016) Log analysis using temporal logic and reconstruction approach: web server case. J Digit Forensics Secur Law JDFSL 11(2):35 Gunestas M, Bilgin Z (2016) Log analysis using temporal logic and reconstruction approach: web server case. J Digit Forensics Secur Law JDFSL 11(2):35
go back to reference Gyöngyi Z, Garcia-Molina H, Pedersen J (2004) Combating web spam with trustrank. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB Endowment, pp 576–587CrossRef Gyöngyi Z, Garcia-Molina H, Pedersen J (2004) Combating web spam with trustrank. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB Endowment, pp 576–587CrossRef
go back to reference Henderson K, Eliassi-Rad T, Faloutsos C, Akoglu L, Li L, Maruhashi K, Prakash BA, Tong H (2010) Metric forensics: a multi-level approach for mining volatile graphs. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 163–172 Henderson K, Eliassi-Rad T, Faloutsos C, Akoglu L, Li L, Maruhashi K, Prakash BA, Tong H (2010) Metric forensics: a multi-level approach for mining volatile graphs. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 163–172
go back to reference Hoplaros D, Tari Z, Khalil I (2014) Data summarization for network traffic monitoring. J Netw Comput Appl 37:194–205CrossRef Hoplaros D, Tari Z, Khalil I (2014) Data summarization for network traffic monitoring. J Netw Comput Appl 37:194–205CrossRef
go back to reference Jaeger D, Ussath M, Cheng F, Meinel C (2015) Multi-step attack pattern detection on normalized event logs. In: 2015 IEEE 2nd international conference on cyber security and cloud computing (CSCloud). IEEE, pp 390–398 Jaeger D, Ussath M, Cheng F, Meinel C (2015) Multi-step attack pattern detection on normalized event logs. In: 2015 IEEE 2nd international conference on cyber security and cloud computing (CSCloud). IEEE, pp 390–398
go back to reference Jiang G, Chen H, Yoshihira K (2006) Modeling and tracking of transaction flow dynamics for fault detection in complex systems. IEEE Trans Dependable Secure Comput 3(4):312–326CrossRef Jiang G, Chen H, Yoshihira K (2006) Modeling and tracking of transaction flow dynamics for fault detection in complex systems. IEEE Trans Dependable Secure Comput 3(4):312–326CrossRef
go back to reference Jiang D, Xu Z, Zhang P, Zhu T (2014) A transform domain-based anomaly detection approach to network-wide traffic. J Netw Comput Appl 40:292–306CrossRef Jiang D, Xu Z, Zhang P, Zhu T (2014) A transform domain-based anomaly detection approach to network-wide traffic. J Netw Comput Appl 40:292–306CrossRef
go back to reference Langville AN, Meyer CD (2005) A survey of eigenvector methods for web information retrieval. SIAM Rev 47(1):135–161MathSciNetCrossRef Langville AN, Meyer CD (2005) A survey of eigenvector methods for web information retrieval. SIAM Rev 47(1):135–161MathSciNetCrossRef
go back to reference Liao H-J, Lin C-HR, Lin Y-C, Tung K-Y (2013) Intrusion detection system: a comprehensive review. J Netw Comput Appl 36(1):16–24CrossRef Liao H-J, Lin C-HR, Lin Y-C, Tung K-Y (2013) Intrusion detection system: a comprehensive review. J Netw Comput Appl 36(1):16–24CrossRef
go back to reference Liu C, Yan X, Yu H, Han J, Philip SY (2005) Mining behavior graphs for“ backtrace” of noncrashing bugs. In: SDM. SIAM, pp 286–297 Liu C, Yan X, Yu H, Han J, Philip SY (2005) Mining behavior graphs for“ backtrace” of noncrashing bugs. In: SDM. SIAM, pp 286–297
go back to reference Manevitz L M, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2:139–154MATH Manevitz L M, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2:139–154MATH
go back to reference Matta V, Di Mauro M, Longo M (2017) Ddos attacks with randomized traffic innovation: botnet identification challenges and strategies. IEEE Trans Inf Forensics Secur 12(8):1844–1859CrossRef Matta V, Di Mauro M, Longo M (2017) Ddos attacks with randomized traffic innovation: botnet identification challenges and strategies. IEEE Trans Inf Forensics Secur 12(8):1844–1859CrossRef
go back to reference Modi C, Patel D, Borisaniya B, Patel H, Patel A, Rajarajan M (2013) A survey of intrusion detection techniques in cloud. J Netw Comput Appl 36(1):42–57CrossRef Modi C, Patel D, Borisaniya B, Patel H, Patel A, Rajarajan M (2013) A survey of intrusion detection techniques in cloud. J Netw Comput Appl 36(1):42–57CrossRef
go back to reference Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab
go back to reference Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In: Proceedings of the 4th international conference on advances in pattern recognition and digital techniques (ICAPRDT’99) Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In: Proceedings of the 4th international conference on advances in pattern recognition and digital techniques (ICAPRDT’99)
go back to reference Rossi RA, Gallagher B, Neville J, Henderson K (2013) Modeling dynamic behavior in large evolving graphs. In: Proceedings of the sixth ACM international conference on Web search and data mining. ACM, pp 667–676 Rossi RA, Gallagher B, Neville J, Henderson K (2013) Modeling dynamic behavior in large evolving graphs. In: Proceedings of the sixth ACM international conference on Web search and data mining. ACM, pp 667–676
go back to reference Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 374–383 Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 374–383
go back to reference Sun X, Dai J, Liu P, Singhal A, Yen J (2016) Towards probabilistic identification of zero-day attack paths. In: 2016 IEEE conference on communications and network security (CNS). IEEE, pp 64–72 Sun X, Dai J, Liu P, Singhal A, Yen J (2016) Towards probabilistic identification of zero-day attack paths. In: 2016 IEEE conference on communications and network security (CNS). IEEE, pp 64–72
go back to reference Tao C, Ge Y, Song Q, Ge Y, Omitaomu OA (2014) Metric ranking of invariant networks with belief propagation. In: 2014 IEEE international conference on data mining. IEEE, pp 1001–1006 Tao C, Ge Y, Song Q, Ge Y, Omitaomu OA (2014) Metric ranking of invariant networks with belief propagation. In: 2014 IEEE international conference on data mining. IEEE, pp 1001–1006
go back to reference Tong H, Faloutsos C, Pan J-Y (2006) Fast random walk with restart and its applications Tong H, Faloutsos C, Pan J-Y (2006) Fast random walk with restart and its applications
Metadata
Title
WebHound: a data-driven intrusion detection from real-world web access logs
Authors
Te-En Wei
Hahn-Ming Lee
Albert B. Jeng
Hemank Lamba
Christos Faloutsos
Publication date
18-01-2019
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 22/2019
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-018-03750-1

Other articles of this Issue 22/2019

Soft Computing 22/2019 Go to the issue

Premium Partner