research-article

Public Access

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning

Authors:
Min Du

University of Utah, Salt Lake City, UT, USA

University of Utah, Salt Lake City, UT, USA
View Profile

,
Feifei Li

University of Utah, Salt Lake City, UT, USA

University of Utah, Salt Lake City, UT, USA
View Profile

,
Guineng Zheng

University of Utah, Salt Lake City, UT, USA

University of Utah, Salt Lake City, UT, USA
View Profile

,
Vivek Srikumar

University of Utah, Salt Lake City, UT, USA

University of Utah, Salt Lake City, UT, USA
View Profile

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications SecurityOctober 2017Pages 1285–1298https://doi.org/10.1145/3133956.3134015

Published:30 October 2017Publication History

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Pages 1285–1298

ABSTRACT

Anomaly detection is a critical step towards building a secure and trustworthy system. The primary purpose of a system log is to record system states and significant events at various critical points to help debug system failures and perform root cause analysis. Such log data is universally available in nearly all computer systems. Log data is an important and valuable resource for understanding system status and performance issues; therefore, the various system logs are naturally excellent source of information for online monitoring and anomaly detection. We propose DeepLog, a deep neural network model utilizing Long Short-Term Memory (LSTM), to model a system log as a natural language sequence. This allows DeepLog to automatically learn log patterns from normal execution, and detect anomalies when log patterns deviate from the model trained from log data under normal execution. In addition, we demonstrate how to incrementally update the DeepLog model in an online fashion so that it can adapt to new log patterns over time. Furthermore, DeepLog constructs workflows from the underlying system log so that once an anomaly is detected, users can diagnose the detected anomaly and perform root cause analysis effectively. Extensive experimental evaluations over large log data have shown that DeepLog has outperformed other existing log-based anomaly detection methods based on traditional data mining methodologies.

Supplemental Material

References

VAST Challenge 2011. 2011. MC2 - Computer Networking Operations. (2011). http://hcil2.cs.umd.edu/newvarepository/VAST%20Challenge%202011/challenges/MC2%20-%20Computer%20Networking%20Operations/ [Online; accessed 08-May-2017].Google Scholar
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et almbox. 2016 TensorFlow: A system for large-scale machine learning Proc. USENIX Symposium on Operating Systems Design and Implementation (OSDI). 264--285.Google Scholar
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin 2003. A neural probabilistic language model. Journal of machine learning research Vol. 3, Feb (2003), 1137--1155.Google ScholarDigital Library
Ivan Beschastnikh, Yuriy Brun, Michael D Ernst, and Arvind Krishnamurthy 2014. Inferring models of concurrent systems from logs of their behavior with CSight Proc. International Conference on Software Engineering (ICSE ). 468--479.Google Scholar
Andrea Bittau, Adam Belay, Ali Mashtizadeh, David Mazières, and Dan Boneh. 2014. Hacking blind Security and Privacy (SP), 2014 IEEE Symposium on. IEEE, 227--242.Google Scholar
François Chollet. 2015. keras. https://github.com/fchollet/keras. (2015). [Online; accessed 08-May-2017].Google Scholar
Marcello Cinque, Domenico Cotroneo, and Antonio Pecchia. 2013. Event logs for the analysis of software failures: A rule-based approach. IEEE Transactions on Software Engineering (TSE) (2013), 806--821. Google ScholarDigital Library
Andrew M Dai and Quoc V Le 2015. Semi-supervised sequence learning. In Proc. Neural Information Processing Systems Conference (NIPS). 3079--3087.Google Scholar
Min Du and Feifei Li. 2016. Spell: Streaming Parsing of System Event Logs. In Proc. IEEE International Conference on Data Mining (ICDM). 859--864. Google ScholarCross Ref
Min Du and Feifei Li. 2017. ATOM: Efficient Tracking, Monitoring, and Orchestration of Cloud Resources. IEEE Transactions on Parallel and Distributed Systems (2017).Google Scholar
Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis Proc. IEEE International Conference on Data Mining (ICDM). 149--158.Google Scholar
Yoav Goldberg. 2016. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research Vol. 57 (2016), 345--420.Google ScholarDigital Library
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.Google ScholarDigital Library
Hossein Hamooni, Biplob Debnath, Jianwu Xu, Hui Zhang, Guofei Jiang, and Abdullah Mueen. 2016. LogMine: Fast Pattern Recognition for Log Analytics Proc. Conference on Information and Knowledge Management (CIKM). 1573--1582. Google ScholarDigital Library
Stephen E Hansen and E Todd Atkins 1993. Automated System Monitoring and Notification with Swatch. Proc. Large Installation System Administration Conference (LISA). 145--152.Google Scholar
Pinjia He, Jieming Zhu, Shilin He, Jian Li, and Michael R Lyu 2016. An evaluation study on log parsing and its use in log mining Proc. International Conference on Dependable Systems and Networks (DSN). 654--661.Google Scholar
Shilin He, Jieming Zhu, Pinjia He, and Michael R Lyu. 2016. Experience Report: System Log Analysis for Anomaly Detection Proc. International Symposium on Software Reliability Engineering (ISSRE). 207--218. Google ScholarCross Ref
Sepp Hochreiter and Jürgen Schmidhuber 1997. Long short-term memory. Neural computation (1997), 1735--1780. Google ScholarDigital Library
Qingwei Lin, Hongyu Zhang, Jian-Guang Lou, Yu Zhang, and Xuewei Chen 2016. Log clustering based problem identification for online service systems Proc. International Conference on Software Engineering (ICSE ). 102--111.Google Scholar
Chaochun Liu, Huan Sun, Nan Du, Shulong Tan, Hongliang Fei, Wei Fan, Tao Yang, Hao Wu, Yaliang Li, and Chenwei Zhang. 2016. Augmented LS™ Framework to Construct Medical Self-diagnosis Android Proc. IEEE International Conference on Data Mining (ICDM). 251--260.Google Scholar
Jian-Guang Lou, Qiang Fu, Shengqi Yang, Jiang Li, and Bin Wu 2010. Mining program workflow from interleaved traces. Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). Google ScholarDigital Library
Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li 2010. Mining Invariants from Console Logs for System Problem Detection. Proc. USENIX Annual Technical Conference (ATC). 231--244.Google Scholar
Adetokunbo AO Makanju, A Nur Zincir-Heywood, and Evangelos E Milios 2009. Clustering event logs using iterative partitioning Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). 1255--1264.Google Scholar
Christopher D Manning and Hinrich Schütze 1999. Foundations of statistical natural language processing. MIT Press.Google ScholarDigital Library
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model.. In Interspeech, Vol. Vol. 2. 3.Google ScholarCross Ref
Karthik Nagaraj, Charles Killian, and Jennifer Neville. 2012. Structured comparative analysis of systems logs to diagnose performance problems Proc. USENIX Symposium on Networked Systems Design and Implementation (NSDI). 26--26.Google Scholar
Christopher Olah. 2015. Understanding LS™ Networks. (2015). http://colah.github.io/posts/2015-08-Understanding-LSTMsshownote[Online; accessed 16-May-2017].Google Scholar
Alina Oprea, Zhou Li, Ting-Fang Yen, Sang H Chin, and Sumayah Alrwais 2015. Detection of early-stage enterprise infection by mining large-scale log data Proc. International Conference on Dependable Systems and Networks (DSN). 45--56.Google Scholar
James E Prewett. 2003. Analyzing cluster log files using Logsurfer. In Proc. Annual Conference on Linux Clusters.Google Scholar
Robert Ricci, Eric Eide, and The CloudLab Team. 2014. Introducing CloudLab: Scientific Infrastructure for Advancing Cloud Architectures and Applications. USENIX ;login:, Vol. 39, 6 (Dec. 2014). https://www.usenix.org/publications/login/dec14/ricciGoogle Scholar
John P Rouillard. 2004. Real-time Log File Analysis Using the Simple Event Correlator (SEC). Proc. Large Installation System Administration Conference (LISA). 133--150.Google Scholar
Sudip Roy, Arnd Christian König, Igor Dvorkin, and Manish Kumar 2015. Perfaugur: Robust diagnostics for performance anomalies in cloud services Proc. IEEE International Conference on Data Engineering (ICDE). IEEE, 1167--1178. Google ScholarCross Ref
Elastic Stack. 2017. The Open Source Elastic Stack. (2017). https://www.elastic.co/products[Online; accessed 16-May-2017].Google Scholar
Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM Neural Networks for Language Modeling.. In Interspeech. 194--197.Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks Proc. Neural Information Processing Systems Conference (NIPS). 3104--3112.Google Scholar
Liang Tang and Tao Li. 2010. LogTree: A framework for generating system events from raw textual logs Proc. IEEE International Conference on Data Mining (ICDM). 491--500. Google ScholarDigital Library
Liang Tang, Tao Li, and Chang-Shing Perng 2011. LogSig: Generating system events from raw textual logs Proc. Conference on Information and Knowledge Management (CIKM). 785--794. Google ScholarDigital Library
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael Jordan 2009. Online system problem detection by mining patterns of console logs Proc. IEEE International Conference on Data Mining (ICDM). 588--597.Google Scholar
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I Jordan 2009. Detecting large-scale system problems by mining console logs Proc. ACM Symposium on Operating Systems Principles (SOSP). 117--132.Google ScholarDigital Library
Kenji Yamanishi and Yuko Maruyama 2015. Dynamic syslog mining for network failure monitoring Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). 499--508.Google Scholar
Ting-Fang Yen, Alina Oprea, Kaan Onarlioglu, Todd Leetham, William Robertson, Ari Juels, and Engin Kirda 2013. Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks Proc. International Conference on Dependable Systems and Networks (ACSAC). 199--208. Google ScholarDigital Library
Xiao Yu, Pallavi Joshi, Jianwu Xu, Guoliang Jin, Hui Zhang, and Guofei Jiang. 2016. CloudSeer: Workflow Monitoring of Cloud Infrastructures via Interleaved Logs Proc. ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 489--502. Google ScholarDigital Library
Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and Shankar Pasupathy. 2010. SherLog: error diagnosis by connecting clues from run-time logs ACM SIGARCH computer architecture news. ACM, 143--154. Google ScholarDigital Library
Ke Zhang, Jianwu Xu, Martin Renqiang Min, Guofei Jiang, Konstantinos Pelechrinis, and Hui Zhang 2016. Automated IT system failure prediction: A deep learning approach Proc. IEEE International Conference on Big Data (IEEE BigData). 1291--1300.Google ScholarCross Ref
Xu Zhao, Kirk Rodrigues, Yu Luo, Ding Yuan, and Michael Stumm 2016. Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle. In Proc. USENIX Symposium on Operating Systems Design and Implementation (OSDI). 603--618.Google Scholar

Index Terms

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning
1. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Online analytical processing
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation

Recommendations

LAnoBERT: System log anomaly detection based on BERT masked language model
Abstract
The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining errors, intrusion and abnormal behaviors. The aim of system log anomaly detection is ...
Highlights
- We propose LAnoBERT, a new log parser-free and unsupervised framework
- We ...
Read More
Robust log-based anomaly detection on unstable log data
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Logs are widely used by large and complex software-intensive systems for troubleshooting. There have been a lot of studies on log-based anomaly detection. To detect the anomalies, the existing methods mainly construct a detection model using log event ...
Read More
Utilizing persistence for post facto suppression of invalid anomalies using system logs
ICSE-NIER '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results

The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
October 2017
2682 pages
ISBN:9781450349468
DOI:10.1145/3133956
General Chair:
Bhavani Thuraisingham
The University of Texas at Dallas, USA
,
Program Chairs:
David Evans
University of Virginia
,
Tal Malkin
Columbia University
,
Dongyan Xu
Purdue University
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
deep learning
log data analysis
Qualifiers
- research-article
Conference

Acceptance Rates
CCS '17 Paper Acceptance Rate151of836submissions,18%Overall Acceptance Rate1,261of6,999submissions,18%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 822
  Total Citations
  View Citations
- 18,296
  Total Downloads
- Downloads (Last 12 months)6,075
- Downloads (Last 6 weeks)1,018
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

LAnoBERT: System log anomaly detection based on BERT masked language model

Robust log-based anomaly detection on unstable log data

Utilizing persistence for post facto suppression of invalid anomalies using system logs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

LAnoBERT: System log anomaly detection based on BERT masked language model

Robust log-based anomaly detection on unstable log data

Utilizing persistence for post facto suppression of invalid anomalies using system logs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media