Lifelong Anomaly Detection Through Unlearning

Authors:
Min Du

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

,
Zhi Chen

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

,
Chang Liu

Citadel Securities, Chicago, IL, USA

Citadel Securities, Chicago, IL, USA
View Profile

,
Rajvardhan Oak

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

,
Dawn Song

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications SecurityNovember 2019Pages 1283–1297https://doi.org/10.1145/3319535.3363226

Published:06 November 2019Publication History

CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security

Pages 1283–1297

ABSTRACT

Anomaly detection is essential towards ensuring system security and reliability. Powered by constantly generated system data, deep learning has been found both effective and flexible to use, with its ability to extract patterns without much domain knowledge. Existing anomaly detection research focuses on a scenario referred to as zero-positive, which means that the detection model is only trained for normal (i.e., negative) data. In a real application scenario, there may be additional manually inspected positive data provided after the system is deployed. We refer to this scenario as lifelong anomaly detection. However, we find that existing approaches are not easy to adopt such new knowledge to improve system performance. In this work, we are the first to explore the lifelong anomaly detection problem, and propose novel approaches to handle corresponding challenges. In particular, we propose a framework called unlearning, which can effectively correct the model when a false negative (or a false positive) is labeled. To this aim, we develop several novel techniques to tackle two challenges referred to as exploding loss and catastrophic forgetting. In addition, we abstract a theoretical framework based on generative models. Under this framework, our unlearning approach can be presented in a generic way to be applied to most zero-positive deep learning-based anomaly detection algorithms to turn them into corresponding lifelong anomaly detection solutions. We evaluate our approach using two state-of-the-art zero-positive deep learning anomaly detection architectures and three real-world tasks. The results show that the proposed approach is able to significantly reduce the number of false positives and false negatives through unlearning.

Supplemental Material

p1283-shen.webm

webm

75 MB

Download

References

Charu C Aggarwal, Jiawei Han, Jianyong Wang, and Philip S Yu. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th international conference on Very large data bases-Volume 29. VLDB Endowment, 81--92.Google ScholarDigital Library
Feng Cao, Martin Estert, Weining Qian, and Aoying Zhou. 2006. Density-based clustering over an evolving data stream with noise. In Proceedings of the 2006 SIAM international conference on data mining. SIAM, 328--339.Google ScholarCross Ref
Yinzhi Cao and Junfeng Yang. 2015. Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy. IEEE, 463--480.Google ScholarDigital Library
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR), Vol. 41, 3 (2009), 15.Google ScholarDigital Library
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2012. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 5 (2012), 823--839.Google ScholarDigital Library
Sucheta Chauhan and Lovekesh Vig. 2015. Anomaly detection in ECG time signals via deep long short-term memory networks. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 1--7.Google ScholarCross Ref
Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 859--864.Google ScholarCross Ref
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1285--1298.Google ScholarDigital Library
Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, and Sal Stolfo. 2002. A geometric framework for unsupervised anomaly detection. In Applications of data mining in computer security. Springer, 77--101.Google Scholar
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise.. In Kdd, Vol. 96. 226--231.Google ScholarDigital Library
Li Fei-Fei, Rob Fergus, and Pietro Perona. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop. IEEE, 178--178.Google ScholarCross Ref
Robert M French. 1999. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, Vol. 3, 4 (1999), 128--135.Google Scholar
Stefan Glock, Eugen Gillich, Johannes Schaede, and Volker Lohweg. 2009. Feature extraction algorithm for banknote textures based on incomplete shift invariant wavelet packet transform. In Joint Pattern Recognition Symposium. Springer, 422--431.Google ScholarCross Ref
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning .MIT Press. http://www.deeplearningbook.org.Google ScholarDigital Library
Justin Gottschlich, Abdullah Muzahid, et al. 2017. AutoPerf: A Generalized Zero-Positive Learning System to Detect Software Performance Anomalies. arXiv preprint arXiv:1709.07536 (2017).Google Scholar
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 6645--6649.Google ScholarCross Ref
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google Scholar
Ling Huang, XuanLong Nguyen, Minos Garofalakis, Michael I Jordan, Anthony Joseph, and Nina Taft. 2007. In-network PCA and anomaly detection. In Advances in Neural Information Processing Systems. 617--624.Google Scholar
Kaggle. 2013. Credit Card Fraud Detection. https://www.kaggle.com/mlg-ulb/creditcardfraud [Online; accessed 19-April-2019].Google Scholar
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, Vol. 114, 13 (2017), 3521--3526.Google ScholarCross Ref
Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur. 2003. Bayesian event classification for intrusion detection. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 14--23.Google ScholarCross Ref
Tae Jun Lee, Justin Gottschlich, Nesime Tatbul, Eric Metcalf, and Stan Zdonik. 2018. Greenhouse: A Zero-Positive Machine Learning System for Time-Series Anomaly Detection. arXiv preprint arXiv:1801.03168 (2018).Google Scholar
Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li. 2010. Mining Invariants from Console Logs for System Problem Detection.. In USENIX Annual Technical Conference. 1--14.Google Scholar
Pankaj Malhotra, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal. 2015. Long short term memory networks for anomaly detection in time series. In Proceedings. Presses universitaires de Louvain, 89.Google Scholar
Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 (2018).Google Scholar
Andrew Y Ng and Michael I Jordan. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems. 841--848.Google Scholar
German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks (2019).Google Scholar
Razvan Pascanu, Jack W Stokes, Hermineh Sanossian, Mady Marinescu, and Anil Thomas. 2015. Malware classification with recurrent networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1916--1920.Google ScholarCross Ref
David E Rumelhart, Geoffrey E Hinton, Ronald J Williams, et al. 1988. Learning representations by back-propagating errors. Cognitive modeling, Vol. 5, 3 (1988), 1.Google Scholar
Mayu Sakurada and Takehisa Yairi. 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis. ACM, 4.Google ScholarDigital Library
Mahsa Salehi and Lida Rashidi. 2018. A survey on anomaly detection in evolving data:[with application to forest fire risk prediction. ACM SIGKDD Explorations Newsletter, Vol. 20, 1 (2018), 13--23.Google ScholarDigital Library
Joan Serrà, Didac Suris, Marius Miron, and Alexandros Karatzoglou. 2018. Overcoming catastrophic forgetting with hard attention to the task. arXiv preprint arXiv:1801.01423 (2018).Google Scholar
Yun Shen, Enrico Mariconti, Pierre Antoine Vervier, and Gianluca Stringhini. 2018. Tiresias: Predicting security events through deep learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 592--605.Google ScholarDigital Library
Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing functions in binaries with neural networks. In 24th USENIX Security Symposium (USENIX Security 15). 611--626.Google ScholarDigital Library
Adrian Taylor, Sylvain Leblanc, and Nathalie Japkowicz. 2016. Anomaly detection in automobile control network data with long short-term memory networks. In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 130--139.Google ScholarCross Ref
T. Tieleman and G. Hinton. 2012. Lecture 6.5 - RMSProp, COURSERA: Neural Networks for Machine Learning. Technical report (2012).Google Scholar
Venelin Valkov. 2017. Credit Card Fraud Detection using Autoencoders in Keras. https://github.com/curiousily/Credit-Card-Fraud-Detection-using-Autoencoders-in-Keras/blob/master/fraud_detection.ipynb [Online; accessed 19-April-2019].Google Scholar
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. [n.d.]. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. IEEE, 0.Google Scholar
Wei Xu. 2009. HDFS Log Dataset. http://iiis.tsinghua.edu.cn/ weixu/sospdata.html [Online; accessed 19-April-2019].Google Scholar
Wikipedia contributors. 2019 a. F1 score -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=F1_score&oldid=911716685. [Online; accessed 31-August-2019].Google Scholar
Wikipedia contributors. 2019 b. Zero-day (computing) -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Zero-day_(computing)&oldid=895202836. [Online; accessed 16-May-2019].Google Scholar
Rui Xu and Donald C Wunsch. 2005. Survey of clustering algorithms. (2005).Google Scholar
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, 117--132.Google ScholarDigital Library
Yahoo Research. 2015. A Benchmark Dataset for Time Series Anomaly Detection. https://yahooresearch.tumblr.com/post/114590420346/a-benchmark-dataset-for-time-series-anomaly [Online; accessed 19-April-2019].Google Scholar
Ke Zhang, Jianwu Xu, Martin Renqiang Min, Guofei Jiang, Konstantinos Pelechrinis, and Hui Zhang. 2016. Automated IT system failure prediction: A deep learning approach. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1291--1300.Google ScholarCross Ref
Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 665--674.Google ScholarDigital Library
Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. (2018).Google Scholar

Index Terms

Lifelong Anomaly Detection Through Unlearning

Recommendations

Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges
Abstract
Anomaly detection has recently been applied to various areas, and several techniques based on deep learning have been proposed for the analysis of multivariate time series. In this study, we classify the anomalies into three types, ...
Highlights
- The methods for anomaly detection on multivariate time series are reviewed.
- The ...
Read More
Toward Explainable Deep Anomaly Detection
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Anomaly explanation, also known as anomaly localization, is as important as, if not more than, anomaly detection in many real-world applications. However, it is challenging to build explainable detection models due to the lack of anomaly-supervisory ...
Read More
Deep Anomaly Detection with Deviation Networks
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
November 2019
2755 pages
ISBN:9781450367479
DOI:10.1145/3319535
General Chairs:
Lorenzo Cavallaro
King's College London, UK
,
Johannes Kinder
Bundeswehr University Munich, Germany
,
Program Chairs:
XiaoFeng Wang
Indiana University, USA
,
Jonathan Katz
George Mason University, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
online learning
unlearning
Qualifiers
- research-article
Conference

Acceptance Rates
CCS '19 Paper Acceptance Rate149of934submissions,16%Overall Acceptance Rate1,261of6,999submissions,18%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 4,920
  Total Downloads
- Downloads (Last 12 months)923
- Downloads (Last 6 weeks)96
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Lifelong Anomaly Detection Through Unlearning

CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges

Toward Explainable Deep Anomaly Detection

Deep Anomaly Detection with Deviation Networks