skip to main content
10.1145/3319535.3363226acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open Access

Lifelong Anomaly Detection Through Unlearning

Authors Info & Claims
Published:06 November 2019Publication History

ABSTRACT

Anomaly detection is essential towards ensuring system security and reliability. Powered by constantly generated system data, deep learning has been found both effective and flexible to use, with its ability to extract patterns without much domain knowledge. Existing anomaly detection research focuses on a scenario referred to as zero-positive, which means that the detection model is only trained for normal (i.e., negative) data. In a real application scenario, there may be additional manually inspected positive data provided after the system is deployed. We refer to this scenario as lifelong anomaly detection. However, we find that existing approaches are not easy to adopt such new knowledge to improve system performance. In this work, we are the first to explore the lifelong anomaly detection problem, and propose novel approaches to handle corresponding challenges. In particular, we propose a framework called unlearning, which can effectively correct the model when a false negative (or a false positive) is labeled. To this aim, we develop several novel techniques to tackle two challenges referred to as exploding loss and catastrophic forgetting. In addition, we abstract a theoretical framework based on generative models. Under this framework, our unlearning approach can be presented in a generic way to be applied to most zero-positive deep learning-based anomaly detection algorithms to turn them into corresponding lifelong anomaly detection solutions. We evaluate our approach using two state-of-the-art zero-positive deep learning anomaly detection architectures and three real-world tasks. The results show that the proposed approach is able to significantly reduce the number of false positives and false negatives through unlearning.

Skip Supplemental Material Section

Supplemental Material

p1283-shen.webm

webm

75 MB

References

  1. Charu C Aggarwal, Jiawei Han, Jianyong Wang, and Philip S Yu. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th international conference on Very large data bases-Volume 29. VLDB Endowment, 81--92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Feng Cao, Martin Estert, Weining Qian, and Aoying Zhou. 2006. Density-based clustering over an evolving data stream with noise. In Proceedings of the 2006 SIAM international conference on data mining. SIAM, 328--339.Google ScholarGoogle ScholarCross RefCross Ref
  3. Yinzhi Cao and Junfeng Yang. 2015. Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy. IEEE, 463--480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR), Vol. 41, 3 (2009), 15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2012. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 5 (2012), 823--839.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sucheta Chauhan and Lovekesh Vig. 2015. Anomaly detection in ECG time signals via deep long short-term memory networks. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  7. Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 859--864.Google ScholarGoogle ScholarCross RefCross Ref
  8. Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1285--1298.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, and Sal Stolfo. 2002. A geometric framework for unsupervised anomaly detection. In Applications of data mining in computer security. Springer, 77--101.Google ScholarGoogle Scholar
  10. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise.. In Kdd, Vol. 96. 226--231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Li Fei-Fei, Rob Fergus, and Pietro Perona. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop. IEEE, 178--178.Google ScholarGoogle ScholarCross RefCross Ref
  12. Robert M French. 1999. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, Vol. 3, 4 (1999), 128--135.Google ScholarGoogle Scholar
  13. Stefan Glock, Eugen Gillich, Johannes Schaede, and Volker Lohweg. 2009. Feature extraction algorithm for banknote textures based on incomplete shift invariant wavelet packet transform. In Joint Pattern Recognition Symposium. Springer, 422--431.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning .MIT Press. http://www.deeplearningbook.org.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Justin Gottschlich, Abdullah Muzahid, et al. 2017. AutoPerf: A Generalized Zero-Positive Learning System to Detect Software Performance Anomalies. arXiv preprint arXiv:1709.07536 (2017).Google ScholarGoogle Scholar
  16. Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 6645--6649.Google ScholarGoogle ScholarCross RefCross Ref
  17. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google ScholarGoogle Scholar
  18. Ling Huang, XuanLong Nguyen, Minos Garofalakis, Michael I Jordan, Anthony Joseph, and Nina Taft. 2007. In-network PCA and anomaly detection. In Advances in Neural Information Processing Systems. 617--624.Google ScholarGoogle Scholar
  19. Kaggle. 2013. Credit Card Fraud Detection. https://www.kaggle.com/mlg-ulb/creditcardfraud [Online; accessed 19-April-2019].Google ScholarGoogle Scholar
  20. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  21. James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, Vol. 114, 13 (2017), 3521--3526.Google ScholarGoogle ScholarCross RefCross Ref
  22. Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur. 2003. Bayesian event classification for intrusion detection. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 14--23.Google ScholarGoogle ScholarCross RefCross Ref
  23. Tae Jun Lee, Justin Gottschlich, Nesime Tatbul, Eric Metcalf, and Stan Zdonik. 2018. Greenhouse: A Zero-Positive Machine Learning System for Time-Series Anomaly Detection. arXiv preprint arXiv:1801.03168 (2018).Google ScholarGoogle Scholar
  24. Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li. 2010. Mining Invariants from Console Logs for System Problem Detection.. In USENIX Annual Technical Conference. 1--14.Google ScholarGoogle Scholar
  25. Pankaj Malhotra, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal. 2015. Long short term memory networks for anomaly detection in time series. In Proceedings. Presses universitaires de Louvain, 89.Google ScholarGoogle Scholar
  26. Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 (2018).Google ScholarGoogle Scholar
  27. Andrew Y Ng and Michael I Jordan. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems. 841--848.Google ScholarGoogle Scholar
  28. German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. 2019. Continual lifelong learning with neural networks: A review. Neural Networks (2019).Google ScholarGoogle Scholar
  29. Razvan Pascanu, Jack W Stokes, Hermineh Sanossian, Mady Marinescu, and Anil Thomas. 2015. Malware classification with recurrent networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1916--1920.Google ScholarGoogle ScholarCross RefCross Ref
  30. David E Rumelhart, Geoffrey E Hinton, Ronald J Williams, et al. 1988. Learning representations by back-propagating errors. Cognitive modeling, Vol. 5, 3 (1988), 1.Google ScholarGoogle Scholar
  31. Mayu Sakurada and Takehisa Yairi. 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis. ACM, 4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mahsa Salehi and Lida Rashidi. 2018. A survey on anomaly detection in evolving data:[with application to forest fire risk prediction. ACM SIGKDD Explorations Newsletter, Vol. 20, 1 (2018), 13--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Joan Serrà, Didac Suris, Marius Miron, and Alexandros Karatzoglou. 2018. Overcoming catastrophic forgetting with hard attention to the task. arXiv preprint arXiv:1801.01423 (2018).Google ScholarGoogle Scholar
  34. Yun Shen, Enrico Mariconti, Pierre Antoine Vervier, and Gianluca Stringhini. 2018. Tiresias: Predicting security events through deep learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 592--605.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing functions in binaries with neural networks. In 24th USENIX Security Symposium (USENIX Security 15). 611--626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Adrian Taylor, Sylvain Leblanc, and Nathalie Japkowicz. 2016. Anomaly detection in automobile control network data with long short-term memory networks. In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 130--139.Google ScholarGoogle ScholarCross RefCross Ref
  37. T. Tieleman and G. Hinton. 2012. Lecture 6.5 - RMSProp, COURSERA: Neural Networks for Machine Learning. Technical report (2012).Google ScholarGoogle Scholar
  38. Venelin Valkov. 2017. Credit Card Fraud Detection using Autoencoders in Keras. https://github.com/curiousily/Credit-Card-Fraud-Detection-using-Autoencoders-in-Keras/blob/master/fraud_detection.ipynb [Online; accessed 19-April-2019].Google ScholarGoogle Scholar
  39. Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. [n.d.]. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. IEEE, 0.Google ScholarGoogle Scholar
  40. Wei Xu. 2009. HDFS Log Dataset. http://iiis.tsinghua.edu.cn/ weixu/sospdata.html [Online; accessed 19-April-2019].Google ScholarGoogle Scholar
  41. Wikipedia contributors. 2019 a. F1 score -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=F1_score&oldid=911716685. [Online; accessed 31-August-2019].Google ScholarGoogle Scholar
  42. Wikipedia contributors. 2019 b. Zero-day (computing) -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Zero-day_(computing)&oldid=895202836. [Online; accessed 16-May-2019].Google ScholarGoogle Scholar
  43. Rui Xu and Donald C Wunsch. 2005. Survey of clustering algorithms. (2005).Google ScholarGoogle Scholar
  44. Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, 117--132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yahoo Research. 2015. A Benchmark Dataset for Time Series Anomaly Detection. https://yahooresearch.tumblr.com/post/114590420346/a-benchmark-dataset-for-time-series-anomaly [Online; accessed 19-April-2019].Google ScholarGoogle Scholar
  46. Ke Zhang, Jianwu Xu, Martin Renqiang Min, Guofei Jiang, Konstantinos Pelechrinis, and Hui Zhang. 2016. Automated IT system failure prediction: A deep learning approach. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1291--1300.Google ScholarGoogle ScholarCross RefCross Ref
  47. Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 665--674.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. (2018).Google ScholarGoogle Scholar

Index Terms

  1. Lifelong Anomaly Detection Through Unlearning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
          November 2019
          2755 pages
          ISBN:9781450367479
          DOI:10.1145/3319535

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 6 November 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CCS '19 Paper Acceptance Rate149of934submissions,16%Overall Acceptance Rate1,261of6,999submissions,18%

          Upcoming Conference

          CCS '24
          ACM SIGSAC Conference on Computer and Communications Security
          October 14 - 18, 2024
          Salt Lake City , UT , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader