ABSTRACT
Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature representations to enable downstream anomaly detection methods, perform indirect optimization of anomaly scores, leading to data-inefficient learning and suboptimal anomaly scoring. Also, they are typically designed as unsupervised learning due to the lack of large-scale labeled anomaly data. As a result, they are difficult to leverage prior knowledge (e.g., a few labeled anomalies) when such information is available as in many real-world anomaly detection applications. This paper introduces a novel anomaly detection framework and its instantiation to address these problems. Instead of representation learning, our method fulfills an end-to-end learning of anomaly scores by a neural deviation learning, in which we leverage a few (e.g., multiple to dozens) labeled anomalies and a prior probability to enforce statistically significant deviations of the anomaly scores of anomalies from that of normal data objects in the upper tail. Extensive results show that our method can be trained substantially more data-efficiently and achieves significantly better anomaly scoring than state-of-the-art competing methods.
- Charu C Aggarwal. 2017. Supervised outlier detection. In Outlier Analysis. Springer, 219--248.Google Scholar
- Jinghui Chen, Saket Sathe, Charu Aggarwal, and Deepak Turaga. 2017. Outlier detection with autoencoder ensembles. In SDM. SIAM, 90--98.Google Scholar
- Franccois Chollet et al. 2015. Keras. https://keras.io.Google Scholar
- Charles Elkan and Keith Noto. 2008. Learning classifiers from only positive and unlabeled data. In KDD. ACM, 213--220. Google ScholarDigital Library
- Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, 4 (2006), 594--611. Google ScholarDigital Library
- R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality Reduction by Learning an Invariant Mapping. In CVPR, Vol. 2. 1735--1742. Google ScholarDigital Library
- Simon Hawkins, Hongxing He, Graham Williams, and Rohan Baxter. 2002. Outlier detection using replicator neural networks. In DaWaK. Springer, 170--180. Google ScholarDigital Library
- Geoffrey Hinton. 2012. Overview of mini-batch gradient descent. (2012). https://www.cs.toronto.edu/ tijmen/csc321/slides/lecture_slides_lec6.pdfGoogle Scholar
- Fabian Keller, Emmanuel Muller, and Klemens Bohm. 2012. HiCS: High contrast subspaces for density-based outlier ranking. In ICDE. IEEE, 1037--1048. Google ScholarDigital Library
- Hans-Peter Kriegel, Peer Kroger, Erich Schubert, and Arthur Zimek. 2011. Interpreting and unifying outlier scores. In SDM. SIAM, 13--24.Google Scholar
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436.Google Scholar
- Ping Li, Trevor J Hastie, and Kenneth W Church. 2006. Very sparse random projections. In KDD. ACM, 287--296. Google ScholarDigital Library
- Xiaoli Li and Bing Liu. 2003. Learning to classify texts using positive and unlabeled data. In IJCAI, Vol. 3. 587--592. Google ScholarDigital Library
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data, Vol. 6, 1 (2012), 3. Google ScholarDigital Library
- Justin Ma, Lawrence K Saul, Stefan Savage, and Geoffrey M Voelker. 2009. Identifying suspicious URLs: An application of large-scale online learning. In ICML. ACM, 681--688. Google ScholarDigital Library
- Mary McGlohon, Stephen Bay, Markus G Anderle, David M Steier, and Christos Faloutsos. 2009. SNARE: A link analytic system for graph labeling and risk detection. In KDD. ACM, 1265--1274. Google ScholarDigital Library
- Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Military Communications and Information Systems Conference, 2015. 1--6.Google ScholarCross Ref
- Guansong Pang, Longbing Cao, Ling Chen, Defu Lian, and Huan Liu. 2018. Sparse modeling-based sequential ensemble learning for effective outlier detection in high-dimensional numeric data. In AAAI. AAAI press, 3892--3899.Google Scholar
- Guansong Pang, Longbing Cao, Ling Chen, and Huan Liu. 2018. Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection. In KDD. 2041--2050. Google ScholarDigital Library
- Lukas Ruff, Nico Görnitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Robert Vandermeulen, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In ICML. 4390--4399.Google Scholar
- Emanuele Sansone, Francesco GB De Natale, and Zhi-Hua Zhou. 2018. Efficient training for positive unlabeled learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018).Google Scholar
- Thomas Schlegl, Philipp Seeböck, Sebastian M Waldstein, Ursula Schmidt-Erfurth, and Georg Langs. 2017. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In IPMI. Springer, Cham, 146--157.Google Scholar
- Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval .Cambridge University Press.Google Scholar
- Md Amran Siddiqui, Alan Fern, Thomas G. Dietterich, Ryan Wright, Alec Theriault, and David W. Archer. 2018. Feedback-Guided Anomaly Discovery via Online Optimization. In KDD. ACM, 2200--2209. Google ScholarDigital Library
- Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. In NeurIPS. 4077--4087. Google ScholarDigital Library
- Acar Tamersoy, Kevin Roundy, and Duen Horng Chau. 2014. Guilt by association: Large scale malware detection by mining file-relation graphs. In KDD. 1524--1533. Google ScholarDigital Library
- David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine Learning, Vol. 54, 1 (2004), 45--66. Google ScholarDigital Library
- RF Woolson. 2007. Wilcoxon signed-rank test. Wiley Encyclopedia of Clinical Trials (2007), 1--3.Google Scholar
- Houssam Zenati, Manon Romain, Chuan-Sheng Foo, Bruno Lecouat, and Vijay Chandrasekhar. 2018. Adversarially Learned Anomaly Detection. In ICDM. IEEE, 727--736.Google Scholar
- Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In KDD. ACM, 665--674. Google ScholarDigital Library
Index Terms
- Deep Anomaly Detection with Deviation Networks
Recommendations
Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningWe consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset. This is a common scenario in many important applications. Existing related methods either exclusively fit the ...
Toward Explainable Deep Anomaly Detection
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningAnomaly explanation, also known as anomaly localization, is as important as, if not more than, anomaly detection in many real-world applications. However, it is challenging to build explainable detection models due to the lack of anomaly-supervisory ...
Deep Learning for Anomaly Detection: Challenges, Methods, and Opportunities
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data MiningIn this tutorial we aim to present a comprehensive survey of the advances in deep learning techniques specifically designed for anomaly detection (deep anomaly detection for short). Deep learning has gained tremendous success in transforming many data ...
Comments