Abstract
Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user’s semantic understanding of the anomalies.
The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of B queries is spent. The goal of our approach is to maximize the total number of true anomalies in the B instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.
- Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, NY, 504--509.Google ScholarDigital Library
- Charu C. Aggarwal. 2013. Outlier ensembles: Position paper. ACM SIGKDD Explorations Newsletter 14, 2 (April, 2013), 49--58.Google ScholarDigital Library
- Magnus Almgren and Erland Jonsson. 2004. Using active learning in intrusion detection. In Proceedings of the 17th IEEE Computer Security Foundations Workshop. 88--99.Google ScholarDigital Library
- Stephen Boyd, Mehryar Mohri, Corinna Cortes, and Ana Radovanovic. 2012. Accuracy at the top. In Proceedings of the Advances in Neural Information Processing Systems.Google Scholar
- Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 93--104.Google ScholarDigital Library
- David A. Cohn. 1994. Neural network exploration using optimal experiment design. In Proceedings of the Advances in Neural Information Processing Systems. 679--686.Google Scholar
- Shubhomoy Das, Weng-Keen Wong, Thomas G. Dietterich, Alan Fern, and Andrew Emmott. 2016. Incorporating expert feedback into active anomaly discovery. In Proceedings of the IEEE International Conference on Data Mining. 853--858.Google ScholarCross Ref
- Ethan Dereszynski and Thomas Dietterich. 2007. Probabilistic models for anomaly detection in remote sensor data streams. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 75--82.Google Scholar
- Pinar Donmez and Jaime G. Carbonell. 2008. Optimizing estimated loss reduction for active sampling in rank learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 248--255.Google Scholar
- Pinar Donmez and Jaime G. Carbonell. 2009. Active sampling for rank learning via optimizing the area under the ROC curve. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval. Springer, 78--89.Google Scholar
- Andrew Emmott, Shubhomoy Das, Thomas G. Dietterich, Alan Fern, and Weng-Keen Wong. 2015. Systematic construction of anomaly detection benchmarks from real data. CoRR abs/1503.01158 (2015). http://arxiv.org/abs/1503.01158Google Scholar
- Nico Görnitz, Marius Kloft, Konrad Rieck, and Ulf Brefeld. 2013. Toward Supervised Anomaly Detection. Journal of Artificial Intelligence Research46, 1 (2013), 235--262.Google ScholarCross Ref
- Martin Grill and Tomáš Pevný. 2016. Learning combination of anomaly detectors for security domain. Computer Networks 107, P1 (2016), 1--9.Google ScholarDigital Library
- Jingrui He and Jaime Carbonell. 2008. Nearest-neighbor-based active learning for rare category detection. In Advances in Neural Information Processing Systems 20, J.C. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.). MIT Press, Cambridge, MA, 633--640.Google Scholar
- Jingrui He and Jaime Carbonell. 2008. Rare class discovery based on active learning. In Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics.Google Scholar
- Jingrui He, Yan Liu, and Richard Lawrence. 2008. Graph-based rare category detection. In Proceedings of the 8th IEEE International Conference on Data Mining. 833--838.Google ScholarDigital Library
- Thorsten Joachims. 2002. Optimizing Search Engines using Clickthrough Data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarDigital Library
- Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain. 2015. Surrogate functions for maximizing precision at the top. In Proceedings of the 32nd International Conference on International Conference on Machine Learning. 189--198.Google Scholar
- Narendra Karmarkar. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing. ACM, 302--311.Google ScholarDigital Library
- Edwin M. Knorr and Raymond T. Ng. 1998. Algorithms for mining distance-based outliers in large datasets. In Proceedings of the 24th International Conference on Very Large Data Bases. Morgan Kaufmann, 392--403.Google ScholarDigital Library
- Roger Koenker. 2005. Quantile Regression (Econometric Society Monographs). Cambridge University Press.Google Scholar
- Aleksander Lazarevic and Vipin Kumar. 2005. Feature bagging for outlier detection. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York, NY, 157--166.Google ScholarDigital Library
- Moshe Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In Proceedings of the 8th IEEE International Conference on Data Mining. 413--422.Google ScholarDigital Library
- Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331.Google ScholarDigital Library
- Bo Long, Olivier Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, and Belle Tseng. 2010. Active learning for ranking through expected loss optimization. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 267--274.Google ScholarDigital Library
- Nir Nissim, Aviad Cohen, Robert Moskovitch, Assaf Shabtai, Mattan Edry, Oren Bar-Ad, and Yuval Elovici. 2014. ALPD: Active learning framework for enhancing the detection of malicious PDF files. In Proceedings of the 2014 IEEE Joint Intelligence and Security Informatics Conference. 91--98.Google ScholarDigital Library
- Dan Pelleg and Andrew W. Moore. 2004. Active learning for anomaly and rare-category detection. In Proceedings of the Advances in Neural Information Processing Systems. 1073--1080.Google Scholar
- Tomáš Pevný. 2015. Loda: Lightweight on-line detector of anomalies. Machine Learning 102, 2 (2015), 275--304.Google ScholarDigital Library
- Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning. ACM, 784--791.Google ScholarDigital Library
- Shebuti Rayana and Leman Akoglu. 2016. Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 42:1--42:33.Google Scholar
- Ted E. Senator, Henry G. Goldberg, Alex Memory, William T. Young, Brad Rees, Robert Pierce, Daniel Huang, Matthew Reardon, David A. Bader, Edmond Chow, Irfan Essa, Joshua Jones, Vinay Bettadapura, Duen Horng Chau, Oded Green, Oguz Kaya, Anita Zakrzewska, Erica Briscoe, Rudolph IV L. Mappus, Robert McColl, Lora Weiss, Thomas G. Dietterich, Alan Fern, Weng-Keen Wong, Shubhomoy Das, Andrew Emmott, Jed Irvine, Jay-Yoon Lee, Danai Koutra, Christos Faloutsos, Daniel Corkill, Lisa Friedland, Amanda Gentzel, and David Jensen. 2013. Detecting insider threats in a real corporate database of computer usage activity. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1393--1401.Google ScholarDigital Library
- Burr Settles. 2010. Active learning literature survey. Technical Report. University of Wisconsin, Madison, 52, 55-66.Google Scholar
- Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1069--1078.Google ScholarCross Ref
- H. Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the ACM Workshop on Computational Learning Theory. 1289--1296.Google ScholarDigital Library
- Jack W. Stokes, John C. Platt, Joseph Kravis, and Michael Shilman. 2008. ALADIN: Active learning of anomalies to detect intrusions. Technique Report. Microsoft Network Security Redmond, WA 98052 (2008).Google Scholar
- Simon Tong and Daphne Koller. 2000. Support vector machine active learning with applications to text classification. In Proceedings of the 17th International Conference on Machine Learning. 999--1006.Google Scholar
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579--2605.Google Scholar
- Pavan Vatturi and Weng-Keen Wong. 2009. Category detection using hierarchical mean shift. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 847--856.Google ScholarDigital Library
- Kalyan Veeramachaneni, Ignacio Arnaldo, Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, and Ke Li. 2016. AI2: Training a big data machine to defend. In Proceedings of the IEEE International Conference on Big Data Security.Google Scholar
- Weng-Keen Wong, Andrew Moore, Gregory Cooper, and Michael Wagner. 2005. What’s strange about recent events (WSARE): An algorithm for the early detection of disease outbreaks. Journal of Machine Learning Research 6, Dec (2005), 1961--1998.Google Scholar
- Kevin S. Woods, Christopher C. Doss, Kevin W. Bowyer, Jeffrey L. Solka, Carey E. Priebe, and W. Philip Kegelmeyer. 1993. Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 07, 06 (1993), 1417--1436.Google ScholarCross Ref
- Zuobing Xu, Ram Akella, and Yi Zhang. 2007. Incorporating diversity and density in active learning for relevance feedback. In Advances in Information Retrieval. Lecture Notes in Computer Science, Vol. 4425. Springer, Berlin, Chapter 24, 246--257.Google Scholar
- Arthur Zimek, Ricardo J. G. B. Campello, and Jörg Sander. 2014. Ensembles for unsupervised outlier detection: Challenges and research questions a position paper. ACM SIGKDD Explorations Newsletter 15, 1 (March, 2014), 11--22.Google ScholarDigital Library
Index Terms
- Discovering Anomalies by Incorporating Feedback from an Expert
Recommendations
Robust Anomaly Detection and Localization via Simulated Anomalies
VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in IndustryAnomaly detection refers to identifying abnormal images and localizing anomalous regions. Reconstruction-based anomaly detection is a commonly used method; however, traditional reconstruction-based methods perform poorly as deep models generalize ...
Anomaly Detection with Partially Observed Anomalies
WWW '18: Companion Proceedings of the The Web Conference 2018In this paper, we consider the problem of anomaly detection. Previous studies mostly deal with this task in either supervised or unsupervised manner according to whether label information is available. However, there always exists settings which are ...
Diagnosing network-wide traffic anomalies
Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret ...
Comments