research-article

Discovering Anomalies by Incorporating Feedback from an Expert

Authors:
Shubhomoy Das

Oregon State University, SW Park Terrace, Corvallis, Oregon

Oregon State University, SW Park Terrace, Corvallis, Oregon
View Profile

,
Weng-Keen Wong

Oregon State University, SW Park Terrace, Corvallis, Oregon

Oregon State University, SW Park Terrace, Corvallis, Oregon
View Profile

,
Thomas Dietterich

Oregon State University, SW Park Terrace, Corvallis, Oregon

Oregon State University, SW Park Terrace, Corvallis, Oregon
View Profile

,
Alan Fern

Oregon State University, SW Park Terrace, Corvallis, Oregon

Oregon State University, SW Park Terrace, Corvallis, Oregon
View Profile

,
Andrew Emmott

Oregon State University, SW Park Terrace, Corvallis, Oregon

Oregon State University, SW Park Terrace, Corvallis, Oregon
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 14 Issue 4Article No.: 49pp 1–32https://doi.org/10.1145/3396608

Published:22 June 2020Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user’s semantic understanding of the anomalies.

The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of B queries is spent. The goal of our approach is to maximize the total number of true anomalies in the B instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.

References

Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, NY, 504--509.Google ScholarDigital Library
Charu C. Aggarwal. 2013. Outlier ensembles: Position paper. ACM SIGKDD Explorations Newsletter 14, 2 (April, 2013), 49--58.Google ScholarDigital Library
Magnus Almgren and Erland Jonsson. 2004. Using active learning in intrusion detection. In Proceedings of the 17th IEEE Computer Security Foundations Workshop. 88--99.Google ScholarDigital Library
Stephen Boyd, Mehryar Mohri, Corinna Cortes, and Ana Radovanovic. 2012. Accuracy at the top. In Proceedings of the Advances in Neural Information Processing Systems.Google Scholar
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 93--104.Google ScholarDigital Library
David A. Cohn. 1994. Neural network exploration using optimal experiment design. In Proceedings of the Advances in Neural Information Processing Systems. 679--686.Google Scholar
Shubhomoy Das, Weng-Keen Wong, Thomas G. Dietterich, Alan Fern, and Andrew Emmott. 2016. Incorporating expert feedback into active anomaly discovery. In Proceedings of the IEEE International Conference on Data Mining. 853--858.Google ScholarCross Ref
Ethan Dereszynski and Thomas Dietterich. 2007. Probabilistic models for anomaly detection in remote sensor data streams. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 75--82.Google Scholar
Pinar Donmez and Jaime G. Carbonell. 2008. Optimizing estimated loss reduction for active sampling in rank learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 248--255.Google Scholar
Pinar Donmez and Jaime G. Carbonell. 2009. Active sampling for rank learning via optimizing the area under the ROC curve. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval. Springer, 78--89.Google Scholar
Andrew Emmott, Shubhomoy Das, Thomas G. Dietterich, Alan Fern, and Weng-Keen Wong. 2015. Systematic construction of anomaly detection benchmarks from real data. CoRR abs/1503.01158 (2015). http://arxiv.org/abs/1503.01158Google Scholar
Nico Görnitz, Marius Kloft, Konrad Rieck, and Ulf Brefeld. 2013. Toward Supervised Anomaly Detection. Journal of Artificial Intelligence Research46, 1 (2013), 235--262.Google ScholarCross Ref
Martin Grill and Tomáš Pevný. 2016. Learning combination of anomaly detectors for security domain. Computer Networks 107, P1 (2016), 1--9.Google ScholarDigital Library
Jingrui He and Jaime Carbonell. 2008. Nearest-neighbor-based active learning for rare category detection. In Advances in Neural Information Processing Systems 20, J.C. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.). MIT Press, Cambridge, MA, 633--640.Google Scholar
Jingrui He and Jaime Carbonell. 2008. Rare class discovery based on active learning. In Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics.Google Scholar
Jingrui He, Yan Liu, and Richard Lawrence. 2008. Graph-based rare category detection. In Proceedings of the 8th IEEE International Conference on Data Mining. 833--838.Google ScholarDigital Library
Thorsten Joachims. 2002. Optimizing Search Engines using Clickthrough Data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarDigital Library
Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain. 2015. Surrogate functions for maximizing precision at the top. In Proceedings of the 32nd International Conference on International Conference on Machine Learning. 189--198.Google Scholar
Narendra Karmarkar. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing. ACM, 302--311.Google ScholarDigital Library
Edwin M. Knorr and Raymond T. Ng. 1998. Algorithms for mining distance-based outliers in large datasets. In Proceedings of the 24th International Conference on Very Large Data Bases. Morgan Kaufmann, 392--403.Google ScholarDigital Library
Roger Koenker. 2005. Quantile Regression (Econometric Society Monographs). Cambridge University Press.Google Scholar
Aleksander Lazarevic and Vipin Kumar. 2005. Feature bagging for outlier detection. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York, NY, 157--166.Google ScholarDigital Library
Moshe Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In Proceedings of the 8th IEEE International Conference on Data Mining. 413--422.Google ScholarDigital Library
Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331.Google ScholarDigital Library
Bo Long, Olivier Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, and Belle Tseng. 2010. Active learning for ranking through expected loss optimization. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 267--274.Google ScholarDigital Library
Nir Nissim, Aviad Cohen, Robert Moskovitch, Assaf Shabtai, Mattan Edry, Oren Bar-Ad, and Yuval Elovici. 2014. ALPD: Active learning framework for enhancing the detection of malicious PDF files. In Proceedings of the 2014 IEEE Joint Intelligence and Security Informatics Conference. 91--98.Google ScholarDigital Library
Dan Pelleg and Andrew W. Moore. 2004. Active learning for anomaly and rare-category detection. In Proceedings of the Advances in Neural Information Processing Systems. 1073--1080.Google Scholar
Tomáš Pevný. 2015. Loda: Lightweight on-line detector of anomalies. Machine Learning 102, 2 (2015), 275--304.Google ScholarDigital Library
Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning. ACM, 784--791.Google ScholarDigital Library
Shebuti Rayana and Leman Akoglu. 2016. Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 42:1--42:33.Google Scholar
Ted E. Senator, Henry G. Goldberg, Alex Memory, William T. Young, Brad Rees, Robert Pierce, Daniel Huang, Matthew Reardon, David A. Bader, Edmond Chow, Irfan Essa, Joshua Jones, Vinay Bettadapura, Duen Horng Chau, Oded Green, Oguz Kaya, Anita Zakrzewska, Erica Briscoe, Rudolph IV L. Mappus, Robert McColl, Lora Weiss, Thomas G. Dietterich, Alan Fern, Weng-Keen Wong, Shubhomoy Das, Andrew Emmott, Jed Irvine, Jay-Yoon Lee, Danai Koutra, Christos Faloutsos, Daniel Corkill, Lisa Friedland, Amanda Gentzel, and David Jensen. 2013. Detecting insider threats in a real corporate database of computer usage activity. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1393--1401.Google ScholarDigital Library
Burr Settles. 2010. Active learning literature survey. Technical Report. University of Wisconsin, Madison, 52, 55-66.Google Scholar
Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1069--1078.Google ScholarCross Ref
H. Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the ACM Workshop on Computational Learning Theory. 1289--1296.Google ScholarDigital Library
Jack W. Stokes, John C. Platt, Joseph Kravis, and Michael Shilman. 2008. ALADIN: Active learning of anomalies to detect intrusions. Technique Report. Microsoft Network Security Redmond, WA 98052 (2008).Google Scholar
Simon Tong and Daphne Koller. 2000. Support vector machine active learning with applications to text classification. In Proceedings of the 17th International Conference on Machine Learning. 999--1006.Google Scholar
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579--2605.Google Scholar
Pavan Vatturi and Weng-Keen Wong. 2009. Category detection using hierarchical mean shift. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 847--856.Google ScholarDigital Library
Kalyan Veeramachaneni, Ignacio Arnaldo, Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, and Ke Li. 2016. AI2: Training a big data machine to defend. In Proceedings of the IEEE International Conference on Big Data Security.Google Scholar
Weng-Keen Wong, Andrew Moore, Gregory Cooper, and Michael Wagner. 2005. What’s strange about recent events (WSARE): An algorithm for the early detection of disease outbreaks. Journal of Machine Learning Research 6, Dec (2005), 1961--1998.Google Scholar
Kevin S. Woods, Christopher C. Doss, Kevin W. Bowyer, Jeffrey L. Solka, Carey E. Priebe, and W. Philip Kegelmeyer. 1993. Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 07, 06 (1993), 1417--1436.Google ScholarCross Ref
Zuobing Xu, Ram Akella, and Yi Zhang. 2007. Incorporating diversity and density in active learning for relevance feedback. In Advances in Information Retrieval. Lecture Notes in Computer Science, Vol. 4425. Springer, Berlin, Chapter 24, 246--257.Google Scholar
Arthur Zimek, Ricardo J. G. B. Campello, and Jörg Sander. 2014. Ensembles for unsupervised outlier detection: Challenges and research questions a position paper. ACM SIGKDD Explorations Newsletter 15, 1 (March, 2014), 11--22.Google ScholarDigital Library

Index Terms

Discovering Anomalies by Incorporating Feedback from an Expert
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Active learning settings
    2. Machine learning algorithms
      1. Ensemble methods

Recommendations

Robust Anomaly Detection and Localization via Simulated Anomalies
VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

Anomaly detection refers to identifying abnormal images and localizing anomalous regions. Reconstruction-based anomaly detection is a commonly used method; however, traditional reconstruction-based methods perform poorly as deep models generalize ...
Read More
Anomaly Detection with Partially Observed Anomalies
WWW '18: Companion Proceedings of the The Web Conference 2018

In this paper, we consider the problem of anomaly detection. Previous studies mostly deal with this task in either supervised or unsupervised manner according to whether label information is available. However, there always exists settings which are ...
Read More
Diagnosing network-wide traffic anomalies

Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 14, Issue 4
August 2020
316 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3403605
Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
Minginglamp Academy of Sciences, China
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2020
- Online AM: 7 May 2020
- Accepted: 1 April 2020
- Revised: 1 November 2019
- Received: 1 June 2017
Published in tkdd Volume 14, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Active learning
anomaly detection
optimization
user feedback
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 431
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Discovering Anomalies by Incorporating Feedback from an Expert

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Robust Anomaly Detection and Localization via Simulated Anomalies

Anomaly Detection with Partially Observed Anomalies

Diagnosing network-wide traffic anomalies