research-article

Public Access

Feedback-Guided Anomaly Discovery via Online Optimization

Authors:
Md Amran Siddiqui

Oregon State University, Corvallis, OR, USA

Oregon State University, Corvallis, OR, USA
View Profile

,
Alan Fern

Oregon State University, Corvallis, OR, USA

Oregon State University, Corvallis, OR, USA
View Profile

,
Thomas G. Dietterich

Oregon State University, Corvallis, OR, USA

Oregon State University, Corvallis, OR, USA
View Profile

,
Ryan Wright

Galois, Inc., Portland, OR, USA

Galois, Inc., Portland, OR, USA
View Profile

,
Alec Theriault

Galois, Inc., Portland, OR, USA

Galois, Inc., Portland, OR, USA
View Profile

,
David W. Archer

Galois, Inc., Portland, OR, USA

Galois, Inc., Portland, OR, USA
View Profile

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2018Pages 2200–2209https://doi.org/10.1145/3219819.3220083

Published:19 July 2018Publication History

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2200–2209

ABSTRACT

Anomaly detectors are often used to produce a ranked list of statistical anomalies, which are examined by human analysts in order to extract the actual anomalies of interest. This can be exceedingly difficult and time consuming when most high-ranking anomalies are false positives and not interesting from an application perspective. In this paper, we study how to reduce the analyst's effort by incorporating their feedback about whether the anomalies they investigate are of interest or not. In particular, the feedback will be used to adjust the anomaly ranking after every analyst interaction, ideally moving anomalies of interest closer to the top. Our main contribution is to formulate this problem within the framework of online convex optimization, which yields an efficient and extremely simple approach to incorporating feedback compared to the prior state-of-the-art. We instantiate this approach for the powerful class of tree-based anomaly detectors and conduct experiments on a range of benchmark datasets. The results demonstrate the utility of incorporating feedback and advantages of our approach over the state-of-the-art. In addition, we present results on a significant cybersecurity application where the goal is to detect red-team attacks in real system audit data. We show that our approach for incorporating feedback is able to significantly reduce the time required to identify malicious system entities across multiple attacks on multiple operating systems.

Supplemental Material

siddiqui_feedback_discovery.mp4

mp4

380.3 MB

Download

References

Shubhomoy Das, Weng-Keen Wong, Thomas G. Dietterich, Alan Fern, and Andrew Emmott. 2016. Incorporating Expert Feedback into Active Anomaly Discovery. In Proceedings of the IEEE ICDM. 853--858.Google ScholarCross Ref
Shubhomoy Das,Weng-KeenWong, Alan Fern, Thomas G Dietterich, and Md Amran Siddiqui. 2017. Incorporating Feedback into Tree-based Anomaly Detection. arXiv preprint arXiv:1708.09441 (2017).Google Scholar
Boxiang Dong, Zhengzhang Chen, Hui Wendy Wang, Lu-An Tang, Kai Zhang, Ying Lin, Zhichun Li, and Haifeng Chen. 2017. Efficient Discovery of Abnormal Event Sequences in Enterprise Security Systems. In The ACMInternational Conference on Information and Knowledge Management (CIKM). Pan Pacific, Singapore. Google ScholarDigital Library
Andrew Emmott, Shubhomoy Das, Thomas G. Dietterich, Alan Fern, and Weng- Keen Wong. 2015. Systematic Construction of Anomaly Detection Benchmarks from Real Data. CoRR abs/1503.01158 (2015). http://arxiv.org/abs/1503.01158Google Scholar
Andrew F Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, and Weng- Keen Wong. 2013. Systematic construction of anomaly detection benchmarks from real data. In Proceedings of the ACM SIGKDD workshop on outlier detection and description. ACM, 16--21. Google ScholarDigital Library
Stephanie Forrest, Steven A Hofmeyr, Anil Somayaji, and Thomas A Longstaff. 1996. A sense of self for unix processes. In Security and Privacy, 1996. Proceedings., 1996 IEEE Symposium on. IEEE, 120--128. Google ScholarDigital Library
Debin Gao, Michael K Reiter, and Dawn Song. 2004. Gray-box extraction of execution graphs for anomaly detection. In Proc. of the 11th ACM conference on Computer and communications security. 318--329. Google ScholarDigital Library
Nico Görnitz, Marius Micha Kloft, Konrad Rieck, and Ulf Brefeld. 2013. Toward supervised anomaly detection. Journal of Artificial Intelligence Research (2013). Google ScholarDigital Library
Martin Grill and Tomá?y. 2016. Learning combination of anomaly detectors for security domain. Computer Networks 107 (2016), 55--63. Google ScholarDigital Library
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection On Streams. In Proceedings of The 33rd International Conference on Machine Learning, Vol. 48. Google ScholarDigital Library
Shali Jiang, Gustavo Malkomes, Geoff Converse, Alyssa Shofner, Benjamin Moseley, and Roman Garnett. 2017. Efficient Nonmyopic Active Search. In International Conference on Machine Learning. 1714--1723.Google Scholar
F Korč and W Förstner. 2008. Approximate parameter learning in conditional random fields: An empirical investigation. In Joint Pattern Recog. Symp. Springer. Google ScholarDigital Library
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. IEEE, 413--422.Google ScholarDigital Library
Dan Pelleg and Andrew W Moore. 2005. Active learning for anomaly and rarecategory detection. In Advances in neural information processing systems. Google ScholarDigital Library
Tomáš Pevny. 2016. Loda: Lightweight On-line Detector of Anomalies. Mach. Learn. 102, 2 (Feb. 2016), 275--304. Google ScholarDigital Library
Maxim Raginsky, Rebecca M Willett, Corinne Horn, Jorge Silva, and Roummel F Marcia. 2012. Sequential anomaly detection in the presence of noise and limited feedback. IEEE Transactions on Information Theory 58, 8 (2012), 5544--5562. Google ScholarDigital Library
R Sekar, Mugdha Bendre, Dinakar Dhurjati, and Pradeep Bollineni. 2001. A fast automaton-based method for detecting anomalous program behaviors. In Security and Privacy, 2001. S&EP 2001. Proceedings. 2001 IEEE Symposium on. IEEE, 144--155. Google ScholarDigital Library
Burr Settles. 2012. Active learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 6, 1 (2012), 1--114. Google ScholarDigital Library
Shai Shalev-Shwartz et al. 2012. Online learning and online convex optimization. Foundations and Trends® in Machine Learning 4, 2 (2012), 107--194. Google ScholarDigital Library
Xiaokui Shu, Danfeng Yao, and Naren Ramakrishnan. 2015. Unearthing stealthy program attacks buried in extremely long execution paths. In Proc. of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 401--413. Google ScholarDigital Library
Md Amran Siddiqui, Alan Fern, Thomas Dietterich, and Shubhomoy Das. 2016. Finite Sample Complexity of Rare Pattern Anomaly Detection. In Conference on Uncertainty in Artificial Intelligence (UAI). Google ScholarDigital Library
Swee Chuan Tan, Kai Ming Ting, and Tony Fei Liu. 2011. Fast Anomaly Detection for Streaming Data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence - Volume Two. 1511--1516. Google ScholarDigital Library
Jagannadan Varadarajan, Ramanathan Subramanian, Narendra Ahuja, Pierre Moulin, and Jean-Marc Odobez. 2017. Active Online Anomaly Detection using Dirichlet Process Mixture Model and Gaussian Process Classification. In Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. 615--623.Google ScholarCross Ref
Kalyan Veeramachaneni, Ignacio Arnaldo, Vamsi Korrapati, Constantinos Bassias, and Ke Li. 2016. AI 2: training a big data machine to defend. In Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), 2016 IEEE 2nd International Conference on. IEEE, 49--54.Google ScholarCross Ref
KeWu, Kun Zhang,Wei Fan, Andrea Edwards, and S Yu Philip. 2014. Rs-forest: A Rapid Density Estimator for Streaming Anomaly Detection. In ICDM, 2014 IEEE International Conference on. 600--609. Google ScholarDigital Library

Index Terms

Feedback-Guided Anomaly Discovery via Online Optimization
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection
    2. Learning settings
      1. Online learning settings

Recommendations

Two-stage anomaly detection algorithm via dynamic community evolution in temporal graph
Abstract
Detecting anomalies from a massive amount of user behavioral data is often liken to finding a needle in a haystack. While tremendous efforts have been devoted to anomaly detection from temporal graphs, existing studies rarely consider community ...
Read More
Robust Anomaly Detection and Localization via Simulated Anomalies
VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

Anomaly detection refers to identifying abnormal images and localizing anomalous regions. Reconstruction-based anomaly detection is a commonly used method; however, traditional reconstruction-based methods perform poorly as deep models generalize ...
Read More
Human-machine interactive streaming anomaly detection by online self-adaptive forest
Abstract
Anomaly detectors are used to distinguish differences between normal and abnormal data, which are usually implemented by evaluating and ranking the anomaly scores of each instance. A static unsupervised streaming anomaly detector is difficult to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
anomaly detection feedback
anomaly detection on security
feedback in linear model
online convex optimization
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 2,090
  Total Downloads
- Downloads (Last 12 months)161
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Feedback-Guided Anomaly Discovery via Online Optimization

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Two-stage anomaly detection algorithm via dynamic community evolution in temporal graph

Robust Anomaly Detection and Localization via Simulated Anomalies

Human-machine interactive streaming anomaly detection by online self-adaptive forest

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Feedback-Guided Anomaly Discovery via Online Optimization

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Two-stage anomaly detection algorithm via dynamic community evolution in temporal graph

Robust Anomaly Detection and Localization via Simulated Anomalies

Human-machine interactive streaming anomaly detection by online self-adaptive forest

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media