skip to main content
research-article

Discovering Anomalies by Incorporating Feedback from an Expert

Authors Info & Claims
Published:22 June 2020Publication History
Skip Abstract Section

Abstract

Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user’s semantic understanding of the anomalies.

The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of B queries is spent. The goal of our approach is to maximize the total number of true anomalies in the B instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.

References

  1. Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, NY, 504--509.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Charu C. Aggarwal. 2013. Outlier ensembles: Position paper. ACM SIGKDD Explorations Newsletter 14, 2 (April, 2013), 49--58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Magnus Almgren and Erland Jonsson. 2004. Using active learning in intrusion detection. In Proceedings of the 17th IEEE Computer Security Foundations Workshop. 88--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Stephen Boyd, Mehryar Mohri, Corinna Cortes, and Ana Radovanovic. 2012. Accuracy at the top. In Proceedings of the Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  5. Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 93--104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. David A. Cohn. 1994. Neural network exploration using optimal experiment design. In Proceedings of the Advances in Neural Information Processing Systems. 679--686.Google ScholarGoogle Scholar
  7. Shubhomoy Das, Weng-Keen Wong, Thomas G. Dietterich, Alan Fern, and Andrew Emmott. 2016. Incorporating expert feedback into active anomaly discovery. In Proceedings of the IEEE International Conference on Data Mining. 853--858.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ethan Dereszynski and Thomas Dietterich. 2007. Probabilistic models for anomaly detection in remote sensor data streams. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 75--82.Google ScholarGoogle Scholar
  9. Pinar Donmez and Jaime G. Carbonell. 2008. Optimizing estimated loss reduction for active sampling in rank learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 248--255.Google ScholarGoogle Scholar
  10. Pinar Donmez and Jaime G. Carbonell. 2009. Active sampling for rank learning via optimizing the area under the ROC curve. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval. Springer, 78--89.Google ScholarGoogle Scholar
  11. Andrew Emmott, Shubhomoy Das, Thomas G. Dietterich, Alan Fern, and Weng-Keen Wong. 2015. Systematic construction of anomaly detection benchmarks from real data. CoRR abs/1503.01158 (2015). http://arxiv.org/abs/1503.01158Google ScholarGoogle Scholar
  12. Nico Görnitz, Marius Kloft, Konrad Rieck, and Ulf Brefeld. 2013. Toward Supervised Anomaly Detection. Journal of Artificial Intelligence Research46, 1 (2013), 235--262.Google ScholarGoogle ScholarCross RefCross Ref
  13. Martin Grill and Tomáš Pevný. 2016. Learning combination of anomaly detectors for security domain. Computer Networks 107, P1 (2016), 1--9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jingrui He and Jaime Carbonell. 2008. Nearest-neighbor-based active learning for rare category detection. In Advances in Neural Information Processing Systems 20, J.C. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.). MIT Press, Cambridge, MA, 633--640.Google ScholarGoogle Scholar
  15. Jingrui He and Jaime Carbonell. 2008. Rare class discovery based on active learning. In Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics.Google ScholarGoogle Scholar
  16. Jingrui He, Yan Liu, and Richard Lawrence. 2008. Graph-based rare category detection. In Proceedings of the 8th IEEE International Conference on Data Mining. 833--838.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thorsten Joachims. 2002. Optimizing Search Engines using Clickthrough Data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain. 2015. Surrogate functions for maximizing precision at the top. In Proceedings of the 32nd International Conference on International Conference on Machine Learning. 189--198.Google ScholarGoogle Scholar
  19. Narendra Karmarkar. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing. ACM, 302--311.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Edwin M. Knorr and Raymond T. Ng. 1998. Algorithms for mining distance-based outliers in large datasets. In Proceedings of the 24th International Conference on Very Large Data Bases. Morgan Kaufmann, 392--403.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Roger Koenker. 2005. Quantile Regression (Econometric Society Monographs). Cambridge University Press.Google ScholarGoogle Scholar
  22. Aleksander Lazarevic and Vipin Kumar. 2005. Feature bagging for outlier detection. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York, NY, 157--166.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Moshe Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  24. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In Proceedings of the 8th IEEE International Conference on Data Mining. 413--422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Bo Long, Olivier Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, and Belle Tseng. 2010. Active learning for ranking through expected loss optimization. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 267--274.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nir Nissim, Aviad Cohen, Robert Moskovitch, Assaf Shabtai, Mattan Edry, Oren Bar-Ad, and Yuval Elovici. 2014. ALPD: Active learning framework for enhancing the detection of malicious PDF files. In Proceedings of the 2014 IEEE Joint Intelligence and Security Informatics Conference. 91--98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Dan Pelleg and Andrew W. Moore. 2004. Active learning for anomaly and rare-category detection. In Proceedings of the Advances in Neural Information Processing Systems. 1073--1080.Google ScholarGoogle Scholar
  29. Tomáš Pevný. 2015. Loda: Lightweight on-line detector of anomalies. Machine Learning 102, 2 (2015), 275--304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning. ACM, 784--791.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shebuti Rayana and Leman Akoglu. 2016. Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 42:1--42:33.Google ScholarGoogle Scholar
  32. Ted E. Senator, Henry G. Goldberg, Alex Memory, William T. Young, Brad Rees, Robert Pierce, Daniel Huang, Matthew Reardon, David A. Bader, Edmond Chow, Irfan Essa, Joshua Jones, Vinay Bettadapura, Duen Horng Chau, Oded Green, Oguz Kaya, Anita Zakrzewska, Erica Briscoe, Rudolph IV L. Mappus, Robert McColl, Lora Weiss, Thomas G. Dietterich, Alan Fern, Weng-Keen Wong, Shubhomoy Das, Andrew Emmott, Jed Irvine, Jay-Yoon Lee, Danai Koutra, Christos Faloutsos, Daniel Corkill, Lisa Friedland, Amanda Gentzel, and David Jensen. 2013. Detecting insider threats in a real corporate database of computer usage activity. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1393--1401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Burr Settles. 2010. Active learning literature survey. Technical Report. University of Wisconsin, Madison, 52, 55-66.Google ScholarGoogle Scholar
  34. Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1069--1078.Google ScholarGoogle ScholarCross RefCross Ref
  35. H. Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the ACM Workshop on Computational Learning Theory. 1289--1296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jack W. Stokes, John C. Platt, Joseph Kravis, and Michael Shilman. 2008. ALADIN: Active learning of anomalies to detect intrusions. Technique Report. Microsoft Network Security Redmond, WA 98052 (2008).Google ScholarGoogle Scholar
  37. Simon Tong and Daphne Koller. 2000. Support vector machine active learning with applications to text classification. In Proceedings of the 17th International Conference on Machine Learning. 999--1006.Google ScholarGoogle Scholar
  38. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579--2605.Google ScholarGoogle Scholar
  39. Pavan Vatturi and Weng-Keen Wong. 2009. Category detection using hierarchical mean shift. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 847--856.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Kalyan Veeramachaneni, Ignacio Arnaldo, Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, and Ke Li. 2016. AI2: Training a big data machine to defend. In Proceedings of the IEEE International Conference on Big Data Security.Google ScholarGoogle Scholar
  41. Weng-Keen Wong, Andrew Moore, Gregory Cooper, and Michael Wagner. 2005. What’s strange about recent events (WSARE): An algorithm for the early detection of disease outbreaks. Journal of Machine Learning Research 6, Dec (2005), 1961--1998.Google ScholarGoogle Scholar
  42. Kevin S. Woods, Christopher C. Doss, Kevin W. Bowyer, Jeffrey L. Solka, Carey E. Priebe, and W. Philip Kegelmeyer. 1993. Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 07, 06 (1993), 1417--1436.Google ScholarGoogle ScholarCross RefCross Ref
  43. Zuobing Xu, Ram Akella, and Yi Zhang. 2007. Incorporating diversity and density in active learning for relevance feedback. In Advances in Information Retrieval. Lecture Notes in Computer Science, Vol. 4425. Springer, Berlin, Chapter 24, 246--257.Google ScholarGoogle Scholar
  44. Arthur Zimek, Ricardo J. G. B. Campello, and Jörg Sander. 2014. Ensembles for unsupervised outlier detection: Challenges and research questions a position paper. ACM SIGKDD Explorations Newsletter 15, 1 (March, 2014), 11--22.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Discovering Anomalies by Incorporating Feedback from an Expert

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 14, Issue 4
        August 2020
        316 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3403605
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 June 2020
        • Online AM: 7 May 2020
        • Accepted: 1 April 2020
        • Revised: 1 November 2019
        • Received: 1 June 2017
        Published in tkdd Volume 14, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format