Skip to main content
Top
Published in:

26-11-2021

A Comprehensive Survey of Anomaly Detection Algorithms

Authors: Durgesh Samariya, Amit Thakkar

Published in: Annals of Data Science | Issue 3/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Anomaly or outlier detection is consider as one of the vital application of data mining, which deals with anomalies or outliers. Anomalies are considered as data points that are dramatically different from the rest of the data points. In this survey, we comprehensively present anomaly detection algorithms in an organized manner. We begin this survey with the definition of anomaly, then provide essential elements of anomaly detection, such as different types of anomaly, different application domains, and evaluation measures. Such anomaly detection algorithms are categorized in seven categories based on their working mechanisms, which includes total of 52 algorithms. The categories are anomaly detection algorithms based on statistics, density, distance, clustering, isolation, ensemble and subspace. For each category, we provide the time complexity of each algorithm and their general advantages and disadvantages. In the end, we compared all discussed anomaly detection algorithms in detail.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Anomaly and outlier are widely used terms. In this work, we will use both terms interchangeably.
 
2
Anomaly detection and outlier detection are widely used terms. In this paper, we used both terms interchangeably.
 
3
The time complexity of this kind of algorithms can be reduced to \(O(n\log (n))\) by using good indexing structure, but they are not feasible in high dimensional space. Thus we mention time complexities without such index throughout the paper.
 
4
Clustered anomalies are anomalies, which form cluster of few points outside of the normal cluster.
 
5
Some algorithms choose subspace based on statistical test (e.g. HiCS, CMI) and some choose randomly(e.g. Zero++).
 
6
Anomaly detection algorithms based on subspace are required to search for the subspace, which requires additional time, which depends on a search method. We only provide scoring time in a subspace.
 
Literature
1.
go back to reference Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef
2.
go back to reference Ahmed M, Najmul Islam AKM (2020) Deep learning: hope or hype. Ann Data Sci 7(3):427–432CrossRef Ahmed M, Najmul Islam AKM (2020) Deep learning: hope or hype. Ann Data Sci 7(3):427–432CrossRef
3.
go back to reference Chandola V, Banerjee A, Kumar V (2007) Outlier detection: a survey. ACM Comput Surv 14:15 Chandola V, Banerjee A, Kumar V (2007) Outlier detection: a survey. ACM Comput Surv 14:15
4.
go back to reference Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef
5.
6.
go back to reference Barnett V, Lewis T (1984) Outliers in statistical data, 3rd edn. Wiley, New York Barnett V, Lewis T (1984) Outliers in statistical data, 3rd edn. Wiley, New York
7.
go back to reference Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, Association for Computing Machinery, New York, NY, USA, pp 93–104 Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, Association for Computing Machinery, New York, NY, USA, pp 93–104
8.
go back to reference Jiang MF, Tseng SS, Su CM (2001) Two-phase clustering process for outliers detection. Pattern Recogn Lett 22(6):691–700CrossRef Jiang MF, Tseng SS, Su CM (2001) Two-phase clustering process for outliers detection. Pattern Recogn Lett 22(6):691–700CrossRef
9.
go back to reference Hu T, Sung SY (2003) Detecting pattern-based outliers. Pattern Recogn Lett 24(16):3059–3068CrossRef Hu T, Sung SY (2003) Detecting pattern-based outliers. Pattern Recogn Lett 24(16):3059–3068CrossRef
10.
go back to reference Aryal S, Baniya AA, Santosh KC (2019) Improved histogram-based anomaly detector with the extended principal component features. arXiv preprint arXiv: 1909.12702 Aryal S, Baniya AA, Santosh KC (2019) Improved histogram-based anomaly detector with the extended principal component features. arXiv preprint arXiv:​ 1909.​12702
11.
go back to reference Ahmed M (2018) Collective anomaly detection techniques for network traffic analysis. Ann Data Sci 5(4):497–512CrossRef Ahmed M (2018) Collective anomaly detection techniques for network traffic analysis. Ann Data Sci 5(4):497–512CrossRef
12.
go back to reference Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef
13.
go back to reference Aggarwal CC (2017) An introduction to outlier analysis. Springer, Cham, pp 1–34 Aggarwal CC (2017) An introduction to outlier analysis. Springer, Cham, pp 1–34
14.
go back to reference Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York
15.
go back to reference Nick C (2009) Precision at n. Springer, Boston, pp 2127–2128 Nick C (2009) Precision at n. Springer, Boston, pp 2127–2128
16.
go back to reference Zhang E, Zhang Y (2009) Average precision. Springer, Boston, pp 192–193 Zhang E, Zhang Y (2009) Average precision. Springer, Boston, pp 192–193
17.
go back to reference Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRef Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRef
18.
go back to reference Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):1–31, 04CrossRef Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):1–31, 04CrossRef
19.
go back to reference Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Disc 30(4):891–927CrossRef Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Disc 30(4):891–927CrossRef
20.
go back to reference Shewhart WA (1930) Economic quality control of manufactured product1. Bell Syst Tech J 9(2):364–389CrossRef Shewhart WA (1930) Economic quality control of manufactured product1. Bell Syst Tech J 9(2):364–389CrossRef
21.
go back to reference Rosner B (1983) Percentage points for a generalized ESD many-outlier procedure. Technometrics 25(2):165–172CrossRef Rosner B (1983) Percentage points for a generalized ESD many-outlier procedure. Technometrics 25(2):165–172CrossRef
22.
go back to reference Liu J-P, Weng C-S (1991) Detection of outlying data in bioavailability/bioequivalence studies. Stat Med 10(9):1375–1389CrossRef Liu J-P, Weng C-S (1991) Detection of outlying data in bioavailability/bioequivalence studies. Stat Med 10(9):1375–1389CrossRef
23.
go back to reference Surace C, Worden K, Tomlinson G (1997) A novelty detection approach to diagnose damage in a cracked beam. In: Proceedings-SPIE the international society for optical engineering, Citeseer, pp 947–953 Surace C, Worden K, Tomlinson G (1997) A novelty detection approach to diagnose damage in a cracked beam. In: Proceedings-SPIE the international society for optical engineering, Citeseer, pp 947–953
24.
go back to reference Surace C, Orden K et al (1998) A novelty detection method to diagnose damage in structures: an application to an offshore platform. In: The eighth international offshore and polar engineering conference, International Society of Offshore and Polar Engineers Surace C, Orden K et al (1998) A novelty detection method to diagnose damage in structures: an application to an offshore platform. In: The eighth international offshore and polar engineering conference, International Society of Offshore and Polar Engineers
25.
go back to reference Laurikkala J, Juhola M, Kentala E (2000) Informal identification of outliers in medical data. In: Fifth international workshop on intelligent data analysis in medicine and pharmacology, vol 1, pp 20–24 Laurikkala J, Juhola M, Kentala E (2000) Informal identification of outliers in medical data. In: Fifth international workshop on intelligent data analysis in medicine and pharmacology, vol 1, pp 20–24
26.
go back to reference Ye N, Chen Q (2001) An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Qual Reliab Eng Int 17(2):105–112CrossRef Ye N, Chen Q (2001) An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Qual Reliab Eng Int 17(2):105–112CrossRef
27.
go back to reference Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, New York Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, New York
28.
go back to reference Horn PS, Feng L, Li Y, Pesce AJ (2001) Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 47(12):2137–2142CrossRef Horn PS, Feng L, Li Y, Pesce AJ (2001) Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 47(12):2137–2142CrossRef
29.
go back to reference Solberg HE, Lahti A (2005) Detection of outliers in reference distributions: performance of Horn’s algorithm. Clin Chem 51(2):2326–2332, 12CrossRef Solberg HE, Lahti A (2005) Detection of outliers in reference distributions: performance of Horn’s algorithm. Clin Chem 51(2):2326–2332, 12CrossRef
30.
go back to reference Dovoedo YH, Chakraborti S (2015) Boxplot-based outlier detection for the location-scale family. Commun Stat Simul Comput 44(6):1492–1513CrossRef Dovoedo YH, Chakraborti S (2015) Boxplot-based outlier detection for the location-scale family. Commun Stat Simul Comput 44(6):1492–1513CrossRef
31.
go back to reference Gibbons RD (1994) Statistical methods for groundwater monitoring. Wiley, New YorkCrossRef Gibbons RD (1994) Statistical methods for groundwater monitoring. Wiley, New YorkCrossRef
32.
go back to reference Javitz HS, Valdes A (1991) The SRI ides statistical anomaly detector. In: Proceedings of 1991 IEEE computer society symposium on research in security and privacy, pp 316–326 Javitz HS, Valdes A (1991) The SRI ides statistical anomaly detector. In: Proceedings of 1991 IEEE computer society symposium on research in security and privacy, pp 316–326
33.
go back to reference Gebski M, Wong RK (2007) An efficient histogram method for outlier detection. In: Ramamohanarao KP, Krishna R, Mohania M, Nantajeewarawat E (eds) Advances in databases: concepts, systems and applications. Springer, Berlin, pp 176–187CrossRef Gebski M, Wong RK (2007) An efficient histogram method for outlier detection. In: Ramamohanarao KP, Krishna R, Mohania M, Nantajeewarawat E (eds) Advances in databases: concepts, systems and applications. Springer, Berlin, pp 176–187CrossRef
34.
go back to reference Jiang X-B, Li G-Y, Lian S (2011) Outlier detection algorithm based on variable-width histogram for wireless sensor network. J Comput Appl 31(3):694–697 Jiang X-B, Li G-Y, Lian S (2011) Outlier detection algorithm based on variable-width histogram for wireless sensor network. J Comput Appl 31(3):694–697
35.
go back to reference Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. In: KI-2012: poster and demo track, pp 59–63 Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. In: KI-2012: poster and demo track, pp 59–63
36.
go back to reference Xie M, Hu J, Tian B (2012) Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: 2012 IEEE 11th international conference on trust, security and privacy in computing and communications, pp 751–759 Xie M, Hu J, Tian B (2012) Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: 2012 IEEE 11th international conference on trust, security and privacy in computing and communications, pp 751–759
37.
go back to reference Latecki LJ, Lazarevic A, Pokrajac D (2007) Outlier detection with kernel density functions. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Berlin, pp 61–75CrossRef Latecki LJ, Lazarevic A, Pokrajac D (2007) Outlier detection with kernel density functions. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Berlin, pp 61–75CrossRef
38.
go back to reference Oh JH, Gao J (2009) A kernel-based approach for detecting outliers of high-dimensional biological data. In: BMC bioinformatics, vol 10, Springer, p S7 Oh JH, Gao J (2009) A kernel-based approach for detecting outliers of high-dimensional biological data. In: BMC bioinformatics, vol 10, Springer, p S7
39.
go back to reference Gao J, Hu W, Zhang Z, Zhang X, Wu O (2011) Rkof: robust kernel-based local outlier detection. In: Huang JZ, Cao L, Srivastava J (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 270–283CrossRef Gao J, Hu W, Zhang Z, Zhang X, Wu O (2011) Rkof: robust kernel-based local outlier detection. In: Huang JZ, Cao L, Srivastava J (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 270–283CrossRef
40.
go back to reference Askari A, Yang F, Ghaoui LE (2018) Kernel-based outlier detection using the inverse christoffel function Askari A, Yang F, Ghaoui LE (2018) Kernel-based outlier detection using the inverse christoffel function
41.
go back to reference Liu F, Yanwei Yu, Song P, Fan Y, Tong X (2020) Scalable KDE-based top-n local outlier detection over large-scale data streams. Knowl Based Syst 204:106186CrossRef Liu F, Yanwei Yu, Song P, Fan Y, Tong X (2020) Scalable KDE-based top-n local outlier detection over large-scale data streams. Knowl Based Syst 204:106186CrossRef
42.
go back to reference Siegel AF, Morgan CJ (1988) Statistics and data analysis: an introduction, 2nd edn. Wiley, New York Siegel AF, Morgan CJ (1988) Statistics and data analysis: an introduction, 2nd edn. Wiley, New York
43.
go back to reference Zhang Y, Hamm NAS, Meratnia N, Stein A, van de Voort M, Havinga PJM (2012) Statistics-based outlier detection for wireless sensor networks. Int J Geogr Inf Sci 26(8):1373–1392CrossRef Zhang Y, Hamm NAS, Meratnia N, Stein A, van de Voort M, Havinga PJM (2012) Statistics-based outlier detection for wireless sensor networks. Int J Geogr Inf Sci 26(8):1373–1392CrossRef
44.
go back to reference Zimek A, Filzmoser P (2018) There and back again: outlier detection between statistical reasoning and data mining algorithms. WIREs Data Min Knowl Discov 8(6):e1280 Zimek A, Filzmoser P (2018) There and back again: outlier detection between statistical reasoning and data mining algorithms. WIREs Data Min Knowl Discov 8(6):e1280
45.
go back to reference Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Chen MS, Yu PS, Liu B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 535–548CrossRef Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Chen MS, Yu PS, Liu B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 535–548CrossRef
46.
go back to reference Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Loop: local outlier probabilities. In: Proceedings of the 18th ACM conference on information and knowledge management, CIKM ’09, Association for Computing Machinery, New York, NY, USA, pp 1649–1652 Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Loop: local outlier probabilities. In: Proceedings of the 18th ACM conference on information and knowledge management, CIKM ’09, Association for Computing Machinery, New York, NY, USA, pp 1649–1652
47.
go back to reference Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405), pp 315–326 Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405), pp 315–326
48.
go back to reference Ren D, Wang B, Perrizo W (2004) Rdf: a density-based outlier detection method using vertical data representation. In: extitFourth IEEE international conference on data mining (ICDM’04), pp 503–506 Ren D, Wang B, Perrizo W (2004) Rdf: a density-based outlier detection method using vertical data representation. In: extitFourth IEEE international conference on data mining (ICDM’04), pp 503–506
49.
go back to reference Jin W, Tung Anthony KH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06, Springer, Berlin, pp 577–593 Jin W, Tung Anthony KH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06, Springer, Berlin, pp 577–593
50.
go back to reference Fan H, Zaïane OR, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19(1):31–51CrossRef Fan H, Zaïane OR, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19(1):31–51CrossRef
51.
go back to reference Goldstein M (2012) Fastlof: an expectation-maximization based local outlier detection algorithm. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 2282–2285 Goldstein M (2012) Fastlof: an expectation-maximization based local outlier detection algorithm. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 2282–2285
52.
go back to reference Momtaz R, Nesma M, Gowayyed MA (2013) Dwof: a robust density-based outlier detection approach. In: Sanches JM, Micó L, Cardoso JS (eds) Pattern recognition and image analysis. Springer, Berlin, pp 517–525CrossRef Momtaz R, Nesma M, Gowayyed MA (2013) Dwof: a robust density-based outlier detection approach. In: Sanches JM, Micó L, Cardoso JS (eds) Pattern recognition and image analysis. Springer, Berlin, pp 517–525CrossRef
53.
go back to reference Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Disc 28(1):190–237CrossRef Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Disc 28(1):190–237CrossRef
54.
go back to reference Wells JR, Ting KM, Washio T (2014) Linearn: a new approach to nearest neighbour density estimator. Pattern Recogn 47(8):2702–2720CrossRef Wells JR, Ting KM, Washio T (2014) Linearn: a new approach to nearest neighbour density estimator. Pattern Recogn 47(8):2702–2720CrossRef
55.
go back to reference Campello Ricardo JGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10(1):1–51CrossRef Campello Ricardo JGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10(1):1–51CrossRef
56.
go back to reference Aryal S, Ting KM, Haffari G (2016) Revisiting attribute independence assumption in probabilistic unsupervised anomaly detection. In: Michael C, Alan Wang G, Hsinchun C (eds) Intelligence and security informatics. Springer, Cham, pp 73–86CrossRef Aryal S, Ting KM, Haffari G (2016) Revisiting attribute independence assumption in probabilistic unsupervised anomaly detection. In: Michael C, Alan Wang G, Hsinchun C (eds) Intelligence and security informatics. Springer, Cham, pp 73–86CrossRef
57.
go back to reference Abdi H, Williams LJ (2010) Principal component analysis. WIREs Comput Stat 2(4):433–459CrossRef Abdi H, Williams LJ (2010) Principal component analysis. WIREs Comput Stat 2(4):433–459CrossRef
58.
go back to reference Aggarwal CC (2017) Proximity-based outlier detection. Springer, Cham, pp 111–147 Aggarwal CC (2017) Proximity-based outlier detection. Springer, Cham, pp 111–147
59.
go back to reference Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421CrossRef Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421CrossRef
60.
go back to reference Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Kaufmann Publishers Inc, San Francisco, CA, USA, Morgan, pp 392–403 Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Kaufmann Publishers Inc, San Francisco, CA, USA, Morgan, pp 392–403
61.
go back to reference Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3):237–253CrossRef Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3):237–253CrossRef
62.
go back to reference Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec 29(2):427–438CrossRef Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec 29(2):427–438CrossRef
63.
go back to reference Ghoting A, Parthasarathy S, Otey ME (2008) Fast mining of distance-based outliers in high-dimensional datasets. Data Min Knowl Disc 16(3):349–364CrossRef Ghoting A, Parthasarathy S, Otey ME (2008) Fast mining of distance-based outliers in high-dimensional datasets. Data Min Knowl Disc 16(3):349–364CrossRef
64.
go back to reference Kriegel HP, Schubert M, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08, Association for Computing Machinery, New York, pp 444–452 Kriegel HP, Schubert M, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08, Association for Computing Machinery, New York, pp 444–452
65.
go back to reference Wang B, Xiao G, Yu H, Yang X (2009) Distance-based outlier detection on uncertain data. In: 2009 Ninth IEEE international conference on computer and information technology, vol 1, pp 293–298 Wang B, Xiao G, Yu H, Yang X (2009) Distance-based outlier detection on uncertain data. In: 2009 Ninth IEEE international conference on computer and information technology, vol 1, pp 293–298
66.
go back to reference Zhang K, Hutter M, Jin H (2009) A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho T-B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 813–822CrossRef Zhang K, Hutter M, Jin H (2009) A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho T-B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 813–822CrossRef
67.
go back to reference Sugiyama M, Borgwardt K (2013) Rapid distance-based outlier detection via sampling. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26, Curran Associates Inc, pp 467–475 Sugiyama M, Borgwardt K (2013) Rapid distance-based outlier detection via sampling. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26, Curran Associates Inc, pp 467–475
68.
go back to reference Radovanović M, Nanopoulos A, Ivanović M (2015) Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans Knowl Data Eng 27(5):1369–1382CrossRef Radovanović M, Nanopoulos A, Ivanović M (2015) Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans Knowl Data Eng 27(5):1369–1382CrossRef
69.
go back to reference Wang H, Bah MJ, Hammad M (2019) Progress in outlier detection techniques: a survey. IEEE Access 7:107964–108000CrossRef Wang H, Bah MJ, Hammad M (2019) Progress in outlier detection techniques: a survey. IEEE Access 7:107964–108000CrossRef
70.
go back to reference Berchtold S, Keim DA, Kriegel H-P (1996) The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22th international conference on very large data bases, VLDB ’96, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 28–39 Berchtold S, Keim DA, Kriegel H-P (1996) The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22th international conference on very large data bases, VLDB ’96, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 28–39
71.
go back to reference Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57CrossRef Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57CrossRef
72.
go back to reference Sellis TK, Roussopoulos N, Faloutsos C (1987) The r+-tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th international conference on very large data bases, VLDB ’87, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 507–518 Sellis TK, Roussopoulos N, Faloutsos C (1987) The r+-tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th international conference on very large data bases, VLDB ’87, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 507–518
73.
go back to reference Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef
74.
go back to reference Dantong Yu, Sheikholeslami G, Zhang A (2002) Findout: finding outliers in very large datasets. Knowl Inf Syst 4(4):387–412CrossRef Dantong Yu, Sheikholeslami G, Zhang A (2002) Findout: finding outliers in very large datasets. Knowl Inf Syst 4(4):387–412CrossRef
75.
go back to reference He Z, Xiaofei X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650CrossRef He Z, Xiaofei X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650CrossRef
76.
go back to reference Jiang S, An Q (2008) Clustering-based outlier detection method. In: 2008 Fifth international conference on fuzzy systems and knowledge discovery, vol 2, pp 429–433 Jiang S, An Q (2008) Clustering-based outlier detection method. In: 2008 Fifth international conference on fuzzy systems and knowledge discovery, vol 2, pp 429–433
77.
go back to reference Liu FT, Ting KM, Zhou Z (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining, pp 413–422 Liu FT, Ting KM, Zhou Z (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining, pp 413–422
78.
go back to reference Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):1–39CrossRef Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):1–39CrossRef
79.
go back to reference Liu FT, Ting KM, Zhou ZH (2010) On detecting clustered anomalies using sciforest. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 274–290CrossRef Liu FT, Ting KM, Zhou ZH (2010) On detecting clustered anomalies using sciforest. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 274–290CrossRef
80.
go back to reference Tan SC, Ting KM, Liu TF (2011) Fast anomaly detection for streaming data. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 2, IJCAI’11, AAAI Press, pp 1511–1516 Tan SC, Ting KM, Liu TF (2011) Fast anomaly detection for streaming data. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 2, IJCAI’11, AAAI Press, pp 1511–1516
81.
go back to reference Aryal S, Ting KM, Wells JR, Washio T (2014) Improving iforest with relative mass. In: Tseng VS, Ho TB, Zhou ZH, Chen ALP, Kao HY (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 510–521CrossRef Aryal S, Ting KM, Wells JR, Washio T (2014) Improving iforest with relative mass. In: Tseng VS, Ho TB, Zhou ZH, Chen ALP, Kao HY (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 510–521CrossRef
82.
go back to reference Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Wells JR (2014) Efficient anomaly detection by isolation using nearest neighbour ensemble. In: 2014 IEEE International conference on data mining workshop, pp 698–705 Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Wells JR (2014) Efficient anomaly detection by isolation using nearest neighbour ensemble. In: 2014 IEEE International conference on data mining workshop, pp 698–705
83.
go back to reference Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Zhu Y, Wells JR (2018) Isolation-based anomaly detection using nearest-neighbor ensembles. Comput Intell 34(4):968–998CrossRef Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Zhu Y, Wells JR (2018) Isolation-based anomaly detection using nearest-neighbor ensembles. Comput Intell 34(4):968–998CrossRef
84.
go back to reference Pang G, Ting KM, Albrecht D (2015) Lesinn: detecting anomalies by identifying least similar nearest neighbours. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 623–630 Pang G, Ting KM, Albrecht D (2015) Lesinn: detecting anomalies by identifying least similar nearest neighbours. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 623–630
85.
go back to reference Zhang X, Dou W, He Q, Zhou R, Leckie C, Kotagiri R, Salcic Z (2017) Lshiforest: a generic framework for fast tree isolation based ensemble anomaly analysis. In: 2017 IEEE 33rd international conference on data engineering (ICDE), pp 983–994 Zhang X, Dou W, He Q, Zhou R, Leckie C, Kotagiri R, Salcic Z (2017) Lshiforest: a generic framework for fast tree isolation based ensemble anomaly analysis. In: 2017 IEEE 33rd international conference on data engineering (ICDE), pp 983–994
86.
go back to reference Aryal S (2018) Anomaly detection technique robust to units and scales of measurement. In: Phung D, Tseng VS, Webb GI, Ho B, Ganji M, Rashidi L (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 589–601CrossRef Aryal S (2018) Anomaly detection technique robust to units and scales of measurement. In: Phung D, Tseng VS, Webb GI, Ho B, Ganji M, Rashidi L (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 589–601CrossRef
87.
go back to reference Aryal S, Santosh KC, Dazeley R (2020) usfad: a robust anomaly detector based on unsupervised stochastic forest. Int J Mach Learn Cybern 12:1–14 Aryal S, Santosh KC, Dazeley R (2020) usfad: a robust anomaly detector based on unsupervised stochastic forest. Int J Mach Learn Cybern 12:1–14
88.
go back to reference Ting KM, Zhou G-T, Liu FT, Tan JSC (2010) Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, Association for Computing Machinery, New York, NY, USA, pp 989–998 Ting KM, Zhou G-T, Liu FT, Tan JSC (2010) Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, Association for Computing Machinery, New York, NY, USA, pp 989–998
89.
go back to reference Fernando TL, Webb GI (2017) Simusf: an efficient and effective similarity measure that is invariant to violations of the interval scale assumption. Data Min Knowl Disc 31(1):264–286CrossRef Fernando TL, Webb GI (2017) Simusf: an efficient and effective similarity measure that is invariant to violations of the interval scale assumption. Data Min Knowl Disc 31(1):264–286CrossRef
90.
go back to reference Ting KM, Washio T, Wells JR, Aryal S (2017) Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach Learn 106(1):55–91CrossRef Ting KM, Washio T, Wells JR, Aryal S (2017) Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach Learn 106(1):55–91CrossRef
91.
go back to reference Bandaragoda TR (2015) Isolation based anomaly detection: a re-examination. PhD thesis, Monash University Bandaragoda TR (2015) Isolation based anomaly detection: a re-examination. PhD thesis, Monash University
92.
go back to reference Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304CrossRef Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304CrossRef
93.
go back to reference Zhao Y, Hryniewicki MK (2018) DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM SIGKDD ODD workshop, London, UK Zhao Y, Hryniewicki MK (2018) DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM SIGKDD ODD workshop, London, UK
94.
go back to reference Zhao Y, Nasrullah Z, Hryniewicki MK, Li Z (2019) LSCP: locally selective combination in parallel outlier ensembles. In: Proceedings of the 2019 SIAM international conference on data mining, SDM 2019, Calgary, Canada, pp 585–593 Zhao Y, Nasrullah Z, Hryniewicki MK, Li Z (2019) LSCP: locally selective combination in parallel outlier ensembles. In: Proceedings of the 2019 SIAM international conference on data mining, SDM 2019, Calgary, Canada, pp 585–593
95.
go back to reference Aggarwal CC (2013) Outlier ensembles: position paper. SIGKDD Explor Newsl 14(2):49–58CrossRef Aggarwal CC (2013) Outlier ensembles: position paper. SIGKDD Explor Newsl 14(2):49–58CrossRef
96.
97.
go back to reference Zimek A, Campello RJGB, Sander J (2014) Ensembles for unsupervised outlier detection: challenges and research questions a position paper. SIGKDD Explor Newsl 15(2):11–22CrossRef Zimek A, Campello RJGB, Sander J (2014) Ensembles for unsupervised outlier detection: challenges and research questions a position paper. SIGKDD Explor Newsl 15(2):11–22CrossRef
98.
go back to reference Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Outlier detection in axis-parallel subspaces of high dimensional data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho TB (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 831–838CrossRef Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Outlier detection in axis-parallel subspaces of high dimensional data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho TB (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 831–838CrossRef
99.
go back to reference Agrawal A (2009) Local subspace based outlier detection. In: Ranka S, Aluru S, Buyya R, Chung Y-C, Dua S, Grama A, Gupta SKS, Kumar R, Phoha VV (eds) Contemporary computing. Springer, Heidelberg, pp 149–157CrossRef Agrawal A (2009) Local subspace based outlier detection. In: Ranka S, Aluru S, Buyya R, Chung Y-C, Dua S, Grama A, Gupta SKS, Kumar R, Phoha VV (eds) Contemporary computing. Springer, Heidelberg, pp 149–157CrossRef
100.
go back to reference Nguyen HV, Gopalkrishnan V, Assent I (2011) An unbiased distance-based outlier detection approach for high-dimensional data. In: Jeffrey XY, Myoung HK, Rainer U (eds) Database systems for advanced applications. Springer, Berlin, pp 138–152CrossRef Nguyen HV, Gopalkrishnan V, Assent I (2011) An unbiased distance-based outlier detection approach for high-dimensional data. In: Jeffrey XY, Myoung HK, Rainer U (eds) Database systems for advanced applications. Springer, Berlin, pp 138–152CrossRef
101.
go back to reference Kriegel H, Kröger P, Schubert E, Zimek A (2012) Outlier detection in arbitrarily oriented subspaces. In: 2012 IEEE 12th international conference on data mining, pp 379–388 Kriegel H, Kröger P, Schubert E, Zimek A (2012) Outlier detection in arbitrarily oriented subspaces. In: 2012 IEEE 12th international conference on data mining, pp 379–388
102.
go back to reference Keller F, Muller E, Bohm K (2012) Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th international conference on data engineering, pp 1037–1048 Keller F, Muller E, Bohm K (2012) Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th international conference on data engineering, pp 1037–1048
103.
go back to reference Nguyen HV, Müller E, Vreeken J, Keller F, Böhm, K (2013) Cmi: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: Proceedings of the 2013 SIAM international conference on data mining, SIAM, pp 198–206 Nguyen HV, Müller E, Vreeken J, Keller F, Böhm, K (2013) Cmi: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: Proceedings of the 2013 SIAM international conference on data mining, SIAM, pp 198–206
104.
go back to reference Pang G, Ting KM, Albrecht D, Jin H (2016) Zero++: harnessing the power of zero appearances to detect anomalies in large-scale data sets. J Artif Intell Res 57:593–620CrossRef Pang G, Ting KM, Albrecht D, Jin H (2016) Zero++: harnessing the power of zero appearances to detect anomalies in large-scale data sets. J Artif Intell Res 57:593–620CrossRef
105.
go back to reference Aggarwal CC (2017) High-dimensional outlier detection: the subspace method, Springer International Publishing, Cham, pp 149–184 Aggarwal CC (2017) High-dimensional outlier detection: the subspace method, Springer International Publishing, Cham, pp 149–184
106.
go back to reference Zimek A, Schubert E, Kriegel H-P (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387CrossRef Zimek A, Schubert E, Kriegel H-P (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387CrossRef
Metadata
Title
A Comprehensive Survey of Anomaly Detection Algorithms
Authors
Durgesh Samariya
Amit Thakkar
Publication date
26-11-2021
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science / Issue 3/2023
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-021-00362-9

Premium Partner