Skip to main content
Erschienen in: Annals of Data Science 3/2023

26.11.2021

A Comprehensive Survey of Anomaly Detection Algorithms

verfasst von: Durgesh Samariya, Amit Thakkar

Erschienen in: Annals of Data Science | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Anomaly or outlier detection is consider as one of the vital application of data mining, which deals with anomalies or outliers. Anomalies are considered as data points that are dramatically different from the rest of the data points. In this survey, we comprehensively present anomaly detection algorithms in an organized manner. We begin this survey with the definition of anomaly, then provide essential elements of anomaly detection, such as different types of anomaly, different application domains, and evaluation measures. Such anomaly detection algorithms are categorized in seven categories based on their working mechanisms, which includes total of 52 algorithms. The categories are anomaly detection algorithms based on statistics, density, distance, clustering, isolation, ensemble and subspace. For each category, we provide the time complexity of each algorithm and their general advantages and disadvantages. In the end, we compared all discussed anomaly detection algorithms in detail.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Anomaly and outlier are widely used terms. In this work, we will use both terms interchangeably.
 
2
Anomaly detection and outlier detection are widely used terms. In this paper, we used both terms interchangeably.
 
3
The time complexity of this kind of algorithms can be reduced to \(O(n\log (n))\) by using good indexing structure, but they are not feasible in high dimensional space. Thus we mention time complexities without such index throughout the paper.
 
4
Clustered anomalies are anomalies, which form cluster of few points outside of the normal cluster.
 
5
Some algorithms choose subspace based on statistical test (e.g. HiCS, CMI) and some choose randomly(e.g. Zero++).
 
6
Anomaly detection algorithms based on subspace are required to search for the subspace, which requires additional time, which depends on a search method. We only provide scoring time in a subspace.
 
Literatur
1.
Zurück zum Zitat Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef
2.
Zurück zum Zitat Ahmed M, Najmul Islam AKM (2020) Deep learning: hope or hype. Ann Data Sci 7(3):427–432CrossRef Ahmed M, Najmul Islam AKM (2020) Deep learning: hope or hype. Ann Data Sci 7(3):427–432CrossRef
3.
Zurück zum Zitat Chandola V, Banerjee A, Kumar V (2007) Outlier detection: a survey. ACM Comput Surv 14:15 Chandola V, Banerjee A, Kumar V (2007) Outlier detection: a survey. ACM Comput Surv 14:15
4.
Zurück zum Zitat Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef
5.
Zurück zum Zitat Hawkins DM (1980) Identification of outliers, vol 11. Springer, BerlinCrossRef Hawkins DM (1980) Identification of outliers, vol 11. Springer, BerlinCrossRef
6.
Zurück zum Zitat Barnett V, Lewis T (1984) Outliers in statistical data, 3rd edn. Wiley, New York Barnett V, Lewis T (1984) Outliers in statistical data, 3rd edn. Wiley, New York
7.
Zurück zum Zitat Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, Association for Computing Machinery, New York, NY, USA, pp 93–104 Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, Association for Computing Machinery, New York, NY, USA, pp 93–104
8.
Zurück zum Zitat Jiang MF, Tseng SS, Su CM (2001) Two-phase clustering process for outliers detection. Pattern Recogn Lett 22(6):691–700CrossRef Jiang MF, Tseng SS, Su CM (2001) Two-phase clustering process for outliers detection. Pattern Recogn Lett 22(6):691–700CrossRef
9.
Zurück zum Zitat Hu T, Sung SY (2003) Detecting pattern-based outliers. Pattern Recogn Lett 24(16):3059–3068CrossRef Hu T, Sung SY (2003) Detecting pattern-based outliers. Pattern Recogn Lett 24(16):3059–3068CrossRef
10.
Zurück zum Zitat Aryal S, Baniya AA, Santosh KC (2019) Improved histogram-based anomaly detector with the extended principal component features. arXiv preprint arXiv: 1909.12702 Aryal S, Baniya AA, Santosh KC (2019) Improved histogram-based anomaly detector with the extended principal component features. arXiv preprint arXiv:​ 1909.​12702
11.
Zurück zum Zitat Ahmed M (2018) Collective anomaly detection techniques for network traffic analysis. Ann Data Sci 5(4):497–512CrossRef Ahmed M (2018) Collective anomaly detection techniques for network traffic analysis. Ann Data Sci 5(4):497–512CrossRef
12.
Zurück zum Zitat Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef
13.
Zurück zum Zitat Aggarwal CC (2017) An introduction to outlier analysis. Springer, Cham, pp 1–34 Aggarwal CC (2017) An introduction to outlier analysis. Springer, Cham, pp 1–34
14.
Zurück zum Zitat Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York
15.
Zurück zum Zitat Nick C (2009) Precision at n. Springer, Boston, pp 2127–2128 Nick C (2009) Precision at n. Springer, Boston, pp 2127–2128
16.
Zurück zum Zitat Zhang E, Zhang Y (2009) Average precision. Springer, Boston, pp 192–193 Zhang E, Zhang Y (2009) Average precision. Springer, Boston, pp 192–193
17.
Zurück zum Zitat Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRef Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRef
18.
Zurück zum Zitat Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):1–31, 04CrossRef Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):1–31, 04CrossRef
19.
Zurück zum Zitat Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Disc 30(4):891–927CrossRef Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Disc 30(4):891–927CrossRef
20.
Zurück zum Zitat Shewhart WA (1930) Economic quality control of manufactured product1. Bell Syst Tech J 9(2):364–389CrossRef Shewhart WA (1930) Economic quality control of manufactured product1. Bell Syst Tech J 9(2):364–389CrossRef
21.
Zurück zum Zitat Rosner B (1983) Percentage points for a generalized ESD many-outlier procedure. Technometrics 25(2):165–172CrossRef Rosner B (1983) Percentage points for a generalized ESD many-outlier procedure. Technometrics 25(2):165–172CrossRef
22.
Zurück zum Zitat Liu J-P, Weng C-S (1991) Detection of outlying data in bioavailability/bioequivalence studies. Stat Med 10(9):1375–1389CrossRef Liu J-P, Weng C-S (1991) Detection of outlying data in bioavailability/bioequivalence studies. Stat Med 10(9):1375–1389CrossRef
23.
Zurück zum Zitat Surace C, Worden K, Tomlinson G (1997) A novelty detection approach to diagnose damage in a cracked beam. In: Proceedings-SPIE the international society for optical engineering, Citeseer, pp 947–953 Surace C, Worden K, Tomlinson G (1997) A novelty detection approach to diagnose damage in a cracked beam. In: Proceedings-SPIE the international society for optical engineering, Citeseer, pp 947–953
24.
Zurück zum Zitat Surace C, Orden K et al (1998) A novelty detection method to diagnose damage in structures: an application to an offshore platform. In: The eighth international offshore and polar engineering conference, International Society of Offshore and Polar Engineers Surace C, Orden K et al (1998) A novelty detection method to diagnose damage in structures: an application to an offshore platform. In: The eighth international offshore and polar engineering conference, International Society of Offshore and Polar Engineers
25.
Zurück zum Zitat Laurikkala J, Juhola M, Kentala E (2000) Informal identification of outliers in medical data. In: Fifth international workshop on intelligent data analysis in medicine and pharmacology, vol 1, pp 20–24 Laurikkala J, Juhola M, Kentala E (2000) Informal identification of outliers in medical data. In: Fifth international workshop on intelligent data analysis in medicine and pharmacology, vol 1, pp 20–24
26.
Zurück zum Zitat Ye N, Chen Q (2001) An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Qual Reliab Eng Int 17(2):105–112CrossRef Ye N, Chen Q (2001) An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Qual Reliab Eng Int 17(2):105–112CrossRef
27.
Zurück zum Zitat Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, New York Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, New York
28.
Zurück zum Zitat Horn PS, Feng L, Li Y, Pesce AJ (2001) Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 47(12):2137–2142CrossRef Horn PS, Feng L, Li Y, Pesce AJ (2001) Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 47(12):2137–2142CrossRef
29.
Zurück zum Zitat Solberg HE, Lahti A (2005) Detection of outliers in reference distributions: performance of Horn’s algorithm. Clin Chem 51(2):2326–2332, 12CrossRef Solberg HE, Lahti A (2005) Detection of outliers in reference distributions: performance of Horn’s algorithm. Clin Chem 51(2):2326–2332, 12CrossRef
30.
Zurück zum Zitat Dovoedo YH, Chakraborti S (2015) Boxplot-based outlier detection for the location-scale family. Commun Stat Simul Comput 44(6):1492–1513CrossRef Dovoedo YH, Chakraborti S (2015) Boxplot-based outlier detection for the location-scale family. Commun Stat Simul Comput 44(6):1492–1513CrossRef
31.
Zurück zum Zitat Gibbons RD (1994) Statistical methods for groundwater monitoring. Wiley, New YorkCrossRef Gibbons RD (1994) Statistical methods for groundwater monitoring. Wiley, New YorkCrossRef
32.
Zurück zum Zitat Javitz HS, Valdes A (1991) The SRI ides statistical anomaly detector. In: Proceedings of 1991 IEEE computer society symposium on research in security and privacy, pp 316–326 Javitz HS, Valdes A (1991) The SRI ides statistical anomaly detector. In: Proceedings of 1991 IEEE computer society symposium on research in security and privacy, pp 316–326
33.
Zurück zum Zitat Gebski M, Wong RK (2007) An efficient histogram method for outlier detection. In: Ramamohanarao KP, Krishna R, Mohania M, Nantajeewarawat E (eds) Advances in databases: concepts, systems and applications. Springer, Berlin, pp 176–187CrossRef Gebski M, Wong RK (2007) An efficient histogram method for outlier detection. In: Ramamohanarao KP, Krishna R, Mohania M, Nantajeewarawat E (eds) Advances in databases: concepts, systems and applications. Springer, Berlin, pp 176–187CrossRef
34.
Zurück zum Zitat Jiang X-B, Li G-Y, Lian S (2011) Outlier detection algorithm based on variable-width histogram for wireless sensor network. J Comput Appl 31(3):694–697 Jiang X-B, Li G-Y, Lian S (2011) Outlier detection algorithm based on variable-width histogram for wireless sensor network. J Comput Appl 31(3):694–697
35.
Zurück zum Zitat Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. In: KI-2012: poster and demo track, pp 59–63 Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. In: KI-2012: poster and demo track, pp 59–63
36.
Zurück zum Zitat Xie M, Hu J, Tian B (2012) Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: 2012 IEEE 11th international conference on trust, security and privacy in computing and communications, pp 751–759 Xie M, Hu J, Tian B (2012) Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: 2012 IEEE 11th international conference on trust, security and privacy in computing and communications, pp 751–759
37.
Zurück zum Zitat Latecki LJ, Lazarevic A, Pokrajac D (2007) Outlier detection with kernel density functions. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Berlin, pp 61–75CrossRef Latecki LJ, Lazarevic A, Pokrajac D (2007) Outlier detection with kernel density functions. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Berlin, pp 61–75CrossRef
38.
Zurück zum Zitat Oh JH, Gao J (2009) A kernel-based approach for detecting outliers of high-dimensional biological data. In: BMC bioinformatics, vol 10, Springer, p S7 Oh JH, Gao J (2009) A kernel-based approach for detecting outliers of high-dimensional biological data. In: BMC bioinformatics, vol 10, Springer, p S7
39.
Zurück zum Zitat Gao J, Hu W, Zhang Z, Zhang X, Wu O (2011) Rkof: robust kernel-based local outlier detection. In: Huang JZ, Cao L, Srivastava J (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 270–283CrossRef Gao J, Hu W, Zhang Z, Zhang X, Wu O (2011) Rkof: robust kernel-based local outlier detection. In: Huang JZ, Cao L, Srivastava J (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 270–283CrossRef
40.
Zurück zum Zitat Askari A, Yang F, Ghaoui LE (2018) Kernel-based outlier detection using the inverse christoffel function Askari A, Yang F, Ghaoui LE (2018) Kernel-based outlier detection using the inverse christoffel function
41.
Zurück zum Zitat Liu F, Yanwei Yu, Song P, Fan Y, Tong X (2020) Scalable KDE-based top-n local outlier detection over large-scale data streams. Knowl Based Syst 204:106186CrossRef Liu F, Yanwei Yu, Song P, Fan Y, Tong X (2020) Scalable KDE-based top-n local outlier detection over large-scale data streams. Knowl Based Syst 204:106186CrossRef
42.
Zurück zum Zitat Siegel AF, Morgan CJ (1988) Statistics and data analysis: an introduction, 2nd edn. Wiley, New York Siegel AF, Morgan CJ (1988) Statistics and data analysis: an introduction, 2nd edn. Wiley, New York
43.
Zurück zum Zitat Zhang Y, Hamm NAS, Meratnia N, Stein A, van de Voort M, Havinga PJM (2012) Statistics-based outlier detection for wireless sensor networks. Int J Geogr Inf Sci 26(8):1373–1392CrossRef Zhang Y, Hamm NAS, Meratnia N, Stein A, van de Voort M, Havinga PJM (2012) Statistics-based outlier detection for wireless sensor networks. Int J Geogr Inf Sci 26(8):1373–1392CrossRef
44.
Zurück zum Zitat Zimek A, Filzmoser P (2018) There and back again: outlier detection between statistical reasoning and data mining algorithms. WIREs Data Min Knowl Discov 8(6):e1280 Zimek A, Filzmoser P (2018) There and back again: outlier detection between statistical reasoning and data mining algorithms. WIREs Data Min Knowl Discov 8(6):e1280
45.
Zurück zum Zitat Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Chen MS, Yu PS, Liu B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 535–548CrossRef Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Chen MS, Yu PS, Liu B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 535–548CrossRef
46.
Zurück zum Zitat Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Loop: local outlier probabilities. In: Proceedings of the 18th ACM conference on information and knowledge management, CIKM ’09, Association for Computing Machinery, New York, NY, USA, pp 1649–1652 Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Loop: local outlier probabilities. In: Proceedings of the 18th ACM conference on information and knowledge management, CIKM ’09, Association for Computing Machinery, New York, NY, USA, pp 1649–1652
47.
Zurück zum Zitat Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405), pp 315–326 Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405), pp 315–326
48.
Zurück zum Zitat Ren D, Wang B, Perrizo W (2004) Rdf: a density-based outlier detection method using vertical data representation. In: extitFourth IEEE international conference on data mining (ICDM’04), pp 503–506 Ren D, Wang B, Perrizo W (2004) Rdf: a density-based outlier detection method using vertical data representation. In: extitFourth IEEE international conference on data mining (ICDM’04), pp 503–506
49.
Zurück zum Zitat Jin W, Tung Anthony KH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06, Springer, Berlin, pp 577–593 Jin W, Tung Anthony KH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06, Springer, Berlin, pp 577–593
50.
Zurück zum Zitat Fan H, Zaïane OR, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19(1):31–51CrossRef Fan H, Zaïane OR, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19(1):31–51CrossRef
51.
Zurück zum Zitat Goldstein M (2012) Fastlof: an expectation-maximization based local outlier detection algorithm. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 2282–2285 Goldstein M (2012) Fastlof: an expectation-maximization based local outlier detection algorithm. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 2282–2285
52.
Zurück zum Zitat Momtaz R, Nesma M, Gowayyed MA (2013) Dwof: a robust density-based outlier detection approach. In: Sanches JM, Micó L, Cardoso JS (eds) Pattern recognition and image analysis. Springer, Berlin, pp 517–525CrossRef Momtaz R, Nesma M, Gowayyed MA (2013) Dwof: a robust density-based outlier detection approach. In: Sanches JM, Micó L, Cardoso JS (eds) Pattern recognition and image analysis. Springer, Berlin, pp 517–525CrossRef
53.
Zurück zum Zitat Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Disc 28(1):190–237CrossRef Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Disc 28(1):190–237CrossRef
54.
Zurück zum Zitat Wells JR, Ting KM, Washio T (2014) Linearn: a new approach to nearest neighbour density estimator. Pattern Recogn 47(8):2702–2720CrossRef Wells JR, Ting KM, Washio T (2014) Linearn: a new approach to nearest neighbour density estimator. Pattern Recogn 47(8):2702–2720CrossRef
55.
Zurück zum Zitat Campello Ricardo JGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10(1):1–51CrossRef Campello Ricardo JGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10(1):1–51CrossRef
56.
Zurück zum Zitat Aryal S, Ting KM, Haffari G (2016) Revisiting attribute independence assumption in probabilistic unsupervised anomaly detection. In: Michael C, Alan Wang G, Hsinchun C (eds) Intelligence and security informatics. Springer, Cham, pp 73–86CrossRef Aryal S, Ting KM, Haffari G (2016) Revisiting attribute independence assumption in probabilistic unsupervised anomaly detection. In: Michael C, Alan Wang G, Hsinchun C (eds) Intelligence and security informatics. Springer, Cham, pp 73–86CrossRef
57.
Zurück zum Zitat Abdi H, Williams LJ (2010) Principal component analysis. WIREs Comput Stat 2(4):433–459CrossRef Abdi H, Williams LJ (2010) Principal component analysis. WIREs Comput Stat 2(4):433–459CrossRef
58.
Zurück zum Zitat Aggarwal CC (2017) Proximity-based outlier detection. Springer, Cham, pp 111–147 Aggarwal CC (2017) Proximity-based outlier detection. Springer, Cham, pp 111–147
59.
Zurück zum Zitat Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421CrossRef Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421CrossRef
60.
Zurück zum Zitat Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Kaufmann Publishers Inc, San Francisco, CA, USA, Morgan, pp 392–403 Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Kaufmann Publishers Inc, San Francisco, CA, USA, Morgan, pp 392–403
61.
Zurück zum Zitat Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3):237–253CrossRef Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3):237–253CrossRef
62.
Zurück zum Zitat Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec 29(2):427–438CrossRef Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec 29(2):427–438CrossRef
63.
Zurück zum Zitat Ghoting A, Parthasarathy S, Otey ME (2008) Fast mining of distance-based outliers in high-dimensional datasets. Data Min Knowl Disc 16(3):349–364CrossRef Ghoting A, Parthasarathy S, Otey ME (2008) Fast mining of distance-based outliers in high-dimensional datasets. Data Min Knowl Disc 16(3):349–364CrossRef
64.
Zurück zum Zitat Kriegel HP, Schubert M, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08, Association for Computing Machinery, New York, pp 444–452 Kriegel HP, Schubert M, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08, Association for Computing Machinery, New York, pp 444–452
65.
Zurück zum Zitat Wang B, Xiao G, Yu H, Yang X (2009) Distance-based outlier detection on uncertain data. In: 2009 Ninth IEEE international conference on computer and information technology, vol 1, pp 293–298 Wang B, Xiao G, Yu H, Yang X (2009) Distance-based outlier detection on uncertain data. In: 2009 Ninth IEEE international conference on computer and information technology, vol 1, pp 293–298
66.
Zurück zum Zitat Zhang K, Hutter M, Jin H (2009) A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho T-B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 813–822CrossRef Zhang K, Hutter M, Jin H (2009) A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho T-B (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 813–822CrossRef
67.
Zurück zum Zitat Sugiyama M, Borgwardt K (2013) Rapid distance-based outlier detection via sampling. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26, Curran Associates Inc, pp 467–475 Sugiyama M, Borgwardt K (2013) Rapid distance-based outlier detection via sampling. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26, Curran Associates Inc, pp 467–475
68.
Zurück zum Zitat Radovanović M, Nanopoulos A, Ivanović M (2015) Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans Knowl Data Eng 27(5):1369–1382CrossRef Radovanović M, Nanopoulos A, Ivanović M (2015) Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans Knowl Data Eng 27(5):1369–1382CrossRef
69.
Zurück zum Zitat Wang H, Bah MJ, Hammad M (2019) Progress in outlier detection techniques: a survey. IEEE Access 7:107964–108000CrossRef Wang H, Bah MJ, Hammad M (2019) Progress in outlier detection techniques: a survey. IEEE Access 7:107964–108000CrossRef
70.
Zurück zum Zitat Berchtold S, Keim DA, Kriegel H-P (1996) The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22th international conference on very large data bases, VLDB ’96, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 28–39 Berchtold S, Keim DA, Kriegel H-P (1996) The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22th international conference on very large data bases, VLDB ’96, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 28–39
71.
Zurück zum Zitat Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57CrossRef Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57CrossRef
72.
Zurück zum Zitat Sellis TK, Roussopoulos N, Faloutsos C (1987) The r+-tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th international conference on very large data bases, VLDB ’87, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 507–518 Sellis TK, Roussopoulos N, Faloutsos C (1987) The r+-tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th international conference on very large data bases, VLDB ’87, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 507–518
73.
Zurück zum Zitat Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef
74.
Zurück zum Zitat Dantong Yu, Sheikholeslami G, Zhang A (2002) Findout: finding outliers in very large datasets. Knowl Inf Syst 4(4):387–412CrossRef Dantong Yu, Sheikholeslami G, Zhang A (2002) Findout: finding outliers in very large datasets. Knowl Inf Syst 4(4):387–412CrossRef
75.
Zurück zum Zitat He Z, Xiaofei X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650CrossRef He Z, Xiaofei X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650CrossRef
76.
Zurück zum Zitat Jiang S, An Q (2008) Clustering-based outlier detection method. In: 2008 Fifth international conference on fuzzy systems and knowledge discovery, vol 2, pp 429–433 Jiang S, An Q (2008) Clustering-based outlier detection method. In: 2008 Fifth international conference on fuzzy systems and knowledge discovery, vol 2, pp 429–433
77.
Zurück zum Zitat Liu FT, Ting KM, Zhou Z (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining, pp 413–422 Liu FT, Ting KM, Zhou Z (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining, pp 413–422
78.
Zurück zum Zitat Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):1–39CrossRef Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):1–39CrossRef
79.
Zurück zum Zitat Liu FT, Ting KM, Zhou ZH (2010) On detecting clustered anomalies using sciforest. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 274–290CrossRef Liu FT, Ting KM, Zhou ZH (2010) On detecting clustered anomalies using sciforest. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 274–290CrossRef
80.
Zurück zum Zitat Tan SC, Ting KM, Liu TF (2011) Fast anomaly detection for streaming data. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 2, IJCAI’11, AAAI Press, pp 1511–1516 Tan SC, Ting KM, Liu TF (2011) Fast anomaly detection for streaming data. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 2, IJCAI’11, AAAI Press, pp 1511–1516
81.
Zurück zum Zitat Aryal S, Ting KM, Wells JR, Washio T (2014) Improving iforest with relative mass. In: Tseng VS, Ho TB, Zhou ZH, Chen ALP, Kao HY (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 510–521CrossRef Aryal S, Ting KM, Wells JR, Washio T (2014) Improving iforest with relative mass. In: Tseng VS, Ho TB, Zhou ZH, Chen ALP, Kao HY (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 510–521CrossRef
82.
Zurück zum Zitat Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Wells JR (2014) Efficient anomaly detection by isolation using nearest neighbour ensemble. In: 2014 IEEE International conference on data mining workshop, pp 698–705 Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Wells JR (2014) Efficient anomaly detection by isolation using nearest neighbour ensemble. In: 2014 IEEE International conference on data mining workshop, pp 698–705
83.
Zurück zum Zitat Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Zhu Y, Wells JR (2018) Isolation-based anomaly detection using nearest-neighbor ensembles. Comput Intell 34(4):968–998CrossRef Bandaragoda TR, Ting KM, Albrecht D, Liu FT, Zhu Y, Wells JR (2018) Isolation-based anomaly detection using nearest-neighbor ensembles. Comput Intell 34(4):968–998CrossRef
84.
Zurück zum Zitat Pang G, Ting KM, Albrecht D (2015) Lesinn: detecting anomalies by identifying least similar nearest neighbours. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 623–630 Pang G, Ting KM, Albrecht D (2015) Lesinn: detecting anomalies by identifying least similar nearest neighbours. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 623–630
85.
Zurück zum Zitat Zhang X, Dou W, He Q, Zhou R, Leckie C, Kotagiri R, Salcic Z (2017) Lshiforest: a generic framework for fast tree isolation based ensemble anomaly analysis. In: 2017 IEEE 33rd international conference on data engineering (ICDE), pp 983–994 Zhang X, Dou W, He Q, Zhou R, Leckie C, Kotagiri R, Salcic Z (2017) Lshiforest: a generic framework for fast tree isolation based ensemble anomaly analysis. In: 2017 IEEE 33rd international conference on data engineering (ICDE), pp 983–994
86.
Zurück zum Zitat Aryal S (2018) Anomaly detection technique robust to units and scales of measurement. In: Phung D, Tseng VS, Webb GI, Ho B, Ganji M, Rashidi L (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 589–601CrossRef Aryal S (2018) Anomaly detection technique robust to units and scales of measurement. In: Phung D, Tseng VS, Webb GI, Ho B, Ganji M, Rashidi L (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 589–601CrossRef
87.
Zurück zum Zitat Aryal S, Santosh KC, Dazeley R (2020) usfad: a robust anomaly detector based on unsupervised stochastic forest. Int J Mach Learn Cybern 12:1–14 Aryal S, Santosh KC, Dazeley R (2020) usfad: a robust anomaly detector based on unsupervised stochastic forest. Int J Mach Learn Cybern 12:1–14
88.
Zurück zum Zitat Ting KM, Zhou G-T, Liu FT, Tan JSC (2010) Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, Association for Computing Machinery, New York, NY, USA, pp 989–998 Ting KM, Zhou G-T, Liu FT, Tan JSC (2010) Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10, Association for Computing Machinery, New York, NY, USA, pp 989–998
89.
Zurück zum Zitat Fernando TL, Webb GI (2017) Simusf: an efficient and effective similarity measure that is invariant to violations of the interval scale assumption. Data Min Knowl Disc 31(1):264–286CrossRef Fernando TL, Webb GI (2017) Simusf: an efficient and effective similarity measure that is invariant to violations of the interval scale assumption. Data Min Knowl Disc 31(1):264–286CrossRef
90.
Zurück zum Zitat Ting KM, Washio T, Wells JR, Aryal S (2017) Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach Learn 106(1):55–91CrossRef Ting KM, Washio T, Wells JR, Aryal S (2017) Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach Learn 106(1):55–91CrossRef
91.
Zurück zum Zitat Bandaragoda TR (2015) Isolation based anomaly detection: a re-examination. PhD thesis, Monash University Bandaragoda TR (2015) Isolation based anomaly detection: a re-examination. PhD thesis, Monash University
92.
Zurück zum Zitat Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304CrossRef Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304CrossRef
93.
Zurück zum Zitat Zhao Y, Hryniewicki MK (2018) DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM SIGKDD ODD workshop, London, UK Zhao Y, Hryniewicki MK (2018) DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM SIGKDD ODD workshop, London, UK
94.
Zurück zum Zitat Zhao Y, Nasrullah Z, Hryniewicki MK, Li Z (2019) LSCP: locally selective combination in parallel outlier ensembles. In: Proceedings of the 2019 SIAM international conference on data mining, SDM 2019, Calgary, Canada, pp 585–593 Zhao Y, Nasrullah Z, Hryniewicki MK, Li Z (2019) LSCP: locally selective combination in parallel outlier ensembles. In: Proceedings of the 2019 SIAM international conference on data mining, SDM 2019, Calgary, Canada, pp 585–593
95.
Zurück zum Zitat Aggarwal CC (2013) Outlier ensembles: position paper. SIGKDD Explor Newsl 14(2):49–58CrossRef Aggarwal CC (2013) Outlier ensembles: position paper. SIGKDD Explor Newsl 14(2):49–58CrossRef
96.
97.
Zurück zum Zitat Zimek A, Campello RJGB, Sander J (2014) Ensembles for unsupervised outlier detection: challenges and research questions a position paper. SIGKDD Explor Newsl 15(2):11–22CrossRef Zimek A, Campello RJGB, Sander J (2014) Ensembles for unsupervised outlier detection: challenges and research questions a position paper. SIGKDD Explor Newsl 15(2):11–22CrossRef
98.
Zurück zum Zitat Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Outlier detection in axis-parallel subspaces of high dimensional data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho TB (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 831–838CrossRef Kriegel H-P, Kröger P, Schubert E, Zimek A (2009) Outlier detection in axis-parallel subspaces of high dimensional data. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho TB (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 831–838CrossRef
99.
Zurück zum Zitat Agrawal A (2009) Local subspace based outlier detection. In: Ranka S, Aluru S, Buyya R, Chung Y-C, Dua S, Grama A, Gupta SKS, Kumar R, Phoha VV (eds) Contemporary computing. Springer, Heidelberg, pp 149–157CrossRef Agrawal A (2009) Local subspace based outlier detection. In: Ranka S, Aluru S, Buyya R, Chung Y-C, Dua S, Grama A, Gupta SKS, Kumar R, Phoha VV (eds) Contemporary computing. Springer, Heidelberg, pp 149–157CrossRef
100.
Zurück zum Zitat Nguyen HV, Gopalkrishnan V, Assent I (2011) An unbiased distance-based outlier detection approach for high-dimensional data. In: Jeffrey XY, Myoung HK, Rainer U (eds) Database systems for advanced applications. Springer, Berlin, pp 138–152CrossRef Nguyen HV, Gopalkrishnan V, Assent I (2011) An unbiased distance-based outlier detection approach for high-dimensional data. In: Jeffrey XY, Myoung HK, Rainer U (eds) Database systems for advanced applications. Springer, Berlin, pp 138–152CrossRef
101.
Zurück zum Zitat Kriegel H, Kröger P, Schubert E, Zimek A (2012) Outlier detection in arbitrarily oriented subspaces. In: 2012 IEEE 12th international conference on data mining, pp 379–388 Kriegel H, Kröger P, Schubert E, Zimek A (2012) Outlier detection in arbitrarily oriented subspaces. In: 2012 IEEE 12th international conference on data mining, pp 379–388
102.
Zurück zum Zitat Keller F, Muller E, Bohm K (2012) Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th international conference on data engineering, pp 1037–1048 Keller F, Muller E, Bohm K (2012) Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th international conference on data engineering, pp 1037–1048
103.
Zurück zum Zitat Nguyen HV, Müller E, Vreeken J, Keller F, Böhm, K (2013) Cmi: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: Proceedings of the 2013 SIAM international conference on data mining, SIAM, pp 198–206 Nguyen HV, Müller E, Vreeken J, Keller F, Böhm, K (2013) Cmi: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: Proceedings of the 2013 SIAM international conference on data mining, SIAM, pp 198–206
104.
Zurück zum Zitat Pang G, Ting KM, Albrecht D, Jin H (2016) Zero++: harnessing the power of zero appearances to detect anomalies in large-scale data sets. J Artif Intell Res 57:593–620CrossRef Pang G, Ting KM, Albrecht D, Jin H (2016) Zero++: harnessing the power of zero appearances to detect anomalies in large-scale data sets. J Artif Intell Res 57:593–620CrossRef
105.
Zurück zum Zitat Aggarwal CC (2017) High-dimensional outlier detection: the subspace method, Springer International Publishing, Cham, pp 149–184 Aggarwal CC (2017) High-dimensional outlier detection: the subspace method, Springer International Publishing, Cham, pp 149–184
106.
Zurück zum Zitat Zimek A, Schubert E, Kriegel H-P (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387CrossRef Zimek A, Schubert E, Kriegel H-P (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387CrossRef
Metadaten
Titel
A Comprehensive Survey of Anomaly Detection Algorithms
verfasst von
Durgesh Samariya
Amit Thakkar
Publikationsdatum
26.11.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
Annals of Data Science / Ausgabe 3/2023
Print ISSN: 2198-5804
Elektronische ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-021-00362-9

Weitere Artikel der Ausgabe 3/2023

Annals of Data Science 3/2023 Zur Ausgabe

Premium Partner