Skip to main content
Top
Published in: The VLDB Journal 4/2023

14-01-2023 | Regular Paper

A meta-level analysis of online anomaly detectors

Authors: Antonios Ntroumpogiannis, Michail Giannoulis, Nikolaos Myrtakis, Vassilis Christophides, Eric Simon, Ioannis Tsamardinos

Published in: The VLDB Journal | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Real-time detection of anomalies in streaming data is receiving increasing attention as it allows us to raise alerts, predict faults, and detect intrusions or threats across industries. Yet, little attention has been given to compare the effectiveness and efficiency of anomaly detectors for streaming data (i.e., of online algorithms). In this paper, we present a qualitative, synthetic overview of major online detectors from different algorithmic families (i.e., distance, density, tree or projection based) and highlight their main ideas for constructing, updating and testing detection models. Then, we provide a thorough analysis of the results of a quantitative experimental evaluation of online detection algorithms along with their offline counterparts. The behavior of the detectors is correlated with the characteristics of different datasets (i.e., meta-features), thereby providing a meta-level analysis of their performance. Our study addresses several missing insights from the literature such as (a) how reliable are detectors against a random classifier and what dataset characteristics make them perform randomly; (b) to what extent online detectors approximate the performance of offline counterparts; (c) which sketch strategy and update primitives of detectors are best to detect anomalies visible only within a feature subspace of a dataset; (d) what are the trade-offs between the effectiveness and the efficiency of detectors belonging to different algorithmic families; (e) which specific characteristics of datasets yield an online algorithm to outperform all others.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
In this paper, we use the terms outlier, novelty and anomaly detection interchangeably
 
2
We call “sample” an element (i.e., observation, measurement) of a data stream.
 
3
An hyper-parameter cannot be estimated from the data.
 
4
The recently proposed distance-based detector NETS [73] consumes less resources than MCOD but is not reporting any improvement in terms of effectiveness.
 
5
nearest neighbors are distinguished according to maximum and average distance.
 
6
\(2^{h+1} - 1\) is the number of nodes in a perfect binary tree.
 
8
\(2 n - 1\) is the number of nodes of a full binary tree with n leaves.
 
9
We consider the coordinates of each point as (F2, F1).
 
10
in our implementation we consider the simple forgetting mechanism rather than the time-decaying mechanism.
 
11
Authors report that the value \(\gamma =0.01\) is optimal for the most datasets.
 
21
After removing null values and categorical features not treated by our anomaly detectors.
 
23
There are also some cases of unknown anomalies they may appear in any of those categories.
 
24
k value is automatically computed during training.
 
25
When p is stored in a leaf of size larger than one, the value of !t(p) is adjusted to c(size).
 
26
In most experimental studies [15, 24, 25], detectors were executed using the default values of their hyper-parameters as “recommended by their authors.”
 
27
RS does not require to compute the gradient of the problem to be optimized and hence be used on functions that are not continuous or differentiable [31].
 
Literature
1.
go back to reference Aggarwal, C.: An Introduction to Outlier Analysis, pp. 1–40 (2013) Aggarwal, C.: An Introduction to Outlier Analysis, pp. 1–40 (2013)
2.
go back to reference Aggarwal, C., Hinneburg, A., Keim, A.: On the surprising behavior of distance metrics in high dimensional spaces ICDT (2001) Aggarwal, C., Hinneburg, A., Keim, A.: On the surprising behavior of distance metrics in high dimensional spaces ICDT (2001)
3.
go back to reference Aggarwal, C., Sathe, S.: Theoretical foundations and algorithms for outlier ensembles. SIGKDD Explor. 17(1), 74 (2015)CrossRef Aggarwal, C., Sathe, S.: Theoretical foundations and algorithms for outlier ensembles. SIGKDD Explor. 17(1), 74 (2015)CrossRef
4.
go back to reference Aggarwal, C., Sathe, S.: Outlier Ensembles-An Introduction. Springer, Berlin (2017)CrossRef Aggarwal, C., Sathe, S.: Outlier Ensembles-An Introduction. Springer, Berlin (2017)CrossRef
5.
go back to reference Akidau, T., Bradshaw, R., Chambers, C., Chernyak, S., Fernández-Moctezuma, R., Lax, R., McVeety, S., Mills, D., Perry, F., Schmidt, E., Whittle, S.: The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB 8(12), 68 (2015) Akidau, T., Bradshaw, R., Chambers, C., Chernyak, S., Fernández-Moctezuma, R., Lax, R., McVeety, S., Mills, D., Perry, F., Schmidt, E., Whittle, S.: The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB 8(12), 68 (2015)
6.
go back to reference Alcobaça, E., Siqueira, F., Rivolli, A., Garcia, L., Oliva, J., de Carvalho, A.: Mfe: towards reproducible meta-feature extraction. JMLR 21(111), 1–5 (2020) Alcobaça, E., Siqueira, F., Rivolli, A., Garcia, L., Oliva, J., de Carvalho, A.: Mfe: towards reproducible meta-feature extraction. JMLR 21(111), 1–5 (2020)
7.
go back to reference Bailis, P., Gan, E., Madden, S., Narayanan, D., Rong, K., Suri, S.: Macrobase: prioritizing attention in fast data. In: SIGMOD (2017) Bailis, P., Gan, E., Madden, S., Narayanan, D., Rong, K., Suri, S.: Macrobase: prioritizing attention in fast data. In: SIGMOD (2017)
8.
9.
go back to reference Bergmeir, C., Benítez, M.: On the use of cross-validation for time series predictor evaluation. Inf. Sci. 7, 191 (2012) Bergmeir, C., Benítez, M.: On the use of cross-validation for time series predictor evaluation. Inf. Sci. 7, 191 (2012)
10.
go back to reference Birge, L., Rozenholc, Y.: How many bins should be put in a regular histogram. In: ESAIM: Probability and Statistics, pp. 24–45 (2006) Birge, L., Rozenholc, Y.: How many bins should be put in a regular histogram. In: ESAIM: Probability and Statistics, pp. 24–45 (2006)
11.
go back to reference Blázquez-García, A., Conde, A., Mori, U., Lozano, J.: A review on outlier/anomaly detection in time series data. ACM Comput. Surv. 54(3), 98 (2021) Blázquez-García, A., Conde, A., Mori, U., Lozano, J.: A review on outlier/anomaly detection in time series data. ACM Comput. Surv. 54(3), 98 (2021)
12.
go back to reference Braei, M., Wagner, S.: Anomaly detection in univariate time-series: a survey on the state-of-the-art. CoRR 00433, 2020 (2004) Braei, M., Wagner, S.: Anomaly detection in univariate time-series: a survey on the state-of-the-art. CoRR 00433, 2020 (2004)
13.
go back to reference Branco, P., Torgo, L., Ribeiro, R.: A survey of predictive modelling under imbalanced distributions. CoRR, 1505.01658 (2015) Branco, P., Torgo, L., Ribeiro, R.: A survey of predictive modelling under imbalanced distributions. CoRR, 1505.01658 (2015)
14.
go back to reference Breunig, M., Kriegel, H., Ng, R., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 799 (2000)CrossRef Breunig, M., Kriegel, H., Ng, R., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 799 (2000)CrossRef
15.
go back to reference Campos, G., Zimek, A., Sander, J., Campello, R., Micenková, B., Schubert, E., Assent, I., Houle, M.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min. Knowl. Discov. 30(4), 891–927 (2016)MathSciNetCrossRef Campos, G., Zimek, A., Sander, J., Campello, R., Micenková, B., Schubert, E., Assent, I., Houle, M.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min. Knowl. Discov. 30(4), 891–927 (2016)MathSciNetCrossRef
16.
go back to reference Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.: Scalable distance-based outlier detection over high-volume data streams (2014) Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.: Scalable distance-based outlier detection over high-volume data streams (2014)
17.
go back to reference Carbone, P., Fragkoulis, M., Kalavri, V., Katsifodimos, A.: Beyond analytics: the evolution of stream processing systems. In: SIGMOD (2020) Carbone, P., Fragkoulis, M., Kalavri, V., Katsifodimos, A.: Beyond analytics: the evolution of stream processing systems. In: SIGMOD (2020)
18.
go back to reference Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 96 (2009)CrossRef Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 96 (2009)CrossRef
19.
go back to reference Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection for discrete sequences: a survey. IEEE TKDE 24(5), 119 (2012) Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection for discrete sequences: a survey. IEEE TKDE 24(5), 119 (2012)
20.
go back to reference Choudhary, D., Arun Kejariwal, A., Orsini, F.: On the runtime-efficacy trade-off of anomaly detection techniques for real-time streaming data. CoRR, arxiv:1710.04735 (2017) Choudhary, D., Arun Kejariwal, A., Orsini, F.: On the runtime-efficacy trade-off of anomaly detection techniques for real-time streaming data. CoRR, arxiv:​1710.​04735 (2017)
21.
go back to reference Cook, A., Mısırlı, G., Fan, Z.: Anomaly detection for IoT time-series data: a survey. IEEE IoT J. 7(7), 88 (2020) Cook, A., Mısırlı, G., Fan, Z.: Anomaly detection for IoT time-series data: a survey. IEEE IoT J. 7(7), 88 (2020)
22.
go back to reference Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: (ICML’06), pp. 233–240 (2006) Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: (ICML’06), pp. 233–240 (2006)
23.
go back to reference Demšar, J.: Statistical comparisons of classifiers over multiple data sets. In: JMLR, 7, December (2006) Demšar, J.: Statistical comparisons of classifiers over multiple data sets. In: JMLR, 7, December (2006)
24.
go back to reference Domingues, R., Filippone, M., Michiardi, P., Zouaoui, J.: A comparative evaluation of outlier detection algorithms. Pattern Recogn. 74(C), 406–421 (2018)MATHCrossRef Domingues, R., Filippone, M., Michiardi, P., Zouaoui, J.: A comparative evaluation of outlier detection algorithms. Pattern Recogn. 74(C), 406–421 (2018)MATHCrossRef
25.
go back to reference Domingues, R., Filippone, M., Michiardi, P., Zouaoui, J.: A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recognit. 74, 478 (2018)MATHCrossRef Domingues, R., Filippone, M., Michiardi, P., Zouaoui, J.: A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recognit. 74, 478 (2018)MATHCrossRef
26.
go back to reference Dua, D., Graff, C.: Uci Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine (2017) Dua, D., Graff, C.: Uci Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine (2017)
27.
go back to reference Dudani, S.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. SMC 6(4), 325–327 (1976) Dudani, S.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. SMC 6(4), 325–327 (1976)
28.
go back to reference Emmott, A., Das, S., Dietterich, T., Fern, A., Wong, W.: A meta-analysis of the anomaly detection problem. CoRR, arxiv:1503.01158 (2015) Emmott, A., Das, S., Dietterich, T., Fern, A., Wong, W.: A meta-analysis of the anomaly detection problem. CoRR, arxiv:​1503.​01158 (2015)
29.
go back to reference Goix, N., Drougard, N., Brault, R., Chiapino, M.: One class splitting criteria for random forests. In: ACML (2017) Goix, N., Drougard, N., Brault, R., Chiapino, M.: One class splitting criteria for random forests. In: ACML (2017)
30.
go back to reference Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS One 11(4), 887 (2016)CrossRef Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS One 11(4), 887 (2016)CrossRef
31.
go back to reference Granichin, O.N., Volkovich, Z., Toledano-Kitai, D.: Randomized Algorithms in Automatic Control and Data Mining, vol. 67. Springer, Berlin (2015) Granichin, O.N., Volkovich, Z., Toledano-Kitai, D.: Randomized Algorithms in Automatic Control and Data Mining, vol. 67. Springer, Berlin (2015)
32.
go back to reference Guha, S., Mishra, N., Roy, G., Schrijvers, O.: Robust random cut forest based anomaly detection on streams. In: ICML’16, pp. 2712–2721 (2016) Guha, S., Mishra, N., Roy, G., Schrijvers, O.: Robust random cut forest based anomaly detection on streams. In: ICML’16, pp. 2712–2721 (2016)
33.
go back to reference Gupta, M., Gao, J., Aggarwal, C., Han, J.: Outlier detection for temporal data: a survey. IEEE TKDE 26(9), 83 (2014)MATH Gupta, M., Gao, J., Aggarwal, C., Han, J.: Outlier detection for temporal data: a survey. IEEE TKDE 26(9), 83 (2014)MATH
34.
go back to reference Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Berlin (2009)MATHCrossRef Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Berlin (2009)MATHCrossRef
35.
go back to reference Herbold, S.: Autorank: a python package for automated ranking of classifiers. J. Open Source Softw. 3, 2173 (2020)CrossRef Herbold, S.: Autorank: a python package for automated ranking of classifiers. J. Open Source Softw. 3, 2173 (2020)CrossRef
36.
go back to reference Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 69 (2004)MATHCrossRef Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 69 (2004)MATHCrossRef
37.
go back to reference Jacob, V., Song, F., Stiegler, A., Rad, B., Diao, Y., Tatbul, N.: Exathlon: a benchmark for explainable anomaly detection over time series. PVLDB 14(11), 58 (2021) Jacob, V., Song, F., Stiegler, A., Rad, B., Diao, Y., Tatbul, N.: Exathlon: a benchmark for explainable anomaly detection over time series. PVLDB 14(11), 58 (2021)
38.
go back to reference Keller, F., Müller, E., Böhm, K.: (2012) Hics: high contrast subspaces for density-based outlier ranking. In: ICDE, pp. 1037–1048 Keller, F., Müller, E., Böhm, K.: (2012) Hics: high contrast subspaces for density-based outlier ranking. In: ICDE, pp. 1037–1048
39.
go back to reference Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI (1995) Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI (1995)
40.
go back to reference Kontaki, M., Gounaris, A., Papadopoulos, A., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: ICDE (2011) Kontaki, M., Gounaris, A., Papadopoulos, A., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: ICDE (2011)
41.
go back to reference Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.: Semantics and evaluation techniques for window aggregates in data streams. In: SIGMOD (2005) Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.: Semantics and evaluation techniques for window aggregates in data streams. In: SIGMOD (2005)
42.
go back to reference Lindner, G., Studer, R.: Ast: Support for algorithm selection with a CBR approach. In: Principles of Data Mining and Knowledge Discovery, pp. 418–423 (1999) Lindner, G., Studer, R.: Ast: Support for algorithm selection with a CBR approach. In: Principles of Data Mining and Knowledge Discovery, pp. 418–423 (1999)
43.
go back to reference Liu, T., Ting, K. Ming, Zhou, Z.: Isolation forest. In: ICDM, pp. 413–422 (2008) Liu, T., Ting, K. Ming, Zhou, Z.: Isolation forest. In: ICDM, pp. 413–422 (2008)
44.
go back to reference Lobo, J., Jiménez-Valverde, A., Real, R.: AUC: a misleading measure of the performance of predictive distribution models. Global Ecol. Biogeogr. 17(2), 9008 (2008)CrossRef Lobo, J., Jiménez-Valverde, A., Real, R.: AUC: a misleading measure of the performance of predictive distribution models. Global Ecol. Biogeogr. 17(2), 9008 (2008)CrossRef
45.
go back to reference Manzoor, E., Lamba, H., Akoglu, L.: Xstream: Outlier detection in feature-evolving data streams KDD. (2018) Manzoor, E., Lamba, H., Akoglu, L.: Xstream: Outlier detection in feature-evolving data streams KDD. (2018)
46.
go back to reference Na, Gyoung S., Kim, Donghyun, Yu., Hwanjo: Dilof: Effective and memory efficient local outlier detection in data streams. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1993–2002 (2018) Na, Gyoung S., Kim, Donghyun, Yu., Hwanjo: Dilof: Effective and memory efficient local outlier detection in data streams. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1993–2002 (2018)
47.
go back to reference Orair, G., Teixeira, C., Meira, W., Wang, Y., Parthasarathy, S.: Distance-based outlier detection: consolidation and renewed bearing. PVLDB 3(2), 788 (2010) Orair, G., Teixeira, C., Meira, W., Wang, Y., Parthasarathy, S.: Distance-based outlier detection: consolidation and renewed bearing. PVLDB 3(2), 788 (2010)
48.
go back to reference Pang, G., Shen, C., Cao, L., Hengel, A.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 89 (2021) Pang, G., Shen, C., Cao, L., Hengel, A.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 89 (2021)
50.
go back to reference Qin, X., Cao, L., Rundensteiner, E.A., Madden, S.: Scalable kernel density estimation-based local outlier detection over large data streams. In: EDBT (2019) Qin, X., Cao, L., Rundensteiner, E.A., Madden, S.: Scalable kernel density estimation-based local outlier detection over large data streams. In: EDBT (2019)
51.
go back to reference Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets, pp. 427–438 (2000) Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets, pp. 427–438 (2000)
52.
go back to reference Rastrigin, L.A.: The convergence of the random search method in the extremal control of a many parameter system. Autom. Remote Control 4, 1337–1342 (1963) Rastrigin, L.A.: The convergence of the random search method in the extremal control of a many parameter system. Autom. Remote Control 4, 1337–1342 (1963)
53.
go back to reference Rogers, J., Gunn, S.: Identifying feature relevance using a random forest. In: SLSFS, pp. 173–184, Bohinj, Slovenia (2005) Rogers, J., Gunn, S.: Identifying feature relevance using a random forest. In: SLSFS, pp. 173–184, Bohinj, Slovenia (2005)
54.
55.
go back to reference Sadik, S., Gruenwald, L.: Research issues in outlier detection for data streams. SIGKDD Explor. Newsl. 15(1), 78 (2014)CrossRef Sadik, S., Gruenwald, L.: Research issues in outlier detection for data streams. SIGKDD Explor. Newsl. 15(1), 78 (2014)CrossRef
56.
go back to reference Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than ROC when evaluating binary classifiers on imbalanced datasets. PLoS One 10(3), 708 (2015)CrossRef Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than ROC when evaluating binary classifiers on imbalanced datasets. PLoS One 10(3), 708 (2015)CrossRef
57.
go back to reference Sathe, S., Aggarwal, C.: Subspace histograms for outlier detection in linear time. KAIS 56(3), 68 (2018) Sathe, S., Aggarwal, C.: Subspace histograms for outlier detection in linear time. KAIS 56(3), 68 (2018)
58.
go back to reference Silva, J., Faria, E., Barros, R., Hruschka, E., de Carvalho, A., Gama, J.: Data stream clustering: a survey. ACM Comput. Surv. 46(1), 9114 (2013)MATHCrossRef Silva, J., Faria, E., Barros, R., Hruschka, E., de Carvalho, A., Gama, J.: Data stream clustering: a survey. ACM Comput. Surv. 46(1), 9114 (2013)MATHCrossRef
59.
go back to reference Somol, P., Grim, J., Filip, J., Pudil, P.: On stopping rules in dependency-aware feature ranking. In: CIARP (2013) Somol, P., Grim, J., Filip, J., Pudil, P.: On stopping rules in dependency-aware feature ranking. In: CIARP (2013)
60.
go back to reference Tan, C., Ting, M., Liu, T.: Fast anomaly detection for streaming data. In: IJCAI (2011) Tan, C., Ting, M., Liu, T.: Fast anomaly detection for streaming data. In: IJCAI (2011)
61.
go back to reference Tatbul, N., Lee, T.J., Zdonik, S., Alam, M., Gottschlich, J.: Precision and recall for time series. In: NIPS (2018) Tatbul, N., Lee, T.J., Zdonik, S., Alam, M., Gottschlich, J.: Precision and recall for time series. In: NIPS (2018)
62.
go back to reference Ting, K.M., Washio, T., Wells, J.R., Aryal, S.: Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach. Learn. 5, 55–91 (2017)MathSciNetMATHCrossRef Ting, K.M., Washio, T., Wells, J.R., Aryal, S.: Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Mach. Learn. 5, 55–91 (2017)MathSciNetMATHCrossRef
63.
go back to reference Tran, L., Fan, L., Shahabi, C.: Distance-based outlier detection in data streams. PVLDB 9(12), 96 (2016) Tran, L., Fan, L., Shahabi, C.: Distance-based outlier detection in data streams. PVLDB 9(12), 96 (2016)
64.
go back to reference Tran, L., Mun, M., Shahabi, C.: Real-time distance-based outlier detection in data streams. PVLDB 14(2), 7006 (2020) Tran, L., Mun, M., Shahabi, C.: Real-time distance-based outlier detection in data streams. PVLDB 14(2), 7006 (2020)
65.
go back to reference van Stein, B., van Leeuwen, M., Bäck, T.: Local subspace-based outlier detection using global neighbourhoods. CoRR, arxiv:1611.00183 (2016) van Stein, B., van Leeuwen, M., Bäck, T.: Local subspace-based outlier detection using global neighbourhoods. CoRR, arxiv:​1611.​00183 (2016)
66.
go back to reference Vanschoren, J.: Meta-Learning, pp. 35–61 (2019) Vanschoren, J.: Meta-Learning, pp. 35–61 (2019)
67.
go back to reference Vanschoren, J., van Rijn, J., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 96 (2013) Vanschoren, J., van Rijn, J., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 96 (2013)
68.
go back to reference Wang, H., Bah, J., Hammad, M.: Progress in outlier detection techniques: a survey. IEEE Access 7, 998 (2019) Wang, H., Bah, J., Hammad, M.: Progress in outlier detection techniques: a survey. IEEE Access 7, 998 (2019)
69.
go back to reference Wu, R., Keogh, E.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress. In: IEEE TKDE (2021) Wu, R., Keogh, E.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress. In: IEEE TKDE (2021)
70.
go back to reference Xia, S., Xiong, Z., Luo, Y., WeiXu, Z.G.: Effectiveness of the Euclidean distance in high dimensional spaces. Optik 4, 5614–5619 (2015) Xia, S., Xiong, Z., Luo, Y., WeiXu, Z.G.: Effectiveness of the Euclidean distance in high dimensional spaces. Optik 4, 5614–5619 (2015)
71.
go back to reference Yang, J., Rahardja, S., Fränti, P.: Outlier detection: how to threshold outlier scores? In: AIIPCC (2019) Yang, J., Rahardja, S., Fränti, P.: Outlier detection: how to threshold outlier scores? In: AIIPCC (2019)
72.
go back to reference Yoon, S., Lee, J., Lee, B.: Ultrafast local outlier detection from a data stream with stationary region skipping. In: KDD (2020) Yoon, S., Lee, J., Lee, B.: Ultrafast local outlier detection from a data stream with stationary region skipping. In: KDD (2020)
73.
go back to reference Yoon, S., Lee, J., Lee, B.S.: Nets: extremely fast outlier detection from a data stream via set-based processing. PVLDB 12(11), 998 (2019) Yoon, S., Lee, J., Lee, B.S.: Nets: extremely fast outlier detection from a data stream via set-based processing. PVLDB 12(11), 998 (2019)
74.
go back to reference Zhang, E., Zhang, Y.I.: Average precision. In: Encyclopedia of Database Systems (2009) Zhang, E., Zhang, Y.I.: Average precision. In: Encyclopedia of Database Systems (2009)
75.
go back to reference Zhao, Y., Rossi, A., Akoglu, L.: Automating outlier detection via meta-learning. CoRR 2009, 10606 (2020) Zhao, Y., Rossi, A., Akoglu, L.: Automating outlier detection via meta-learning. CoRR 2009, 10606 (2020)
76.
go back to reference Zimek, A., Filzmoser, P.: There and back again: outlier detection between statistical reasoning and data mining algorithms. Int. Rev. Data Min. Knowl. Discov. 8(6), 66 (2018) Zimek, A., Filzmoser, P.: There and back again: outlier detection between statistical reasoning and data mining algorithms. Int. Rev. Data Min. Knowl. Discov. 8(6), 66 (2018)
77.
go back to reference Zimek, A., Gaudet, M., Campello, R., Sander, J.: Subsampling for efficient and effective unsupervised outlier detection ensembles. In: KDD (2013) Zimek, A., Gaudet, M., Campello, R., Sander, J.: Subsampling for efficient and effective unsupervised outlier detection ensembles. In: KDD (2013)
78.
go back to reference Zimek, A., Schubert, E., Kriegel, H.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Mini. 5(5), 997 (2012) Zimek, A., Schubert, E., Kriegel, H.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Mini. 5(5), 997 (2012)
Metadata
Title
A meta-level analysis of online anomaly detectors
Authors
Antonios Ntroumpogiannis
Michail Giannoulis
Nikolaos Myrtakis
Vassilis Christophides
Eric Simon
Ioannis Tsamardinos
Publication date
14-01-2023
Publisher
Springer Berlin Heidelberg
Published in
The VLDB Journal / Issue 4/2023
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-022-00773-x

Other articles of this Issue 4/2023

The VLDB Journal 4/2023 Go to the issue

Premium Partner