Top

The VLDB Journal

Published in:

15-02-2024 | Regular Paper

A new distributional treatment for time series anomaly detection

Authors: Kai Ming Ting, Zongyou Liu, Lei Gong, Hang Zhang, Ye Zhu

Published in: The VLDB Journal | Issue 3/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Time series is traditionally treated with two main approaches, i.e., the time domain approach and the frequency domain approach. These approaches must rely on a sliding window so that time-shift versions of a sequence can be measured to be similar. Coupled with the use of a root point-to-point measure, existing methods often have quadratic time complexity. We offer the third \(\mathbb {R}\) domain approach. It begins with an insight that sequences in a stationary time series can be treated as sets of independent and identically distributed (iid) points generated from an unknown distribution in \(\mathbb {R}\). This \(\mathbb {R}\) domain treatment enables two new possibilities: (a) The similarity between two sequences can be computed using a distributional measure such as Wasserstein distance (WD), kernel mean embedding or isolation distributional kernel (\(\mathcal {K}_I\)), and (b) these distributional measures become non-sliding-window-based. Together, they offer an alternative that has more effective similarity measurements and runs significantly faster than the point-to-point and sliding-window-based measures. Our empirical evaluation shows that \(\mathcal {K}_I\) is an effective and efficient distributional measure for time series; and \(\mathcal {K}_I\)-based detectors have better detection accuracy than existing detectors in two tasks: (i) anomalous sequence detection in a stationary time series and (ii) anomalous time series detection in a dataset of non-stationary time series. The insight makes underutilized “old things new again” which gives existing distributional measures and anomaly detectors a new life in time series anomaly detection that would otherwise be impossible.

previous article Time series data encoding in Apache IoTDB: comparative analysis and recommendation

next article Ingress: an automated incremental graph processing system

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

We use this term to denote either a part of one stationary time series or one time series in a dataset of non-stationary time series, depending on which of the two anomaly detection tasks under investigation.

Detecting anomalous sequences in a non-stationary time series is outside the scope of this paper because the notion of normal sequences could be defined in various ways, depending on the kind of non-stationarity which is often ill-defined. Yet, we show in Sect. 7 that the proposed treatment works for the second anomaly detection task in a dataset of time series, where individual time series can be non-stationary.

We have attempted more complicated measures such as MSM [52] and TWED [31]. They are very time-consuming because they have at least quadratic time complexity, and neither of them (using the Python implementations from sktime [30]) could complete the run within the 2-day time frame for any dataset we have used.

These methods are evaluated for time series classification in their papers, but their representation steps do not need label information and are independent of the downstream task.

The feature map of Gaussian kernel is approximated from the Nyström method [63] in order to accelerate the computation. The sample size of the Nyström method is set to \(\sqrt{nl}\) which is also equal to the number of features. The bandwidth of \(\mathcal {K}_G\) is searched over \(\{10^m\ |\ m=-4,-3,\ldots ,0,1\}\).

The biggest dataset in the UCR archive is not used due to the lack of memories.

Bandaragoda, T.R., Ting, K.M., Albrecht, D., Liu, F.T., Zhu, Y., Wells, J.R.: Isolation-based anomaly detection using nearest-neighbor ensembles. Comput. Intell. 34(4), 968–998 (2018)MathSciNetCrossRef

Beggel, L., Kausler, B.X., Schiegg, M., Pfeiffer, M., Bischl, B.: Time series anomaly detection based on shapelet learning. Comput. Stat. 34(3), 945–976 (2019)MathSciNetCrossRef

Benkabou, S.E., Benabdeslem, K., Canitia, B.: Unsupervised outlier detection for time series by entropy and dynamic time warping. Knowl. Inf. Syst. 54(2), 463–486 (2018)CrossRef

Bock, C., Togninalli, M., Ghisu, E., Gumbsch, T., Rieck, B., Borgwardt, K.: A Wasserstein subsequence kernel for time series. In: Proceedings of the International Conference on Data Mining, pp. 964–969 (2019)

Boniol, P., Linardi, M., Roncallo, F., Palpanas, T., Meftah, M., Remy, E.: Unsupervised and scalable subsequence anomaly detection in large data series. VLDB J. 30(6), 909–931 (2021)CrossRef

Boniol, P., Paparrizos, J., Palpanas, T., Franklin, M.J.: SAND: streaming subsequence anomaly detection. In: Proceedings of the VLDB Endowment, pp. 1717–1729 (2021)

Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)

Cazelles, E., Robert, A., Tobar, F.: The Wasserstein–Fourier distance for stationary time series. IEEE Trans. Signal Process. 69, 709–721 (2020)MathSciNetCrossRef

Chan, F.P., Fu, A.C.: Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Trans. Knowl. Data Eng. 15(3), 686–705 (2003)CrossRef

10.

Dau, H.A., Bagnall, A., Kamgar, K., Yeh, C.C.M., Zhu, Y., Gharghabi, S., Ratanamahatana, C.A., Keogh, E.: The UCR time series archive. IEEE/CAA J. Autom. Sinica 6(6), 1293–1305 (2019)

11.

Dempster, A., Schmidt, D.F., Webb, G.I.: Minirocket: A very fast (almost) deterministic transform for time series classification. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 248–257 (2021)

12.

Demšar, J.: Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNet

13.

Dickey, D.A., Fuller, W.A.: Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 74(366), 427–431 (1979)MathSciNetCrossRef

14.

Elliott, G., Rothenberg, T.J., Stock, J.H.: Efficient tests for an autoregressive unit root. Econometrica 64(4), 813–836 (1996)MathSciNetCrossRef

15.

Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. ACM SIGMOD Rec. 23(2), 419–429 (1994)CrossRef

16.

Gharghabi, S., Imani, S., Bagnall, A., Darvishzadeh, A., Keogh, E.: An ultra-fast time series distance measure to allow data mining in more complex real-world deployments. Data Min. Knowl. Disc. 34(4), 1104–1135 (2020)MathSciNetCrossRef

17.

Gold, O., Sharir, M.: Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans. Algorithms 14(4), 1–17 (2018)MathSciNetCrossRef

18.

Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), 215–220 (2000)CrossRef

19.

Hobijn, B., Franses, P.H., Ooms, M.: Generalizations of the KPSS-test for stationarity. Stat. Neerl. 58(4), 483–502 (2004)MathSciNetCrossRef

20.

Hyndman, R.J.: Computing and graphing highest density regions. Am. Stat. 50(2), 120–126 (1996)CrossRef

21.

Hyndman, R.J., Wang, E., Laptev, N.: Large-scale unusual time series detection. In: Proceedings of the International Conference on Data Mining Workshop, pp. 1616–1619 (2015)

22.

Itakura, F.: Analysis synthesis telephony based on the maximum likelihood method. In: Proceedings of the 6th International Congress on Acoustics, pp. 280–292 (1968)

23.

Jones, M., Nikovski, D., Imamura, M., Hirata, T.: Exemplar learning for extremely efficient anomaly detection in real-valued time series. Data Min. Knowl. Disc. 30(6), 1427–1454 (2016)MathSciNetCrossRef

24.

Kalpakis, K., Gada, D., Puttagunta, V.: Distance measures for effective clustering of arima time-series. In: Proceedings of the IEEE International Conference on Data Mining, pp. 273–280 (2001)

25.

Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: Proceedings of the IEEE International Conference on Data Mining, pp. 226–233 (2005)

26.

Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)CrossRef

27.

Klein, J.L.: Statistical Visions in Time: A History of Time Series Analysis, pp. 1662–1938. Cambridge University Press, Cambridge (1997)

28.

Knox, E.M., Ng, R.T.: Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the 24th International Conference on Very Large Data Bases, pp. 392–403 (1998)

29.

Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceedings of the International Conference on Data Mining, pp. 413–422 (2008)

30.

Löning, M., Bagnall, A., Ganesh, S., Kazakov, V., Lines, J., Király, F.J.: sktime: A unified interface for machine learning with time series. arXiv:1909.07872 (2019)

31.

Marteau, P.F.: Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 306–318 (2008)CrossRef

32.

Moody, G.B., Mark, R.G.: The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 20(3), 45–50 (2001)CrossRef

33.

Muandet, K., Fukumizu, K., Sriperumbudur, B., Schölkopf, B., et al.: Kernel mean embedding of distributions: a review and beyond. Found. Trends® Mach. Learn. 10(1–2), 1–141 (2017)

34.

Muandet, K., Schölkopf, B.: One-class support measure machines for group anomaly detection. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, pp. 449–458 (2013)

35.

Paparrizos, J., Boniol, P., Palpanas, T., Tsay, R.S., Elmore, A., Franklin, M.J.: Volume under the surface: a new accuracy evaluation measure for time-series anomaly detection. In: Proceedings of the VLDB Endowment, pp. 2774–2787 (2022)

36.

Paparrizos, J., Franklin, M.J.: Grail: efficient time-series representation learning. In: Proceedings of the VLDB Endowment, pp. 1762–1777 (2019)

37.

Paparrizos, J., Gravano, L.: k-Shape: Efficient and accurate clustering of time series. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1855–1870 (2015)

38.

Paparrizos, J., Kang, Y., Boniol, P., Tsay, R.S., Palpanas, T., Franklin, M.J.: TSB-UAD: an end-to-end benchmark suite for univariate time-series anomaly detection. In: Proceedings of the VLDB Endowment, pp. 1697–1711 (2022)

39.

Paparrizos, J., Liu, C., Elmore, A.J., Franklin, M.J.: Debunking four long-standing misconceptions of time-series distance measures. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1887–1905 (2020)

40.

Popivanov, I., Miller, R.J.: Similarity search over time-series data using wavelets. In: Proceedings of the International Conference on Data Engineering, pp. 212–221 (2002)

41.

Qin, X., Ting, K.M., Zhu, Y., Lee, V.C.: Nearest-neighbour-induced isolation similarity and its impact on density-based clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4755–4762 (2019)

42.

Qiu, C., Pfrommer, T., Kloft, M., Mandt, S., Rudolph, M.: Neural transformation learning for deep anomaly detection beyond images. In: Proceedings of the International Conference on Machine Learning, pp. 8703–8714 (2021)

43.

RueshendorffS, L.: Wasserstein metric. In: Encyclopedia of Mathematics (2002)

44.

Sakoe, H.: Dynamic-programming approach to continuous speech recognition. In: Proceedings of the International Congress of Acoustics (1971)

45.

Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRef

46.

Schmidl, S., Wenig, P., Papenbrock, T.: Anomaly detection in time series: a comprehensive evaluation. In: Proceedings of the VLDB Endowment, pp. 1779–1797 (2022)

47.

Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

48.

Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S.: Time series anomaly discovery with grammar-based compression. In: Proceedings of the 18th International Conference on Extending Database Technology, pp. 481–492 (2015)

49.

Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time series searching with large uniform scaling. In: Proceedings of the SIAM International Conference on Data Mining, pp. 234–242 (2018)

50.

Shumway, R.H., Stoffer, D.S.: Time Series Analysis and Its Applications: With R Examples. Springer, Berlin (2017)CrossRef

51.

Smola, A., Gretton, A., Song, L., Schölkopf, B.: A Hilbert space embedding for distributions. In: Proceedings of the International Conference on Algorithmic Learning Theory, pp. 13–31 (2007)

52.

Stefan, A., Athitsos, V., Das, G.: The move-split-merge metric for time series. IEEE Trans. Knowl. Data Eng. 25(6), 1425–1438 (2012)CrossRef

53.

Tan, C.W., Petitjean, F., Webb, G.I.: Elastic bands across the path: A new framework and method to lower bound DTW. In: Proceedings of the SIAM International Conference on Data Mining, pp. 522–530 (2019)

54.

Tavenard, R., Faouzi, J., Vandewiele, G., Divo, F., Androz, G., Holtz, C., Payne, M., Yurchak, R., Rußwurm, M., Kolar, K., et al.: Tslearn, a machine learning toolkit for time series data. J. Mach. Learn. Res. 21(1), 4686–4691 (2020)

55.

Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)CrossRef

56.

Ting, K.M., Liu, Z., Zhang, H., Zhu, Y.: A new distributional treatment for time series and an anomaly detection investigation. In: Proceedings of the VLDB Endowment, pp. 2321–2333 (2022)

57.

Ting, K.M., Wells, J.R., Washio, T.: Isolation kernel: the X factor in efficient and effective large scale online kernel learning. Data Min. Knowl. Disc. 35(6), 2282–2312 (2021)MathSciNetCrossRef

58.

Ting, K.M., Xu, B.C., Washio, T., Zhou, Z.H.: Isolation distributional kernel: a new tool for kernel based anomaly detection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 198–206 (2020)

59.

Ting, K.M., Xu, B.C., Washio, T., Zhou, Z.H.: Isolation distributional kernel: a new tool for point and group anomaly detections. IEEE Trans. Knowl. Data Eng. 35(03), 2697–2710 (2023)

60.

Ting, K.M., Zhu, Y., Zhou, Z.H.: Isolation kernel and its effect on SVM. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2329–2337 (2018)

61.

Togninalli, M., Ghisu, E., Llinares-López, F., Rieck, B., Borgwardt, K.: Wasserstein Weisfeiler-Lehman graph kernels. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 6436–6446 (2019)

62.

Wu, R., Keogh, E.J.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress. IEEE Trans. Knowl. Data Eng. 35(03), 2421–2429 (2023)

63.

Yang, T., Li, Y.f., Mahdavi, M., Jin, R., Zhou, Z.H.: Nyström method vs random Fourier features: a theoretical and empirical comparison. In: Proceedings of Conference on Neural Information Processing Systems, pp. 476–484 (2012)

64.

Yeh, C.C.M., Zhu, Y., Ulanova, L., Begum, N., Ding, Y., Dau, H.A., Silva, D.F., Mueen, A., Keogh, E.: Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: Proceedings of the International Conference on Data Mining, pp. 1317–1322 (2016)

65.

Yue, Z., Wang, Y., Duan, J., Yang, T., Huang, C., Tong, Y., Xu, B.: Ts2vec: Towards universal representation of time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 8980–8987 (2022)

66.

Zhu, Y., Zimmerman, Z., Senobari, N.S., Yeh, C.C.M., Funning, G., Mueen, A., Brisk, P., Keogh, E.: Matrix profile II: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins. In: Proceedings of the International Conference on Data Mining, pp. 739–748 (2016)

Title: A new distributional treatment for time series anomaly detection
Authors: Kai Ming Ting
Zongyou Liu
Lei Gong
Hang Zhang
Ye Zhu
Publication date: 15-02-2024
Publisher: Springer Berlin Heidelberg
Published in: The VLDB Journal / Issue 3/2024
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI: https://doi.org/10.1007/s00778-023-00832-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2024

MM-DIRECT

Ingress: an automated incremental graph processing system

Hilogx: noise-aware log-based anomaly detection with human feedback

Hypergraph motifs and their extensions beyond binary

A new window Clause for SQL++

Refiner: a reliable and efficient incentive-driven federated learning system powered by blockchain

Premium Partner