Skip to main content
Erschienen in: The VLDB Journal 6/2020

28.05.2020 | Regular Paper

Time series indexing by dynamic covering with cross-range constraints

verfasst von: Tao Sun, Hongbo Liu, Seán McLoone, Shaoxiong Ji, Xindong Wu

Erschienen in: The VLDB Journal | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Time series indexing plays an important role in querying and pattern mining of big data. This paper proposes a novel structure for tightly covering a given set of time series under the dynamic time warping similarity measurement. The structure, referred to as dynamic covering with cross-range constraints (DCRC), enables more efficient and scalable indexing to be developed than current hypercube-based partitioning approaches. In particular, a lower bound of the DTW distance from a given query time series to a DCRC-based cover set is introduced. By virtue of its tightness, which is proven theoretically, the lower bound can be used for pruning when querying on an indexing tree. If the DCRC-based lower bound (LB_DCRC) of an upper node in an index tree is larger than a given threshold, all child nodes can be pruned yielding a significant reduction in computational time. A hierarchical DCRC (HDCRC) structure is proposed to generate the DCRC-tree-based indexing and used to develop time series indexing and insertion algorithms. Experimental results for a selection of benchmark time series datasets are presented to illustrate the tightness of LB_DCRC, as well as the pruning efficiency on the DCRC-tree, especially when the time series have large deformations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of International Conference on Foundations of Data Organization and Algorithms, pp. 69–84. Springer, Boston, MA (1993) Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of International Conference on Foundations of Data Organization and Algorithms, pp. 69–84. Springer, Boston, MA (1993)
2.
Zurück zum Zitat Chen, C.L.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef Chen, C.L.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRef
4.
Zurück zum Zitat Edstrom, J., Chen, D., Gong, Y., Wang, J., Gong, N.: Data-pattern enabled self-recovery low-power storage system for big video data. IEEE Trans. Big Data 5(1), 95–105 (2019)CrossRef Edstrom, J., Chen, D., Gong, Y., Wang, J., Gong, N.: Data-pattern enabled self-recovery low-power storage system for big video data. IEEE Trans. Big Data 5(1), 95–105 (2019)CrossRef
5.
Zurück zum Zitat Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12:1–34 (2012)CrossRef Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12:1–34 (2012)CrossRef
6.
Zurück zum Zitat Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef
7.
Zurück zum Zitat Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Fast classification of univariate and multivariate time series through shapelet discovery. Knowl. Inf. Syst. 49(2), 429–454 (2016)CrossRef Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Fast classification of univariate and multivariate time series through shapelet discovery. Knowl. Inf. Syst. 49(2), 429–454 (2016)CrossRef
8.
Zurück zum Zitat Guttman, A.: (1984) R-trees: A dynamic index structure for spatial searching. In: ACM Sigmod International Conference on Management of Data, pp. 47–57. ACM, New York, NY (2018) Guttman, A.: (1984) R-trees: A dynamic index structure for spatial searching. In: ACM Sigmod International Conference on Management of Data, pp. 47–57. ACM, New York, NY (2018)
9.
Zurück zum Zitat He, H., Tan, Y.: Unsupervised classification of multivariate time series using VPCA and fuzzy clustering with spatial weighted matrix distance. IEEE Trans. Cybern. 50(3), 1096–1105 (2020)CrossRef He, H., Tan, Y.: Unsupervised classification of multivariate time series using VPCA and fuzzy clustering with spatial weighted matrix distance. IEEE Trans. Cybern. 50(3), 1096–1105 (2020)CrossRef
10.
Zurück zum Zitat Hu, J., Yang, B., Guo, C., Jensen, C.S.: Risk-aware path selection with time-varying, uncertain travel costs: A time series approach. VLDB J. 27(2), 179–200 (2018)CrossRef Hu, J., Yang, B., Guo, C., Jensen, C.S.: Risk-aware path selection with time-varying, uncertain travel costs: A time series approach. VLDB J. 27(2), 179–200 (2018)CrossRef
11.
Zurück zum Zitat Ignatov, A.: Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl. Soft Comput. 62, 915–922 (2018)CrossRef Ignatov, A.: Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl. Soft Comput. 62, 915–922 (2018)CrossRef
12.
Zurück zum Zitat Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 67–72 (1975)CrossRef Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 67–72 (1975)CrossRef
13.
Zurück zum Zitat Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)MathSciNetCrossRef Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)MathSciNetCrossRef
14.
Zurück zum Zitat Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)CrossRef Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)CrossRef
15.
Zurück zum Zitat Keogh, E., Wei, L., Xi, X., Vlachos, M., Lee, S.H., Protopapas, P.: Supporting exact indexing of arbitrarily rotated shapes and periodic time series under Euclidean and warping distance measures. VLDB J. 18(3), 611–630 (2009)CrossRef Keogh, E., Wei, L., Xi, X., Vlachos, M., Lee, S.H., Protopapas, P.: Supporting exact indexing of arbitrarily rotated shapes and periodic time series under Euclidean and warping distance measures. VLDB J. 18(3), 611–630 (2009)CrossRef
16.
Zurück zum Zitat Lemire, D.: Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn. 42, 2169–2180 (2009)CrossRef Lemire, D.: Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn. 42, 2169–2180 (2009)CrossRef
17.
Zurück zum Zitat Li, H., Yang, L.: Extensions and relationships of some existing lower-bound functions for dynamic time warping. J., Intell. Inf. Syst. 43(1), 59–79 (2014)CrossRef Li, H., Yang, L.: Extensions and relationships of some existing lower-bound functions for dynamic time warping. J., Intell. Inf. Syst. 43(1), 59–79 (2014)CrossRef
18.
Zurück zum Zitat Li, Q., Chen, Y., Wang, J., Chen, Y., Chen, H.C.: Web media and stock markets: A survey and future directions from a big data perspective. IEEE Trans. Knowl. Data Eng. 30(2), 381–399 (2018)CrossRef Li, Q., Chen, Y., Wang, J., Chen, Y., Chen, H.C.: Web media and stock markets: A survey and future directions from a big data perspective. IEEE Trans. Knowl. Data Eng. 30(2), 381–399 (2018)CrossRef
19.
Zurück zum Zitat Lin, S.C., Yeh, M.Y., Chen, M.S.: Non-overlapping subsequence matching of stream synopses. IEEE Trans. Knowl. Data Eng. 30(1), 101–114 (2018)CrossRef Lin, S.C., Yeh, M.Y., Chen, M.S.: Non-overlapping subsequence matching of stream synopses. IEEE Trans. Knowl. Data Eng. 30(1), 101–114 (2018)CrossRef
20.
Zurück zum Zitat Liu, M., Zhang, X., Xu, G.: Continuous motion classification and segmentation based on improved dynamic time warping algorithm. Int. J. Pattern Recognit Artif Intell. 32(2), 1850,002 (2018)MathSciNetCrossRef Liu, M., Zhang, X., Xu, G.: Continuous motion classification and segmentation based on improved dynamic time warping algorithm. Int. J. Pattern Recognit Artif Intell. 32(2), 1850,002 (2018)MathSciNetCrossRef
21.
Zurück zum Zitat Mikalsen, K.Ø., Bianchi, F.M., Soguero-Ruiz, C., Jenssen, R.: Time series cluster kernel for learning similarities between multivariate time series with missing data. Pattern Recogn. 76, 569–581 (2018)CrossRef Mikalsen, K.Ø., Bianchi, F.M., Soguero-Ruiz, C., Jenssen, R.: Time series cluster kernel for learning similarities between multivariate time series with missing data. Pattern Recogn. 76, 569–581 (2018)CrossRef
22.
Zurück zum Zitat Mondal, T., Ragot, N., Ramel, J.Y., Pal, U.: Comparative study of conventional time series matching techniques for word spotting. Pattern Recogn. 73, 47–64 (2018)CrossRef Mondal, T., Ragot, N., Ramel, J.Y., Pal, U.: Comparative study of conventional time series matching techniques for word spotting. Pattern Recogn. 73, 47–64 (2018)CrossRef
23.
Zurück zum Zitat Mori, U., Mendiburu, A., Lozano, J.A.: Similarity measure selection for clustering time series databases. IEEE Trans. Knowl. Data Eng. 28(1), 181–195 (2016)CrossRef Mori, U., Mendiburu, A., Lozano, J.A.: Similarity measure selection for clustering time series databases. IEEE Trans. Knowl. Data Eng. 28(1), 181–195 (2016)CrossRef
24.
Zurück zum Zitat Mueen, A., Chavoshi, N., Abu-El-Rub, N., Hamooni, H., Minnich, A., MacCarthy, J.: Speeding up dynamic time warping distance for sparse time series data. Knowl. Inf. Syst. 54(1), 237–263 (2018)CrossRef Mueen, A., Chavoshi, N., Abu-El-Rub, N., Hamooni, H., Minnich, A., MacCarthy, J.: Speeding up dynamic time warping distance for sparse time series data. Knowl. Inf. Syst. 54(1), 237–263 (2018)CrossRef
25.
Zurück zum Zitat Mueen, A., Keogh, E.: Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2129–2130. ACM, New York, NY (2016) Mueen, A., Keogh, E.: Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2129–2130. ACM, New York, NY (2016)
26.
Zurück zum Zitat Park, S., Lee, D., Chu, W.W.: Fast retrieval of similar subsequences in long sequence databases. In: Proceedings of 1999 Workshop on Knowledge and Data Engineering Exchange, pp. 60–67. IEEE, Chicago, IL (1999) Park, S., Lee, D., Chu, W.W.: Fast retrieval of similar subsequences in long sequence databases. In: Proceedings of 1999 Workshop on Knowledge and Data Engineering Exchange, pp. 60–67. IEEE, Chicago, IL (1999)
27.
Zurück zum Zitat Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM, New York, NY (2012) Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM, New York, NY (2012)
28.
Zurück zum Zitat Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRef Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRef
29.
Zurück zum Zitat Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 234–242. SIAM, Bologna, Italy (2018) Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 234–242. SIAM, Bologna, Italy (2018)
30.
Zurück zum Zitat Son, N.T., Anh, D.T.: Discovery of time series \(k\)-motifs based on multidimensional index. Knowl. Inf. Syst. 46(1), 59–86 (2016)CrossRef Son, N.T., Anh, D.T.: Discovery of time series \(k\)-motifs based on multidimensional index. Knowl. Inf. Syst. 46(1), 59–86 (2016)CrossRef
31.
Zurück zum Zitat Sun, T., Liu, H., Yu, H., Chen, C.L.P.: Degree-pruning dynamic planning approaches to central time series through minimizing dynamic time warping distance. IEEE Trans. Cybern. 47(7), 1719–1729 (2017)CrossRef Sun, T., Liu, H., Yu, H., Chen, C.L.P.: Degree-pruning dynamic planning approaches to central time series through minimizing dynamic time warping distance. IEEE Trans. Cybern. 47(7), 1719–1729 (2017)CrossRef
32.
Zurück zum Zitat Tan, C.W., Petitjean, F., Webb, G.: Elastic bands across the path: a new framework and method to lower bound DTW. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 522–530. SIAM, Alberta, Canada (2019) Tan, C.W., Petitjean, F., Webb, G.: Elastic bands across the path: a new framework and method to lower bound DTW. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 522–530. SIAM, Alberta, Canada (2019)
33.
Zurück zum Zitat Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282–290. SIAM, Houston, TX (2017) Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282–290. SIAM, Houston, TX (2017)
34.
Zurück zum Zitat Tan, Z., Wang, Y., Zhang, Y., Zhou, J.: A novel time series approach for predicting the long-term popularity of online videos. IEEE Trans. Broadcast. 62(2), 436–445 (2016)CrossRef Tan, Z., Wang, Y., Zhang, Y., Zhou, J.: A novel time series approach for predicting the long-term popularity of online videos. IEEE Trans. Broadcast. 62(2), 436–445 (2016)CrossRef
35.
Zurück zum Zitat Tang, J., Cheng, H., Zhao, Y., Guo, H.: Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recogn. 80, 21–31 (2018)CrossRef Tang, J., Cheng, H., Zhao, Y., Guo, H.: Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recogn. 80, 21–31 (2018)CrossRef
36.
Zurück zum Zitat Wu, X., Zhu, X., Wu, G., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)CrossRef Wu, X., Zhu, X., Wu, G., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)CrossRef
37.
Zurück zum Zitat Wu, Y., Tong, Y., Zhu, X., Wu, X.: NOSEP: Nonoverlapping sequence pattern mining with gap constraints. IEEE Trans. Cybern. 48(10), 2809–2822 (2018)CrossRef Wu, Y., Tong, Y., Zhu, X., Wu, X.: NOSEP: Nonoverlapping sequence pattern mining with gap constraints. IEEE Trans. Cybern. 48(10), 2809–2822 (2018)CrossRef
38.
Zurück zum Zitat Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th International Conference on Data Engineering, pp. 201–208. IEEE, Orlando, FL (1998) Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th International Conference on Data Engineering, pp. 201–208. IEEE, Orlando, FL (1998)
39.
Zurück zum Zitat Zhou, M., Wong, M.H.: Boundary-based lower-bound functions for dynamic time warping and their indexing. Inf. Sci. 181(19), 4175–4196 (2011)CrossRef Zhou, M., Wong, M.H.: Boundary-based lower-bound functions for dynamic time warping and their indexing. Inf. Sci. 181(19), 4175–4196 (2011)CrossRef
40.
Zurück zum Zitat Zoumpatianos, K., Lou, Y., Ileana, I., Palpanas, T., Gehrke, J.: Generating data series query workloads. VLDB J. 27(6), 823–846 (2018)CrossRef Zoumpatianos, K., Lou, Y., Ileana, I., Palpanas, T., Gehrke, J.: Generating data series query workloads. VLDB J. 27(6), 823–846 (2018)CrossRef
Metadaten
Titel
Time series indexing by dynamic covering with cross-range constraints
verfasst von
Tao Sun
Hongbo Liu
Seán McLoone
Shaoxiong Ji
Xindong Wu
Publikationsdatum
28.05.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 6/2020
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-020-00614-9

Weitere Artikel der Ausgabe 6/2020

The VLDB Journal 6/2020 Zur Ausgabe