Skip to main content
Erschienen in: Knowledge and Information Systems 1/2018

28.10.2017 | Regular Paper

Speeding up dynamic time warping distance for sparse time series data

verfasst von: Abdullah Mueen, Nikan Chavoshi, Noor Abu-El-Rub, Hossein Hamooni, Amanda Minnich, Jonathan MacCarthy

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original DTW distance does not take advantage of this sparsity, leading to redundant calculations and a prohibitively large computational cost for long time series. We derive a new time warping similarity measure (AWarp) for sparse time series that works on the run-length encoded representation of sparse time series. The complexity of AWarp is quadratic on the number of observations as opposed to the range of time of the time series. Therefore, AWarp can be several orders of magnitude faster than DTW on sparse time series. AWarp is exact for binary-valued time series and a close approximation of the original DTW distance for any-valued series. We discuss useful variants of AWarp: bounded (both upper and lower), constrained, and multidimensional. We show applications of AWarp to three data mining tasks including clustering, classification, and outlier detection, which are otherwise not feasible using classic DTW, while producing equivalent results. Potential areas of application include bot detection, human activity classification, search trend analysis, seismic analysis, and unusual review pattern mining.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’10, number C in KDD’10. ACM Press, p 1089 Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’10, number C in KDD’10. ACM Press, p 1089
2.
Zurück zum Zitat Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 947–956 Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 947–956
3.
Zurück zum Zitat Shokoohi-Yekta M, Chen Y, Campana B, Hu B, Zakaria J, Keogh E (2015) Discovery of meaningful rules in time series. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’15. ACM Press, New York, pp 1085–1094 Shokoohi-Yekta M, Chen Y, Campana B, Hu B, Zakaria J, Keogh E (2015) Discovery of meaningful rules in time series. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’15. ACM Press, New York, pp 1085–1094
4.
Zurück zum Zitat Hamooni H, Mueen A (2014) Dual-domain hierarchical classification of phonetic time series. In: ICDM 2014. ICDM Hamooni H, Mueen A (2014) Dual-domain hierarchical classification of phonetic time series. In: ICDM 2014. ICDM
5.
Zurück zum Zitat Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases, VLDB’02, pp 406–417 Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases, VLDB’02, pp 406–417
6.
Zurück zum Zitat Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. ACM SIGMOD Rec 23(2):419–429CrossRef Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. ACM SIGMOD Rec 23(2):419–429CrossRef
7.
Zurück zum Zitat Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309MathSciNetCrossRef Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309MathSciNetCrossRef
9.
Zurück zum Zitat Cook DJ, Crandall AS, Thomas BL, Krishnan NC (2013) CASAS: a smart home in a box. Computer 46(7):62–69CrossRef Cook DJ, Crandall AS, Thomas BL, Krishnan NC (2013) CASAS: a smart home in a box. Computer 46(7):62–69CrossRef
11.
Zurück zum Zitat Boulgouris N, Plataniotis K, Hatzinakos D (2004) Gait recognition using dynamic time warping. In: IEEE 6th workshop on multimedia signal processing. IEEE, pp 263–266 Boulgouris N, Plataniotis K, Hatzinakos D (2004) Gait recognition using dynamic time warping. In: IEEE 6th workshop on multimedia signal processing. IEEE, pp 263–266
12.
Zurück zum Zitat Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49CrossRefMATH Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49CrossRefMATH
13.
Zurück zum Zitat Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’00. ACM Press, New York, pp 285–289 Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’00. ACM Press, New York, pp 285–289
14.
Zurück zum Zitat Rath TM, Manmatha R (2003) Word image matching using dynamic time warping. In: 2003. Proceedings. 2003 IEEE computer society conference on computer vision and pattern recognition, vol 2. IEEE, p II—521 Rath TM, Manmatha R (2003) Word image matching using dynamic time warping. In: 2003. Proceedings. 2003 IEEE computer society conference on computer vision and pattern recognition, vol 2. IEEE, p II—521
15.
Zurück zum Zitat Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD Workshop, pp 359–370 Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD Workshop, pp 359–370
16.
Zurück zum Zitat Al-Naymat G, Chawla S, Taheri J (2009) SparseDTW: a novel approach to speed up dynamic time warping. In: Proceedings of the Eighth Australasian data mining conference, vol 101. Australian computer society, Inc., Darlinghurst, Australia, pp 117–127 Al-Naymat G, Chawla S, Taheri J (2009) SparseDTW: a novel approach to speed up dynamic time warping. In: Proceedings of the Eighth Australasian data mining conference, vol 101. Australian computer society, Inc., Darlinghurst, Australia, pp 117–127
17.
Zurück zum Zitat Tan LN, Alwan A, Kossan G, Cody ML, Taylor CE (2015) Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data. J Acoust Soc Am 137(3):1069–80CrossRef Tan LN, Alwan A, Kossan G, Cody ML, Taylor CE (2015) Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data. J Acoust Soc Am 137(3):1069–80CrossRef
18.
Zurück zum Zitat Chu S, Keogh E, Hart D, Pazzani M (2002) Iterative deepening dynamic time warping for time series, Chapter 12, pp 195–212 Chu S, Keogh E, Hart D, Pazzani M (2002) Iterative deepening dynamic time warping for time series, Chapter 12, pp 195–212
19.
Zurück zum Zitat Salvador S, Chan P (2007) Toward accurate dynamic time warping in linear time and space. Intell Data Anal 11(5):561–580 Salvador S, Chan P (2007) Toward accurate dynamic time warping in linear time and space. Intell Data Anal 11(5):561–580
20.
Zurück zum Zitat Sart D, Mueen A, Najjar W, Niennattrakul V, Keogh E (2010) Accelerating dynamic time warping subsequnce search with GPUs and FPGAs. ICDM 2010. In: Proceedings—IEEE international conference on data mining, ICDM, pp 1001–1006 Sart D, Mueen A, Najjar W, Niennattrakul V, Keogh E (2010) Accelerating dynamic time warping subsequnce search with GPUs and FPGAs. ICDM 2010. In: Proceedings—IEEE international conference on data mining, ICDM, pp 1001–1006
21.
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’12. ACM Press, New York, p 262 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’12. ACM Press, New York, p 262
22.
Zurück zum Zitat Begum N, Ulanova L, Wang J, Keogh E (2015) Accelerating dynamic time warping clustering with a novel admissible pruning strategy. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining- KDD’15. ACM Press, New York, pp 49–58 Begum N, Ulanova L, Wang J, Keogh E (2015) Accelerating dynamic time warping clustering with a novel admissible pruning strategy. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining- KDD’15. ACM Press, New York, pp 49–58
23.
Zurück zum Zitat Assent I, Wichterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. J Proc VLDB Endow 2(1):826–837CrossRef Assent I, Wichterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. J Proc VLDB Endow 2(1):826–837CrossRef
24.
Zurück zum Zitat Candan KS, Rossini R, Sapino ML, Wang X (2012) sDTW: computing DTW distances using locally relevant constraints based on salient feature alignments. PVLDB 5(11):1519–1530 Candan KS, Rossini R, Sapino ML, Wang X (2012) sDTW: computing DTW distances using locally relevant constraints based on salient feature alignments. PVLDB 5(11):1519–1530
25.
Zurück zum Zitat Shokoohi-Yekta M, Wang J, Keogh E, On the non-trivial generalization of dynamic time warping to the multi-dimensional case, Chapter 33, pp 289–297 Shokoohi-Yekta M, Wang J, Keogh E, On the non-trivial generalization of dynamic time warping to the multi-dimensional case, Chapter 33, pp 289–297
26.
Zurück zum Zitat Lines J, Davis L, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 289–297 Lines J, Davis L, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 289–297
27.
Zurück zum Zitat Mueen A (2013) Enumeration of time series motifs of all lengths. In: Proceedings—IEEE international conference on data mining, ICDM. ICDM, pp 547–556 Mueen A (2013) Enumeration of time series motifs of all lengths. In: Proceedings—IEEE international conference on data mining, ICDM. ICDM, pp 547–556
28.
Zurück zum Zitat Zhu Y, Zimmerman Z, Senobari NS, Yeh CCM, Funning G, Mueen A, Brisk P, Keogh E (2016) Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million Barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 739–748 Zhu Y, Zimmerman Z, Senobari NS, Yeh CCM, Funning G, Mueen A, Brisk P, Keogh E (2016) Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million Barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 739–748
30.
Zurück zum Zitat Zhu Q, Batista G, Rakthanmanon T, Keogh E (2012) A novel approximation to dynamic time warping allows anytime clustering of massive time series datasets. In: Proceedings of the 2012 SIAM international conference on data mining, pp 999–1010 Zhu Q, Batista G, Rakthanmanon T, Keogh E (2012) A novel approximation to dynamic time warping allows anytime clustering of massive time series datasets. In: Proceedings of the 2012 SIAM international conference on data mining, pp 999–1010
31.
Zurück zum Zitat Yeh CCM, Zhu Y, Ulanova L, Begum N, Ding Y, Dau HA, Silva DF, Mueen A, Keogh E (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 1317–1322 Yeh CCM, Zhu Y, Ulanova L, Begum N, Ding Y, Dau HA, Silva DF, Mueen A, Keogh E (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 1317–1322
32.
Zurück zum Zitat Silva DF, Batista GEAPA (2016) Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, pp 837–845 Silva DF, Batista GEAPA (2016) Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, pp 837–845
33.
Zurück zum Zitat Shieh J, Keogh E (2009) ISAX: disk-aware mining and indexing of massive time series datasets. Data Min Knowl Disc 19(1):24–57CrossRef Shieh J, Keogh E (2009) ISAX: disk-aware mining and indexing of massive time series datasets. Data Min Knowl Disc 19(1):24–57CrossRef
34.
Zurück zum Zitat Chavoshi N, Hamooni H, Mueen A (2016) DeBot: Twitter Bot detection via warped correlation. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, 12, pp 817–822 Chavoshi N, Hamooni H, Mueen A (2016) DeBot: Twitter Bot detection via warped correlation. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, 12, pp 817–822
35.
Zurück zum Zitat Mueen A, Keogh E, Zhu Q, Cash S, Westover B (2009) Exact discovery of time series motifs. In: Proceedings of the 2009 SIAM international conference on data mining, pp 473–484 Mueen A, Keogh E, Zhu Q, Cash S, Westover B (2009) Exact discovery of time series motifs. In: Proceedings of the 2009 SIAM international conference on data mining, pp 473–484
36.
Zurück zum Zitat Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining KDD 07, KDD’07, p 844 Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining KDD 07, KDD’07, p 844
38.
Zurück zum Zitat Herrera RH, Fomel S, van der Baan M (2014) Automatic approaches for seismic to well tying. Interpretation 2(2):SD9–SD17CrossRef Herrera RH, Fomel S, van der Baan M (2014) Automatic approaches for seismic to well tying. Interpretation 2(2):SD9–SD17CrossRef
41.
Zurück zum Zitat Yankov D, Keogh EJ, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM, pp 381–390 Yankov D, Keogh EJ, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM, pp 381–390
42.
Zurück zum Zitat Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: The 17th ACM SIGKDD international conference, pp 1154–1162 Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: The 17th ACM SIGKDD international conference, pp 1154–1162
Metadaten
Titel
Speeding up dynamic time warping distance for sparse time series data
verfasst von
Abdullah Mueen
Nikan Chavoshi
Noor Abu-El-Rub
Hossein Hamooni
Amanda Minnich
Jonathan MacCarthy
Publikationsdatum
28.10.2017
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2018
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-017-1119-0

Weitere Artikel der Ausgabe 1/2018

Knowledge and Information Systems 1/2018 Zur Ausgabe