Skip to main content
Erschienen in: Knowledge and Information Systems 1/2015

01.10.2015 | Regular Paper

Enumeration of time series motifs of all lengths

verfasst von: Abdullah Mueen, Nikan Chavoshi

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Time series motifs are repeated patterns in long and noisy time series. Motifs are typically used to understand the dynamics of the source because repeated patterns with high similarity evidentially rule out the presence of noise. Recently, time series motifs have also been used for clustering, summarization, rule discovery and compression as features. For all such purposes, many high-quality motifs of various lengths are desirable and thus originate the problem of enumerating motifs for a wide range of lengths. Existing algorithms find motifs for a given length. A trivial way to enumerate motifs is to run one of the algorithms for the whole range of lengths. However, such parameter sweep is computationally infeasible for large real datasets. In this paper, we describe an exact algorithm, called \({\textit{MOEN}}\), to enumerate motifs. The algorithm is an order of magnitude faster than the naive algorithm. The algorithm frees us from re-discovering the same motif at different lengths and tuning multiple data-dependent parameters. The speedup comes from using a novel bound on the similarity function across lengths and the algorithm uses only linear space unlike other motif discovery algorithms. We also describe an approximate extension of MOEN algorithm that is faster and suitable for larger datasets. We describe five case studies in entomology, sensor fusion, power consumption monitoring and activity recognition where \({\textit{MOEN}}\) enumerates several high-quality motifs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
4.
Zurück zum Zitat Brown AEX, Yemini EI, Grundy LJ, Jucikas T, Schafer WR (2013) A dictionary of behavioral motifs reveals clusters of genes affecting caenorhabditis elegans locomotion. Proc Natl Acad Sci 110(2):791–796CrossRef Brown AEX, Yemini EI, Grundy LJ, Jucikas T, Schafer WR (2013) A dictionary of behavioral motifs reveals clusters of genes affecting caenorhabditis elegans locomotion. Proc Natl Acad Sci 110(2):791–796CrossRef
5.
Zurück zum Zitat Cassisi C, Aliotta M, Cannata A, Montalto P, Patanè D, Pulvirenti A, Spampinato L (2013) Motif discovery on seismic amplitude time series: the case study of Mt. Etna 2011 eruptive activity. Pure Appl Geophys 170(4):529–545 Cassisi C, Aliotta M, Cannata A, Montalto P, Patanè D, Pulvirenti A, Spampinato L (2013) Motif discovery on seismic amplitude time series: the case study of Mt. Etna 2011 eruptive activity. Pure Appl Geophys 170(4):529–545
7.
Zurück zum Zitat Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220CrossRef Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220CrossRef
8.
Zurück zum Zitat Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371MathSciNetCrossRef Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371MathSciNetCrossRef
9.
Zurück zum Zitat Lam HT, Pham ND, Calders T (2011) Online discovery of top-k similar motifs in time series data. In: SIAM Conference on Data Mining, SDM ’11 Lam HT, Pham ND, Calders T (2011) Online discovery of top-k similar motifs in time series data. In: SIAM Conference on Data Mining, SDM ’11
11.
Zurück zum Zitat Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Proceedings of 2nd workshop on temporal data mining at KDD, pp 53–68 Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Proceedings of 2nd workshop on temporal data mining at KDD, pp 53–68
12.
Zurück zum Zitat Makonin S, Popowich F, Bartram L, Gill B, Bajic IV (2013) AMPds: a public dataset for load disaggregation and eco-feedback research. In: Electrical power and energy conference (EPEC), 2013 IEEE, pp 1–6 Makonin S, Popowich F, Bartram L, Gill B, Bajic IV (2013) AMPds: a public dataset for load disaggregation and eco-feedback research. In: Electrical power and energy conference (EPEC), 2013 IEEE, pp 1–6
13.
Zurück zum Zitat Mäntyjärvi J, Himberg J, Kangas P, Tuomela U, Huuskonen P (2004) Sensor signal data set for exploring context recognition of mobile devices. In: Workshop “Benchmarks and a database for context recognition” in conjunction with the 2nd international conference on pervasive computing (PERVASIVE 2004) Mäntyjärvi J, Himberg J, Kangas P, Tuomela U, Huuskonen P (2004) Sensor signal data set for exploring context recognition of mobile devices. In: Workshop “Benchmarks and a database for context recognition” in conjunction with the 2nd international conference on pervasive computing (PERVASIVE 2004)
14.
Zurück zum Zitat Mueen A (2013) Enumeration of time series motifs of all lengths. ICDM Mueen A (2013) Enumeration of time series motifs of all lengths. ICDM
15.
Zurück zum Zitat Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10. ACM, Washington, DC, pp 1089–1098. ISBN: 978-1-4503-0055-1 Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10. ACM, Washington, DC, pp 1089–1098. ISBN: 978-1-4503-0055-1
16.
Zurück zum Zitat Mueen A, Keogh E (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11. ACM, New York, pp 1154–1162. ISBN: 978-1-4503-0813-7 Mueen A, Keogh E (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11. ACM, New York, pp 1154–1162. ISBN: 978-1-4503-0813-7
18.
Zurück zum Zitat Mueen A, Nath S, Liu J (2010) Fast approximate correlation for massive time-series data. In: SIGMOD conference, pp. 171–182 Mueen A, Nath S, Liu J (2010) Fast approximate correlation for massive time-series data. In: SIGMOD conference, pp. 171–182
19.
Zurück zum Zitat Narang A, Bhattcherjee S (2011) Real-time approximate range motif discovery & data redundancy removal algorithm. In: Proceedings of the 14th international conference on extending database technology, EDBT/ICDT ’11. ACM, New York, pp 485–496. ISBN: 978-1-4503-0528-0 Narang A, Bhattcherjee S (2011) Real-time approximate range motif discovery & data redundancy removal algorithm. In: Proceedings of the 14th international conference on extending database technology, EDBT/ICDT ’11. ACM, New York, pp 485–496. ISBN: 978-1-4503-0528-0
20.
Zurück zum Zitat Nunthanid P, Niennattrakul V, Ratanamahatana C (2011) Discovery of variable length time series motif. In: 2011 8th international conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), pp 472–475 Nunthanid P, Niennattrakul V, Ratanamahatana C (2011) Discovery of variable length time series motif. In: 2011 8th international conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), pp 472–475
21.
Zurück zum Zitat Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: Proceedings of the IEEE international conference on data mining, ICDM Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: Proceedings of the IEEE international conference on data mining, ICDM
22.
Zurück zum Zitat Pohl H, Hadjakos A (2010) Dance pattern recognition using dynamic time warping. In: SMC 2010 proceedings, pp 183–190 Pohl H, Hadjakos A (2010) Dance pattern recognition using dynamic time warping. In: SMC 2010 proceedings, pp 183–190
23.
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, pp 262–270. ISBN: 978-1-4503-1462-6 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, pp 262–270. ISBN: 978-1-4503-1462-6
24.
Zurück zum Zitat Rakthanmanon T, Keogh EJ, Lonardi S, Evans S (2011) Time series epenthesis: clustering time series streams requires ignoring some data. In: Proceedings of the 2011 IEEE 11th international conference on data mining, ICDM ’11, pp 547–556 Rakthanmanon T, Keogh EJ, Lonardi S, Evans S (2011) Time series epenthesis: clustering time series streams requires ignoring some data. In: Proceedings of the 2011 IEEE 11th international conference on data mining, ICDM ’11, pp 547–556
25.
Zurück zum Zitat Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGMOD conference, pp 599–610 Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGMOD conference, pp 599–610
26.
Zurück zum Zitat Tanaka Y, Iwamoto K, Uehara K (2005) Discovery of time-series motif from multi-dimensional data based on MDL principle. Mach Learn 58:269–300MATHCrossRef Tanaka Y, Iwamoto K, Uehara K (2005) Discovery of time-series motif from multi-dimensional data based on MDL principle. Mach Learn 58:269–300MATHCrossRef
27.
Zurück zum Zitat Tang H, Liao SS (2008) Discovering original motifs with different lengths from time series. Knowl Based Syst 21:666–671CrossRef Tang H, Liao SS (2008) Discovering original motifs with different lengths from time series. Knowl Based Syst 21:666–671CrossRef
28.
Zurück zum Zitat Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07, pp 844–853 Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07, pp 844–853
29.
Zurück zum Zitat Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD, pp 947–956 Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD, pp 947–956
30.
Zurück zum Zitat Yingchareonthawornchai S, Sivaraks H, Rakthanmanon T, Ratanamahatana C (2013) Efficient proper length time series motif discovery. In: 2013 IEEE 13th international conference on data mining (ICDM), pp 1265–1270 Yingchareonthawornchai S, Sivaraks H, Rakthanmanon T, Ratanamahatana C (2013) Efficient proper length time series motif discovery. In: 2013 IEEE 13th international conference on data mining (ICDM), pp 1265–1270
Metadaten
Titel
Enumeration of time series motifs of all lengths
verfasst von
Abdullah Mueen
Nikan Chavoshi
Publikationsdatum
01.10.2015
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2015
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0793-4

Weitere Artikel der Ausgabe 1/2015

Knowledge and Information Systems 1/2015 Zur Ausgabe