Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 1/2020

18.11.2019

FastEE: Fast Ensembles of Elastic Distances for time series classification

verfasst von: Chang Wei Tan, François Petitjean, Geoffrey I. Webb

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years, many new ensemble-based time series classification (TSC) algorithms have been proposed. Each of them is significantly more accurate than their predecessors. The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is currently the most accurate TSC algorithm when assessed on the UCR repository. It is a meta-ensemble of 5 state-of-the-art ensemble-based classifiers. The time complexity of HIVE-COTE—particularly for training—is prohibitive for most datasets. There is thus a critical need to speed up the classifiers that compose HIVE-COTE. This paper focuses on speeding up one of its components: Ensembles of Elastic Distances (EE), which is the classifier that leverages on the decades of research into the development of time-dedicated measures. Training EE can be prohibitive for many datasets. For example, it takes a month on the ElectricDevices dataset with 9000 instances. This is because EE needs to cross-validate the hyper-parameters used for the 11 similarity measures it encompasses. In this work, Fast Ensembles of Elastic Distances is proposed to train EE faster. There are two versions to it. The exact version makes it possible to train EE 10 times faster. The approximate version is 40 times faster than EE without significantly impacting the classification accuracy. This translates to being able to train EE on ElectricDevices in 13 h.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Bagnall A, Lines J (2014) An experimental evaluation of nearest neighbour time series classification. technical report# cmp-c14-01. Department of Computing Sciences, University of East Anglia, Technical Report Bagnall A, Lines J (2014) An experimental evaluation of nearest neighbour time series classification. technical report# cmp-c14-01. Department of Computing Sciences, University of East Anglia, Technical Report
Zurück zum Zitat Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535CrossRef Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535CrossRef
Zurück zum Zitat Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660MathSciNetCrossRef Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660MathSciNetCrossRef
Zurück zum Zitat Boreczky JS, Rowe LA (1996) Comparison of video shot boundary detection techniques. J Electron Imaging 5(2):122–129CrossRef Boreczky JS, Rowe LA (1996) Comparison of video shot boundary detection techniques. J Electron Imaging 5(2):122–129CrossRef
Zurück zum Zitat Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: Proceedings of the 30th international conference on very large databases (VLDB), pp 792–803CrossRef Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: Proceedings of the 30th international conference on very large databases (VLDB), pp 792–803CrossRef
Zurück zum Zitat Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data (SIGMOD), pp 491–502 Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data (SIGMOD), pp 491–502
Zurück zum Zitat Dau H, Silva D, Petitjean F, Bagnall A, Keogh E (2017) Judicious setting of dynamic time warping’s window width allows more accurate classification of time series. In: Proceedings of the 2017 IEEE international conference on big data (Big Data), pp 917–922 Dau H, Silva D, Petitjean F, Bagnall A, Keogh E (2017) Judicious setting of dynamic time warping’s window width allows more accurate classification of time series. In: Proceedings of the 2017 IEEE international conference on big data (Big Data), pp 917–922
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
Zurück zum Zitat Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the 34th international conference on very large data bases (VLDB), pp 1542–1552CrossRef Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the 34th international conference on very large data bases (VLDB), pp 1542–1552CrossRef
Zurück zum Zitat Flynn M, Large J, Bagnall T (2019) The contract random interval spectral ensemble (c-RISE): the effect of contracting a classifier on accuracy. In: Proceedings of 2019 international conference on hybrid artificial intelligence systems (HAIS), pp 381–392 Flynn M, Large J, Bagnall T (2019) The contract random interval spectral ensemble (c-RISE): the effect of contracting a classifier on accuracy. In: Proceedings of 2019 international conference on hybrid artificial intelligence systems (HAIS), pp 381–392
Zurück zum Zitat Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2014) Classification of time series by shapelet transformation. Data Min Knowl Discov 28(4):851–881MathSciNetCrossRef Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2014) Classification of time series by shapelet transformation. Data Min Knowl Discov 28(4):851–881MathSciNetCrossRef
Zurück zum Zitat Inglada J, Arias M, Tardy B, Hagolle O, Valero S, Morin D, Dedieu G, Sepulcre G, Bontemps S, Defourny P, Koetz B (2015) Assessment of an operational system for crop type map production using high temporal and spatial resolution satellite optical imagery. Remote Sens 7(9):12356–12379CrossRef Inglada J, Arias M, Tardy B, Hagolle O, Valero S, Morin D, Dedieu G, Sepulcre G, Bontemps S, Defourny P, Koetz B (2015) Assessment of an operational system for crop type map production using high temporal and spatial resolution satellite optical imagery. Remote Sens 7(9):12356–12379CrossRef
Zurück zum Zitat Inglada J, Vincent A, Arias M, Marais-Sicre C (2016) Improved early crop type identification by joint use of high temporal resolution sar and optical image time series. Remote Sens 8(5):362CrossRef Inglada J, Vincent A, Arias M, Marais-Sicre C (2016) Improved early crop type identification by joint use of high temporal resolution sar and optical image time series. Remote Sens 8(5):362CrossRef
Zurück zum Zitat Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72CrossRef Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72CrossRef
Zurück zum Zitat Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240CrossRef Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240CrossRef
Zurück zum Zitat Keogh E, Ratanamahatana C (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386CrossRef Keogh E, Ratanamahatana C (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386CrossRef
Zurück zum Zitat Keogh EJ, Pazzani MJ (2001) Derivative dynamic time warping. In: Proceedings of the 2001 SIAM international conference on data mining (SDM), pp 1–11 Keogh EJ, Pazzani MJ (2001) Derivative dynamic time warping. In: Proceedings of the 2001 SIAM international conference on data mining (SDM), pp 1–11
Zurück zum Zitat Kim SW, Park S, Chu WW (2001) An index-based approach for similarity search supporting time warping in large sequence databases. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 607–614 Kim SW, Park S, Chu WW (2001) An index-based approach for similarity search supporting time warping in large sequence databases. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 607–614
Zurück zum Zitat Lemire D (2009) Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn 42(9):2169–2180CrossRef Lemire D (2009) Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn 42(9):2169–2180CrossRef
Zurück zum Zitat Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592MathSciNetCrossRef Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592MathSciNetCrossRef
Zurück zum Zitat Lines J, Taylor S, Bagnall A (2016) HIVE-COTE: The hierarchical vote collective of transformation-based ensembles for time series classification. In: Proceedings of the 16th IEEE international conference on data mining (ICDM), pp 1041–1046 Lines J, Taylor S, Bagnall A (2016) HIVE-COTE: The hierarchical vote collective of transformation-based ensembles for time series classification. In: Proceedings of the 16th IEEE international conference on data mining (ICDM), pp 1041–1046
Zurück zum Zitat Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Min Knowl Discov 33(3):607–635CrossRef Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Min Knowl Discov 33(3):607–635CrossRef
Zurück zum Zitat Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318CrossRef Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318CrossRef
Zurück zum Zitat Petitjean F, Inglada J, Gançarski P (2012) Satellite image time series analysis under time warping. IEEE Trans Geosci Remote Sens 50(8):3081–3095CrossRef Petitjean F, Inglada J, Gançarski P (2012) Satellite image time series analysis under time warping. IEEE Trans Geosci Remote Sens 50(8):3081–3095CrossRef
Zurück zum Zitat Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2014) Dynamic time warping averaging of time series allows faster and more accurate classification. In: Proceedings of the 2014 IEEE international conference on data mining (ICDM), pp 470–479 Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2014) Dynamic time warping averaging of time series allows faster and more accurate classification. In: Proceedings of the 2014 IEEE international conference on data mining (ICDM), pp 470–479
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 262–270 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 262–270
Zurück zum Zitat Ratanamahatana C, Keogh E (2005) Three myths about DTW data mining. In: Proceedings of the 2005 SIAM international conference on data mining (SDM), pp 506–510 Ratanamahatana C, Keogh E (2005) Three myths about DTW data mining. In: Proceedings of the 2005 SIAM international conference on data mining (SDM), pp 506–510
Zurück zum Zitat Ratanamahatana CA, Keogh E (2004) Making time-series classification more accurate using learned constraints. In: Proceedings of the 2004 SIAM international conference on data mining, pp 11–22 Ratanamahatana CA, Keogh E (2004) Making time-series classification more accurate using learned constraints. In: Proceedings of the 2004 SIAM international conference on data mining, pp 11–22
Zurück zum Zitat Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. In: Proceedings of the 7th international congress on acoustics, Budapest, Hungary, vol 3, pp 65–69 Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. In: Proceedings of the 7th international congress on acoustics, Budapest, Hungary, vol 3, pp 65–69
Zurück zum Zitat Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49CrossRef Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49CrossRef
Zurück zum Zitat Shen Y, Chen Y, Keogh E, Jin H (2018) Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM international conference on data mining (SDM), pp 234–242CrossRef Shen Y, Chen Y, Keogh E, Jin H (2018) Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM international conference on data mining (SDM), pp 234–242CrossRef
Zurück zum Zitat Silva D, Batista G (2016) Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM international conference on data mining (SDM), pp 837–845 Silva D, Batista G (2016) Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM international conference on data mining (SDM), pp 837–845
Zurück zum Zitat Srikanthan S, Kumar A, Gupta R (2011) Implementing the dynamic time warping algorithm in multithreaded environments for real time and unsupervised pattern discovery. In: Proceedings of the 2nd international conference on computer and communication technology (ICCCT), pp 394–398 Srikanthan S, Kumar A, Gupta R (2011) Implementing the dynamic time warping algorithm in multithreaded environments for real time and unsupervised pattern discovery. In: Proceedings of the 2nd international conference on computer and communication technology (ICCCT), pp 394–398
Zurück zum Zitat Stefan A, Athitsos V, Das G (2013) The move-split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438CrossRef Stefan A, Athitsos V, Das G (2013) The move-split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438CrossRef
Zurück zum Zitat Tan CW, Webb GI, Petitjean F (2017) Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM international conference on data mining (SDM), pp 282–290CrossRef Tan CW, Webb GI, Petitjean F (2017) Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM international conference on data mining (SDM), pp 282–290CrossRef
Zurück zum Zitat Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM international conference on data mining (SDM), pp 225–233CrossRef Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM international conference on data mining (SDM), pp 225–233CrossRef
Zurück zum Zitat Tan CW, Petitjean F, Webb GI (2019) Elastic bands across the path: a new framework and methods to lower bound DTW. In: Proceedings of the 2019 SIAM international conference on data mining (SDM), pp 522–530CrossRef Tan CW, Petitjean F, Webb GI (2019) Elastic bands across the path: a new framework and methods to lower bound DTW. In: Proceedings of the 2019 SIAM international conference on data mining (SDM), pp 522–530CrossRef
Zurück zum Zitat Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of the 18th international conference on data engineering (ICDE), pp 673–684 Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of the 18th international conference on data engineering (ICDE), pp 673–684
Zurück zum Zitat Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh E (2003) Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 216–225 Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh E (2003) Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 216–225
Zurück zum Zitat Yi BK, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th international conference on data engineering (ICDE), pp 201–208 Yi BK, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th international conference on data engineering (ICDE), pp 201–208
Metadaten
Titel
FastEE: Fast Ensembles of Elastic Distances for time series classification
verfasst von
Chang Wei Tan
François Petitjean
Geoffrey I. Webb
Publikationsdatum
18.11.2019
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 1/2020
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-019-00663-x

Weitere Artikel der Ausgabe 1/2020

Data Mining and Knowledge Discovery 1/2020 Zur Ausgabe