Skip to main content
Erschienen in: Knowledge and Information Systems 1/2016

01.04.2016 | Regular Paper

Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm

verfasst von: François Petitjean, Germain Forestier, Geoffrey I. Webb, Ann E. Nicholson, Yanping Chen, Eamonn Keogh

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A concerted research effort over the past two decades has heralded significant improvements in both the efficiency and effectiveness of time series classification. The consensus that has emerged in the community is that the best solution is a surprisingly simple one. In virtually all domains, the most accurate classifier is the nearest neighbor algorithm with dynamic time warping as the distance measure. The time complexity of dynamic time warping means that successful deployments on resource-constrained devices remain elusive. Moreover, the recent explosion of interest in wearable computing devices, which typically have limited computational resources, has greatly increased the need for very efficient classification algorithms. A classic technique to obtain the benefits of the nearest neighbor algorithm, without inheriting its undesirable time and space complexity, is to use the nearest centroid algorithm. Unfortunately, the unique properties of (most) time series data mean that the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this paper we demonstrate that we can exploit a recent result by Petitjean et al. to allow meaningful averaging of “warped” time series, which then allows us to create super-efficient nearest “centroid” classifiers that are at least as accurate as their more computationally challenged nearest neighbor relatives. We demonstrate empirically the utility of our approach by comparing it to all the appropriate strawmen algorithms on the ubiquitous UCR Benchmarks and with a case study in supporting insect classification on resource-constrained sensors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
The source code proving the statistical significance is available at [15]; it performs two-tailed Bonferronni–Dunn test to compare pairs of methods NCC to NN [16].
 
2
Note that the cognitive science use of “ensemble” is unrelated to the more familiar machine learning meaning.
 
3
It actually finds the compact multiple alignment [28].
 
4
We use 42 datasets, i.e., all but two of the datasets of the archive; we have excluded the StarLightCurve and FetalECG for computational reasons.
 
5
In case of ties, we assign the average (or fractional) ranking. For example, if there is one winner, two seconds and a loser [1, 2 ,2, 4], then the fractional ranking will be [1, 2.5, 2.5, 4].
 
Literatur
1.
Zurück zum Zitat Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309MathSciNetCrossRef Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309MathSciNetCrossRef
2.
Zurück zum Zitat Bagnall A, Lines J (2014) An experimental evaluation of nearest neighbour time series classification. Technical report #CMP-C14-01, Department of Computing Sciences, University of East Anglia, Tech. Rep Bagnall A, Lines J (2014) An experimental evaluation of nearest neighbour time series classification. Technical report #CMP-C14-01, Department of Computing Sciences, University of East Anglia, Tech. Rep
3.
Zurück zum Zitat Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction, in international conference on machine learning, pp 1033–1040 Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction, in international conference on machine learning, pp 1033–1040
4.
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping, in international conference on knowledge discovery and data mining, pp 262–270 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping, in international conference on knowledge discovery and data mining, pp 262–270
5.
Zurück zum Zitat Assent I, Wichterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. Proc VLDB Endow 2(1):826–837CrossRef Assent I, Wichterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. Proc VLDB Endow 2(1):826–837CrossRef
6.
Zurück zum Zitat Kremer H, Günnemann S, Ivanescu A-M, Assent I, Seidl T (2011) Efficient processing of multiple DTW queries in time series databases. Scientific and statistical database management. Springer, Berlin, pp 150–167CrossRef Kremer H, Günnemann S, Ivanescu A-M, Assent I, Seidl T (2011) Efficient processing of multiple DTW queries in time series databases. Scientific and statistical database management. Springer, Berlin, pp 150–167CrossRef
7.
Zurück zum Zitat Zhuang DE, Li GC, Wong AK (2014) Discovery of temporal associations in multivariate time series. IEEE Trans Knowl Data Eng 26(12):2969–2982 Zhuang DE, Li GC, Wong AK (2014) Discovery of temporal associations in multivariate time series. IEEE Trans Knowl Data Eng 26(12):2969–2982
8.
Zurück zum Zitat Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit 44(3):678–693CrossRefMATH Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit 44(3):678–693CrossRefMATH
9.
Zurück zum Zitat Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2014) Dynamic time warping averaging of time series allows faster and more accurate classification, in IEEE international conference on data mining, pp 470–479 Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2014) Dynamic time warping averaging of time series allows faster and more accurate classification, in IEEE international conference on data mining, pp 470–479
11.
Zurück zum Zitat Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Natl Acad Sci 99(10):6567–6572CrossRef Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Natl Acad Sci 99(10):6567–6572CrossRef
12.
Zurück zum Zitat Gou J, Yi Z, Du L, Xiong T (2012) A local mean-based k-nearest centroid neighbor classifier. Comput J 55(9):1058–1071CrossRef Gou J, Yi Z, Du L, Xiong T (2012) A local mean-based k-nearest centroid neighbor classifier. Comput J 55(9):1058–1071CrossRef
13.
Zurück zum Zitat Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(03):515–516CrossRef Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(03):515–516CrossRef
14.
Zurück zum Zitat Xi X, Ueno K, Keogh E, Lee D-J (2008) Converting non-parametric distance-based classification to anytime algorithms. Pattern Anal Appl 11(3–4):321–336MathSciNetCrossRef Xi X, Ueno K, Keogh E, Lee D-J (2008) Converting non-parametric distance-based classification to anytime algorithms. Pattern Anal Appl 11(3–4):321–336MathSciNetCrossRef
16.
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Lear Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Lear Res 7:1–30MathSciNetMATH
17.
Zurück zum Zitat Ariely D (2001) Seeing sets: representation by statistical properties. Psychol Sci 12(2):157–162CrossRef Ariely D (2001) Seeing sets: representation by statistical properties. Psychol Sci 12(2):157–162CrossRef
18.
Zurück zum Zitat Alvarez GA (2011) Representing multiple objects as an ensemble enhances visual cognition. Trends Cognit Sci 15(3):122–131CrossRef Alvarez GA (2011) Representing multiple objects as an ensemble enhances visual cognition. Trends Cognit Sci 15(3):122–131CrossRef
19.
Zurück zum Zitat Jenkins R, Burton A (2008) 100 % accuracy in automatic face recognition. Science 319(5862):435–435CrossRef Jenkins R, Burton A (2008) 100 % accuracy in automatic face recognition. Science 319(5862):435–435CrossRef
20.
Zurück zum Zitat Gusfield D (1997) Algorithms on strings, trees, and sequences: computer science and computational biology, Cambridge University Press, ch. 14 multiple string comparison—The Holy Grail, pp 332–367 Gusfield D (1997) Algorithms on strings, trees, and sequences: computer science and computational biology, Cambridge University Press, ch. 14 multiple string comparison—The Holy Grail, pp 332–367
21.
Zurück zum Zitat Wang L, Jiang T (1994) On the complexity of multiple sequence alignment. J Comput Biol 1(4):337–348CrossRef Wang L, Jiang T (1994) On the complexity of multiple sequence alignment. J Comput Biol 1(4):337–348CrossRef
22.
Zurück zum Zitat Gupta L, Molfese DL, Tammana R, Simos PG (1996) Nonlinear alignment and averaging for estimating the evoked potential. IEEE Trans Biomed Eng 43(4):348–356CrossRef Gupta L, Molfese DL, Tammana R, Simos PG (1996) Nonlinear alignment and averaging for estimating the evoked potential. IEEE Trans Biomed Eng 43(4):348–356CrossRef
24.
Zurück zum Zitat Niennattrakul V, Ratanamahatana CA (2009) Shape averaging under time warping, in IEEE international conference on electrical engineering/electronics, computer, telecommunications and information technology, vol. 2, pp 626–629 Niennattrakul V, Ratanamahatana CA (2009) Shape averaging under time warping, in IEEE international conference on electrical engineering/electronics, computer, telecommunications and information technology, vol. 2, pp 626–629
25.
Zurück zum Zitat Ongwattanakul S, Srisai D (2009) Contrast enhanced dynamic time warping distance for time series shape averaging classification, in International conference on interaction sciences: information technology, culture and human, ACM, pp 976–981 Ongwattanakul S, Srisai D (2009) Contrast enhanced dynamic time warping distance for time series shape averaging classification, in International conference on interaction sciences: information technology, culture and human, ACM, pp 976–981
26.
Zurück zum Zitat Feng D-F, Doolittle RF (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mole Evol 25(4):351–360CrossRef Feng D-F, Doolittle RF (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mole Evol 25(4):351–360CrossRef
28.
Zurück zum Zitat Petitjean F, Gançarski P (2012) Summarizing a set of time series by averaging: from Steiner sequence to compact multiple alignment. Theor Comput Sci 414(1):76–91MathSciNetCrossRefMATH Petitjean F, Gançarski P (2012) Summarizing a set of time series by averaging: from Steiner sequence to compact multiple alignment. Theor Comput Sci 414(1):76–91MathSciNetCrossRefMATH
29.
Zurück zum Zitat Petitjean F, Inglada J, Gançarski P (2012) Satellite image time series analysis under time warping. IEEE Trans Geosci Remote Sens 50(8):3081–3095CrossRef Petitjean F, Inglada J, Gançarski P (2012) Satellite image time series analysis under time warping. IEEE Trans Geosci Remote Sens 50(8):3081–3095CrossRef
31.
Zurück zum Zitat Kranen P, Seidl T (2009) Harnessing the strengths of anytime algorithms for constant data streams. Data Min Knowl Discov 19(2):245–260MathSciNetCrossRef Kranen P, Seidl T (2009) Harnessing the strengths of anytime algorithms for constant data streams. Data Min Knowl Discov 19(2):245–260MathSciNetCrossRef
32.
Zurück zum Zitat Hu B, Rakthanmanon T, Hao Y, Evans S, Lonardi S, Keogh E (2011) Discovering the intrinsic cardinality and dimensionality of time series using MDL, in IEEE international conference on data mining, pp 1086–1091 Hu B, Rakthanmanon T, Hao Y, Evans S, Lonardi S, Keogh E (2011) Discovering the intrinsic cardinality and dimensionality of time series using MDL, in IEEE international conference on data mining, pp 1086–1091
33.
Zurück zum Zitat Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping data mining, in SIAM international conference on data mining, pp 506–510 Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping data mining, in SIAM international conference on data mining, pp 506–510
34.
Zurück zum Zitat Niennattrakul V, Ratanamahatana CA (2007) Inaccuracies of shape averaging method using dynamic time warping for time series data, in international conference on computational science. Springer, pp 513–520 Niennattrakul V, Ratanamahatana CA (2007) Inaccuracies of shape averaging method using dynamic time warping for time series data, in international conference on computational science. Springer, pp 513–520
35.
Zurück zum Zitat Pekalska E, Duin RP, Paclìk P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogni 39(2):189–208CrossRefMATH Pekalska E, Duin RP, Paclìk P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogni 39(2):189–208CrossRefMATH
36.
Zurück zum Zitat Ueno K, Xi X, Keogh E, Lee D-J (2006) Anytime classification using the nearest neighbor algorithm with applications to stream mining, in IEEE international conference on data mining, pp 623–632 Ueno K, Xi X, Keogh E, Lee D-J (2006) Anytime classification using the nearest neighbor algorithm with applications to stream mining, in IEEE international conference on data mining, pp 623–632
37.
Zurück zum Zitat Chen Y, Why A, Batista G, Mafra-Neto A, Keogh E (2014) Flying insect classification with inexpensive sensors. J Insect Behav 27(5):657–677CrossRef Chen Y, Why A, Batista G, Mafra-Neto A, Keogh E (2014) Flying insect classification with inexpensive sensors. J Insect Behav 27(5):657–677CrossRef
38.
Zurück zum Zitat Goddard LB, Roth AE, Reisen WK, Scott TW et al (2002) Vector competence of California mosquitoes for west Nile virus. Emerging Infect Dis 8(12):1385–1391CrossRef Goddard LB, Roth AE, Reisen WK, Scott TW et al (2002) Vector competence of California mosquitoes for west Nile virus. Emerging Infect Dis 8(12):1385–1391CrossRef
39.
Zurück zum Zitat Yang Y, Webb GI, Korb K, Ting K-M (2007) Classifying under computational resource constraints: anytime classification using probabilistic estimators. Mach Learn, 69(1):35–53 Yang Y, Webb GI, Korb K, Ting K-M (2007) Classifying under computational resource constraints: anytime classification using probabilistic estimators. Mach Learn, 69(1):35–53
Metadaten
Titel
Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm
verfasst von
François Petitjean
Germain Forestier
Geoffrey I. Webb
Ann E. Nicholson
Yanping Chen
Eamonn Keogh
Publikationsdatum
01.04.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2016
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0878-8

Weitere Artikel der Ausgabe 1/2016

Knowledge and Information Systems 1/2016 Zur Ausgabe

Premium Partner