Skip to main content
Erschienen in: Knowledge and Information Systems 2/2019

05.09.2018 | Regular Paper

Similarity measures for time series data classification using grid representation and matrix distance

verfasst von: Yanqing Ye, Jiang Jiang, Bingfeng Ge, Yajie Dou, Kewei Yang

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Two similarity measures are proposed that can successfully capture both the numerical and point distribution characteristics of time series. More specifically, a novel grid representation for time series is first presented, with which a time series is segmented and compiled into a matrix format. Based on the proposed grid representation, two matrix matching algorithms, matrix-based Euclidean distance (GMED) and matrix-based dynamic time warping (GMDTW), are adapted to measure the similarity of matrix-like time series. Last, to assess the effectiveness of the proposed similarity measures, 1NN classification and K-means experiments are conducted using 22 online datasets from the UCR time series datasets Web site. In general, the results indicate that GMDTW measure is apparently superior to most current measures in accuracy, while the GMED can achieve much higher efficiency than dynamic time warping algorithm with equivalent performance. Furthermore, effects of the parameters in the proposed measures are analyzed and a way to determine the values of the parameters has been given.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Leary DEO (2016) Ethics for big data and analytics. IEEE Intell Syst 31(4):81–84CrossRef Leary DEO (2016) Ethics for big data and analytics. IEEE Intell Syst 31(4):81–84CrossRef
2.
Zurück zum Zitat Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering: a decade review. Inf Syst 53:16–38CrossRef Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering: a decade review. Inf Syst 53:16–38CrossRef
3.
Zurück zum Zitat Gandhi, A (2002) Content-based image retrieval: plant species identification. MS thesis, Oregon State University Gandhi, A (2002) Content-based image retrieval: plant species identification. MS thesis, Oregon State University
4.
5.
Zurück zum Zitat Nielsen CB, Larsen PG, Fitzgerald J, Woodcock J, Peleska J (2015) Systems of systems engineering: basic concepts, model-based techniques, and research directions. ACM Comput Surv 48(2):1–41CrossRef Nielsen CB, Larsen PG, Fitzgerald J, Woodcock J, Peleska J (2015) Systems of systems engineering: basic concepts, model-based techniques, and research directions. ACM Comput Surv 48(2):1–41CrossRef
6.
Zurück zum Zitat Mori U, Mendiburu A, Lozano JA (2016) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195CrossRef Mori U, Mendiburu A, Lozano JA (2016) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195CrossRef
7.
Zurück zum Zitat Serra J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl Based Syst 67:305–314CrossRef Serra J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl Based Syst 67:305–314CrossRef
8.
Zurück zum Zitat Baydogan MG, Runger G (2016) Time series representation and similarity based on local autopatterns. Data Min Knowl Discov 30(2):476–509MathSciNetCrossRefMATH Baydogan MG, Runger G (2016) Time series representation and similarity based on local autopatterns. Data Min Knowl Discov 30(2):476–509MathSciNetCrossRefMATH
9.
Zurück zum Zitat Keogh E, Chakrabarti K, Mehrotra S, Pazzani M (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data, pp 151–163 Keogh E, Chakrabarti K, Mehrotra S, Pazzani M (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data, pp 151–163
10.
Zurück zum Zitat Keogh E (1997) Fast similarity search in the presence of longitudinal scaling in time series databases. In: Proceedings of the ninth IEEE international conference on tools with artificial intelligence, pp 578–584 Keogh E (1997) Fast similarity search in the presence of longitudinal scaling in time series databases. In: Proceedings of the ninth IEEE international conference on tools with artificial intelligence, pp 578–584
11.
Zurück zum Zitat Keogh E, Pazzani M (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Proceedings of the 4th Pacific-Asia conference on knowledge discovery and data mining, pp 122–133 Keogh E, Pazzani M (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Proceedings of the 4th Pacific-Asia conference on knowledge discovery and data mining, pp 122–133
12.
Zurück zum Zitat Azzouzi M, Nabney IT (1998) Analysing time series structure with hidden Markov models. In: Proceedings of the IEEE conference on neural networks and signal processing, pp 402–408 Azzouzi M, Nabney IT (1998) Analysing time series structure with hidden Markov models. In: Proceedings of the IEEE conference on neural networks and signal processing, pp 402–408
13.
Zurück zum Zitat Serr J, Kantz H, Serra X, Andrzejak RG (2012) Predictability of music descriptor time series and its application to cover song detection. IEEE Trans Audio Speech Lang Process 20:514–525 Serr J, Kantz H, Serra X, Andrzejak RG (2012) Predictability of music descriptor time series and its application to cover song detection. IEEE Trans Audio Speech Lang Process 20:514–525
14.
Zurück zum Zitat Weng X, Shen J (2008) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl Based Syst 21:535–539CrossRef Weng X, Shen J (2008) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl Based Syst 21:535–539CrossRef
15.
Zurück zum Zitat Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631 Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631
16.
18.
Zurück zum Zitat Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time series clustering: a decade review. Inf Syst 53:16–38CrossRef Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time series clustering: a decade review. Inf Syst 53:16–38CrossRef
19.
Zurück zum Zitat Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11 Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11
20.
Zurück zum Zitat Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631 Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631
21.
Zurück zum Zitat Agrawal R, Faloutsos C, Swami A, Lomet D (ed) (1993) Efficient similarity search in sequence databases, foundations of data organization and algorithms. Springer, Berlin, pp 69–84 Agrawal R, Faloutsos C, Swami A, Lomet D (ed) (1993) Efficient similarity search in sequence databases, foundations of data organization and algorithms. Springer, Berlin, pp 69–84
22.
Zurück zum Zitat Chen L, TamerOzsu M (2003) Similarity-based retrieval of time-series data using multi-scale histograms, computer sciences technical report. University of Waterloo, Waterloo, CS-2003-31 Chen L, TamerOzsu M (2003) Similarity-based retrieval of time-series data using multi-scale histograms, computer sciences technical report. University of Waterloo, Waterloo, CS-2003-31
23.
Zurück zum Zitat An J, Chen H, Furuse K, Ohbo N, Keogh E (2003) Grid-based indexing for large time series databases. In: Intelligent data engineering and automated learning (IDEAL), pp 614–621 An J, Chen H, Furuse K, Ohbo N, Keogh E (2003) Grid-based indexing for large time series databases. In: Intelligent data engineering and automated learning (IDEAL), pp 614–621
24.
Zurück zum Zitat Duan G, Suzuki Y, Kawagoe K (2006) Grid representation of efficient similarity search in time series databases. In: Proceedings of the 22nd international conference on data engineering workshops (ICDEW’06), pp 64–70 Duan G, Suzuki Y, Kawagoe K (2006) Grid representation of efficient similarity search in time series databases. In: Proceedings of the 22nd international conference on data engineering workshops (ICDEW’06), pp 64–70
25.
Zurück zum Zitat Reshef DN, Reshef YA, Finucane HK, Grossman SR, Mcvean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524CrossRefMATH Reshef DN, Reshef YA, Finucane HK, Grossman SR, Mcvean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524CrossRefMATH
26.
Zurück zum Zitat Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386CrossRef Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386CrossRef
27.
Zurück zum Zitat Gorecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recogn Lett 45:99–105CrossRef Gorecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recogn Lett 45:99–105CrossRef
28.
Zurück zum Zitat Jeong YS, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl Based Syst 75:184–191CrossRef Jeong YS, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl Based Syst 75:184–191CrossRef
29.
Zurück zum Zitat Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44:2231–2240CrossRef Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44:2231–2240CrossRef
30.
Zurück zum Zitat Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: VLDB04: Proceedings of the 30th international conference on very large data bases, pp 792–803 Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: VLDB04: Proceedings of the 30th international conference on very large data bases, pp 792–803
31.
Zurück zum Zitat Das G, Gunopulos D, Mannila H (1997) Finding similar time series. In: Komorowski J, Zytkow J (eds) Principles of data mining and knowledge discovery. Springer, Berlin, pp 88–100CrossRef Das G, Gunopulos D, Mannila H (1997) Finding similar time series. In: Komorowski J, Zytkow J (eds) Principles of data mining and knowledge discovery. Springer, Berlin, pp 88–100CrossRef
32.
Zurück zum Zitat Morse MD, Patel JM (2007) An efficient and accurate method for evaluating time series similarity. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 569–580 Morse MD, Patel JM (2007) An efficient and accurate method for evaluating time series similarity. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 569–580
33.
Zurück zum Zitat Chen L, Zsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, pp 491–502 Chen L, Zsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, pp 491–502
34.
Zurück zum Zitat Yueguo C, Nascimento MA, Beng CO, Tung AKH (2007) SpADe: on shape based pattern detection in streaming time series. In: Proceedings of the IEEE 23rd international conference on data engineering, pp 786–795 Yueguo C, Nascimento MA, Beng CO, Tung AKH (2007) SpADe: on shape based pattern detection in streaming time series. In: Proceedings of the IEEE 23rd international conference on data engineering, pp 786–795
35.
Zurück zum Zitat Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386CrossRef Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386CrossRef
36.
Zurück zum Zitat Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1:1542–1552CrossRef Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1:1542–1552CrossRef
37.
Zurück zum Zitat Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh EJ (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26:275–309MathSciNetCrossRef Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh EJ (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26:275–309MathSciNetCrossRef
38.
Zurück zum Zitat Batista GEAPA, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. In: Proceedings of the 11th SIAM international conference on data mining. SIAM, pp 699–710 Batista GEAPA, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. In: Proceedings of the 11th SIAM international conference on data mining. SIAM, pp 699–710
39.
Zurück zum Zitat Javid MAJ, Blackwell T, Zimmer R, Alrifaie MM (2016) Analysis of information gain and Kolmogorov complexity for structural evaluation of cellular automata configurations. Connect Sci 28(2):1–16 Javid MAJ, Blackwell T, Zimmer R, Alrifaie MM (2016) Analysis of information gain and Kolmogorov complexity for structural evaluation of cellular automata configurations. Connect Sci 28(2):1–16
40.
Zurück zum Zitat Greckia T, Luczak M (2015) Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst Appl 42:2305–2312CrossRef Greckia T, Luczak M (2015) Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst Appl 42:2305–2312CrossRef
41.
Zurück zum Zitat Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discov 30(2):283–312MathSciNetCrossRefMATH Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discov 30(2):283–312MathSciNetCrossRefMATH
42.
Zurück zum Zitat Pietzsch T, Saalfeld S, Preibisch S, Tomancak P (2015) BigDataViewer: visualization and processing for large image data sets. Nat Methods 12(6):481–483CrossRef Pietzsch T, Saalfeld S, Preibisch S, Tomancak P (2015) BigDataViewer: visualization and processing for large image data sets. Nat Methods 12(6):481–483CrossRef
Metadaten
Titel
Similarity measures for time series data classification using grid representation and matrix distance
verfasst von
Yanqing Ye
Jiang Jiang
Bingfeng Ge
Yajie Dou
Kewei Yang
Publikationsdatum
05.09.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1264-0

Weitere Artikel der Ausgabe 2/2019

Knowledge and Information Systems 2/2019 Zur Ausgabe

Premium Partner