Skip to main content
Erschienen in: Soft Computing 15/2019

09.06.2018 | Methodologies and Application

Hierarchical clustering of unequal-length time series with area-based shape distance

verfasst von: Xiao Wang, Fusheng Yu, Witold Pedrycz, Jiayin Wang

Erschienen in: Soft Computing | Ausgabe 15/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Time-series clustering algorithms have been used in a variety of areas to extract valuable information from complex and massive data sets. However, these algorithms suffer from two shortcomings. On the one hand, most of them are designed for the equal-length time series, while clustering of unequal-length time series is often encountered in real-world problems. On the other hand, commonly used distance measures of time series cannot fully reveal trend differences. To overcome these two shortcomings, this paper focuses on the trend of time series and employs the area-based shape distance to measure their similarity. In addition, we present a new hierarchical clustering for unequal-length time series based on area-based shape distance measure. A series of experiments illustrates the performance of the proposed clustering algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aghabozorgi S, Shirkhorshidi A, Wah T (2015) Time-series clustering-a decade review. Inf Syst 53:16–38CrossRef Aghabozorgi S, Shirkhorshidi A, Wah T (2015) Time-series clustering-a decade review. Inf Syst 53:16–38CrossRef
Zurück zum Zitat Bagnall A, Janacek G (2005) Clustering time series with clipped data. Mach Learn 58(2–3):151–178MATHCrossRef Bagnall A, Janacek G (2005) Clustering time series with clipped data. Mach Learn 58(2–3):151–178MATHCrossRef
Zurück zum Zitat Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD Workshop Seattle 10:359–370 Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD Workshop Seattle 10:359–370
Zurück zum Zitat Caiado J, Crato N, Peña D (2009) Comparison of times series with unequal length in the frequency domain. Commun Stat Simul Comput 38:527–540MathSciNetMATHCrossRef Caiado J, Crato N, Peña D (2009) Comparison of times series with unequal length in the frequency domain. Commun Stat Simul Comput 38:527–540MathSciNetMATHCrossRef
Zurück zum Zitat Camacho M, Perez-Quiro G, Saiz L (2006) Are European business cycles close enough to be just one? J Econ Dyn Control 30(9–10):1687–1706MATHCrossRef Camacho M, Perez-Quiro G, Saiz L (2006) Are European business cycles close enough to be just one? J Econ Dyn Control 30(9–10):1687–1706MATHCrossRef
Zurück zum Zitat Cao D, Tian Y, Bai D (2015) Time series clustering method based on principal component analysis. In 5th International conference on information engineering for mechanics and materials, pp 888–895 Cao D, Tian Y, Bai D (2015) Time series clustering method based on principal component analysis. In 5th International conference on information engineering for mechanics and materials, pp 888–895
Zurück zum Zitat Dai D, Mu D (2012) A fast approach to \(K\)-means clustering for time series based on symbolic representation. Int J Adv Comput Technol 4(5):233–239MathSciNet Dai D, Mu D (2012) A fast approach to \(K\)-means clustering for time series based on symbolic representation. Int J Adv Comput Technol 4(5):233–239MathSciNet
Zurück zum Zitat Dias J, Vermunt J, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864MATHCrossRef Dias J, Vermunt J, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864MATHCrossRef
Zurück zum Zitat Górecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recognit Lett 45(1):99–105CrossRef Górecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recognit Lett 45(1):99–105CrossRef
Zurück zum Zitat http://archive.ics.uci.edu/ml/datasets.html. Accessed 29 Nov 2017 http://archive.ics.uci.edu/ml/datasets.html. Accessed 29 Nov 2017
Zurück zum Zitat Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244CrossRef Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244CrossRef
Zurück zum Zitat Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8:154–177CrossRef Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8:154–177CrossRef
Zurück zum Zitat Keogh E, Pazzani M (2001) Derivative dynamic time warping, In: Proceedings of the SIAM international conference on data mining, Chicago, pp 5–7 Keogh E, Pazzani M (2001) Derivative dynamic time warping, In: Proceedings of the SIAM international conference on data mining, Chicago, pp 5–7
Zurück zum Zitat Kini V, Sekhar C (2009) Bayesian mixture of AR models for time series clustering. Formal Pattern Anal Appl 16(2):35–38MathSciNet Kini V, Sekhar C (2009) Bayesian mixture of AR models for time series clustering. Formal Pattern Anal Appl 16(2):35–38MathSciNet
Zurück zum Zitat Košmelj K, Batagelj V (1990) Cross-sectional approach for clustering time varying data. J Classif 7:99–109MathSciNetCrossRef Košmelj K, Batagelj V (1990) Cross-sectional approach for clustering time varying data. J Classif 7:99–109MathSciNetCrossRef
Zurück zum Zitat Lai C, Chung P, Tseng V (2010) A novel two-level clustering method for time series data analysis. Expert Syst Appl 37(9):6319–6326CrossRef Lai C, Chung P, Tseng V (2010) A novel two-level clustering method for time series data analysis. Expert Syst Appl 37(9):6319–6326CrossRef
Zurück zum Zitat Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recognit 45(6):2251–2265MATHCrossRef Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recognit 45(6):2251–2265MATHCrossRef
Zurück zum Zitat Liao T (2005) Clustering of time series data-a survey. Pattern Recognit 38(11):1857–1874MATHCrossRef Liao T (2005) Clustering of time series data-a survey. Pattern Recognit 38(11):1857–1874MATHCrossRef
Zurück zum Zitat Łuczak M (2016) Hierarchical clustering of time series data with parametric derivative dynamic time warping. Expert Syst Appl 62:116–130CrossRef Łuczak M (2016) Hierarchical clustering of time series data with parametric derivative dynamic time warping. Expert Syst Appl 62:116–130CrossRef
Zurück zum Zitat Mori U, Mendiburu A, Lozano J (2015) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195CrossRef Mori U, Mendiburu A, Lozano J (2015) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195CrossRef
Zurück zum Zitat Nguyen H, Mclachlan G, Orban P, Bellec P, Janke A (2017) Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput 29(4):990–1020MathSciNetMATHCrossRef Nguyen H, Mclachlan G, Orban P, Bellec P, Janke A (2017) Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput 29(4):990–1020MathSciNetMATHCrossRef
Zurück zum Zitat Nieto-Barajas L, Contreras-Cristán A (2014) A Bayesian nonparametric approach for time series clustering. Bayesian Anal 9(1):147–170MathSciNetMATHCrossRef Nieto-Barajas L, Contreras-Cristán A (2014) A Bayesian nonparametric approach for time series clustering. Bayesian Anal 9(1):147–170MathSciNetMATHCrossRef
Zurück zum Zitat Qiu X, Zhang L, Suganthan P, Amaratunga G (2017) Oblique random forest ensemble via least square estimation for time series forecasting. Inf Sci 420:249–262CrossRef Qiu X, Zhang L, Suganthan P, Amaratunga G (2017) Oblique random forest ensemble via least square estimation for time series forecasting. Inf Sci 420:249–262CrossRef
Zurück zum Zitat Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Inst Math Stat 35(3):1012–1030MathSciNetMATH Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Inst Math Stat 35(3):1012–1030MathSciNetMATH
Zurück zum Zitat Roy A (2016) A novel multivariate fuzzy time series based forecasting algorithm incorporating the effect of clustering on prediction. Soft Comput 20(5):1991–2019CrossRef Roy A (2016) A novel multivariate fuzzy time series based forecasting algorithm incorporating the effect of clustering on prediction. Soft Comput 20(5):1991–2019CrossRef
Zurück zum Zitat Sedano J, Sedano J, Camara M, Prieto C (2016) Gene clustering for time-series microarray with production outputs. Soft Comput 20(11):4301–4312CrossRef Sedano J, Sedano J, Camara M, Prieto C (2016) Gene clustering for time-series microarray with production outputs. Soft Comput 20(11):4301–4312CrossRef
Zurück zum Zitat Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17CrossRef Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17CrossRef
Zurück zum Zitat Wang X, Yu F, Zhang H, Liu S, Wang J (2015) Large-scale time series clustering based on fuzzy granulation and collaboration. Int J Intell Syst 30(6):763–780CrossRef Wang X, Yu F, Zhang H, Liu S, Wang J (2015) Large-scale time series clustering based on fuzzy granulation and collaboration. Int J Intell Syst 30(6):763–780CrossRef
Zurück zum Zitat Wang X, Yu F, Pedrycz W (2016) An area-based shape distance measure of time series. Appl Soft Comput 48:650–659CrossRef Wang X, Yu F, Pedrycz W (2016) An area-based shape distance measure of time series. Appl Soft Comput 48:650–659CrossRef
Zurück zum Zitat Wei L, Jiang J (2010) A hidden Markov model-based K-means time series clustering algorithm. In: IEEE international conference on intelligent computing & intelligent systems, pp 135–138 Wei L, Jiang J (2010) A hidden Markov model-based K-means time series clustering algorithm. In: IEEE international conference on intelligent computing & intelligent systems, pp 135–138
Zurück zum Zitat Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recognit 37(8):1675–1689MATHCrossRef Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recognit 37(8):1675–1689MATHCrossRef
Zurück zum Zitat Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115MathSciNetMATHCrossRef Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115MathSciNetMATHCrossRef
Zurück zum Zitat Yu F, Dong K, Chen F, Jiang Y, Zeng W (2007) Clustering time series with granular dynamic time warping method. In: IEEE international conference on granular computing, San Jose, CA, pp 393–398 Yu F, Dong K, Chen F, Jiang Y, Zeng W (2007) Clustering time series with granular dynamic time warping method. In: IEEE international conference on granular computing, San Jose, CA, pp 393–398
Zurück zum Zitat Zhang Y, Mańdziuk J, Chai H, Goh B (2017) Curvature-based method for determining the number of clusters. Inf Sci 415–416:414–428CrossRef Zhang Y, Mańdziuk J, Chai H, Goh B (2017) Curvature-based method for determining the number of clusters. Inf Sci 415–416:414–428CrossRef
Metadaten
Titel
Hierarchical clustering of unequal-length time series with area-based shape distance
verfasst von
Xiao Wang
Fusheng Yu
Witold Pedrycz
Jiayin Wang
Publikationsdatum
09.06.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 15/2019
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-018-3287-6

Weitere Artikel der Ausgabe 15/2019

Soft Computing 15/2019 Zur Ausgabe