Skip to main content
Erschienen in: Advances in Data Analysis and Classification 1/2021

20.05.2020 | Regular Article

Clustering discrete-valued time series

verfasst von: Tyler Roick, Dimitris Karlis, Paul D. McNicholas

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

There is a need for the development of models that are able to account for discreteness in data, along with its time series properties and correlation. Our focus falls on INteger-valued AutoRegressive (INAR) type models. The INAR type models can be used in conjunction with existing model-based clustering techniques to cluster discrete-valued time series data. With the use of a finite mixture model, several existing techniques such as the selection of the number of clusters, estimation using expectation-maximization and model selection are applicable. The proposed model is then demonstrated on real data to illustrate its clustering applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering-a decade review. Inf Syst 53:16–38CrossRef Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering-a decade review. Inf Syst 53:16–38CrossRef
Zurück zum Zitat Aitken AC (1926) A series formula for the roots of algebraic and transcendental equations. Proc R Soc Edinb 45:14–22CrossRef Aitken AC (1926) A series formula for the roots of algebraic and transcendental equations. Proc R Soc Edinb 45:14–22CrossRef
Zurück zum Zitat Atkins DC, Baldwin SA, Zheng C, Gallop RJ, Neighbors C (2013) A tutorial on count regression and zero-altered count models for longitudinal substance use data. Psychol Addict Behav J Soc Psychol Addict Behav 27(1):166–177CrossRef Atkins DC, Baldwin SA, Zheng C, Gallop RJ, Neighbors C (2013) A tutorial on count regression and zero-altered count models for longitudinal substance use data. Psychol Addict Behav J Soc Psychol Addict Behav 27(1):166–177CrossRef
Zurück zum Zitat Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: Proceedings of the AAAI-94 workshop knowledge discovery in databases, pp 359–370 Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: Proceedings of the AAAI-94 workshop knowledge discovery in databases, pp 359–370
Zurück zum Zitat Böckenholt U (1998) Mixed INAR (1) poisson regression models: analyzing heterogeneity and serial dependencies in longitudinal count data. J Econ 89(1–2):317–338MathSciNetCrossRef Böckenholt U (1998) Mixed INAR (1) poisson regression models: analyzing heterogeneity and serial dependencies in longitudinal count data. J Econ 89(1–2):317–338MathSciNetCrossRef
Zurück zum Zitat Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46:373–388CrossRef Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46:373–388CrossRef
Zurück zum Zitat Caiado J, Crato N, Peña D (2006) A periodogram-based metric for time series classification. Comput Stat Data Anal 50(10):2668–2684MathSciNetCrossRef Caiado J, Crato N, Peña D (2006) A periodogram-based metric for time series classification. Comput Stat Data Anal 50(10):2668–2684MathSciNetCrossRef
Zurück zum Zitat Caiado J, Maharaj EA, D’Urso P (2015) Time series clustering. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall/CRC Press, Boca Raton Caiado J, Maharaj EA, D’Urso P (2015) Time series clustering. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall/CRC Press, Boca Raton
Zurück zum Zitat da Silva IMM (2005) Contributions to the analysis of discrete-valued time series. PhD thesis, University of Porto da Silva IMM (2005) Contributions to the analysis of discrete-valued time series. PhD thesis, University of Porto
Zurück zum Zitat Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39(1):1–38MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39(1):1–38MathSciNetMATH
Zurück zum Zitat D’Urso P, De Giovanni L, Massari R (2019) Trimmed fuzzy clustering of financial time series based on dynamic time warping. Annals of operations research, pp 1–17 D’Urso P, De Giovanni L, Massari R (2019) Trimmed fuzzy clustering of financial time series based on dynamic time warping. Annals of operations research, pp 1–17
Zurück zum Zitat D’Urso P, Maharaj EA (2009) Autocorrelation-based fuzzy clustering of time series. Fuzzy Sets Syst 160(24):3565–3589MathSciNetCrossRef D’Urso P, Maharaj EA (2009) Autocorrelation-based fuzzy clustering of time series. Fuzzy Sets Syst 160(24):3565–3589MathSciNetCrossRef
Zurück zum Zitat Freeland RK (1998) Statistical analysis of discrete time series with applications to the analysis of workers compensation claims data. PhD thesis, University of British Columbia, Canada Freeland RK (1998) Statistical analysis of discrete time series with applications to the analysis of workers compensation claims data. PhD thesis, University of British Columbia, Canada
Zurück zum Zitat Frühwirth-Schnatter S, Kaufmann S (2008) Model-based clustering of multiple time series. J Bus Econ Stat 26(1):78–89MathSciNetCrossRef Frühwirth-Schnatter S, Kaufmann S (2008) Model-based clustering of multiple time series. J Bus Econ Stat 26(1):78–89MathSciNetCrossRef
Zurück zum Zitat Frühwirth-Schnatter S (2011) Panel data analysis: a survey on model-based clustering of time series. Adv Data Anal Classif 5(4):251–280MathSciNetCrossRef Frühwirth-Schnatter S (2011) Panel data analysis: a survey on model-based clustering of time series. Adv Data Anal Classif 5(4):251–280MathSciNetCrossRef
Zurück zum Zitat Frühwirth-Schnatter S, Pamminger C, Winter-Ember R, Weber A (2011) Model-based clustering of categorical time series with multinomial logit classification. AIP Conf Proc 1281(1):1897–1900 Frühwirth-Schnatter S, Pamminger C, Winter-Ember R, Weber A (2011) Model-based clustering of categorical time series with multinomial logit classification. AIP Conf Proc 1281(1):1897–1900
Zurück zum Zitat Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218CrossRef Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218CrossRef
Zurück zum Zitat Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244CrossRef Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244CrossRef
Zurück zum Zitat Krishnapuram R, Joshi A, Nasraoui O, Yil L (2001) Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Syst 9(4):595–607CrossRef Krishnapuram R, Joshi A, Nasraoui O, Yil L (2001) Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Syst 9(4):595–607CrossRef
Zurück zum Zitat Lindsay BG (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. California: Institute of Mathematical Statistics: Hayward Lindsay BG (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. California: Institute of Mathematical Statistics: Hayward
Zurück zum Zitat Maharaj EA, D’Urso P, Caiado J (2019) Time series clustering and classification. Chapman & Hall/CRC Press, Boca RatonCrossRef Maharaj EA, D’Urso P, Caiado J (2019) Time series clustering and classification. Chapman & Hall/CRC Press, Boca RatonCrossRef
Zurück zum Zitat McNicholas PD (2016a) Mixture model-based classification. Chapman & Hall/CRC Press, Boca RatonCrossRef McNicholas PD (2016a) Mixture model-based classification. Chapman & Hall/CRC Press, Boca RatonCrossRef
Zurück zum Zitat McNicholas PD, Murphy TB, McDaid AF, Frost D (2010) Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput Stat Data Anal 54(3):711–723MathSciNetCrossRef McNicholas PD, Murphy TB, McDaid AF, Frost D (2010) Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput Stat Data Anal 54(3):711–723MathSciNetCrossRef
Zurück zum Zitat Neighbors C, Lewis MA, Atkins DC, Jensen MM, Walter T, Fossos N, Lee CM, Larimer ME (2010) Efficacy of web-based personalized normative feedback: a two-year randomized controlled trial. J Consult Clin Psychol 78(6):898–911CrossRef Neighbors C, Lewis MA, Atkins DC, Jensen MM, Walter T, Fossos N, Lee CM, Larimer ME (2010) Efficacy of web-based personalized normative feedback: a two-year randomized controlled trial. J Consult Clin Psychol 78(6):898–911CrossRef
Zurück zum Zitat Pamminger C, Frühwirth-Schnatter S (2010) Model-based clustering of categorical time series. Bayesian Anal 5(2):345–368MathSciNetMATH Pamminger C, Frühwirth-Schnatter S (2010) Model-based clustering of categorical time series. Bayesian Anal 5(2):345–368MathSciNetMATH
Zurück zum Zitat R Core Team R: a language and environment for statistical computing R Core Team R: a language and environment for statistical computing
Zurück zum Zitat Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850CrossRef Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850CrossRef
Zurück zum Zitat Sobell MB, Sobell LC, Klajner F, Pavan D, Basian E (1986) The reliability of a timeline method for assessing normal drinker college students’ recent drinking history: utility for alcohol research. Addict Behav 11(2):149–161CrossRef Sobell MB, Sobell LC, Klajner F, Pavan D, Basian E (1986) The reliability of a timeline method for assessing normal drinker college students’ recent drinking history: utility for alcohol research. Addict Behav 11(2):149–161CrossRef
Zurück zum Zitat Weiss CH (2018) An introduction to discrete-valued time series. John Wiley & Sons, HobokenCrossRef Weiss CH (2018) An introduction to discrete-valued time series. John Wiley & Sons, HobokenCrossRef
Zurück zum Zitat Weiß CH (2008) Thinning operations for modeling time series of counts—a survey. AStA Adv Stat Anal 92(2):319–341MathSciNetCrossRef Weiß CH (2008) Thinning operations for modeling time series of counts—a survey. AStA Adv Stat Anal 92(2):319–341MathSciNetCrossRef
Zurück zum Zitat Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recogn 37(8):1675–1689CrossRef Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recogn 37(8):1675–1689CrossRef
Metadaten
Titel
Clustering discrete-valued time series
verfasst von
Tyler Roick
Dimitris Karlis
Paul D. McNicholas
Publikationsdatum
20.05.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 1/2021
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-020-00395-7

Weitere Artikel der Ausgabe 1/2021

Advances in Data Analysis and Classification 1/2021 Zur Ausgabe