Skip to main content
Erschienen in: Advances in Data Analysis and Classification 2/2023

25.06.2022 | Regular Article

Classification based on multivariate mixed type longitudinal data with an application to the EU-SILC database

verfasst von: Jan Vávra, Arnošt Komárek

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although many present day studies gather data of a diverse nature (numeric quantities, binary indicators or ordered categories) on the same units repeatedly over time, there only exist limited number of approaches in the literature to analyse so-called mixed-type longitudinal data. We present a statistical model capable of joint modelling several mixed-type outcomes, which also accounts for possible dependencies among the investigated outcomes. A thresholding approach to link binary or ordinal variables to their latent numeric counterparts allows us to jointly model all, including latent, numeric outcomes using a multivariate version of the linear mixed-effects model. We avoid the independence assumption over outcomes by relaxing the variance matrix of random effects to a completely general positive definite matrix. Moreover, we follow model-based clustering methodology to create a mixture of such models to model heterogeneity in the temporal evolution of the considered outcomes. The estimation of such an hierarchical model is approached by Bayesian principles with the use of Markov chain Monte Carlo methods. After a successful simulation study with the aim to examine the ability to consistently estimate the true parameter values and thus discover the different patterns, the EU-SILC dataset consisting of Czech households that were followed for 4 years in a time span from 2005 to 2016 was analysed. The households were classified into groups with a similar evolution of several closely related indicators of monetary poverty based on estimated classification probabilities.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Brooks S, Gelman A, Jones G, Meng X (2011) Handbook for Markov chain Monte Carlo, 2nd edn. Taylor & Francis, Boca RatonCrossRefMATH Brooks S, Gelman A, Jones G, Meng X (2011) Handbook for Markov chain Monte Carlo, 2nd edn. Taylor & Francis, Boca RatonCrossRefMATH
Zurück zum Zitat Bruckers L, Molenberghs G, Drinkenburg P, Geys H (2016) A clustering algorithm for multivariate longitudinal data. J Biopharm Stat 26(4):725–741CrossRef Bruckers L, Molenberghs G, Drinkenburg P, Geys H (2016) A clustering algorithm for multivariate longitudinal data. J Biopharm Stat 26(4):725–741CrossRef
Zurück zum Zitat Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH
Zurück zum Zitat Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62(2):424–431MathSciNetCrossRefMATH Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62(2):424–431MathSciNetCrossRefMATH
Zurück zum Zitat Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, BerlinMATH Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, BerlinMATH
Zurück zum Zitat Genz A (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1(2):141–149 Genz A (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1(2):141–149
Zurück zum Zitat Grün B (2019) Model-based clustering. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis. CRC Press, Boca Raton, pp 157–192 (chap 8)CrossRef Grün B (2019) Model-based clustering. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis. CRC Press, Boca Raton, pp 157–192 (chap 8)CrossRef
Zurück zum Zitat Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35CrossRef Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35CrossRef
Zurück zum Zitat Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkCrossRefMATH Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkCrossRefMATH
Zurück zum Zitat Komárek A, Komárková L (2014) Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 59(12):1–38CrossRef Komárek A, Komárková L (2014) Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 59(12):1–38CrossRef
Zurück zum Zitat Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38(4):963–974CrossRefMATH Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38(4):963–974CrossRefMATH
Zurück zum Zitat Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New YorkMATH Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New YorkMATH
Zurück zum Zitat Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265MathSciNet Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265MathSciNet
Zurück zum Zitat Proust-Lima C, Philipps V, Diakite A, Liquet B (2017) Estimation of extended mixed models using latent classes and latent processes: the R package lcmm. J Stat Softw 78(2):1–56CrossRef Proust-Lima C, Philipps V, Diakite A, Liquet B (2017) Estimation of extended mixed models using latent classes and latent processes: the R package lcmm. J Stat Softw 78(2):1–56CrossRef
Zurück zum Zitat R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Metadaten
Titel
Classification based on multivariate mixed type longitudinal data with an application to the EU-SILC database
verfasst von
Jan Vávra
Arnošt Komárek
Publikationsdatum
25.06.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 2/2023
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-022-00504-8

Weitere Artikel der Ausgabe 2/2023

Advances in Data Analysis and Classification 2/2023 Zur Ausgabe

Premium Partner