Skip to main content
Top
Published in: Advances in Data Analysis and Classification 2/2023

25-06-2022 | Regular Article

Classification based on multivariate mixed type longitudinal data with an application to the EU-SILC database

Authors: Jan Vávra, Arnošt Komárek

Published in: Advances in Data Analysis and Classification | Issue 2/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Although many present day studies gather data of a diverse nature (numeric quantities, binary indicators or ordered categories) on the same units repeatedly over time, there only exist limited number of approaches in the literature to analyse so-called mixed-type longitudinal data. We present a statistical model capable of joint modelling several mixed-type outcomes, which also accounts for possible dependencies among the investigated outcomes. A thresholding approach to link binary or ordinal variables to their latent numeric counterparts allows us to jointly model all, including latent, numeric outcomes using a multivariate version of the linear mixed-effects model. We avoid the independence assumption over outcomes by relaxing the variance matrix of random effects to a completely general positive definite matrix. Moreover, we follow model-based clustering methodology to create a mixture of such models to model heterogeneity in the temporal evolution of the considered outcomes. The estimation of such an hierarchical model is approached by Bayesian principles with the use of Markov chain Monte Carlo methods. After a successful simulation study with the aim to examine the ability to consistently estimate the true parameter values and thus discover the different patterns, the EU-SILC dataset consisting of Czech households that were followed for 4 years in a time span from 2005 to 2016 was analysed. The households were classified into groups with a similar evolution of several closely related indicators of monetary poverty based on estimated classification probabilities.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Brooks S, Gelman A, Jones G, Meng X (2011) Handbook for Markov chain Monte Carlo, 2nd edn. Taylor & Francis, Boca RatonCrossRefMATH Brooks S, Gelman A, Jones G, Meng X (2011) Handbook for Markov chain Monte Carlo, 2nd edn. Taylor & Francis, Boca RatonCrossRefMATH
go back to reference Bruckers L, Molenberghs G, Drinkenburg P, Geys H (2016) A clustering algorithm for multivariate longitudinal data. J Biopharm Stat 26(4):725–741CrossRef Bruckers L, Molenberghs G, Drinkenburg P, Geys H (2016) A clustering algorithm for multivariate longitudinal data. J Biopharm Stat 26(4):725–741CrossRef
go back to reference Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38MathSciNetMATH
go back to reference Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62(2):424–431MathSciNetCrossRefMATH Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62(2):424–431MathSciNetCrossRefMATH
go back to reference Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, BerlinMATH Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, BerlinMATH
go back to reference Genz A (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1(2):141–149 Genz A (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1(2):141–149
go back to reference Grün B (2019) Model-based clustering. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis. CRC Press, Boca Raton, pp 157–192 (chap 8)CrossRef Grün B (2019) Model-based clustering. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis. CRC Press, Boca Raton, pp 157–192 (chap 8)CrossRef
go back to reference Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35CrossRef Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35CrossRef
go back to reference Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkCrossRefMATH Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkCrossRefMATH
go back to reference Komárek A, Komárková L (2014) Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 59(12):1–38CrossRef Komárek A, Komárková L (2014) Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 59(12):1–38CrossRef
go back to reference Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38(4):963–974CrossRefMATH Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38(4):963–974CrossRefMATH
go back to reference Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New YorkMATH Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New YorkMATH
go back to reference Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265MathSciNet Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265MathSciNet
go back to reference Proust-Lima C, Philipps V, Diakite A, Liquet B (2017) Estimation of extended mixed models using latent classes and latent processes: the R package lcmm. J Stat Softw 78(2):1–56CrossRef Proust-Lima C, Philipps V, Diakite A, Liquet B (2017) Estimation of extended mixed models using latent classes and latent processes: the R package lcmm. J Stat Softw 78(2):1–56CrossRef
go back to reference R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Metadata
Title
Classification based on multivariate mixed type longitudinal data with an application to the EU-SILC database
Authors
Jan Vávra
Arnošt Komárek
Publication date
25-06-2022
Publisher
Springer Berlin Heidelberg
Published in
Advances in Data Analysis and Classification / Issue 2/2023
Print ISSN: 1862-5347
Electronic ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-022-00504-8

Other articles of this Issue 2/2023

Advances in Data Analysis and Classification 2/2023 Go to the issue

Premium Partner