Skip to main content
Erschienen in: Journal of Classification 1/2022

22.09.2021

MatTransMix: an R Package for Matrix Model-Based Clustering and Parsimonious Mixture Modeling

verfasst von: Xuwen Zhu, Shuchismita Sarkar, Volodymyr Melnykov

Erschienen in: Journal of Classification | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Finite mixture modeling, expanded to matrix-valued data, faces several challenges. One of the major concerns is overparameterization resulting from the high number of parameters involved in a matrix mixture. In addition, an appropriate power transformation is very useful if the data are skewed. The R package MatTransMix is a new piece of software devoted to parsimonious models, based on spectral decomposition of covariance matrices, developed for fitting heterogeneous matrix-valued data providing model-based clustering results. The package implements a variety of parsimonious models obtained from various combinations of spectral decomposition and skewness parameters. The paper discusses some methodological foundations of the proposed models and elaborates the functions available in this package on carefully chosen examples.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Banfield, J.D., & Raftery, A.E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49, 803–821.MathSciNetCrossRef Banfield, J.D., & Raftery, A.E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49, 803–821.MathSciNetCrossRef
Zurück zum Zitat Biernacki, C., Celeux, G., & Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics and Data Analysis, 413, 561–575.MathSciNetCrossRef Biernacki, C., Celeux, G., & Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics and Data Analysis, 413, 561–575.MathSciNetCrossRef
Zurück zum Zitat Celeux, G., & Govaert (1995). Gaussian parsimonious clustering models. Computational Statistics and Data Analysis, 28, 781–93. Celeux, G., & Govaert (1995). Gaussian parsimonious clustering models. Computational Statistics and Data Analysis, 28, 781–93.
Zurück zum Zitat Dawid, A.P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika, 68, 265–274.MathSciNetCrossRef Dawid, A.P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika, 68, 265–274.MathSciNetCrossRef
Zurück zum Zitat Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood for incomplete data via the EM algorithm (with discussion). Jounal of the Royal Statistical Society, Series B, 39, 1–38.MATH Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood for incomplete data via the EM algorithm (with discussion). Jounal of the Royal Statistical Society, Series B, 39, 1–38.MATH
Zurück zum Zitat Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105–123.CrossRef Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105–123.CrossRef
Zurück zum Zitat Fisher, R.A. (1936). The use of multiple measurements in taxonomic poblems. Annals of Eugenics, 7, 179–188.CrossRef Fisher, R.A. (1936). The use of multiple measurements in taxonomic poblems. Annals of Eugenics, 7, 179–188.CrossRef
Zurück zum Zitat Forgy, E. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21, 768–780. Forgy, E. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21, 768–780.
Zurück zum Zitat Fraley, C., & Raftery, A.E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.MathSciNetCrossRef Fraley, C., & Raftery, A.E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.MathSciNetCrossRef
Zurück zum Zitat Gallaugher, M., & McNicholas, P.D. (2020). Parsimonious mixtures of matrix variate bilinear factor analyzers. In Advanced Studies in Behaviormetrics and Data Science (pp. 177–196). Springer. Gallaugher, M., & McNicholas, P.D. (2020). Parsimonious mixtures of matrix variate bilinear factor analyzers. In Advanced Studies in Behaviormetrics and Data Science (pp. 177–196). Springer.
Zurück zum Zitat Kaufman, L., & Rousseuw, P.J. (1990). Finding groups in data. New York: Wiley.CrossRef Kaufman, L., & Rousseuw, P.J. (1990). Finding groups in data. New York: Wiley.CrossRef
Zurück zum Zitat MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium, 1, 281–297.MathSciNetMATH MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium, 1, 281–297.MathSciNetMATH
Zurück zum Zitat Manly, B.F.J. (1976). Exponential data transformations. Biometrics Unit, 25, 37–42. Manly, B.F.J. (1976). Exponential data transformations. Biometrics Unit, 25, 37–42.
Zurück zum Zitat McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.CrossRef McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.CrossRef
Zurück zum Zitat Melnykov, V. (2013). Challenges in model-based clustering. WIREs: Computational Statistics, 5, 135–148. Melnykov, V. (2013). Challenges in model-based clustering. WIREs: Computational Statistics, 5, 135–148.
Zurück zum Zitat Melnykov, V., & Zhu, X. (2018). On model-based clustering of skewed matrix data. Journal of Multivariate Analysis, 167, 181–194.MathSciNetCrossRef Melnykov, V., & Zhu, X. (2018). On model-based clustering of skewed matrix data. Journal of Multivariate Analysis, 167, 181–194.MathSciNetCrossRef
Zurück zum Zitat Melnykov, V., & Zhu, X. (2019). Studying crime trends in the USA over the years 2000–2012. Advances in Data Analysis and Classification, 13, 325–341.MathSciNetCrossRef Melnykov, V., & Zhu, X. (2019). Studying crime trends in the USA over the years 2000–2012. Advances in Data Analysis and Classification, 13, 325–341.MathSciNetCrossRef
Zurück zum Zitat Sarkar, S., Melnykov, V., & Zhu, X. (2021). Tensor-variate finite mixture modeling for the analysis of university professor remuneration. The Annals of Applied Statistics, 15(2), 1017–1036.MathSciNetCrossRef Sarkar, S., Melnykov, V., & Zhu, X. (2021). Tensor-variate finite mixture modeling for the analysis of university professor remuneration. The Annals of Applied Statistics, 15(2), 1017–1036.MathSciNetCrossRef
Zurück zum Zitat Sarkar, S., Zhu, X., Melnykov, V., & Ingrassia, S. (2020). On parsimonious models for modeling matrix data. Computational Statistics and Data Analysis, 142, 106822.MathSciNetCrossRef Sarkar, S., Zhu, X., Melnykov, V., & Ingrassia, S. (2020). On parsimonious models for modeling matrix data. Computational Statistics and Data Analysis, 142, 106822.MathSciNetCrossRef
Zurück zum Zitat Sneath, P. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.CrossRef Sneath, P. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.CrossRef
Zurück zum Zitat Sorensen, T. (1948). A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons, (Vol. 5. Sorensen, T. (1948). A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons, (Vol. 5.
Zurück zum Zitat Srivastava, M.S., Rosen, T., & Rosen, D. (2008). Models with a Kronecker product covariance structure: estimation and testing. Mathematical Methods of Statistics, 17, 357–370.MathSciNetCrossRef Srivastava, M.S., Rosen, T., & Rosen, D. (2008). Models with a Kronecker product covariance structure: estimation and testing. Mathematical Methods of Statistics, 17, 357–370.MathSciNetCrossRef
Zurück zum Zitat Viroli, C. (2011). Finite mixtures of matrix normal distributions for classifying three-way data. Statistics and Computing, 21, 511–522.MathSciNetCrossRef Viroli, C. (2011). Finite mixtures of matrix normal distributions for classifying three-way data. Statistics and Computing, 21, 511–522.MathSciNetCrossRef
Zurück zum Zitat Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.MathSciNetCrossRef Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.MathSciNetCrossRef
Zurück zum Zitat Yeo, I.-K., & Johnson, R.A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954–959.MathSciNetCrossRef Yeo, I.-K., & Johnson, R.A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954–959.MathSciNetCrossRef
Metadaten
Titel
MatTransMix: an R Package for Matrix Model-Based Clustering and Parsimonious Mixture Modeling
verfasst von
Xuwen Zhu
Shuchismita Sarkar
Volodymyr Melnykov
Publikationsdatum
22.09.2021
Verlag
Springer US
Erschienen in
Journal of Classification / Ausgabe 1/2022
Print ISSN: 0176-4268
Elektronische ISSN: 1432-1343
DOI
https://doi.org/10.1007/s00357-021-09401-9

Weitere Artikel der Ausgabe 1/2022

Journal of Classification 1/2022 Zur Ausgabe

OriginalPaper

Chimeral Clustering

Premium Partner