Skip to main content
Log in

Finite mixtures of canonical fundamental skew \(t\)-distributions

The unification of the restricted and unrestricted skew \(t\)-mixture models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

This paper introduces a finite mixture of canonical fundamental skew \(t\) (CFUST) distributions for a model-based approach to clustering where the clusters are asymmetric and possibly long-tailed (in: Lee and McLachlan, arXiv:1401.8182 [statME], 2014b). The family of CFUST distributions includes the restricted multivariate skew \(t\) and unrestricted multivariate skew \(t\) distributions as special cases. In recent years, a few versions of the multivariate skew \(t\) (MST) mixture model have been put forward, together with various EM-type algorithms for parameter estimation. These formulations adopted either a restricted or unrestricted characterization for their MST densities. In this paper, we examine a natural generalization of these developments, employing the CFUST distribution as the parametric family for the component distributions, and point out that the restricted and unrestricted characterizations can be unified under this general formulation. We show that an exact implementation of the EM algorithm can be achieved for the CFUST distribution and mixtures of this distribution, and present some new analytical results for a conditional expectation involved in the E-step.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aas, K., Haff, I.H.: The generalized hyperbolic skew student’s \(t\)-distribution. J. Financ. Econom. 4, 275–309 (2005)

    Article  Google Scholar 

  • Aghaeepour, N., Finak, G., The FLOWCAP Consortium, The DREAM Consortium, Hoos, H., Mosmann, T., Gottardo, R., Brinkman, R.R., Scheuermann, R.H.: Critical assessment of automated flow cytometry analysis techniques. Nat. Methods 10, 228–238 (2013)

  • Anderson, E.: The irises of the gaspé peninsula. Bull. Am. Iris Soc. 59, 2–5 (1935)

    Google Scholar 

  • Arellano-Valle, R.B., Azzalini, A.: On the unification of families of skew-normal distributions. Scand. J. Stat. 33, 561–574 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Genton, M.G.: On fundamental skew distribtuions. J. Multivar. Anal. 96, 93–116 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Can. J. Stat. 34, 581–601 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Asparouhov, T., Muthén, B.: Structural equation models and mixture models with continuous non-normal skewed distributions. Mplus Web Notes 19, 1–49 (2014)

    Google Scholar 

  • Azzalini, A.: The skew-normal distribution and related multivariate families. Scand. J. Stat. 32, 159–188 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Azzalini, A.: The Skew-Normal and Related Families. Institute of Mathematical Statistics Monographs, Cambridge University Press, Cambridge (2014)

    MATH  Google Scholar 

  • Banfield, J.D., Raftery, A.E.: Model-based gaussian and non-gaussian clustering. Biometrics 49, 803–821 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Bernardi, M.: Risk measures for skew normal mixtures. Stat. Probab. Lett. 83, 1819–1824 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping and Others. Chapman and Hall, London (1999)

    MATH  Google Scholar 

  • Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.: The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46, 373–388 (1994)

    Article  MATH  Google Scholar 

  • Browne, R.P., McNicholas, P.D.: A mixture of generalized hyperbolic distributions. arXiv:1305.1036 [statME] (2013)

  • Cabral, C.S., Lachos, V.H., Prates, M.O.: Multivariate mixture modeling using skew-normal independent distributions. Comput. Stat. Data Anal. 56, 126–142 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Contreras-Reyes, J.E., Arellano-Valle, R.B.: Growth estimates of cardinalfish (Epigonus crassicaudus) based on scale mixtures of skew-normal distributions. Fish. Res. 147, 137–144 (2013)

    Article  Google Scholar 

  • Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  • Everitt, B.S., Hand, D.J.: Finite Mixture Distributions. Chapman and Hall, London (1981)

    Book  MATH  Google Scholar 

  • Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)

    Article  Google Scholar 

  • Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Stat. Comput. (2013). doi:10.1007/s11222-013-9414-4

  • Fraley, C., Raftery, A.E.: How many clusters? which clustering methods? answers via model-based cluster analysis. Comput. J. 41, 578–588 (1999)

    Article  MATH  Google Scholar 

  • Franczak, B.C., Browne, R.P., McNicholas, P.D.: Mixtures of shifted asymmetric laplace distributions. IEEE Trans. Pattern Anal. Mach. Intell. (2013). doi:10.1109/TPAMI.2013.216

  • Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    MATH  Google Scholar 

  • Frühwirth-Schnatter, S., Pyne, S.: Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-\(t\) distributions. Biostatistics 11, 317–336 (2010)

    Article  Google Scholar 

  • Genton, MGe: Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Chapman and Hall, London (2004)

    Book  MATH  Google Scholar 

  • Ho, H.J., Lin, T.I., Chang, H.H., Haase, H.B., Huang, S., Pyne, S.: Parametric modeling of cellular state transitions as measured with flow cytometry different tissues. BMC Bioinform. 13(Suppl 5), S5 (2012a)

    Article  Google Scholar 

  • Ho, H.J., Lin, T.I., Chen, H.Y., Wang, W.L.: Some results on the truncated multivariate \(t\) distribution. J. Stat. Plan. Inference 142, 25–40 (2012b)

    Article  MathSciNet  MATH  Google Scholar 

  • Hu, X., Kim, H., Brennan, P.J., Han, B., Baecher-Allan, C.M., De Jager, P.L., Brenner, M.B., Raychaudhuri, S.: Application of user-guided automated cytometric data analysis to large-scale immunoprofiling of invariant natural killer t cells. Proc. Natl. Acad. Sci. USA 110, 19,030–19,035 (2013). doi:10.1073/pnas.1318322110

    Article  Google Scholar 

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  • Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19, 73–83 (2009)

    Article  MathSciNet  Google Scholar 

  • Lee, S., McLachlan, G.J.: On the fitting of mixtures of multivariate skew \(t\)-distributions via the EM algorithm. arXiv:1109.4706 [statME] (2011)

  • Lee, S., McLachlan, G.J.: Finite mixtures of multivariate skew \(t\)-distributions: some recent and new results. Stat. Comput. 24, 181–202 (2014a)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Model-based clustering and classification with non-normal mixture distributions. Stat. Methods Appl. 22, 427–454 (2013a)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Modelling asset return using multivariate asymmetric mixture models with applications to estimation of value-at-risk. In: Piantadosi, J., Anderssen, R.S., Boland, J. (eds.) MODSIM 2013 (20th International Congress on Modelling and Simulation), pp. 1228–1234. Adelaide (2013)

  • Lee, S.X., McLachlan, G.J.: On mixtures of skew-normal and skew \(t\)-distributions. Adv. Data Anal. Classif. 7, 241–266 (2013c)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Maximum likelihood estimation for finite mixtures of canonical fundamental skew \(t\)-distributions: the unification of the unrestricted and restricted skew t-mixture models. arXiv:1401.8182 [statME] (2014b)

  • Lee, Y.W., Poon, S.H.: Systemic and systematic factors for loan portfolio loss distribution. Econometrics and applied economics workshops pp. 1–61. School of Social Science, University of Manchester (2011)

  • Lin, T.I.: Robust mixture modeling using multivariate skew \(t\) distribution. Stat. Comput. 20, 343–356 (2010)

    Article  MathSciNet  Google Scholar 

  • Lindsay, B.G.: Mixture Models: Theory, Geometry, and Applications. NSF-CBMS Regional Conference Series in probability and Statistics, vol. 5. Institute of Mathematical Statistics and the American Statistical Association, Alexandria (1995)

    Google Scholar 

  • McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications. Marcel Dekker, New York (1988)

    MATH  Google Scholar 

  • McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley, New York (1997)

    MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley Series in Probability and Statistics, New York (2000)

  • McNicholas, P.D., Murphy, T.B., McDaid, A.F., Frost, D.: Serial and parallel implementations of model-based clustering via parsimonious gaussian mixture models. Comput. Stat. Data Anal. 54, 711–723 (2010)

  • Mengersen, K.L., Robert, C.P., Titterington, D.M.: Mixtures: Estimation and Applications. Wiley, New York (2011)

    Book  MATH  Google Scholar 

  • Murray, P.M., Browne, B.P., McNicholas, P.D.: Mixtures of skew-\(t\) factor analyzers. Comput. Stat. Data Anal. 77, 326–335 (2014)

    Article  MathSciNet  Google Scholar 

  • Pyne, S., Hu, X., Wang, K., Rossin, E., Lin, T.I., Maier, L.M., Baecher-Allan, C., McLachlan, G.J., Tamayo, P., Hafler, D.A., De Jager, P.L., Mesirow, J.P.: Automated high-dimensional flow cytometric data analysis. Proc. Natl. Acad. Sci. USA 106, 8519–8524 (2009)

    Article  Google Scholar 

  • Pyne, S., Lee, S.X., Wang, K., Irish, J., Tamayo, P., Nazaire, M.D., Duong, T., Ng, S.K., Hafler, D., Levy, R., Nolan, G.P., Mesirov, J., McLachlan, G.: Joint modeling and registration of cell populations in cohorts of high-dimensional flow cytometric data. PLoS One 9(e100), 334 (2014). doi:10.1371/journal.pone.0100334

    Google Scholar 

  • Riggi, S., Ingrassia, S.: A model-based clustering approach for mass composition analysis of high energy cosmic rays. Astropart. Phys. 48, 86–96 (2013)

    Article  Google Scholar 

  • Rossin, E., Lin, T.I., Ho, H.J., Mentzer, S.J., Pyne, S.: A framework for analytical characterization of monoclonal antibodies based on reactivity profiles in different tissues. Bioinformatics 27, 2746–2753 (2011)

    Article  Google Scholar 

  • Sahu, S.K., Dey, D.K., Branco, M.D.: A new class of multivariate skew distributions with applications to Bayesian regression models. Can. J. Stat. 31, 129–150 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Sahu, S.K., Dey, D.K., Branco, M.D.: Erratum: a new class of multivariate skew distributions with applications to Bayesian regression models. Can. J. Stat. 37, 301–302 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  • Soltyk, S., Gupta, R.: Application of the multivariate skew normal mixture model with the EM algorithm to value-at-risk. In: Chan, F., Marinova, D., Anderssen, R.S. (eds.) MODSIM 2011 (19th International Congress on Modelling and Simulation), pp. 1638–1644. Perth (2011)

  • Titterington, D.M., Smith, A.F.M., Markov, U.E.: Statistical Analysis of Finite Mixture Distributions. Wiley, New York (1985)

    MATH  Google Scholar 

  • Tortora, C., Franczak, B.C., Browne, B.P., McNicholas, P.D.: Model-based clustering using mixtures of coalesced generalized hyperbolic distributions. Preprint arXiv:1403.2332 [statME] (2014)

  • Vrbik, I., McNicholas, P.D.: Analytic calculations for the EM algorithm for multivariate skew \(t\)-mixture models. Stat. Probab. Lett. 82, 1169–1174 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, K., Ng, S.K., McLachlan, G.J.: Multivariate skew \(t\) mixture models: applications to fluorescence-activated cell sorting data. In: Shi, H., Zhang, Y., Bottema, M.J., Lovell, B.C., Maeder, A.J. (eds.) DICTA 2009 (Conference of Digital Image Computing: Techniques and Applications, Melbourne), pp. 526–531. IEEE Computer Society, Los Alamitos (2009)

  • Wendel, J.G.: Note on the gamma function. Am. Math. Mon. 55, 563–564 (1948)

    Article  MathSciNet  Google Scholar 

  • Wraith, D., Forbes, F.: Clustering using skewed multivariate heavy tailed distributions with flexible tail behaviour. Preprint. arXiv:1408.0711 [statME] (2014)

Download references

Acknowledgments

We would like to thank Professor Seung-Gu Kim for helpful comments on this topic. The work of the authors was supported by an Australian Research Council Discovery Grant.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, S.X., McLachlan, G.J. Finite mixtures of canonical fundamental skew \(t\)-distributions. Stat Comput 26, 573–589 (2016). https://doi.org/10.1007/s11222-015-9545-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-015-9545-x

Keywords

Navigation