Skip to main content
Log in

Model-based clustering with non-elliptically contoured distributions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The majority of the existing literature on model-based clustering deals with symmetric components. In some cases, especially when dealing with skewed subpopulations, the estimate of the number of groups can be misleading; if symmetric components are assumed we need more than one component to describe an asymmetric group. Existing mixture models, based on multivariate normal distributions and multivariate t distributions, try to fit symmetric distributions, i.e. they fit symmetric clusters. In the present paper, we propose the use of finite mixtures of the normal inverse Gaussian distribution (and its multivariate extensions). Such finite mixture models start from a density that allows for skewness and fat tails, generalize the existing models, are tractable and have desirable properties. We examine both the univariate case, to gain insight, and the multivariate case, which is more useful in real applications. EM type algorithms are described for fitting the models. Real data examples are used to demonstrate the potential of the new model in comparison with existing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aas, K., Hobaek Hoff, I., Dimakos, X.: Risk estimation using the multivariate normal inverse Gaussian distribution. J. Risk 8(2), 39–60 (2005)

    Google Scholar 

  • Azzalini, A., Dalla Valle, A.: The multivariate skew-normal distribution. Biometrika 83(4), 715–726 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  • Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  • Barndorff-Nielsen, O.E.: Normal inverse Gaussian distributions and stochastic volatility modelling. Scand. J. Stat. 24(1), 1–13 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Barndorff-Nielsen, O.E., Prause, K.: Apparent scaling. Finance Stoch. 5(1), 103–113 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Barndorff-Nielsen, O., Kent, J., Sørensen, M.: Normal variance-mean mixtures and z distributions. Int. Stat. Rev. 50(2), 145–159 (1982)

    Article  MATH  Google Scholar 

  • Bechtel, Y.C., Bonaiti-Pellik, C., Poisson, N., Magnette, J., Bechtel, P.R.: A population and family study of n-acetyltransferase using caffeine urinary metabolites. Clin. Pharmacol. Ther. 54, 134–141 (1993)

    Google Scholar 

  • Fraley, C., Raftery, A.E.: Enhanced software for model-based clustering, density estimation, and discriminant analysis: MCLUST. J. Classif. 20, 263–286 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Gupta, A.K.: Multivariate skew t-distribution. Statistics 37(4), 359–363 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Gutierrez, R.G., Carroll, R.J., Wang, N., Lee, G.-H., Taylor, B.H.: Analysis of tomato root initiation using a normal mixture distribution. Biometrics 51, 1461–1468 (1995)

    Article  MATH  Google Scholar 

  • Jorgensen, B.: Statistical Properties of the Generalized Inverse-Gaussian Distribution. Lecture Notes in Statistics. Spinger, New York (1992)

    Google Scholar 

  • Karlis, D.: An EM type algorithm for maximum likelihood estimation of the normal-inverse Gaussian distribution. Stat. Probab. Lett. 57(1), 43–52 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • Lin, T.I., Lee, J.C., Hsieh, W.J.: Robust mixture modeling using the skew t distribution. Stat. Comput. 17(2), 81–92 (2007a)

    Article  MathSciNet  Google Scholar 

  • Lin, T.I., Lee, J.C., Yen, S.Y., Shu, Y.: Finite mixture modelling using the skew normal distribution. Stat. Sinica 17, 909–927 (2007b)

    MATH  Google Scholar 

  • MacLean, C., Morton, N., Elston, R., Yee, S.: Skewness in commingling distributions. Biometrics 32, 695–699 (1976)

    Article  MATH  Google Scholar 

  • McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    MATH  Google Scholar 

  • McLachlan, G.J., Bean, R.W., Jones, L.B.-T.: Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution. Comput. Stat. Data Anal. 51(11), 5327–5338 (2007)

    Article  MATH  Google Scholar 

  • Meng, X.-L., Van Dyk, D.: The EM algorithm—an old folk song sung to a fast new tune. J. R. Stat. Soc. Ser. B 59(3), 511–567 (1997)

    Article  MATH  Google Scholar 

  • Peel, D., McLachlan, G.: Robust mixture modelling using the t distribution. Stat. Comput. 10, 339–348 (2000)

    Article  Google Scholar 

  • Protassov, R.S.: EM-based maximum likelihood parameter estimation for multivariate generalized hyperbolic distributions with fixed λ. Stat. Comput. 14(1), 67–77 (2004)

    Article  MathSciNet  Google Scholar 

  • Seshadri, V.: The Inverse Gaussian Distribution. Oxford Science Publications. Clarendon/Oxford University Press, New York (1993)

    Google Scholar 

  • Titterington, D., Makov, U., Smith, A.: Statistical Analysis of Finite Mixture Distributions. Wiley, New York (1985)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Karlis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karlis, D., Santourian, A. Model-based clustering with non-elliptically contoured distributions. Stat Comput 19, 73–83 (2009). https://doi.org/10.1007/s11222-008-9072-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-008-9072-0

Keywords

Navigation