Skip to main content
Log in

Bayesian analysis of mixture modelling using the multivariate t distribution

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A finite mixture model using the multivariate t distribution has been shown as a robust extension of normal mixtures. In this paper, we present a Bayesian approach for inference about parameters of t-mixture models. The specifications of prior distributions are weakly informative to avoid causing nonintegrable posterior distributions. We present two efficient EM-type algorithms for computing the joint posterior mode with the observed data and an incomplete future vector as the sample. Markov chain Monte Carlo sampling schemes are also developed to obtain the target posterior distribution of parameters. The advantages of Bayesian approach over the maximum likelihood method are demonstrated via a set of real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anscombe F.J. 1967. Topics in the investigation of linear relations fitted by the method of least squares. Journal of the Royal Statistical Soc B 29: 1–52.

    Google Scholar 

  • Basford K.E., Greenway D.R., McLachlan G.J. and Peel D. 1997. Standard errors of fitted means under normal mixture. Computational Statistics 12: 1–17.

    Google Scholar 

  • Brooks S.P. and Gelman A. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7: 434–455.

    Google Scholar 

  • Campbell N.A. and Mahon R.J. 1974. A multivariate study of variation in two species of rock crab of genus Leptograpsus. Australian Journal of Zoology 22: 417–425.

    Google Scholar 

  • van Dyk D.A., Meng X.L. and Rubin D.B. 1995. Maximum likelihood estimation via the ECM algorithm: Computing the asymptotic variance. Statistica Sinica 5: 55–75.

    Google Scholar 

  • Dempster A.P., Laird N.M. and Rubin D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39: 1–38.

    Google Scholar 

  • Diebolt J. and Robert C.P. 1994. Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society B 56: 363–375.

    Google Scholar 

  • Efron B. and Tibshirani R. 1986. Bootstrap method for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science 1: 54–77. London: Chapman & Hall.

    Google Scholar 

  • Fruhwirth-Schnatter S. 2001. Markov Chain Monte Carlo estimation of classical and dynamic switching and mixture models. Journal of the American Statistical Association 96: 194–209.

    Google Scholar 

  • Geisser S. 1975. The predictive sample reuse method with applications. Journal of the American Statistical Association 70: 320–328.

    Google Scholar 

  • Gelfand A.E. and Smith A.F.M. 1990. Sampling based approaches to calculate marginal densities. Journal of the American Statistical Association 85: 398–409.

    Google Scholar 

  • Gelman A.E. and Rubin D.B. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7: 457–511.

    Google Scholar 

  • Gelman A., Carlin J.B., Stern H.S. and Rubin D.B. 1995. Bayesian Data Analysis. Champmen & Hall, London.

    Google Scholar 

  • Hathaway R.J. 1985. A constrained formulation of maximumlikelihood estimation for normal mixture distributions. Annals of Statistics 13(2): 795–800.

    Google Scholar 

  • Hosmer D.W. 1973. A comparison of iterative maximum-likelihood estimates of the parameters of a mixture of two normal distributions under three different types of sample. Biometrics 29: 761–770.

    Google Scholar 

  • Liu C.H. and Rubin D.B. 1994. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 81: 633–648.

    Google Scholar 

  • Liu C.H. 1995. Missing data imputation using the multivariate t distribution. Journal of Multivariate Analysis 53: 139–158.

    Google Scholar 

  • Liu C.H. and Rubin D.B. 1995. ML estimation of the t distribution using EM and its extensions, ECM and ECME. Statistica Sinica 5: 19–39.

    Google Scholar 

  • Mardia K.V., Kent J.T. and Bibby J.M. 1979. Multivariate Analysis. Academic Press, Inc. London.

    Google Scholar 

  • McLachlan G.J. and Peel D. 1998. Robust cluster analysis via mixtures of multivariate t-distribution. In: A. Amin, D. Dori, P. Pudil, and H. Freeman (Eds.), Lecture Notes in Computer Science, 1451. Berlin, Springer-Verlag, pp. 658–666.

    Google Scholar 

  • McLachlan G.J. and Peel D. 2000. Finite Mixture Model. New York: Wiely.

    Google Scholar 

  • McLachlan G.J. and Basford K.E. 1988. Mixture Models: Inference and Application to Clustering. New York, Marcel Dekker.

    Google Scholar 

  • Meng X.L. and Rubin D.B. 1991. Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm. Journal of the American Statistical Association 86: 899–909.

    Google Scholar 

  • Meng X.L. and Rubin D.B. 1993. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80: 267–278.

    Google Scholar 

  • Pearson K. 1894. Contributions to the theory of mathematical evolution. Philosophical Transactions of the Royal Society of London A 185: 71–110.

    Google Scholar 

  • Peel D. and McLachlan G.J. 2000. Robust mixture modeling using the t distribution. Statistics and Computing 10: 339–348.

    Google Scholar 

  • Raftery A.E. 1996. Hypothesis testing and model selection via posterior simulation. In: W.R. Gilks, S. Richardson and D.J. Spiegelhalter (Eds.), practice Markov Chain Monte Carlo pp. 163–188. Chapman & Hall, London.

    Google Scholar 

  • Rao C.R. 1948. The utilization of multiple measurements in problems of biological classification. Journal of the Royal Statistical Society B 10: 159–203.

    Google Scholar 

  • Redner R.A. and Walker H.F. 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26: 195–239.

    Google Scholar 

  • Relles D.A. and Rogers W.H. 1977. Statistics are fairly robust estimators of location. Journal of the American Statistical Association 72: 107–111.

    Google Scholar 

  • Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society B 59: 731–792.

    Google Scholar 

  • Stephens M.A. 1997. Bayesian method for mixtures of normal distributions. Ph.D. thesis, University of Oxford.

  • Stone, M. 1974. Cross-validatory choice and assessment of statistical prediction (with discussion). Journal of the Royal Statistical Society B 36: 111–147.

    Google Scholar 

  • Tiao G.C. 1967. Discussion on “Topics in the investigation of linear relations fitted by the method of least squares.” Journal of the Royal Statistical Society, series B 29: 44–47.

    Google Scholar 

  • Titterington D.M., Smith A.F.M. and Markov U.E. 1985. Statistical Analysis of Finite Mixture Distributions. New York: Wiely.

    Google Scholar 

  • Vounatsou P. and Smith A.F.M. 1997. Simulation-based Bayesian inferences for two-variance components linear models. Journal of Statistical Planning Inferences 29: 139–161.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, T.I., Lee, J.C. & Ni, H.F. Bayesian analysis of mixture modelling using the multivariate t distribution. Statistics and Computing 14, 119–130 (2004). https://doi.org/10.1023/B:STCO.0000021410.33077.10

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:STCO.0000021410.33077.10

Navigation