Abstract
A finite mixture model using the multivariate t distribution has been shown as a robust extension of normal mixtures. In this paper, we present a Bayesian approach for inference about parameters of t-mixture models. The specifications of prior distributions are weakly informative to avoid causing nonintegrable posterior distributions. We present two efficient EM-type algorithms for computing the joint posterior mode with the observed data and an incomplete future vector as the sample. Markov chain Monte Carlo sampling schemes are also developed to obtain the target posterior distribution of parameters. The advantages of Bayesian approach over the maximum likelihood method are demonstrated via a set of real data.
Similar content being viewed by others
References
Anscombe F.J. 1967. Topics in the investigation of linear relations fitted by the method of least squares. Journal of the Royal Statistical Soc B 29: 1–52.
Basford K.E., Greenway D.R., McLachlan G.J. and Peel D. 1997. Standard errors of fitted means under normal mixture. Computational Statistics 12: 1–17.
Brooks S.P. and Gelman A. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7: 434–455.
Campbell N.A. and Mahon R.J. 1974. A multivariate study of variation in two species of rock crab of genus Leptograpsus. Australian Journal of Zoology 22: 417–425.
van Dyk D.A., Meng X.L. and Rubin D.B. 1995. Maximum likelihood estimation via the ECM algorithm: Computing the asymptotic variance. Statistica Sinica 5: 55–75.
Dempster A.P., Laird N.M. and Rubin D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39: 1–38.
Diebolt J. and Robert C.P. 1994. Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society B 56: 363–375.
Efron B. and Tibshirani R. 1986. Bootstrap method for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science 1: 54–77. London: Chapman & Hall.
Fruhwirth-Schnatter S. 2001. Markov Chain Monte Carlo estimation of classical and dynamic switching and mixture models. Journal of the American Statistical Association 96: 194–209.
Geisser S. 1975. The predictive sample reuse method with applications. Journal of the American Statistical Association 70: 320–328.
Gelfand A.E. and Smith A.F.M. 1990. Sampling based approaches to calculate marginal densities. Journal of the American Statistical Association 85: 398–409.
Gelman A.E. and Rubin D.B. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7: 457–511.
Gelman A., Carlin J.B., Stern H.S. and Rubin D.B. 1995. Bayesian Data Analysis. Champmen & Hall, London.
Hathaway R.J. 1985. A constrained formulation of maximumlikelihood estimation for normal mixture distributions. Annals of Statistics 13(2): 795–800.
Hosmer D.W. 1973. A comparison of iterative maximum-likelihood estimates of the parameters of a mixture of two normal distributions under three different types of sample. Biometrics 29: 761–770.
Liu C.H. and Rubin D.B. 1994. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 81: 633–648.
Liu C.H. 1995. Missing data imputation using the multivariate t distribution. Journal of Multivariate Analysis 53: 139–158.
Liu C.H. and Rubin D.B. 1995. ML estimation of the t distribution using EM and its extensions, ECM and ECME. Statistica Sinica 5: 19–39.
Mardia K.V., Kent J.T. and Bibby J.M. 1979. Multivariate Analysis. Academic Press, Inc. London.
McLachlan G.J. and Peel D. 1998. Robust cluster analysis via mixtures of multivariate t-distribution. In: A. Amin, D. Dori, P. Pudil, and H. Freeman (Eds.), Lecture Notes in Computer Science, 1451. Berlin, Springer-Verlag, pp. 658–666.
McLachlan G.J. and Peel D. 2000. Finite Mixture Model. New York: Wiely.
McLachlan G.J. and Basford K.E. 1988. Mixture Models: Inference and Application to Clustering. New York, Marcel Dekker.
Meng X.L. and Rubin D.B. 1991. Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm. Journal of the American Statistical Association 86: 899–909.
Meng X.L. and Rubin D.B. 1993. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80: 267–278.
Pearson K. 1894. Contributions to the theory of mathematical evolution. Philosophical Transactions of the Royal Society of London A 185: 71–110.
Peel D. and McLachlan G.J. 2000. Robust mixture modeling using the t distribution. Statistics and Computing 10: 339–348.
Raftery A.E. 1996. Hypothesis testing and model selection via posterior simulation. In: W.R. Gilks, S. Richardson and D.J. Spiegelhalter (Eds.), practice Markov Chain Monte Carlo pp. 163–188. Chapman & Hall, London.
Rao C.R. 1948. The utilization of multiple measurements in problems of biological classification. Journal of the Royal Statistical Society B 10: 159–203.
Redner R.A. and Walker H.F. 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26: 195–239.
Relles D.A. and Rogers W.H. 1977. Statistics are fairly robust estimators of location. Journal of the American Statistical Association 72: 107–111.
Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society B 59: 731–792.
Stephens M.A. 1997. Bayesian method for mixtures of normal distributions. Ph.D. thesis, University of Oxford.
Stone, M. 1974. Cross-validatory choice and assessment of statistical prediction (with discussion). Journal of the Royal Statistical Society B 36: 111–147.
Tiao G.C. 1967. Discussion on “Topics in the investigation of linear relations fitted by the method of least squares.” Journal of the Royal Statistical Society, series B 29: 44–47.
Titterington D.M., Smith A.F.M. and Markov U.E. 1985. Statistical Analysis of Finite Mixture Distributions. New York: Wiely.
Vounatsou P. and Smith A.F.M. 1997. Simulation-based Bayesian inferences for two-variance components linear models. Journal of Statistical Planning Inferences 29: 139–161.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lin, T.I., Lee, J.C. & Ni, H.F. Bayesian analysis of mixture modelling using the multivariate t distribution. Statistics and Computing 14, 119–130 (2004). https://doi.org/10.1023/B:STCO.0000021410.33077.10
Issue Date:
DOI: https://doi.org/10.1023/B:STCO.0000021410.33077.10