Skip to main content
Log in

Bayesian finite mixtures with an unknown number of components: The allocation sampler

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A new Markov chain Monte Carlo method for the Bayesian analysis of finite mixture distributions with an unknown number of components is presented. The sampler is characterized by a state space consisting only of the number of components and the latent allocation variables. Its main advantage is that it can be used, with minimal changes, for mixtures of components from any parametric family, under the assumption that the component parameters can be integrated out of the model analytically. Artificial and real data sets are used to illustrate the method and mixtures of univariate and of multivariate normals are explicitly considered. The problem of label switching, when parameter inference is of interest, is addressed in a post-processing stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitkin M. 2001. Likelihood and Bayesian analysis of mixtures. Statistical Modelling 1: 287–304.

    Article  Google Scholar 

  • Böhning D. and Seidel W. 2003. Editorial: Recent developments in mixture models. Computational Statistics and Data Analysis 41: 349–357.

    Article  MathSciNet  Google Scholar 

  • Casella G., Robert C.P., and Wells M.T. 2000. Mixture models, latent variables and partitioned importance sampling. Tech Report 2000-03. CREST, INSEE, Paris.

  • Carlin B.P. and Chib S. 1995. Bayesian model choice via Markov chain Monte Carlo methods. Journal of the Royal Statistical Society B 57: 473–484.

    MATH  Google Scholar 

  • Carpaneto G. and Toth P. 1980. Algorithm 548: Solution of the assignment problem [H]. ACM Transactions on Mathematical Software 6: 104–111.

    Article  Google Scholar 

  • Celeux G., Hurn M., and Robert C.P. 2000. Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association 95: 957–970.

    Article  MATH  MathSciNet  Google Scholar 

  • Chib S. 1995. Marginal Likelihood from the Gibbs Output. Journal of the American Statistical Association 90: 1313–1321.

    Article  MATH  MathSciNet  Google Scholar 

  • Dellaportas P. and Papageorgiou I. 2006. Multivariate mixtures of normals with unknown number of components. Statistics and Computing 16: 57–68.

    Article  MathSciNet  Google Scholar 

  • Dempster A.P., Laird N.M., and Rubin D.B. 1977. Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39: 1–38.

    MATH  MathSciNet  Google Scholar 

  • Diebolt J. and Robert C.P. 1994. Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society B 56: 363–375.

    MATH  MathSciNet  Google Scholar 

  • Fearnhead P. 2004. Particle filters for mixture models with an unknown number of components. Statistics and Computing 14: 11–21.

    Article  MathSciNet  Google Scholar 

  • Frühwirth-Schnatter S. 2001. Markov Chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models. Journal of the American Statistical Association 96: 194–209.

    Article  MATH  MathSciNet  Google Scholar 

  • Green P.J. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82: 711–732.

    Article  MATH  MathSciNet  Google Scholar 

  • Ishwaran H., James L.F., and Sun J. 2001. Bayesian model selection in finite mixtures by marginal density decompositions. Journal of the American Statistical Association 96: 1316–1332.

    Article  MATH  MathSciNet  Google Scholar 

  • Jain S. and Neal R.M. 2004. A split-merge Markov Chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of Computational and Graphical Statistics 13: 158–182.

    Article  MathSciNet  Google Scholar 

  • Jasra A., Holmes C.C., and Stephens D.A. 2005. Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statistical Science 20: 50–67.

    Article  MATH  MathSciNet  Google Scholar 

  • Marin J.-M., Mengersen K., and Robert C.P. 2005. Bayesian modelling and inference on mixtures of distributions. In: Dey D. and Rao C.R. (Eds.), Handbook of Statistics vol. 25, North-Holland.

  • McLachlan G. and Peel D. 2000. Finite Mixture Models, John Wiley & Sons, New York.

    Book  MATH  Google Scholar 

  • Mengersen K.L. and Robert C.P. 1996. Testing for Mixtures: A Bayesian entropic approach. In: Bernardo J.M. Berger J.O., Dawid A.P. and Smith A.F.M. (Eds.), Bayesian Statistics vol. 5, Oxford University Press, pp. 255–276.

  • Nobile A. 1994. Bayesian Analysis of Finite Mixture Distributions, Ph.D. dissertation, Department of Statistics, Carnegie Mellon University, Pittsburgh. Available at http://www.stats.gla.ac.uk/~agostino

  • Nobile A. 2004. On the posterior distribution of the number of components in a finite mixture. The Annals of Statistics 32: 2044–2073.

    Article  MATH  MathSciNet  Google Scholar 

  • Nobile A. 2005. Bayesian finite mixtures: a note on prior specification and posterior computation. Technical Report 05-3, Department of Statistics, University of Glasgow.

  • Phillips D.B. and Smith A.F.M. 1996. Bayesian model comparison via jump diffusions. In: Gilks W.R., Richardson S. and Spiegelhalter D.J. (Eds.), Markov Chain Monte Carlo in Practice, Chapman & Hall, London, pp. 215–239.

    Google Scholar 

  • Raftery A.E. 1996. Hypothesis testing and model selection. In: Gilks W.R., Richardson S., and Spiegelhalter D.J. (Eds.), Markov Chain Monte Carlo in Practice, Chapman & Hall, London, pp. 163–187.

    Google Scholar 

  • Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society B 59: 731–792.

    Article  MATH  MathSciNet  Google Scholar 

  • Roeder K. 1990. Density estimation with confidence sets exemplified by superclusters and voids in galaxies. Journal of the American Statistical Association 85: 617–624.

    Article  MATH  Google Scholar 

  • Roeder K. and Wasserman L. 1997. Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association 92: 894–902.

    Article  MATH  MathSciNet  Google Scholar 

  • Steele R.J., Raftery A.E., and Emond M.J. 2003. Computing normalizing constants for finite mixture models via incremental mixture importance sampling (IMIS). Tech Report 436, Dept of Statistics, U. of Washington.

  • Stephens M. 2000a. Bayesian analysis of mixture models with an unknown number of components—an alternative to reversible jump methods. The Annals of Statistics 28: 40–74.

    Article  MATH  MathSciNet  Google Scholar 

  • Stephens M. 2000b. Dealing with label switching in mixture models. Journal of the Royal Statistical Society B 62: 795–809.

    Article  MATH  MathSciNet  Google Scholar 

  • Titterington D.M., Smith A.F.M., and Makov U.E. 1985. Statistical Analysis of Finite Mixture Distributions, John Wiley & Sons, New York.

    MATH  Google Scholar 

  • Zhang Z., Chan K.L., Wu Y., and Chen C. 2004. Learning a multivariate Gaussian mixture model with the reversible jump MCMC algorithm. Statistics and Computing 14: 343–355.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agostino Nobile.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nobile, A., Fearnside, A.T. Bayesian finite mixtures with an unknown number of components: The allocation sampler. Stat Comput 17, 147–162 (2007). https://doi.org/10.1007/s11222-006-9014-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-006-9014-7

Keywords

Navigation