main-content

## Über dieses Buch

Monte Carlo statistical methods, particularly those based on Markov chains, have now matured to be part of the standard set of techniques used by statisticians. This book is intended to bring these techniques into the class­ room, being (we hope) a self-contained logical development of the subject, with all concepts being explained in detail, and all theorems, etc. having detailed proofs. There is also an abundance of examples and problems, re­ lating the concepts with statistical practice and enhancing primarily the application of simulation techniques to statistical problems of various dif­ ficulties. This is a textbook intended for a second-year graduate course. We do not assume that the reader has any familiarity with Monte Carlo techniques (such as random variable generation) or with any Markov chain theory. We do assume that the reader has had a first course in statistical theory at the level of Statistical Inference by Casella and Berger (1990). Unfortu­ nately, a few times throughout the book a somewhat more advanced no­ tion is needed. We have kept these incidents to a minimum and have posted warnings when they occur. While this is a book on simulation, whose actual implementation must be processed through a computer, no requirement is made on programming skills or computing abilities: algorithms are pre­ sented in a program-like format but in plain text rather than in a specific programming language. (Most of the examples in the book were actually implemented in C, with the S-Plus graphical interface.

## Inhaltsverzeichnis

### Chapter 1. Introduction

Abstract
Until the advent of powerful and accessible computing methods, the experimenter was often confronted with a difficult choice. Either describe an accurate model of a phenomenon, which would usually preclude the computation of explicit answers, or choose a standard model which would allow this computation, but may not be a close representation of a realistic model. This dilemma is present in many branches of statistical applications, for example, in electrical engineering, aeronautics, biology, networks, and astronomy. To use realistic models, the researchers in these disciplines have often developed original approaches for model fitting that are customized for their own problems. (This is particularly true of physicists, the originators of Markov chain Monte Carlo methods.) Traditional methods of analysis, such as the usual numerical analysis techniques, are not well adapted for such settings.
Christian P. Robert, George Casella

### Chapter 2. Random Variable Generation

Abstract
The methods developed in this book mostly rely on the possibility of producing (with a computer) a supposedly endless flow of random variables (usually iid) for well-known distributions. Such a simulation is, in turn, based on the production of uniform random variables. We thus provide in this chapter some basic methodology for doing this. In particular, we present a uniform random number generator and illustrate methods for using these uniform random variables to produce random variables from both standard and nonstandard distributions.
Christian P. Robert, George Casella

### Chapter 3. Monte Carlo Integration

Abstract
Two major classes of numerical problems that arise in statistical inference are optimization problems and integration problems. (An associated problem, that of implicit equations, can often be reformulated as an optimization problem.) Although optimization is generally associated with the likelihood approach, and integration with the Bayesian approach, these are not strict classifications, as shown by Examples 1.2.2 and 1.3.5, and Examples 3.1.1, 3.1.2 and 3.1.3, respectively.
Christian P. Robert, George Casella

### Chapter 4. Markov Chains

Abstract
In this chapter we introduce fundamental notions of Markov chains and state the results that are needed to establish the convergence of various MCMC algorithms and, more generally, to understand the literature on this topic. Thus, this chapter, along with basic notions of probability theory, will provide enough foundation for the understanding of the following chapters. It is, unfortunately, a necessarily brief and, therefore, incomplete introduction to Markov chains, and we refer the reader to Meyn and Tweedie (1993), on which this chapter is based, for a thorough introduction to Markov chains. Other perspectives can be found in Doob (1953), Chung (1960), Feller (1970, 1971), and Billingsley (1995) for general treatments, and Norris (1997) Nummelin (1984), Revuz (1984), and Resnick (1994) for books entirely dedicated to Markov chains. Given the purely utilitarian goal of this chapter, its style and presentation differ from those of other chapters, especially with regard to the plethora of definitions and theorems and to the rarity of examples and proofs. In order to make the book accessible to those who are more interested in the implementation aspects of MCMC algorithms than in their theoretical foundations, we include a preliminary section that contains the essential facts about Markov chains.
Christian P. Robert, George Casella

### Chapter 5. Monte Carlo Optimization

Abstract
Similar to the problem of integration, differences between the numerical approach and the simulation approach to the problem
$$\mathop {\max {\kern 1pt} \,h(\theta )}\limits_{\theta \in \Theta }$$
(5.1.1)
lie in the treatment of the function1 h.
Christian P. Robert, George Casella

### Chapter 6. The Metropolis—Hastings Algorithm

Abstract
It was shown in Chapter 3 that it is not necessary to use a sample from the distribution f to approximate the integral
$$\int {h(x)f(x)dx,}$$
since importance sampling techniques can be used.
Christian P. Robert, George Casella

### Chapter 7. The Gibbs Sampler

Abstract
The previous chapter developed simulation techniques that could be called “generic,” since they require only a limited amount of information about the distribution to be simulated. For example, the generic algorithm ARMS (§6.3.3) aims at reproducing the density f of this distribution in an automatic manner. However, Metropolis-Hastings algorithms can achieve higher levels of efficiency when they take into account the specifics of the target distribution f, in particular through the calibration of the acceptance rate (see §6.4.1). Moving even further in this direction, the properties and performance of the Gibbs sampling method presented in this chapter are very closely tied to the distribution f. This is because the choice of instrumental distribution is essentially reduced to a choice between a finite number of possibilities.
Christian P. Robert, George Casella

### Chapter 8. Diagnosing Convergence

Abstract
The two previous chapters have presented the theoretical foundations of MCMC algorithms and showed that under fairly general conditions, the chains produced by these algorithms are ergodic, or even geometrically ergodic. While such developments are obviously necessary, they are nonetheless insufficient from the point of view of the implementation of MCMC methods. They do not directly result in methods of controlling the chain produced by an algorithm (in the sense of a stopping rule to guarantee that the number of iterations is sufficient). In other words, general convergence results do not tell us when to stop the MCMC algorithm and produce our estimates.
Christian P. Robert, George Casella

### Chapter 9. Implementation in Missing Data Models

Abstract
Missing data models (introduced in §5.3.1) are a natural application for simulation, since they use it to replace the missing data part so that one can proceed with a “classical” inference on the complete model. However, this idea was slow in being formalized; that is, in going beyond ad hoc solutions with no theoretical justification. It is only with the EM algorithm that Dempster et al. (1977) (see §5.3.3) described a rigorous and general formulation of statistical inference through completion of missing data (by expectation rather than simulation, though). The original algorithm could require a difficult analytic computation for the expectation (E) step and therefore cannot be used in all settings. As mentioned in §5.3.4 and §5.5.1, stochastic versions of EM (Broniatowski et al. 1983, Celeux and Diebolt 1985, 1993, Wei and Tanner 1990b, Qian and Titterington 1991, Lavielle and Moulines 1997) have come closer to simulation goals by replacing the E-step with a simulated completion of missing data (but without preserving the entire range of EM convergence properties).
Christian P. Robert, George Casella

### Backmatter

Weitere Informationen