Skip to main content

Über dieses Buch

Monte Carlo methods are revolutionising the on-line analysis of data in fields as diverse as financial modelling, target tracking and computer vision. These methods, appearing under the names of bootstrap filters, condensation, optimal Monte Carlo filters, particle filters and survial of the fittest, have made it possible to solve numerically many complex, non-standarard problems that were previously intractable. This book presents the first comprehensive treatment of these techniques, including convergence results and applications to tracking, guidance, automated target recognition, aircraft navigation, robot navigation, econometrics, financial modelling, neural networks,optimal control, optimal filtering, communications, reinforcement learning, signal enhancement, model averaging and selection, computer vision, semiconductor design, population biology, dynamic Bayesian networks, and time series analysis. This will be of great value to students, researchers and practicioners, who have some basic knowledge of probability. Arnaud Doucet received the Ph. D. degree from the University of Paris- XI Orsay in 1997. From 1998 to 2000, he conducted research at the Signal Processing Group of Cambridge University, UK. He is currently an assistant professor at the Department of Electrical Engineering of Melbourne University, Australia. His research interests include Bayesian statistics, dynamic models and Monte Carlo methods. Nando de Freitas obtained a Ph.D. degree in information engineering from Cambridge University in 1999. He is presently a research associate with the artificial intelligence group of the University of California at Berkeley. His main research interests are in Bayesian statistics and the application of on-line and batch Monte Carlo methods to machine learning.





1. An Introduction to Sequential Monte Carlo Methods

Many real-world data analysis tasks involve estimating unknown quantities from some given observations. In most of these applications, prior knowledge about the phenomenon being modelled is available. This knowledge allows us to formulate Bayesian models, that is prior distributions for the unknown quantities and likelihood functions relating these quantities to the observations. Within this setting, all inference on the unknown quantities is based on the posterior distribution obtained from Bayes’ theorem. Often, the observations arrive sequentially in time and one is interested in performing inference on-line. It is therefore necessary to update the posterior distribution as data become available. Examples include tracking an aircraft using radar measurements, estimating a digital communications signal using noisy measurements, or estimating the volatility of financial instruments using stock market data. Computational simplicity in the form of not having to store all the data might also be an additional motivating factor for sequential methods.
Arnaud Doucet, Nando de Freitas, Neil Gordon

Theoretical Issues


2. Particle Filters — A Theoretical Perspective

The purpose of this chapter is to present a rigorous mathematical treatment of the convergence of particle filters. In general, we follow the notation and settings suggested by the editors, any extra notation being defined in the next section. Section 2.3.1 contains the main results of the paper: Theorems 2.3.1 and 2.3.2 provide necessary and sufficient conditions for the convergence of the particle filter to the posterior distribution of the signal. As an application of these results, we prove the convergence of a certain class of particle filters. This class includes several known filters (such as those presented in (Carpenter, Clifford and Fearnhead 1999b, Crisan, Del Moral and Lyons 1999, Gordon et al. 1993), but is by no means the most general one. Finally, we discuss some of the issues that are relevant in applications and which arise from the theoretical analysis of these methods.
Dan Crisan

3. Interacting Particle Filtering With Discrete Observations

We consider a pair of processes (X,Y), where X represents the state of a system (or signal) and Y represents the observation: X may take its values in an arbitrary measurable space (E,ε), but it is important for what follows that Y take its values in q for some q ≥ 1.
Pierre Del Moral, Jean Jacod

Strategies for Improving Sequential Monte Carlo Methods


4. Sequential Monte Carlo Methods for Optimal Filtering

Estimating the state of a nonlinear dynamic model sequentially in time is of paramount importance in applied science. Except in a few simple cases, there is no closed-form solution to this problem. It is therefore necessary to adopt numerical techniques in order to compute reasonable approximations. Sequential Monte Carlo (SMC) methods are powerful tools that allow us to accomplish this goal.
Christophe Andrieu, Arnaud Doucet, Elena Punskaya

5. Deterministic and Stochastic Particle Filters in State-Space Models

Optimal or Bayesian filtering in state-space models is a question of computing series of linked numerical integrals where output from one is input to the other (Bucy and Senne 1971). Particle filtering can be regarded as comprising techniques for solving these integrals by replacing the complicated posterior densities involved by discrete approximations, based on particles (Kitagawa 1996). There is evidence that the numerical errors as the process is iterated often stabilise, or at least do not accumulate sharply (see section 5.2.5). Filters of this type can be constructed in many ways. Most of the contributions to this volume employ Monte Carlo designs (see also (Doucet 1998) and the references therein). Particles are then random drawings of state vectors under the current posterior. This amounts to Monte Carlo evaluations of integrals. Numerically inaccurate, but often practical and easy to implement, general methods to run the sampling have been developed. Alternatively, particles can be laid out through a deterministic plan, using more sophisticated and more accurate numerical integration techniques. Such an approach has been discussed in (Kitagawa 1987), (Pole and West 1988) and (Pole and West 1990), but recently most work has been based on Monte Carlo methods. To some extent, Monte Carlo and deterministic particle filters are complementary approaches, and one may also wonder whether they may be usefully combined (see (Monahan and Genz 1997) for such a combination in a non-dynamic setting). Emphasis in this paper is on deterministic filtering. A general framework can be found in the above-mentioned references and in (West and Harrison 1997)[Section 13.5]. We shall present a common perspective in the next section, where our contribution will be on design issues.
Erik Bølviken, Geir Storvik

6. RESAMPLE-MOVE Filtering with Cross-Model Jumps

In standard sequential imputation, repeated resampling stages progressively impoverish the set of particles, by decreasing the number of distinct values represented in that set. A possible remedy is Rao-Blackwellisation (Liu and Chen 1998). Another remedy, which we discuss in this chapter, is to adopt a hybrid particle filter, which combines importance sampling/resampling (Rubin 1988, Smith and Gelfand 1992) and Markov chain iterations. An example of this class of particle filters is the RESAMPLEMOVE algorithm described in (Gilks and Berzuini 1999), in which the swarm of particles is adapted to an evolving target distribution by periodical resampling steps and through occasional Markov chain moves that lead each individual particle from its current position to a new point of the parameter space. These moves increase particle diversity. Markov chain moves had previously been introduced in particle filters (for example, (Berzuini, Best, Gilks and Larizza 1997, Liu and Chen 1998)), but rarely with the possibility of moving particles at any stage of the evolution process along any direction of the parameter space; this is, indeed, an important and innovative feature of RESAMPLE—MOVE. This allows, in particular, to prevent particle depletion along directions of the parameter space corresponding to static parameters, for example when the model contains unknown hyper-parameters, a situation which is not addressed by the usual state filtering algorithms.
Carlo Berzuini, Walter Gilks

7. Improvement Strategies for Monte Carlo Particle Filters

The particle filtering field has seen an upsurge in interest over recent years, and accompanying this upsurge several enhancements to the basic techniques have been suggested in the literature. In this paper we collect a group of these developments that seem to be particularly important for time series applications and give a broad discussion of the methods, showing the relationships between them. We firstly present a general importance sampling framework for the filtering/smoothing problem and show how the standard techniques can be obtained from this general approach. In particular, we show that the auxiliary particle filtering methods of (Pitt and Shephard: this volume) fall into the same general class of algorithms as the standard bootstrap filter of (Gordon et al. 1993). We then develop the ideas further and describe the role of MCMC resampling as proposed by (Gilks and Berzuini: this volume) and (MacEachern, Clyde and Liu 1999). Finally, we present a generalisation of our own in which MCMC resampling ideas are used to traverse a sequence of ‘bridging’ densities which lie between the prediction density and the filtering density. In this way it is hoped to reduce the variability of the importance weights by attempting a series of smaller, more manageable moves at each time step.
Simon Godsill, Tim Clapp

8. Approximating and Maximising the Likelihood for a General State-Space Model

Assume that in a general state-space model the state transitions p(x t |x t −1) depend on a parameter τ and the observations densities p(y t |x t ) on a parameter η. These can be for instance the noise variances in the state and the observation density, or some parameters related to the mean value of these densities. Both τ and η can be multi-dimensional. The combined parameter (τ′,η′)′ will be denoted by θ. We discuss here the estimation of θ. For state-space models, the Bayesian viewpoint is often adopted and prior distributions for the unknown parameters are introduced. We discuss this approach briefly, but concentrate mainly on the frequentist approach where one has to compute and maximise the likelihood. Exact methods are usually not feasible, but the Monte Carlo methods allow us to approximate the likelihood function.
Markus Hürzeler, Hans R. Künsch

9. Monte Carlo Smoothing and Self-Organising State-Space Model

A Monte Carlo method for nonlinear non-Gaussian filtering and smoothing and its application to self-organising state-space models are shown in this paper.
Genshiro Kitagawa, Seisho Sato

10. Combined Parameter and State Estimation in Simulation-Based Filtering

Much of the recent and current interest in simulation-based methods of sequential Bayesian analysis of dynamic models has been focused on improved methods of filtering for time-varying state vectors. We now have quite effective algorithms for time-varying states, as represented throughout this volume. Variants of the auxiliary particle filtering algorithm (Pitt and Shephard 1999b), in particular, are of proven applied efficacy in quite elaborate models. However, the need for more general algorithms that deal simultaneously with both fixed model parameters and state variables is especially pressing. We simply do not have access to efficient and effective methods of treating this problem, especially in models with realistically large numbers of fixed model parameters. It is a very challenging problem.
Jane Liu, Mike West

11. A Theoretical Framework for Sequential Importance Sampling with Resampling

Monte Carlo filters (MCF) can be loosely defined as a set of methods that use Monte Carlo simulation to solve on-line estimation and prediction problems in a dynamic system. Compared with traditional filtering methods, simple, flexible — yet powerful — MCF techniques provide effective means to overcome computational difficulties in dealing with nonlinear dynamic models. One key element of MCF techniques is the recursive use of the importance sampling principle, which leads to the more precise name sequential importance sampling (SIS) for the techniques that are to be the focus of this article.
Jun S. Liu, Rong Chen, Tanya Logvinenko

12. Improving Regularised Particle Filters

The optimal filter computes the posterior probability distribution of the state in a dynamical system, given noisy measurements, by iterative application of prediction steps according to the dynamics of the state, and correction steps taking the measurements into account. A new class of approximate nonlinear filter has been recently proposed, the idea being to produce a sample of independent random variables, called a particle system, (approximately) distributed according to this posterior probability distribution. The method is very easy to implement, even in high-dimensional problems, since it is sufficient in principle to simulate independent sample paths of the hidden dynamical system.
Christian Musso, Nadia Oudjane, Francois Le Gland

13. Auxiliary Variable Based Particle Filters

We model a time series {y t , t = 1, ..., n} using a state-space framework with the {y t |α t } being independent and with the state {α t } assumed to be Markovian. The task will be to use simulation to estimate f(α t |F t ), t = 1, ..., n, where F t is contemporaneously available information. We assume a known measurement density f(y t |α t ) and the ability to simulate from the transition density f(α t+1|α t ). Sometimes we will also assume that we can evaluate f(α t+1|α t ).
Michael K. Pitt, Neil Shephard

14. Improved Particle Filters and Smoothing

Exact recursive Bayesian inference is essentially impossible except in very special scenarios, such as the linear-Gaussian dynamic systems that are amenable to the Kalman filter and associated methods. Otherwise, some form of approximation is necessary. In some contexts, a parametric approximation might still be workable, as in (Titterington 1973)’s use of two-component Normal mixtures in a simple extremum-tracking problem (which we revisit later in this chapter), but nowadays it is more common to carry forward, as an estimate of the current distribution of the items of interest, what is claimed to be a simulated sample from that distribution, in other words, a particle filter.
Photis Stavropoulos, D. M. Titterington



15. Posterior Cramér-Rao Bounds for Sequential Estimation

The posterior filter density of most nonlinear recursive estimation problems cannot be described analytically by a finite number of parameters. Several examples of sub-optimal algorithms for practical sequential estimation have therefore appeared in the literature. Generally, these procedures approximate either the estimation model or the description of the posterior distribution. These inevitable approximations may seriously degrade the estimation performance when compared with the results that would have been obtained had Bayesian inference, based on the true posterior density, been carried out. It is of great practical interest to quantify this performance degradation, and measure the effect of the introduced approximations. A benchmark simulation evaluation against the optimal solution is not possible because it would require infinite computing power and unlimited memory to run the optimal algorithm However, even if it is intractable to implement and run the optimal solution a lower bound on the performance of this solution can be obtained. The estimation error obtained with an optimal algorithm depends only on the fundamental properties of the model, e.g. signal-to-noise levels and prior assumptions on the sought parameters. Characteristics of the estimation error from the optimal solution define lower limits on the performance that can be achieved using any practical implementation. The characteristics of the sub-optimal estimation error achieved by an approximate implementation are revealed in simulations using the implemented procedure. The discrepancy from the lower bound gives an indication of the effect of the approximations introduced in the implemented algorithm. A relative comparison between the sub-optimal and the optimal algorithms is therefore possible even if it is intractable to implement the optimal solution. A lower bound on some property of the estimation error is convenient for evaluation purposes, but can also be used to shed light on the fundamental performance level that can be reached for the estimation problem.
Niclas Bergman

16. Statistical Models of Visual Shape and Motion

This paper1 addresses some problems in the interpretation of visually observed shapes, both planar and three-dimensional, in motion. Mumford (1996), interpreting the Pattern Theory developed over a number of years by Grenander (1976), views images as pure patterns that have been distorted by a combination of four kinds of degradations. This view applies naturally to the analysis of static, two-dimensional images. The four degradations are given here, together with comments on how they need to be extended to take account of three-dimensional objects in motion.
Andrew Blake, Michael Isard, John MacCormick

17. Sequential Monte Carlo Methods for Neural Networks

Many problems, arising in science and engineering, require the estimation of nonlinear, time-varying functions that map a set of input signals to a corresponding set of output signals. Some examples include: finding the relation between an input pressure signal and the movement of a pneumatic control valve; using past observations in a time series to predict future events; and using a group of biomedical signals to carry out diagnoses and prognoses. These problems can be reformulated in terms of a generic one of estimating the parameters of a suitable neural network on-line as the input-output data becomes available.
N. de Freitas, C. Andrieu, P. Højen-Sørensen, M. Niranjan, A. Gee

18. Sequential Estimation of Signals under Model Uncertainty

Two important areas of signal processing are parameter estimation of signals and signal detection. In standard textbooks they are usually addressed separately, although in many practical problems they are applied jointly. In estimation theory, it is almost always assumed that the model of the signal is known, and the objective is to estimate its parameters from noisy signal data. For example, if the observed data vector is y, where y ∈ \({R^{{n_y}}}\) , and the model of the signal is represented by a function whose analytical form is known, the goal is to estimate the signal parameters x from y, where x ∈ \({R^{{n_y}}}\) .1 In cases when it is unclear whether there is a signal in the data, one resorts to estimation of the signal parameters under the assumption that the data contain a signal, thereafter applying a detection scheme to decide if the signal is indeed in the data.
Petar M. Djurić

19. Particle Filters for Mobile Robot Localization

This chapter investigates the utility of particle filters in the context of mobile robotics. In particular, we report results of applying particle filters to the problem of mobile robot localization, which is the problem of estimating a robot’s pose relative to a map of its environment. The localization problem is a key one in mobile robotics, because it plays a fundamental role in various successful mobile robot systems; see e.g., (Cox and Wilfong 1990, Fukuda, Ito, Oota, Arai, Abe, Tanake and Tanaka 1993, Hinkel and Knieriemen 1988, Leonard, Durrant-Whyte and Cox 1992, Rencken 1993, Simmons, Goodwin, Haigh, Koenig and O’Sullivan 1997, Weiß, Wetzler and von Puttkamer 1994) and various chapters in (Borenstein, Everett and Feng 1996) and (Kortenkamp, Bonasso and Murphy 1998). Occasionally, it has been referred to as “the most fundamental problem to providing a mobile robot with autonomous capabilities” (Cox 1991).
Dieter Fox, Sebastian Thrun, Wolfram Burgard, Frank Dellaert

20. Self-Organizing Time Series Model

The generalised state-space model (GSSM) that we deal with in this study is defined by a set of two equations,
$$system\;\bmod el\quad {x_t} = f({x_{t - 1}},{v_t})$$
$$observation\;\bmod el\quad {y_t} \sim r( \cdot |{x_t},{\theta _{obs}})$$
where x t is an n x × 1 vector of unobserved sate variables, and y t is an n y dimensional vector observation. \( {\mathbb{R}^{{n_x}}} \times {\mathbb{R}^{{n_v}}} \to {\mathbb{R}^{{n_x}}} \) is a given function. {v t } is an independent and identically distributed (i.i.d.) random process with v t ~ q(v|θ sys ). r is the conditional distribution of y t given x t q(∙|∙) and r(∙|∙) are, in general, non-Gaussian densities specified by the unknown parameter vectors, θ sys and θ obs respectively. In this study, we set θ = [θ sys ,θ obs ]′. The initial state x 0 is distributed according to the density p 0(x).
Tomoyuki Higuchi

21. Sampling in Factored Dynamic Systems

In many real-world situations, we are interested in monitoring the evolution of a complex situation over time. For example, we may be monitoring a patient’s vital signs in an intensive care unit (Dagum and Galper 1995), analyzing a complex freeway traffic scene with the goal of controlling a moving vehicle (Huang, Koller, Malik, Ogasawara, Rao, Russell and Weber 1994), localizing a robot in a complex environment (Fox, Burgard and Thrun 1999) (see also Murphy and Russell (2001: this volume)), or tracking motion of non-rigid objects in a cluttered visual scene (Isard and Blake 1998a). We treat such systems as being in one of a possible set of states, where the state changes over time. We model the states as changing at discrete time intervals, so that x t is the state of the system at time t. In most systems, we model the system states as having some internal structure: the system state is typically represented by some vector of variables X = (X 1,...,X n ), where each X i takes on values in some space Dom[X i ]. The possible states x are assignments of values to the variables X. In a traffic surveillance application, the state might contain variables such as the vehicle position, its velocity, the weather, and more.
Daphne Koller, Uri Lerner

22. In-Situ Ellipsometry Solutions Using Sequential Monte Carlo

This chapter presents an example of an actual industrial application in which Sequential Monte Carlo (SMC) methods have led to significant progress toward the goal of on-line control of growing semiconductor composition. This chapter will differ from the majority of contributions to this book in that it presents an example application in which SMC has been used to solve a real problem in industrial process monitoring. By necessity we shall not dwell upon such issues as whether to use SIS, SIR, ASIR, Resample-move, Residual-resample, kernel-smoothing or any of the myriad of mechanisms for “herding” our set of posterior samples from one time step to the next. Similarly, the chapter does not seek to introduce the latest “jolly-wheeze” for implementing a Bayesian recursive filter using Sequential Monte Carlo. The chapter describes how the power of vanilla SMC methods can be used to overcome the analytical problems inherent in highly nonlinear physical measurement models and that, even with a relatively small number of particles, adequate results can be obtained.
Alan D. Marrs

23. Manoeuvring Target Tracking Using a Multiple-Model Bootstrap Filter

The Kalman filter (Sorensen 1966) is the optimal solution to the Bayesian estimation problem for a given linear, stochastic, state-space system with additive Gaussian noise sources of known statistics. The solution is possible since all densities in the system are Gaussian, and remain so under system transformation, and thus closed-form expressions for these are always available in terms of their first and second moments. However, if the model used by the filter does not match the actual system dynamics, the filter will tend to diverge, such that the actual errors fall outside the range predicted by the filter’s estimate of the error covariance. By introducing some degree of uncertainty into the filter, small deviations from the actual system can be tracked. This also ensures that the filter gain does not fall to zero, and thus measurements are always used to update the state estimate. The practicality of this approach for large model mismatch problems is limited however.
Shaun McGinnity, George W. Irwin

24. Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks

Particle filtering in high dimensional state-spaces can be inefficient because a large number of samples is needed to represent the posterior. A standard technique to increase the efficiency of sampling techniques is to reduce the size of the state space by marginalizing out some of the variables analytically; this is called Rao-Blackwellisation (Casella and Robert 1996). Combining these two techniques results in Rao-Blackwellised particle filtering (RBPF) (Doucet 1998, Doucet, de Freitas, Murphy and Russell 2000). In this chapter, we explain RBPF, discuss when it can be used, and give a detailed example of its application to the problem of map learning for a mobile robot, which has a very large (~ 2100) discrete state space.
Kevin Murphy, Stuart Russell

25. Particles and Mixtures for Tracking and Guidance

The guidance algorithm is the central decision and control element of a missile system. It is responsible for taking data from all available missile sensors, together with targeting information and generating a guidance demand. This is usually in the form of an acceleration demand that is passed to the autopilot.
David Salmond, Neil Gordon

26. Monte Carlo Techniques for Automated Target Recognition

Motivated by the extraordinary performance of the human visual system, automated recognition of targets from their remotely sensed images has become an active area of research. In particular, Bayesian techniques based on statistical models of system components such as targets, sensors, and clutter have emerged. Since the probability models associated with such physical systems are complicated, general closed-form analytical solutions are ruled out, and computational approaches become vital. This chapter explores a family of computational solutions applied to finding the unknowns associated with Automated Target Recognition (ATR) problems. In general ATR, remote sensors (such as visual or infrared cameras, or microwave or laser radars) observe a scene containing a number of targets, either moving or stationary. These sensors produce measurements that are analyzed by computer algorithms to detect, track and recognise the targets of interest in that scene. (Dudgen and Lacoss 1993) and (Srivastava, Miller and Grenander 1999) offer more detailed introductions.
Anuj Srivastava, Aaron D. Lanterman, Ulf Grenander, Marc Loizeaux, Michael I. Miller


Weitere Informationen