Skip to main content

1992 | Buch

Numerical Methods for Stochastic Control Problems in Continuous Time

verfasst von: Harold J. Kushner, Paul G. Dupuis

Verlag: Springer US

Buchreihe : Stochastic Modelling and Applied Probability

insite
SUCHEN

Über dieses Buch

This book is concerned with numerical methods for stochastic control and optimal stochastic control problems. The random process models of the controlled or uncontrolled stochastic systems are either diffusions or jump diffusions. Stochastic control is a very active area of research and new prob­ lem formulations and sometimes surprising applications appear regularly. We have chosen forms of the models which cover the great bulk of the for­ mulations of the continuous time stochastic control problems which have appeared to date. The standard formats are covered, but much emphasis is given to the newer and less well known formulations. The controlled process might be either stopped or absorbed on leaving a constraint set or upon first hitting a target set, or it might be reflected or "projected" from the boundary of a constraining set. In some of the more recent applications of the reflecting boundary problem, for example the so-called heavy traffic approximation problems, the directions of reflection are actually discontin­ uous. In general, the control might be representable as a bounded function or it might be of the so-called impulsive or singular control types. Both the "drift" and the "variance" might be controlled. The cost functions might be any of the standard types: Discounted, stopped on first exit from a set, finite time, optimal stopping, average cost per unit time over the infinite time interval, and so forth.

Inhaltsverzeichnis

Frontmatter
Introduction
Abstract
This book is concerned with numerical methods for stochastic control and optimal stochastic control problems. The random process models of the controlled or uncontrolled stochastic systems are either diffusions or jump diffusions. Stochastic control is a very active area of research and new problem formulations and sometimes surprising applications appear regularly. We have chosen forms of the models which cover the great bulk of the formulations of the continuous time stochastic control problems which have appeared to date. The standard formats are covered, but much emphasis is given to the newer and less well known formulations. The controlled process might be either stopped or absorbed on leaving a constraint set or upon first hitting a target set, or it might be reflected or “projected” from the boundary of a constraining set. In some of the more recent applications of the reflecting boundary problem, for example the so-called heavy traffic approximation problems, the directions of reflection are actually discontinuous. In general, the control might be representable as a bounded function or it might be of the so-called impulsive or singular control types. Both the “drift” and the “variance” might be controlled. The cost functions might be any of the standard types: Discounted, stopped on first exit from a set, finite time, optimal stopping, average cost per unit time over the infinite time interval, and so forth. There might be separate costs when the process is on the boundary and when it is in the interior of the set of interest. In fact all of the standard cost functionals can be dealt with by the methods to be presented. There is a close connection between approximation methods for stochastic control and those for optimal nonlinear filtering, and approximation methods for the latter problem are also discussed.
Harold J. Kushner, Paul G. Dupuis
1. Review of Continuous Time Models
Abstract
In this book we will consider methods of numerically computing the value function for certain classes of controlled continuous time stochastic and deterministic processes. The purpose of the present chapter is to provide an introduction to and some of the background material for controlled diffusions and controlled jump diffusions. These types of processes include many of the models that are commonly used. This section is only intended to serve as a review of the main ideas and for purposes of reference. Other models (e.g., singularly controlled diffusions) that are also of interest will be introduced and elaborated on in the appropriate later sections of the book.
Harold J. Kushner, Paul G. Dupuis
2. Controlled Markov Chains
Abstract
The main computational techniques in this book require the approximation of an original controlled processes in continuous time by appropriately chosen controlled finite state Markov chains. In this chapter, we will define some of the canonical control problems for the Markov chain models which will be used in the sequel as “approximating processes.” The cost functions will be defined. The functional equations which are satisfied by these cost functions for fixed controls, as well as the functional equations satisfied by the optimal cost functions (the dynamic programming or Bellman equation), will be obtained by exploiting the Markov property and the uniqueness of their solutions is shown, under appropriate conditions. These are the equations which will have to be solved in order to get the required approximate solutions to the original control or optimal control problem. The simplest case, where there is no control or where the control is fixed, is dealt with in Section 2.1, and the recursive equations satisfied by the cost functionals are obtained. A similar method is used to get the recursive equations for the optimal value functions for the controlled problems. The optimal stopping problem is treated in Section 2.2. This is a relatively simple control problem, because the only decision to be made is the choice of the moment at which the process is to be stopped. This problem will illustrate the basic ideas of dynamic programming for Markov chains and introduce the fundamental principle of optimality in a simple way. Section 2.3 concerns the general discounted cost problem. Section 2.4 deals with the optimization problem when the control stops at the first moment of reaching a target or stopping set.
Harold J. Kushner, Paul G. Dupuis
3. Dynamic Programming Equations
Abstract
In this chapter we define many of the standard control problems whose numerical solutions will concern us in the subsequent chapters. Other, less familiar control problems will be discussed separately in later chapters. We will first define cost functionals for uncontrolled processes, and then formally discuss the partial differential equations which they satisfy. Then the cost functionals for the controlled problems will be stated and the partial differential equations for the optimal cost formally derived. These partial differential equations are generally known as Bellman equations or dynamic programming equations. The main tool in the derivations is Itô’s formula.
Harold J. Kushner, Paul G. Dupuis
4. The Markov Chain Approximation Method: Introduction
Abstract
The main purpose of the book is the development of numerical methods for the solution of control or optimal control problems, or for the computation of functionals of the stochastic processes of interest, of the type described in Chapters 3, 7, 8, 9, 12, and 13. It was shown in Chapter 3 that the cost or optimal cost functionals can be the (at least formal) solutions to certain nonlinear partial differential equations. It is tempting to try to solve for or approximate the various cost functions and optimal controls by dealing directly with the appropriate PDE’s, and numerically approximating their solutions. A basic impediment is that the PDE’s often have only a formal meaning, and standard methods of numerical analysis might not be usable to prove convergence of the numerical methods. For many problems of interest, one cannot even write down a partial differential equation. The Bellman equation might be replaced by a system of “variational inequalities,” or the proper form might not be known. Optimal stochastic control problems occur in an enormous variety of forms. As time goes on, we learn more about the analytical methods which can be used to describe and analyze the various optimal cost functions, but even then it seems that many important classes of problems are still not covered and new models appear which need even further analysis. The optimal stochastic control or stochastic modelling problem usually starts with a physical model, which guides the formulation of the precise stochastic process model to be used in the analysis. One would like numerical methods which are able to conveniently exploit the intuition contained in the physical model.
Harold J. Kushner, Paul G. Dupuis
5. Construction of the Approximating Markov Chain
Abstract
In this chapter we develop some canonical methods for obtaining approximating Markov chains which are locally consistent with the controlled diffusion in the sense used in (4.1.3). We also deal with the controlled jump diffusion and reflected processes in Sections 5.6 and 5.7. The chapter outlines two classes of basic methods and a number of variations of each of them. One purpose is to describe methods which are readily programmable. But we also wish to show the versatility and intuitive nature of the general approach. There are many variations of the methods discussed. Once the general procedures are clear, the reader can adapt them to particular problems which might not be directly covered by the discussion. The development does not directly cover processes on manifolds such as the surface of a sphere or torus, but the various possibilities should be apparent.
Harold J. Kushner, Paul G. Dupuis
6. Computational Methods for Controlled Markov Chains
Abstract
The chapter presents many of the basic ideas which are in current use for the solution of the dynamic programming equations for the optimal control and value function for the approximating Markov chain models. We concentrate on methods for problems which are of interest over a potentially unbounded time interval. Numerical methods for the ergodic problem will be discussed in Chapter 7, and are simple modifications of the ideas of this chapter. Some approaches to the numerical problem for the finite time problem will be discussed in Chapter 12.
Harold J. Kushner, Paul G. Dupuis
7. The Ergodic Cost Problem: Formulation and Algorithms
Abstract
In this chapter, we reformulate some of the concepts in the last chapter so that they can be used on the ergodic cost problem. Before doing that it is useful to discuss the appropriate dynamic programming equations and some additional background material. The natural state spaces for control problems that are of interest over a long time interval are often unbounded, and they must be truncated for numerical purposes. One standard way of doing this involves a reflecting boundary, and this is the case dealt with in this chapter. Thus, there are no absorbing states. The basic process is the controlled diffusion (5.3.1) or jump diffusion (5.6.1). The approximating Markov chains \(\left\{ \varepsilon _{n}^{h} \right.,n<\left. \infty \right\}\)will be locally consistent with these processes. As in the previous chapters, Sh denotes the state space, and\(\partial G_{h}^{+}\subset {{S}_{h}}\)the set of reflecting states. Recall that the reflection is instantaneous and that we use\(\Delta {{t}^{h}}\left( x,a \right)=0\) for \(x\in7\partial G_{h}^{+}\).For a feedback control u(·), the cost function of interest for the original process is
$$\gamma \left( x,u \right)=\underset{T}{\mathop{\lim \sup }}\,\frac{1}{T}\int_{0}^{T}{E_{x}^{u}}k\left( x\left( s \right),u\left( x\left( s \right) \right) \right)ds,$$
where the function k(·) is continuous. If the limit does not depend on the initial state x, then we omit it from the notation.
Harold J. Kushner, Paul G. Dupuis
8. Heavy Traffic and Singular Control Problems: Examples and Markov Chain Approximations
Abstract
Many of the process models which are used for purposes of analysis or control are approximations to the true physical model. Perhaps the dimension of the actual physical model is very high, or it might be difficult to define a manageable controlled dynamical (Markov) system model which describes well the quantities of basic interest. Sometimes the sheer size of the problem and the nature of the interactions of the component effects allows a good approximation to be made, in the sense that some form of the central limit theorem might be used to “summarize” or “aggregate” many of the random influences and provide a good description of the quantities of interest. Because these simpler or aggregate models will be used in an optimization problem, we need to be sure that optimal or nearly optimal controls (and the minimum value function, resp.) for the aggregated problem will also be nearly optimal for the actual physical problem (and a good approximation to the associated minimum value function, resp.).
Harold J. Kushner, Paul G. Dupuis
9. Weak Convergence and the Characterization of Processes
Abstract
This chapter begins the section of the book devoted to the convergence proofs and related matters. The purpose of the chapter is to introduce the mathematical machinery that is needed in the later chapters. Because particular applications are intended, we do not, in general, give the most elaborate versions of the theorems to be presented.
Harold J. Kushner, Paul G. Dupuis
10. Convergence Proofs
Abstract
This chapter is the core of the mathematical part of the book. It deals with the approximation and convergence theorems for the basic problem classes: discounted problems with absorbing boundaries; diffusion and jump diffusion models; optimal stopping problems, and problems where we stop on hitting a target set and where there is no discounting. The convergence results for the case of reflecting boundaries and the singular and ergodic control problems will appear in the next chapter.
Harold J. Kushner, Paul G. Dupuis
11. Convergence for Reflecting Boundaries, Singular Control and Ergodic Cost Problems
Abstract
The development of the convergence proofs of Chapter 10 is continued, but applied to the problem classes of Chapters 7 and 8. The reflecting boundary and discounted cost problem is covered in Section 11.1. The primary mathematical difficulty with which we must contend is the proof of tightness of the “reflecting process.” The problem is avoided by use of a time rescaling method, under which all the processes are tight. After proving the weak convergence of the rescaled processes and characterizing the limits, the rescaling is inverted to obtain the desired results. This “inversion” is possible due to the conditions imposed on the allowable reflection directions. The time rescaling idea appears to be a rather powerful tool.
Harold J. Kushner, Paul G. Dupuis
12. Finite Time Problems and Nonlinear Filtering
Abstract
The problems considered in Chapters 10 and 11 were of interest over an unbounded time interval, and one did not need to keep track of the actual value of the current time. The cost functions were all over an unbounded interval, whether of the discounted form, of the form where control stops when a set is first exited, or of the average cost per unit time type. Time did not enter explicitly into the stopping criterion. All of the results and problem formulations (except for the average cost per unit time formulation) can be readily extended to problems of interest over a given bounded interval only. Owing to the explicit use of a bounded time interval, there are several variations of the locally consistent approximating Markov chains which might be used. They can all be easily derived from those of Chapter 5. The approximating chains are loosely divided into the “explicit” and “implicit” classes, depending on the treatment of time, somewhat analogous to the classification in classical numerical analysis. Section 12.1 contains an example to motivate the general form of the “explicit” method, and the general case is dealt with in Section 12.2. A motivating example for the “implicit” method appears in Section 12.3, and the general case is given in Section 12.4. Various combinations of these methods can be used. An optimal control problem is formulated in Section 12.5, and the numerical questions as well as the convergence of the algorithms is dealt with in Section 12.6. It turns out that the natural analogues of all of the models of control problems of the previous chapters (with the above cited “ergodic” exception) can be dealt with by the previous proofs with little change.
Harold J. Kushner, Paul G. Dupuis
13. Problems from the Calculus of Variations
Abstract
A large class of deterministic optimal control problems are special cases of the stochastic optimal control problems considered previously. This is true both with respect to the construction of schemes as well as the proofs of convergence. In fact, the convergence proofs become much simpler in the deterministic setting.
Harold J. Kushner, Paul G. Dupuis
14. The Viscosity Solution Approach to Proving Convergence of Numerical Schemes
Abstract
In Chapters 10 to 13, we have shown the convergence of properly designed numerical approximations for a wide range of stochastic and deterministic optimal control problems. The approach to proving the convergence has been based on demonstrating the convergence of a sequence of controlled Markov chains to a controlled process (diffusion, jump diffusion, etc.) appropriate to the given stochastic or deterministic optimal control problem.
Harold J. Kushner, Paul G. Dupuis
Backmatter
Metadaten
Titel
Numerical Methods for Stochastic Control Problems in Continuous Time
verfasst von
Harold J. Kushner
Paul G. Dupuis
Copyright-Jahr
1992
Verlag
Springer US
Electronic ISBN
978-1-4684-0441-8
Print ISBN
978-1-4684-0443-2
DOI
https://doi.org/10.1007/978-1-4684-0441-8