Skip to main content
main-content

Über dieses Buch

As is well known, Pontryagin's maximum principle and Bellman's dynamic programming are the two principal and most commonly used approaches in solving stochastic optimal control problems. * An interesting phenomenon one can observe from the literature is that these two approaches have been developed separately and independently. Since both methods are used to investigate the same problems, a natural question one will ask is the fol­ lowing: (Q) What is the relationship betwccn the maximum principlc and dy­ namic programming in stochastic optimal controls? There did exist some researches (prior to the 1980s) on the relationship between these two. Nevertheless, the results usually werestated in heuristic terms and proved under rather restrictive assumptions, which were not satisfied in most cases. In the statement of a Pontryagin-type maximum principle there is an adjoint equation, which is an ordinary differential equation (ODE) in the (finite-dimensional) deterministic case and a stochastic differential equation (SDE) in the stochastic case. The system consisting of the adjoint equa­ tion, the original state equation, and the maximum condition is referred to as an (extended) Hamiltonian system. On the other hand, in Bellman's dynamic programming, there is a partial differential equation (PDE), of first order in the (finite-dimensional) deterministic case and of second or­ der in the stochastic case. This is known as a Hamilton-Jacobi-Bellman (HJB) equation.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Basic Stochastic Calculus

Abstract
Stochastic calculus serves as a fundamental tool throughout this book. This chapter is meant to be a convenient “User’s Guide” on stochastic calculus for use in the subsequent chapters. Specifically, it collects the definitions and results in stochastic calculus scattered around in the literature that are related to stochastic controls. It also unifies terminology and notation (which may differ in different papers/books) that are to be used in later chapters. Proofs of the results presented in this chapter are either given (which is the case when we think that the proof is important in understanding the subsequent material and/or when there is no immediate reference available) or else referred to standard and easily accessible books. Knowledgeable readers may skip this chapter or regard it as a quick reference.
Jiongmin Yong, Xun Yu Zhou

Chapter 2. Stochastic Optimal Control Problems

Abstract
Uncertainty is inherent in most real-world systems. It places many disadvantages (and sometimes, surprisingly, advantages) on humankind’s efforts, which are usually associated with the quest for optimal results. The systems mainly studied in this book are dynamic, namely, they evolve over time. Moreover, they are described by Itô’s stochastic differential equations and are sometimes called diffusion models. The basic source of uncertainty in diffusion models is white noise, which represents the joint effects of a large number of independent random forces acting on the systems. Since the systems are dynamic, the relevant decisions (controls), which are made based on the most updated information available to the decision makers (controllers), must also change over time. The decision makers must select an optimal decision among all possible ones to achieve the best expected result related to their goals. Such optimization problems are called stochastic optimal control problems. The range of stochastic optimal control problems covers a variety of physical, biological, economic, and management systems, just to mention a few. In this chapter we shall set up a rigorous mathematical framework for stochastic optimal control problems.
Jiongmin Yong, Xun Yu Zhou

Chapter 3. Maximum Principle and Stochastic Hamiltonian Systems

Abstract
One of the principal approaches in solving optimization problems is to derive a set of necessary conditions that must be satisfied by any optimal solution. For example, in obtaining an optimum of a finite-dimensional function, one relies on the zero-derivative condition (for the unconstrained case) or the Kuhn-Tucker condition (for the constrained case), which are necessary conditions for optimality. These necessary conditions become sufficient under certain convexity conditions on the objective/constraint functions. Optimal control problems may be regarded as optimization problems in infinite-dimensional spaces; thus they are substantially difficult to solve. The maximum principle, formulated and derived by Pontryagin and his group in the 1950s, is truly a milestone of optimal control theory. It states that any optimal control along with the optimal state trajectory must solve the so-called (extended) Hamiltonian system, which is a two-point boundary value problem (and can also be called a forward-backward differential equation, to be able to compare with the stochastic case), plus a maximum condition of a function called the Hamiltonian. The mathematical significance of the maximum principle lies in that maximizing the Hamiltonian is much easier than the original control problem that is infinite-dimensional. This leads to closed-form solutions for certain classes of optimal control problems, including the linear quadratic case.
Jiongmin Yong, Xun Yu Zhou

Chapter 4. Dynamic Programming and HJB Equations

Abstract
In this chapter we turn to study another powerful approach to solving optimal control problems, namely, the method of dynamic programming. Dynamic programming, originated by R. Bellman in the early 1950s, is a mathematical technique for making a sequence of interrelated decisions, which can be applied to many optimization problems (including optimal control problems). The basic idea of this method applied to optimal controls is to consider a family of optimal control problems with different initial times and states, to establish relationships among these problems via the so-called Hamilton-Jacobi-Bellman equation (HJB, for short), which is a nonlinear first-order (in the deterministic case) or second-order (in the stochastic case) partial differential equation. If the HJB equation is solvable (either analytically or numerically), then one can obtain an optimal feedback control by taking the maximizer/minimizer of the Hamiltonian or generalized Hamiltonian involved in the HJB equation. This is the so-called verification technique. Note that this approach actually gives solutions to the whole family of problems (with different initial times and states), and in particular, the original problem.
Jiongmin Yong, Xun Yu Zhou

Chapter 5. The Relationship Between the Maximum Principle and Dynamic Programming

Abstract
In Chapters 3 and 4 we studied Pontryagin’s maximum principle (MP, for short) and Bellman’s dynamic programming (DP, for short). These two approaches serve as two of the most important tools in solving optimal control problems. Both MP and DP can be regarded as some necessary conditions of optimal controls (under certain conditions, they become sufficient ones). An interesting phenomenon one can observe from the literature is that to a great extent these two approaches have been developed separately and independently. Hence, a natural question arises: Are there any relations between these two? In this chapter we are going to address this question.
Jiongmin Yong, Xun Yu Zhou

Chapter 6. Linear Quadratic Optimal Control Problems

Abstract
We have studied general nonlinear optimal control problems for both the deterministic and stochastic cases in previous chapters. In this chapter we are going to investigate a special case of optimal control problems where the state equations are linear in both the state and control with nonhomogeneous terms, and the cost functionals are quadratic. Such a control problem is called a linear quadratic optimal control problem (LQ problem, for short). The LQ problems constitute an extremely important class of optimal control problems, since they can model many problems in applications, and more importantly, many nonlinear control problems can be reasonably approximated by the LQ problems. On the other hand, solutions of LQ problems exhibit elegant properties due to their simple and nice structures.
Jiongmin Yong, Xun Yu Zhou

Chapter 7. Backward Stochastic Differential Equations

Abstract
In Chapter 3, in order to derive the stochastic maximum principle as a set of necessary conditions for optimal controls, we encountered the problem of finding adapted solutions to the adjoint equations. Those are terminal value problems of (linear) stochastic differential equations involving the Itô stochastic integral. We call them backward stochastic differential equations (BSDEs, for short). For an ordinary differential equation (ODE, for short), under the usual Lipschitz condition, both the initial value and the terminal value problems are well-posed. As a matter of fact, for an ODE, the terminal value problem on [0, T] is equivalent to an initial value problem on [0, T] under the time-reversing transformation t T — t. However, things are fundamentally different (and difficult) for BSDEs when we are looking for a solution that is adapted to the given filtration. Practically, one knows only about what has happened in the past, but cannot foretell what is going to happen in the future. Mathematically, it means that we would like to keep the context within the framework of the Itô-type stochastic calculus (and do not want to involve the so-called anticipative integral). As a result, one cannot simply reverse the time to get a solution for a terminal value problem of SDE, as it would destroy the adaptiveness. Therefore, the first issue one should address in the stochastic case is how to correctly formulate a terminal value problem for stochastic differential equations (SDEs, for short).
Jiongmin Yong, Xun Yu Zhou

Backmatter

Weitere Informationen