main-content

## Über dieses Buch

This IMA Volume in Mathematics and its Applications STOCHASTIC DIFFERENTIAL SYSTEMS, STOCHASTIC CONTROL THEORY AND APPLICATIONS is the proceedings of a workshop which was an integral part of the 1986-87 IMA program on STOCHASTIC DIFFERENTIAL EQUATIONS AND THEIR APPLICATIONS. We are grateful to the Scientific Committee: Daniel Stroock (Chairman) WendeIl Flerning Theodore Harris Pierre-Louis Lions Steven Orey George Papanicolaou for planning and implementing an exciting and stimulating year-long program. We es­ pecially thank WendeIl Fleming and Pierre-Louis Lions for organizing an interesting and productive workshop in an area in which mathematics is beginning to make significant contributions to real-world problems. George R. Seil Hans Weinberger PREFACE This volume is the Proceedings of a Workshop on Stochastic Differential Systems, Stochastic Control Theory, and Applications held at IMA June 9-19,1986. The Workshop Program Commit­ tee consisted of W.H. Fleming and P.-L. Lions (co-chairmen), J. Baras, B. Hajek, J.M. Harrison, and H. Sussmann. The Workshop emphasized topics in the following four areas. (1) Mathematical theory of stochastic differential systems, stochastic control and nonlinear filtering for Markov diffusion processes. Connections with partial differential equations. (2) Applications of stochastic differential system theory, in engineering and management sci­ ence. Adaptive control of Markov processes. Advanced computational methods in stochas­ tic control and nonlinear filtering. (3) Stochastic scheduling, queueing networks, and related topics. Flow control, multiarm bandit problems, applications to problems of computer networks and scheduling of complex manufacturing operations.

## Inhaltsverzeichnis

### Optimality of “full bang to reduce predicted miss” for some partially observed stochastic control problems

For final value stochastic control problems, the “predicted miss” of the title is the expected final position, conditional on the cumulative information to date, if no further control is exerted. For partially observed problems with bounded control, similar to some proposed by Åström, we use PDE methods to show that predicted miss defines the switching surfaces for an optimal “bang-bang” control law whose gist, simply stated, is to reduce predicted miss. The surfaces in question are explicitly calculated.

V. E. Beneš, R. W. Rishel

### On Some Approximation Techniques in Non Linear Filtering

The following problem has been considered by J. Picard [3]. Let a signal be governed by the equation 1.1 $$^{\begin{array}{*{20}{c}} {dx = b\left( x \right)dt + {\varepsilon ^r}\sigma \left( x \right)d{w^1}} \\ {x\left( 0 \right) = \xi } \end{array}}$$ This signal is “observed” by the process 1.2 $$\begin{array}{*{20}{c}} {dy = h\left( x \right)dt + {\varepsilon ^{1 - }}{}^rd{w^2}} \\ {y\left( 0 \right) = 0} \end{array}$$ where wl, w2 are two independent Wiener processes, and ξ is a random variable independent of wl, w2. The parameter ε is small and 0 < γ < ½. This assumption means that there is a high signal to noise ratio. Let 1.3 $$\phi \left( t \right) = E\left[ {\phi \left( {{x_t}} \right)\left| { y\left( s \right), 0 \leqslant s \leqslant t} \right.} \right]$$

A. Bensoussan

### Applications of Homogenization Theory to the Control of Flexible Structures

Under certain natural conditions the dynamics of large, low mass lattice structures with a regular infrastructure are well approximated by the dynamics of continua, e.g., trusses may be modeled by beam equations. Using a technique from the mathematics of asymptotic analysis called homogenization, we show how such approximations may be derived in a systematic way which avoids errors made using “direct” averaging methods. We also develop a model for the combined problem of homogenization and control of vibrations in lattice structures and present some preliminary analysis of this problem.

Gilmer L. Blankenship

### Control of Markov Chains with Long-Run Average Cost Criterion

The long-run average cost control problem for discrete time Markov chains is studied in an extremely general framework. Existence of stable stationary strategies which are optimal in the appropriate sense is established and these are characterized via the dynamic programming equations. The approach here differs from the conventional approach via the discounted cost problem and covers situations not covered by the latter.

Vivek S. Borkar

### Automatic Study in Stochastic Control

The purpose of this paper is to give an example of automatic generation of a complete study in stochastic control done by an expert system designed at INRIA by the authors. This study includes the: automatic choice of an algorithmautomatic checking of the mathematical well posedness of the problemautomatic generation of a numerical routineautomatic test of this routine on a numerical exampleautomatic generation of graphicsautomatic writing of a report describing all the obtained results.

J. P. Chancelier, C. Gomez, J. P. Quadrat, A. Sulem

### Some Results on Kolmogoroff Equations for Infinite Dimensional Stochastic Systems

Consider the following differential stochastic equation 1.1 $$du = (Bu + f(u))dt + G(u)dWt;u(0) = x$$ where B: D(B) ⊂ H → H is a linear operator (generally unbounded) in a real separable Hilbert space H, f is a smooth function from H to H, W t is a K-valued Brownian motion in a probability space (Ω, ε, P) and K is another Hilbert space. Finally G: D(G) ⊂ H → L(K; H) is a linear operator (generally unbounded) from H into L(K; H). S is a positive nuclear operator in K such that cov(W t ) = tS.

G. Da Prato

### Hamilton-Jacobi Equations with Constraints

Let us consider Hamilton-Jacobi equations of the form 1 $$u(x) + H(x,Du(x)) = 0,x\in \Omega,$$ where Ω is a smooth bounded open subset of Rn, H is a given continuous real valued function of $$(x,p) \in \overline\Omega\times {R^n}$$ and $$Du = ({u_{{x_1}}}, \ldots ,{u_{{x_N}}})$$ is the gradient of the unknown function u.

I. Capuzzo-Dolcetta

### An Approximate Minimum Principle for a Partially Observed Markov Chain

The optimal control of a partially observed Markov chain is considered in separated form by discussing the related Zakai equation satisfied by the unnormalized densities. This approach was introduced by Davis [1], [2]. An optimal principle for a partially observed Markov process was also obtained by Rishel in [9]. The observation process is a point process whose intensity is a function of the Markov chain. The set of admissible controls is shown to be a complete, metric space on which the cost function is continuous. Following techniques used in [7] a minimum principle for an almost optimal control is obtained using a variational inequality of Ekeland [3], and the co-state variables are identified. The results of this paper are an outcome of the author’s visit to the University of Maryland in the spring of 1986 and he wishes to express his gratitude to Professor Baras and the Systems Research Center.

Robert J. Elliott

### Generalized Solutions in the Optimal Control of Diffusions

We consider optimal control of Markov diffusion procesess n-dimensional Rn, on a finite time interval t ≤ s ≤ T. The dynamics of the process x s being controlled are governed by the stochastic differential equation 1$$d{x_s} = b(s,x,{u_s})ds + \sigma (s,{x_s},{u_s})d{\omega _s}$$ with initial data x s = x. The control process us takes values in a control space Y, and is progressively measurable with respect to the filtration of the brownian motion process w s . The objective is to minimize an expected cost 1.2$${J^U}(t,x) = {E_{tx}}\smallint _t^t{\ell _o}(s,{x_s}{u_s})ds$$ .

Wendell H. Fleming, Domokos Vermes

### Consistency of Maximum Likelihood and Pseudo-Likelihood Estimators for Gibbs Distributions

We prove that the Maximum Likelihood and Pseudo-likelihood estimators for the parameters of Gibbs distributions (equivalently Markov Random Fields) over ℤd, d≥l, are consistent even at points of “first” or “higher-order” phase transitions. The distributions are parametrized by points in a finite-dimensional Euclidean space ℝm, m≥l, and the single spin state space is either a finite set or a compact metric space. Also, the underlying interactions need not be of finite range.

B. Gidas

### Brownian Models of Queueing Networks with Heterogeneous Customer Populations

Consider an open queueing network with I single-server stations and K customer classes. Each customer class requires service at a specified station, and customers change class after service in a Markovian fashion. (With K allowed to be arbitrary, this routing structure is almost perfectly general.) There is a renewal input process and general service time distribution for each class. The correspondence between customer classes and service stations is in general many to one, and the service discipline (or scheduling protocol) at each station is left as a matter for dynamic decision making.

J. Michael Harrison

### Non-Linear Filtering — The Degenerate Case

We consider the non-linear filtering problem where the state (or signal) satisfies 1.1$$d{X_t} = b(t,{X_t},Y)dt + \sigma (t,{X_t},Y)d{w_t}$$ and the observation satisfies 1.2$$d{Y_t} = h(t,{X_t})dt + d{\tilde w_t}$$

U. G. Haussmann

### The Asymptotic Behaviour of the Maximum Likelihood Estimates for a Class of Diffusion Processes

Let the 2-dimensional process (X t ) satisfy the stochastic differential equation $$d{{X}_{t}} = \theta J{{X}_{t}} + d{{W}_{t}},X\left( 0 \right) = \xi$$, where (W t ) is a 2-dimensional Wiener process starting at zero, $$J = ({}_{1{\text{ 0}}}^{0{\text{-1}}})$$ and θ is an unknown real-valued parameter. Let $$\hat \theta {}_T$$ denote the maximum likelihood estimate of θ given the observations (X s ) up to time T. We shall show that $$_{T \to \infty }^{\lim {\text{ sup}}}\left[ {\frac{T}{{\lg \lg T}}} \right.\left. {\left| {{{\hat \theta }_T} - \theta } \right|} \right] \leqslant 1{\text{ a}}{\text{.e}}{\text{.}}$$

Kurt Helmes

### The Filtering Problem for Infinite Dimensional Stochastic Processes

The paper presents some recently obtained results on the nonlinear filtering problem for infinite dimensional processes. The optimal filter is obtained as the unique solution of certain measure valued equations. Robustness properties — both pathwise and statistical — are given and a preliminary result shows consistency with the stochastic calculus theory. Applications to random fields and models of voltage potential in neurophysiology are briefly discussed.

G. Kallianpur, R. L. Karandika

### Stochastic Control Under Finite-Fuel Constraints

The aim of this lecture is to present an overview of some recent results on control problems of the finite fuel type for Brownian motion, in which there is a constraint on the total amount of controlling effort that can be expended.

Ioannis Karatzas

### Recent Advances in the Theory of Stochastic Adaptive Control

In recent years much progress has been made on the theory of adaptive control for linear stochastic systems. The stability of some simple adaptive control laws, their optimality with respect to the minimum output variance criterion, and their self-tuning property for both the regulation and tracking problems, have all been proved. In this paper we outline these recent advances in the theory of stochastic adaptive control and also indicate some of the open problems.

P. R. Kumar

### Almost Optimal Controls for Wideband Noise Driven Systems

The work reported here is based on the joint paper [1] of the author and W. Runggaldier.

Harold J. Kushner

### Asymptotic Solutions of Bandit Problems

Some recent results on asymptotically optimal solutions of bandit problems are reviewed and discussed herein. The problems considered include (a) the classical “closed bandit problem” of adaptive allocation involving k statistical populations, and (b) the “open bandit problem” of priority scheduling in a queueing network. Making use of the interconnections between the discounted and finite-horizon formulations of these problems, we also suggest certain heuristic arguments that lead to simple asymptotic solutions.

T. L. Lai

### Viscosity Solutions of Second-Order Equations, Stochastic Control and Stochastic Differential Games

In this note we review, explain and detail some recent results concerning the possible relations between various value functions of general optimal stochastic control and stochastic differential games and the viscosity solutions of the associated Hamilton-Jacobi-Bellman HJB and Bellman-Isaacs BI equations. It is well-known that the derivation of these equations is heuristic and it is justified only when the value functions are smooth enough (W.H. Fleming and R. Richel [15]). On the other hand, the equations are fully nonlinear, second-order, elliptic but possibly degenerate. Smooth solutions do not exist in general and nonsmooth solutions (like Lipschitz continuous solutions in the deterministic case) are highly nonunique. (For some simple examples we refer to P.-L. Lions [24]). As far as the first-order Hamilton-Jacobi equations are concerned, to overcome these typical difficulties and related ones like numerical approximations, asymptotic problems etc. M.G. Crandall and P.-L. Lions [8] introduced the notion of viscosity solutions and proved general uniqueness results. A systematic exploration of several equivalent formulations of this notion and an easy and readable account of the typical uniqueness results may be found in M.G. Crandall, L.C. Evans and P.-L. Lions [6]. It was also observed in P.-L. Lions [24] that the classical derivation of the Bellman equation for deterministic control problems can be easily adapted to yield the following general fact: Value functions of deterministic control problems are always viscosity solutions of the associated Hamilton-Jacobi-Bellman equations. The uniqueness of viscosity solutions and the above fact imply then a complete characterization of the value functions. This observation was then extended to differential games by E.N. Barron, L.C. Evans and R. Jensen [3], P.E. Souganidis [36] and L.C. Evans and P.E. Souganidis [14].

P.-L. Lions, P. E. Souganidis

### On the Memory Length of the Optimal Nonlinear Filter

Consider the standard one-dimensional nonlinear filtering problem with plant diffusion coefficient εγ and observation noise coefficient ε1-γ, stationarity of the plant is assumed. Definitions for the memory length of the optimal non-linear filter are suggested and it is shown that the memory length tends to zero as ε → 0 for γ < 1/2 and is bounded away from 0 (uniformly in ε) for γ > 1/2. It is pointed out that while for ε → 0, γ < 1/2, linear filtering is nearly optimal, this is not the case for γ > 1/2 and an approximate asymptotically optimal non-linear filter is suggested for a particular generic example.

E. Mayer-Wolf, M. Zakai, O. Zeitouni

### Implementation Issues for Markov Decision Processes

In this paper, the problem of steering a lon-run average cost functional to a prespecified value is discussed in the context of Markov decision processes with countable state-space; this problem naturally arises in the study of constrained Markov decision processes by Lagrangian arguments. Under reasonable assumptions, a Markov stationary steering control is shown to exist and to be obtained by fixed memoryless randomization between two Markov stationary policies. The implementability of this randomized policy is investigated in view of the fact that the randomization bias is solution to a (highly) nonlinear equation, which may not even be available in the absence of full knowledge of the model parameter values. Several proposals for implementation are made and their relative properties discussed. The paper closes with an outline of a methodology that was found useful in investigating properties of Certainty Equivalence implementations.

Armand M. Makowski, Adam Shwartz

### Navigating and Stopping Multi-Parameter Bandit Processes

In recent years, considerable effort has been devoted to the development of a theory for multi-parameter processes. These are stochastic processes that evolve in “time” which is only partially ordered. The multi-parameter theory provides a natural way to formulate problems in dynamic allocation of resources, including discrete and continuous time multi-armed bandits as special cases. Multi-parameter processes that describe a game played by a gambler against a multi-armed bandit are called bandit processes. My talk will focus on two control problems for bandit processes. The first problem, the optimal stopping problem, is that of a gambler who can stop playing at any time. The reward from the game depends only on the state of affairs at the time of stopping, and the gambler’s problem is to choose an optimal stopping time. In the second problem, the optimal navigation problem, the gambler plays forever and seeks to maximize total discounted reward over an infinite horizon.

Avi Mandelbaum

### Bounded Variation Control of a Damped Linear Oscillator Under Random Disturbances

In this paper we give an overview of some recent works on the optimal control of a damped linear oscillator by means of monotone or bounded variation policies in the presence of random disturbances. We refer to Gorbunov [8] Bancora-Imbert et al. [1]. Chow and Menaldi [3.4], Sun and Menaldi [24].

Jose-Luis Menaldi

### The Support of the Law of a Filter in C ∞ Topology

In their well known paper, [18], D.W. Stroock and S. Varadhan proved that the support of the law of a diffusion process solution of a stochastic differential equation: $$d{x_t} = {X_0}({x_t})dt + {X_i}({x_t})^\circ dw_t^i$$ can be described as the closure (for the natural Banach topology on C([0,1], ℝn) of the set of solutions of the following controlled systems: $$\begin{array}{*{20}{c}} {\dot{x}_{t}^{u} = {{X}_{0}}\left( {x_{t}^{u}} \right) + {{X}_{i}}\left( {x_{t}^{u}} \right)\dot{u}_{t}^{i}, u \in {{H}^{1}}\left( {\left[ {0,1} \right],{{\mathbb{R}}^{n}}} \right)} \hfill \\ {x_{o}^{u} = {{x}_{0}}} \hfill \\ \end{array}$$

M. Chaleyat-Maurel, D. Michel

### Existence of Densities for Statistics in the Cubic Sensor Problem

Let $${\mathop \phi \limits^ \wedge _1}, \ldots ,{\mathop \phi \limits^ \wedge _n}$$ be the unnormalized conditional estimates of Φ1(X(t)),…,Φn(X(t)) in the cubic sensor problem. For arbitrary n and linearly independent Φ1,…, Φn we show that the random variable $$\Phi = \left( {{{\mathop \phi \limits^ \wedge }_1}, \ldots ,{{\mathop \phi \limits^ \wedge }_n}} \right)$$ admits a density. This strongly precludes the existence of a finite dimensionally computable realization of the unnormalized conditional density, and it provides an example for a general theorem given in [6].

Daniel Ocone

### Piecewise Linear Filtering

We consider a nonlinear filtering problem, whose dynamics is “piecewise linear”. We construct a suboptimal filter consisting of a finite number of Kalman filters working in parallel, and exchanging their information at times δ, 2δ, 3δ …. We establish the convergence of that filter towards the optimal filter, as δ → 0.

E. Pardoux, C. Savona

### Quick Simulation of Excessive Backlogs in Networks of Queues

We consider stable open Jackson networks and study the rare events of excessive backlogs. Although, these events occur rarely they can be critical, since they can impair the functioning of the network. We attempt to estimate the probability of these events by simulations. Since, the direct simulation of rare events takes a very long time, this procedure is very costly. Instead, we devise a method for changing the network to speed up the simulation of rare events. We try to pursue this idea with the help of Large Deviation theory. This approach, under certain assumptions, results in a system of differential equations which may be difficult to solve. To circumvent this, we develop a heuristic method which gives the rule for changing the network for the purpose of simulations. We illustrate, by examples, that our method of simulations can be several orders of magnitude faster than direct simulations.

Shyam Parekh, Jean Walrand

### On Some Perturbation Problems in Optimal Stopping and Impulse Control

In this paper, we study the behaviour of dynamic programming inequalities for optimal stopping and impulsive control of Markov processes when the generator of the uncontrolled process is - Aε such that $${A_\varepsilon } = A + {\varepsilon ^{ - 1}}B{\kern 1pt} and{\kern 1pt} \varepsilon \to o.$$

Maurice Robin

### Optimal Control of Jump-Markov Processes and Viscosity Solutions

We investigate the Bellman equation that arises in the optimal control of Markov processes. This is a fully nonlinear integro-differential equation. The notion of viscosity solutions is introduced and then existence and uniqueness results are obtained. Also, the connection between the optimal control problem and the Bellman equation is developed.

Halil Mete Soner

### An Introduction to Singular Stochastic Control

Bounded variation stochastic control may be defined to include any stochastic control problem in which one restricts the cumulative displacement of the state caused by control to be of bounded variation on finite time intervals. In classical control problems, this cumulative displacement is the integral of the control process (or some function of it), and so is absolutely continuous. In impulse control (see Bensoussan & Lions (1978)), this cumulative displacement has jumps, between which it is either constant or absolutely continuous. Bounded variation control admits both these possibilities and also the possibility that the displacement of the state caused by the optimal control is singularly continuous, at least with positive probability over some interval of time. Problems which exhibit this feature will be called singular, and these are the objects of interest of the present paper.

Steven E. Shreve

### Scheduling, Routing, and Flow Control in Stochastic Networks

Queueing models are frequently helpful in the analysis and control of communication, manufacturing, and transportation systems. The theory of Markov decision processes and the inductive techniques of dynamic programming have been used to develop normative models for optimal control of admission, servicing, routing, and scheduling of jobs in queues and networks of queues. We review some of these models, beginning with single-facility models and then progressing to models for networks of queues. We emphasize the use of induction on a sequence of successive approximations of the optimal value function (value iteration) to establish the form of optimal control policies.

Shaler Stidham

### Product Expansions of Exponential Lie Series and the Discretization of Stochastic Differential Equations

We show how to rewrite the usual discretization schemes for stochastic differential equations in a way that uses the minimum necessary number of nonredundant iterated integrals. Also, we obtain schemes with the property that, if the true solution evolves in a submanifold, then the same is true of the approximation given by the scheme.

H. J. Sussmann

### A Survey of Large Time Asymptotics of Simulated Annealing Algorithms

Simulated annealing is a probabilistic algorithm for minimizing a general cost function which may have multiple local minima The amount of randomness in this algorithm is controlled by the “temperature”, a scalar parameter which is decreased to zero as the algorithm progresses. We consider the case where the minimization is carried out over a finite domain and we present a survey of several results and analytical tools for studying the asymptotic behavior of the simulated annealing algorithm, as time goes to infinity and temperature approaches zero.

John N. Tsitsiklis

### Stochastic Scheduling on Parallel Processors and Minimization of Concave Functions of Completion Times

We consider a stochastic scheduling problem in which n jobs are to be scheduled on m identical processors which operate in parallel. The processing times of the jobs are not known in advance but they have known distributions with hazard rates ρ1, (t), …, ρ n (t). It is desired to minimize the expected value of к(C), where C i is the time at which job i is completed C = (C1, …, C n ), and к(C) is increasing and concave in C. Suppose processor i first becomes available at time τ i . We prove that if there is a single static list priority policy which is optimal for every τ = (τ1, …, τ m ), then the minimal expected cost must be increasing and concave in τ. Moreover, if к(C) is supermodular in C then this cost is also supermodular in τ. This result is used to prove that processing jobs according to the static list priority order (1,2,…,n) minimizes the expected value of ∑w i h(C i ), when h(·) is a nondecreasing, concave function, w 1 ≥ … ≥ w n , and ρ1 (t1)w1 ≥ … ≥ ρ n (t n )w n for all t1, …, t n .

Richard R. Weber
Weitere Informationen