On sets of occupational measures generated by a deterministic control system on an infinite time horizon

doi:10.1016/j.na.2013.03.015

Nonlinear Analysis: Theory, Methods & Applications

Volume 88, September 2013, Pages 27-41

https://doi.org/10.1016/j.na.2013.03.015 Get rights and content

Abstract

We give a representation for the closed convex hull of the set of discounted occupational measures generated by control-state trajectories of a deterministic control system. We also investigate the limit behavior of the latter when the discount factor tends to zero and compare it with the limit behavior of the long run time average occupational measures set. The novelty of our results is in that we allow the control set dependence on the state variables that make the results to be applicable to differential inclusions.

Section snippets

Introduction and preliminaries

It is well known that nonlinear optimal control problems can be equivalently reformulated as infinite dimensional linear programming problems considered on spaces of occupational measures generated by control-state trajectories. Having many attractive features and being applicable in both stochastic and deterministic settings, the linear programming (LP) based approaches to optimal control problems have been intensively studied in the literature. Important results justifying the use of LP

A representation of the set of discounted occupational measures

Lemma 2.1

The following inclusion is valid $\bar{c o} Γ_{d i s}^{r} (C, y_{0}) \subset W (C, y_{0}),$ where $\bar{c o}$ stands for the closed convex hull, and $W (C, y_{0}) \overset{def}{=} {γ \in P (K) : \int_{K} (\nabla ϕ {(y)}^{T} f (u, y) + C (ϕ (y_{0}) - ϕ (y))) γ (d u, d y) = 0 \forall ϕ \in C^{1}} .$

Proof

Take arbitrary $γ \in Γ_{d i s}^{r} (C, y_{0})$ . By definition, there exists a relaxed $y_{0}$ -admissible pair $(μ (\cdot), y (\cdot))$ such that $γ$ is the discounted occupational measure generated by this pair. Using the fact that (11) is valid for any continuous function $h (u, y)$ , one can obtain $\int_{K} \nabla ϕ {(y)}^{T} f (u, y) γ (d u, d y) = C \int_{0}^{\infty} e^{- C t} {(\nabla ϕ (y (t)))}^{T} (\int_{U (y (t))} f (u, y (t)) μ (t, d u)) d t = C$

Proof of Theorem 2.2

We will divide the proof into four steps.

(i) Auxiliary relaxed admissible pairs. Let the multivalued function $F (\cdot)$ be defined by the equation $F (y) \overset{def}{=} {v : v = \int_{U (y)} f (u, y) μ (d u), μ \in P (M), s u p p (μ) \subset U (y)} \forall y \in Y .$ It is easy to verify that $F (\cdot)$ is upper semicontinuous. Hence, its graph $G r a p h (F) \overset{def}{=} {(v, y) : v \in F (y), y \in Y}$ is compact.

Let $D$ and $Q$ be closed balls in $R^{m}$ such that $Y \subset D$ and $F (y) \subset Q \forall y \in Y \Rightarrow G r a p h (F) \subset Q \times D .$

Let $ν (t, d v) : [0, \infty) \to P (Q)$ be measurable and let $y (t)$ satisfy the equation $y^{'} (t) = \int_{Q} v ν (t, d v) for a . e . t > 0; y (0) = y_{0} .$

Proof of Lemma 2.4

To prove (22), it is enough to show that (28) implies (32). Note that from (28) (and from (25)) it follows that, for any sequence $C_{i} \to 0$ , there exist a sequence of $y_{0}^{i} \in Y$ and a sequence of $y_{0}^{i}$ -admissible pairs $(u^{i} (\cdot), y^{i} (\cdot))$ such that $lim_{i \to \infty} ζ_{i} = 0, ζ_{i} \overset{def}{=} C_{i} \int_{0}^{+ \infty} e^{- C_{i} t} g (u^{i} (t), y^{i} (t)) d t - G^{*} .$ From Lemma 3.5(ii) in [28] it follows that there exists a sequence $S_{i}$ , $i = 1, 2, \dots$ , such that $S_{i} \geq \frac{K}{\sqrt{C_{i}}}$ ( $K > 0$ being a constant) and such that $\frac{1}{S_{i}} \int_{0}^{S_{i}} g (u^{i} (t), y^{i} (t)) d t \leq G^{*} + ζ_{i} + \sqrt{C_{i}}$ $\Rightarrow inf_{y_{0} \in Y} Θ_{S_{i}} (y_{0}) \leq Θ_{S_{i}} (y_{0}^{i}) \leq G^{*} + ζ_{i} + \sqrt{C_{i}}$ $\Rightarrow {lim_{¯}}_{S \to \infty} inf_{y_{0} \in Y} Θ_{S} (y_{0})$

Acknowledgments

The work of the first author was partially funded by the Australian Research Council Discovery-Project Grants DP0664330, DP120100532, and DP130104432 and by the Linkage International Grant LX0560049. The second author was partially supported by project SADCO, FP7-PEOPLE-2010-ITN, No. 264735 and he was also supported partially by the French National Research AgencyANR-10-BLAN 0112.

References (29)

M. Quincampoix et al.
The problem of optimal control with reflection studied through a linear optimization problem stated on occupational measures
Nonlinear Anal.
(2010)
H. Frankowska et al.
Filippov’s and Filippov–Wazewski’s theorems on closed domains
Journal of Differential Equations
(2000)
L. Grüne
On the relation between discounted and average optimal value functions
Journal of Differential Equations
(1998)
A.G. Bhatt et al.
Occupation measures for controlled markov processes: characterization and optimality
Annals of Probability
(1996)
W.H. Fleming et al.
Convex duality approach to the optimal control of diffusions
SIAM Journal on Control and Optimization
(1989)
T.G. Kurtz et al.
Existence of Markov controls and characterization of optimal markov controls
SIAM Journal on Control and Optimization
(1998)
R.H. Stockbridge
Time-Average Control of a Martingale Problem. Existence of a Stationary Solution
Annals of Probability
(1990)
R.H. Stockbridge, Time-average control of a martingale problem: a linear programming formulation, Annals of...
V. Borkar et al.
Ergodic control for constrained diffusions: characterization using HJB equations. (English summary)
SIAM Journal on Control and Optimization
(2004/05)
R. Buckdahn et al.
Stochastic optimal control and linear programming approach
Applied Mathematics and Optimization
(2011)

F. Dufour et al.

On the existence of strict optimal controls for constrained, controlled Markov processes in continuous-time

Stochastics

(2012)

D. Goreac et al.

A note on linearization methods and dynamic programming principles for stochastic discontinuous control problems

Electronic Communications in Probability

(2012)

D. Hernandez-Hernandez et al.

The linear programming approach to deterministic optimal control problems

Applicationes Mathematicae

(1996)

O. Hernandez-Lerma et al.

Markov Chains and Invariant Probabilities

(2003)

Cited by (31)

On representation formulas for long run averaging optimal control problem
2015, Journal of Differential Equations
We investigate an optimal control problem with an averaging cost. The asymptotic behaviour of the values is a classical problem in ergodic control. To study the long run averaging we consider both Cesàro and Abel means. A main result of the paper says that there is at most one possible accumulation point – in the uniform convergence topology – of the values, when the time horizon of the Cesàro means converges to infinity or the discount factor of the Abel means converges to zero. This unique accumulation point is explicitly described by representation formulas involving probability measures on the state and control spaces. As a byproduct we obtain the existence of a limit value whenever the Cesàro or Abel values are equicontinuous. Our approach allows to generalise several results in ergodic control, and in particular it allows to cope with cases where the limit value is not constant with respect to the initial condition.
LINEAR PROGRAMMING ESTIMATES FOR CESÀRO AND ABEL LIMITS OF OPTIMAL VALUES IN OPTIMAL CONTROL PROBLEMS
2022, Discrete and Continuous Dynamical Systems - Series B
LP-related representations of Cesàro and Abel limits of optimal value functions
2022, Optimization
Unique Ergodicity of Deterministic Zero-Sum Differential Games
2021, Dynamic Games and Applications
LP Based Bounds for Cesàro and Abel Limits of the Optimal Values in Non-ergodic Stochastic Systems
2021, 2021 European Control Conference, ECC 2021
Linear programming estimates for cesaro and abel limits of optimal values in optimal control problems
2020, arXiv

View all citing articles on Scopus

View full text

On sets of occupational measures generated by a deterministic control system on an infinite time horizon

Abstract

Section snippets

Introduction and preliminaries

A representation of the set of discounted occupational measures

Proof of Theorem 2.2

Proof of Lemma 2.4

Acknowledgments

Nonlinear Anal.

Journal of Differential Equations

Journal of Differential Equations

Occupation measures for controlled markov processes: characterization and optimality

Annals of Probability

Convex duality approach to the optimal control of diffusions

SIAM Journal on Control and Optimization

Existence of Markov controls and characterization of optimal markov controls

SIAM Journal on Control and Optimization

Time-Average Control of a Martingale Problem. Existence of a Stationary Solution

Annals of Probability

Ergodic control for constrained diffusions: characterization using HJB equations. (English summary)

SIAM Journal on Control and Optimization

Stochastic optimal control and linear programming approach

Applied Mathematics and Optimization

On the existence of strict optimal controls for constrained, controlled Markov processes in continuous-time

Stochastics

A note on linearization methods and dynamic programming principles for stochastic discontinuous control problems

Electronic Communications in Probability

The linear programming approach to deterministic optimal control problems

Applicationes Mathematicae

Markov Chains and Invariant Probabilities