Skip to main content
Top

2020 | Book

Continuous-Time Markov Decision Processes

Borel Space Models and General Control Strategies

insite
SEARCH

About this book

This book offers a systematic and rigorous treatment of continuous-time Markov decision processes, covering both theory and possible applications to queueing systems, epidemiology, finance, and other fields. Unlike most books on the subject, much attention is paid to problems with functional constraints and the realizability of strategies.

Three major methods of investigations are presented, based on dynamic programming, linear programming, and reduction to discrete-time problems. Although the main focus is on models with total (discounted or undiscounted) cost criteria, models with average cost criteria and with impulsive controls are also discussed in depth.

The book is self-contained. A separate chapter is devoted to Markov pure jump processes and the appendices collect the requisite background on real analysis and applied probability. All the statements in the main text are proved in detail.

Researchers and graduate students in applied probability, operational research, statistics and engineering will find this monograph interesting, useful and valuable.

Table of Contents

Frontmatter
Chapter 1. Description of CTMDPs and Preliminaries
Abstract
In this chapter, we provide a rigorous mathematical description of continuous-time Markov decision processes (CTMDPs), the total cost model, and present several practical examples of CTMDPs. We also establish several preliminary properties of the total cost model to serve our future investigations. In comparison with the previous literature, we introduce and consider wider classes of control strategies and put a special emphasis on those that are realizable, see more discussions on this in Sect. 1.4. A further discussion of possible control strategies is in Chap. 6, Sect. 6.​1.​1.
Alexey Piunovskiy, Yi Zhang
Chapter 2. Selected Properties of Controlled Processes
Abstract
The main purpose of this chapter is to present some properties of the controlled processes that are closely related to the optimal control problems considered in this book. In Sect. 2.1, we rigorously show that under a natural Markov strategy \(\breve{\pi }\), the controlled process \(X(\cdot )\) is a Markov pure jump process, with the explicit construction of its transition function being presented.
Alexey Piunovskiy, Yi Zhang
Chapter 3. The Discounted Cost Model
Abstract
In this chapter, let \(\alpha >0\) be a fixed discount factor. We shall consider the \(\alpha \)-discounted CTMDP problems (1.​15) and (1.​16). Without loss of generality we can restrict to (relaxed) \(\pi \)-strategies because (Markov) \(\pi \)-strategies are sufficient for \(\alpha \)-discounted problems; see Remark 1.​3.​1. (In fact, in this chapter, the class of natural Markov strategies is also sufficient, according to (2.​92). However, we choose to stay with the more general class of \(\pi \)-strategies. The same is true for some of the subsequent chapters.) Although most \(\pi \)-strategies are not realizable, under appropriate conditions, the solution to the unconstrained problem is given by a realizable deterministic stationary strategy (see Theorem 3.1.2). Moreover, as was explained in Sect. 1.​3.​5, for every \(\pi \)-strategy there exists an equivalent standard \(\xi \)-strategy which is realizable.
Alexey Piunovskiy, Yi Zhang
Chapter 4. Reduction to DTMDP: The Total Cost Model
Abstract
In this chapter, we show that solving an (unconstrained or constrained) optimal control problem for the model with total expected cost is equivalent to solving the control problem for the corresponding Discrete-Time Markov Decision Process (DTMDP). After that, we use the known facts for DTMDP models to obtain the corresponding results for CTMDP models. The necessary information about DTMDPs is in Appendix C. We do not assume that the controlled process \(X(\cdot )\) is non-explosive.
Alexey Piunovskiy, Yi Zhang
Chapter 5. The Average Cost Model
Abstract
In this chapter, restricted to the class of \(\pi \)-strategies, we consider the following long-run average cost as the performance measure of the system:
$$\begin{aligned} W_j(S,\gamma ):= & {} \mathop {\overline{\lim }}_{T\rightarrow \infty }\frac{1}{T}\mathrm{E}_\gamma ^S\left[ \int _{(0,T]} \int _\mathbf{A } c_j(X(t),a)\Pi (da|t)dt \right] \nonumber \\= & {} \mathop {\overline{\lim }}_{T\rightarrow \infty }\frac{1}{T}\mathrm{E}_\gamma ^S\left[ \int _{(0,\min \{T,~T_\infty \}]} \int _\mathbf{A } c_j(X(t),a)\Pi (da|t)dt \right] \end{aligned}$$
for each initial distribution \(\gamma \), \(j=0,1,\dots ,J\) and \(\pi \)-strategy \(S\in \mathbf{S} _\pi \). Recall that the convention of \(c_j(x_\infty ,a):=0\) for each \(a\in \mathbf{A} \) is in use.
Alexey Piunovskiy, Yi Zhang
Chapter 6. The Total Cost Model: General Case
Abstract
In this chapter, we continue to study the CTMDP model introduced in Sect. 1.​1. One of the aims is to show that, after the essential extension of the notion of a strategy, the classes of Markov \(\pi \)-strategies and Markov Poisson-related strategies are still sufficient for solving constrained and unconstrained optimal control problems with total expected costs. In Sect. 6.4, we investigate the important strategies called mixtures. The realizability of different classes of strategies is discussed in Sect. 6.6.
Alexey Piunovskiy, Yi Zhang
Chapter 7. Gradual-Impulsive Control Models
Abstract
The CTMDPs discussed in the previous chapters only allow the decision-maker to control the local characteristics of the process. This chapter considers a more general situation, where the decision-maker can control both the local characteristics and the trajectories of the process.
Alexey Piunovskiy, Yi Zhang
Backmatter
Metadata
Title
Continuous-Time Markov Decision Processes
Authors
Dr. Alexey Piunovskiy
Prof. Yi Zhang
Copyright Year
2020
Electronic ISBN
978-3-030-54987-9
Print ISBN
978-3-030-54986-2
DOI
https://doi.org/10.1007/978-3-030-54987-9