Skip to main content
Log in

Markov decision processes with a stopping time constraint

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract.

In this paper, the optimization problem for a stopped Markov decision process with finite states and actions is considered over stopping times τ constrained so that ?τ≦α for some fixed α>0. The problem is solved through randomization of stopping times and mathematical programming formulation by occupation measures. Another representation, called F-representation, of randomized stopping times is given, by which the concept of Markov or stationary randomized stopping times is introduced. We treat two types of occupation measures, running and stopped, but stopped occupation measure is shown to be expressed by running one. We study the properties of the set of running occupation measures achieved by different classes of pairs of policies and randomized stopping times. Analyzing the equivalent mathematical programming problem formulated by running occupation measures corresponding with stationary policies and stationary randomized stopping times, we prove the existence of an optimal constrained pair of stationary policy and stopping time requiring randomization in at most one state.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Manuscript received: September 2000/Final version received: November 2000

Rights and permissions

Reprints and permissions

About this article

Cite this article

Horiguchi, M. Markov decision processes with a stopping time constraint. Mathematical Methods of OR 53, 279–295 (2001). https://doi.org/10.1007/PL00003996

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/PL00003996

Navigation