Abstract
This paper is the first attempt to investigate the risk probability criterion in semi-Markov decision processes with loss rates. The goal is to find an optimal policy with the minimum risk probability that the total loss incurred during a first passage time to some target set exceeds a loss level. First, we establish the optimality equation via a successive approximation technique, and show that the value function is the unique solution to the optimality equation. Second, we give suitable conditions, under which we prove the existence of optimal policies and develop an algorithm for computing ε-optimal policies. Finally, we apply our main results to a business system.
Similar content being viewed by others
References
Boda K, Filar J A, Lin Y, et al. Stochastic target hitting time and the problem of early retirement. IEEE Trans Automat Control, 2004, 49: 409–419
Bouakiz M, Kebir Y. Target-level criterion in Markov decision processes. J Optim Theory Appl, 1995, 86: 1–15
Guo X P, Hernández-Lerma O. Continuous-Time Markov Decision Processes: Theory and Applications. Berlin: Springer-Verlag, 2009
Guo X P, Hernández-Lerma O. New optimality conditions for average-payoff continuous-time Markov games in Polish spaces. Sci China Math, 2011, 54: 793–816
Hernández-Lerma O, Lasserre J B. Discrete-Time Markov Control Processes: Basic Optimality Criteria. New York: Springer-Verlag, 1996
Huang Y H, Guo X P. Optimal risk probability for first passage models in semi-Markov decision processes. J Math Anal Appl, 2009, 359: 404–420
Huang Y H, Guo X P, Li Z F. Minimum risk probability for finite horizon semi-Markov decision processes. J Math Anal Appl, 2013, 402: 378–391
Huang Y H, Guo X P, Song X Y. Performance analysis for controlled semi-Markov systems with application to maintenance. J Optim Theory Appl, 2011, 150: 395–415
Limnios N, Oprisan G. Semi-Markov Processes and Reliability. Boston: Birkhäuser, 2001
Love C E, Zhang Z G, Zitron M A, et al. A discrete semi-Markov decision model to determine the optimal repair/replacement policy under general repairs. European J Oper Res, 2000, 125: 398–409
Mamer J W. Successive approximations for finite horizon, semi-Markov decision processes with application to asset liquidation. Oper Res, 1986, 34: 638–644
Ohtsubo Y. Risk minimization in optimal stopping problem and applications. J Oper Res Soc Japan, 2003, 46: 342–352
Ohtsubo Y. Minimizing risk models in stochastic shortest path problems. Math Methods Oper Res, 2003, 57: 79–88
Ohtsubo Y. Optimal threshold probability in undiscounted Markov decision processes with a target set. Appl Math Comput, 2004, 149: 519–532
Ohtsubo Y, Toyonaga K. Equivalence classes for optimizing risk models in Markov decision processes. Math Methods Oper Res, 2004, 60: 239–250
Puterman M L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: John Wiley & Sons, 1994
Sakaguchi M, Ohtsubo Y. Optimal threshold probability and expectation in semi-Markov decision processes. Appl Math Comput, 2010, 216: 2947–2958
Singh S S, Tadić V B, Doucet A. A policy gradient method for semi-Markov decision processes with application to call admission control. European J Oper Res, 2007, 178: 808–818
Wei Q D, Guo X P. New average optimality conditions for semi-Markov decision processes in Borel spaces. J Optim Theory Appl, 2012, 153: 709–732
White D J. Minimising a threshold probability in discounted Markov decision processes. J Math Anal Appl, 1993, 173: 634–646
Wu R, Fang K. A risk model with delay in claim settlement. Acta Math Appl Sin Engl Ser, 1999, 15: 352–360
Wu Y H. Bounds for the ruin probability under a Markovian modulated risk model. Comm Statist Stoch Models, 1999, 15: 125–136
Yu S X, Lin Y L, Yan P F. Optimization models for the first arrival target distribution function in discrete time. J Math Anal Appl, 1998, 225: 193–223
Zhang W Z, Guo X P. Nonzero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. Sci China Math, 2012, 55: 2405–2416
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, X., Zou, X. & Guo, X. A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates. Sci. China Math. 58, 1923–1938 (2015). https://doi.org/10.1007/s11425-015-5029-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11425-015-5029-x
Keywords
- semi-Markov decision processes
- loss rate
- risk probability
- first passage time
- optimal policy
- iteration algorithm