Skip to main content
Top

2021 | OriginalPaper | Chapter

First Passage Exponential Optimality Problem for Semi-Markov Decision Processes

Authors : Haifeng Huo, Xian Wen

Published in: Modern Trends in Controlled Stochastic Processes:

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper deals with the exponential utility maximization problem for semi-Markov decision process with Borel state and action spaces, and nonnegative reward rates. The criterion to be optimized is the expected exponential utility of the total rewards before the system state enters the target set. Under the regular and compactness-continuity conditions, we establish the corresponding optimality equation, and prove the existence of an exponential utility optimal stationary policy by an invariant embedding technique. Moreover, we provide an iterative algorithm for calculating the value function as well as the optimal policies. Finally, we illustrate the computational aspects of an optimal policy with an example.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011)CrossRef Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011)CrossRef
2.
3.
go back to reference Cao, X.R.: Semi-Markov decision problems and performance sensitivity analysis. IEEE Trans. Autom. Control 48, 758–769 (2003)MathSciNetCrossRef Cao, X.R.: Semi-Markov decision problems and performance sensitivity analysis. IEEE Trans. Autom. Control 48, 758–769 (2003)MathSciNetCrossRef
4.
go back to reference Cavazos-Cadena, R., Montes-De-Oca, R.: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonnegative rewards. Appl. Math. (Warsaw) 27, 167–185 (2000)MathSciNetCrossRef Cavazos-Cadena, R., Montes-De-Oca, R.: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonnegative rewards. Appl. Math. (Warsaw) 27, 167–185 (2000)MathSciNetCrossRef
5.
go back to reference Cavazos-Cadena, R., Montes-De-Oca, R.: Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces. Math. Meth. Oper. Res. 52, 133–167 (2000)MathSciNetCrossRef Cavazos-Cadena, R., Montes-De-Oca, R.: Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces. Math. Meth. Oper. Res. 52, 133–167 (2000)MathSciNetCrossRef
6.
go back to reference Chung, K.J., Sobel, M.J.: Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetCrossRef Chung, K.J., Sobel, M.J.: Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetCrossRef
7.
8.
go back to reference Ghosh, M.K., Saha, S.: Non-stationary semi-Markov decision processes on a finite horizon. Stoch. Anal. Appl. 31, 183–190 (2013)MathSciNetCrossRef Ghosh, M.K., Saha, S.: Non-stationary semi-Markov decision processes on a finite horizon. Stoch. Anal. Appl. 31, 183–190 (2013)MathSciNetCrossRef
9.
go back to reference Guo, X., Liu, Q.L., Zhang, Y.: Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates. 4OR 17, 427–442 (2019) Guo, X., Liu, Q.L., Zhang, Y.: Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates. 4OR 17, 427–442 (2019)
10.
go back to reference Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)CrossRef Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)CrossRef
11.
go back to reference Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)CrossRef Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)CrossRef
12.
13.
go back to reference Huang, Y.H., Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta Math. Sin. (Chinese Ser.) 53, 503–514 (2010)MathSciNetMATH Huang, Y.H., Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta Math. Sin. (Chinese Ser.) 53, 503–514 (2010)MathSciNetMATH
14.
go back to reference Huang, Y.H., Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. Eur. J. Oper. Res. 212, 131–140 (2011)MathSciNetCrossRef Huang, Y.H., Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. Eur. J. Oper. Res. 212, 131–140 (2011)MathSciNetCrossRef
15.
go back to reference Huang, Y.H., Guo, X.P.: Mean-variance problems for finite horizon semi-Markov decision processes. Appl. Math. Optim. 72, 233–259 (2015)MathSciNetCrossRef Huang, Y.H., Guo, X.P.: Mean-variance problems for finite horizon semi-Markov decision processes. Appl. Math. Optim. 72, 233–259 (2015)MathSciNetCrossRef
16.
go back to reference Huang, Y.H., Guo, X.P., Song, X.Y.: Performance analysis for controlled semi-Markov process. J. Optim. Theory Appl. 150, 395–415 (2011)MathSciNetCrossRef Huang, Y.H., Guo, X.P., Song, X.Y.: Performance analysis for controlled semi-Markov process. J. Optim. Theory Appl. 150, 395–415 (2011)MathSciNetCrossRef
17.
go back to reference Huang, Y.H., Lian, Z.T., Guo, X.P.: Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria. Adv. Appl. Probab. 50, 783–804 (2018)MathSciNetCrossRef Huang, Y.H., Lian, Z.T., Guo, X.P.: Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria. Adv. Appl. Probab. 50, 783–804 (2018)MathSciNetCrossRef
18.
go back to reference Huang, X.X., Zou, X.L., Guo, X.P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates. Sci. China Math. 58, 1923–1938 (2015)MathSciNetCrossRef Huang, X.X., Zou, X.L., Guo, X.P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates. Sci. China Math. 58, 1923–1938 (2015)MathSciNetCrossRef
19.
go back to reference Huo, H.F., Zou, X.L., Guo, X.P.: The risk probability criterion for discounted continuous-time Markov decision processes. Discrete Event Dyn. Syst. 27, 675–699 (2017)MathSciNetCrossRef Huo, H.F., Zou, X.L., Guo, X.P.: The risk probability criterion for discounted continuous-time Markov decision processes. Discrete Event Dyn. Syst. 27, 675–699 (2017)MathSciNetCrossRef
20.
go back to reference Janssen, J., Manca, R.: Semi-Markov Risk Models for Finance, Insurance, and Reliability. Springer, New York (2006)MATH Janssen, J., Manca, R.: Semi-Markov Risk Models for Finance, Insurance, and Reliability. Springer, New York (2006)MATH
22.
go back to reference Jaśkiewicz, A.: A note on negative dynamic programming for risk-sensitive control. Oper. Res. Lett. 36, 531–534 (2008)MathSciNetCrossRef Jaśkiewicz, A.: A note on negative dynamic programming for risk-sensitive control. Oper. Res. Lett. 36, 531–534 (2008)MathSciNetCrossRef
23.
go back to reference Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi Markov control processes. Math. Oper. Res. 29, 326–338 (2013)MathSciNetCrossRef Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi Markov control processes. Math. Oper. Res. 29, 326–338 (2013)MathSciNetCrossRef
24.
go back to reference Limnios, N., Oprisan, G.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)CrossRef Limnios, N., Oprisan, G.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)CrossRef
25.
go back to reference Luque-Vásquez, F., Minjárez-Sosa, J.A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Meth. Oper. Res. 61, 455–468 (2005)MathSciNetCrossRef Luque-Vásquez, F., Minjárez-Sosa, J.A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Meth. Oper. Res. 61, 455–468 (2005)MathSciNetCrossRef
26.
go back to reference Mamer, J.W.: Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Oper. Res. 34, 638–644 (1986)MathSciNetCrossRef Mamer, J.W.: Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Oper. Res. 34, 638–644 (1986)MathSciNetCrossRef
27.
go back to reference Nollau, V.: Solution of a discounted semi-Markovian decision problem by successive overrelaxation. Optimization 39, 85–97 (1997)MathSciNetCrossRef Nollau, V.: Solution of a discounted semi-Markovian decision problem by successive overrelaxation. Optimization 39, 85–97 (1997)MathSciNetCrossRef
28.
go back to reference Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef
29.
go back to reference Schäl, M.: Control of ruin probabilities by discrete-time investments. Math. Meth. Oper. Res. 70, 141–158 (2005)MathSciNetCrossRef Schäl, M.: Control of ruin probabilities by discrete-time investments. Math. Meth. Oper. Res. 70, 141–158 (2005)MathSciNetCrossRef
30.
go back to reference Wei, Q.D.: Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Meth. Oper. Res. 84, 1–27 (2016)MathSciNetCrossRef Wei, Q.D.: Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Meth. Oper. Res. 84, 1–27 (2016)MathSciNetCrossRef
31.
go back to reference Wei, Q.D., Guo, X.P.: New average optimality conditions for semi-Markov decision processes in Borel spaces. J. Optim. Theory Appl. 153, 709–732 (2012)MathSciNetCrossRef Wei, Q.D., Guo, X.P.: New average optimality conditions for semi-Markov decision processes in Borel spaces. J. Optim. Theory Appl. 153, 709–732 (2012)MathSciNetCrossRef
32.
go back to reference Wei, Q.D., Guo, X.P.: Constrained semi-Markov decision processes with ratio and time expected average criteria in Polish spaces. Optimization 64, 1593–1623 (2015)MathSciNetCrossRef Wei, Q.D., Guo, X.P.: Constrained semi-Markov decision processes with ratio and time expected average criteria in Polish spaces. Optimization 64, 1593–1623 (2015)MathSciNetCrossRef
33.
go back to reference Yushkevich, A.A.: On semi-Markov controlled models with average reward criterion. Theory Probab. Appl. 26, 808–815 (1982)MathSciNetCrossRef Yushkevich, A.A.: On semi-Markov controlled models with average reward criterion. Theory Probab. Appl. 26, 808–815 (1982)MathSciNetCrossRef
34.
go back to reference Zhang, Y.: Continuous-time Markov decision processes with exponential utility. SIAM J. Control Optim. 55, 1–24 (2017)MathSciNetCrossRef Zhang, Y.: Continuous-time Markov decision processes with exponential utility. SIAM J. Control Optim. 55, 1–24 (2017)MathSciNetCrossRef
Metadata
Title
First Passage Exponential Optimality Problem for Semi-Markov Decision Processes
Authors
Haifeng Huo
Xian Wen
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-76928-4_2

Premium Partner