Skip to main content

2021 | OriginalPaper | Buchkapitel

First Passage Exponential Optimality Problem for Semi-Markov Decision Processes

verfasst von : Haifeng Huo, Xian Wen

Erschienen in: Modern Trends in Controlled Stochastic Processes:

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper deals with the exponential utility maximization problem for semi-Markov decision process with Borel state and action spaces, and nonnegative reward rates. The criterion to be optimized is the expected exponential utility of the total rewards before the system state enters the target set. Under the regular and compactness-continuity conditions, we establish the corresponding optimality equation, and prove the existence of an exponential utility optimal stationary policy by an invariant embedding technique. Moreover, we provide an iterative algorithm for calculating the value function as well as the optimal policies. Finally, we illustrate the computational aspects of an optimal policy with an example.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011)CrossRef Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011)CrossRef
2.
Zurück zum Zitat Baüerle, N., Rieder, U.: More risk-sensitive Markov decision processes. Math. Oper. Res. 39, 105–120 (2014)MathSciNetCrossRef Baüerle, N., Rieder, U.: More risk-sensitive Markov decision processes. Math. Oper. Res. 39, 105–120 (2014)MathSciNetCrossRef
3.
Zurück zum Zitat Cao, X.R.: Semi-Markov decision problems and performance sensitivity analysis. IEEE Trans. Autom. Control 48, 758–769 (2003)MathSciNetCrossRef Cao, X.R.: Semi-Markov decision problems and performance sensitivity analysis. IEEE Trans. Autom. Control 48, 758–769 (2003)MathSciNetCrossRef
4.
Zurück zum Zitat Cavazos-Cadena, R., Montes-De-Oca, R.: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonnegative rewards. Appl. Math. (Warsaw) 27, 167–185 (2000)MathSciNetCrossRef Cavazos-Cadena, R., Montes-De-Oca, R.: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonnegative rewards. Appl. Math. (Warsaw) 27, 167–185 (2000)MathSciNetCrossRef
5.
Zurück zum Zitat Cavazos-Cadena, R., Montes-De-Oca, R.: Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces. Math. Meth. Oper. Res. 52, 133–167 (2000)MathSciNetCrossRef Cavazos-Cadena, R., Montes-De-Oca, R.: Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces. Math. Meth. Oper. Res. 52, 133–167 (2000)MathSciNetCrossRef
6.
Zurück zum Zitat Chung, K.J., Sobel, M.J.: Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetCrossRef Chung, K.J., Sobel, M.J.: Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetCrossRef
7.
Zurück zum Zitat Ghosh, M.K., Saha, S.: Risk-sensitive control of continuous time Markov chains. Stochastics 86, 655–675 (2014)MathSciNetCrossRef Ghosh, M.K., Saha, S.: Risk-sensitive control of continuous time Markov chains. Stochastics 86, 655–675 (2014)MathSciNetCrossRef
8.
Zurück zum Zitat Ghosh, M.K., Saha, S.: Non-stationary semi-Markov decision processes on a finite horizon. Stoch. Anal. Appl. 31, 183–190 (2013)MathSciNetCrossRef Ghosh, M.K., Saha, S.: Non-stationary semi-Markov decision processes on a finite horizon. Stoch. Anal. Appl. 31, 183–190 (2013)MathSciNetCrossRef
9.
Zurück zum Zitat Guo, X., Liu, Q.L., Zhang, Y.: Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates. 4OR 17, 427–442 (2019) Guo, X., Liu, Q.L., Zhang, Y.: Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates. 4OR 17, 427–442 (2019)
10.
Zurück zum Zitat Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)CrossRef Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)CrossRef
11.
Zurück zum Zitat Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)CrossRef Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)CrossRef
12.
13.
Zurück zum Zitat Huang, Y.H., Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta Math. Sin. (Chinese Ser.) 53, 503–514 (2010)MathSciNetMATH Huang, Y.H., Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta Math. Sin. (Chinese Ser.) 53, 503–514 (2010)MathSciNetMATH
14.
Zurück zum Zitat Huang, Y.H., Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. Eur. J. Oper. Res. 212, 131–140 (2011)MathSciNetCrossRef Huang, Y.H., Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. Eur. J. Oper. Res. 212, 131–140 (2011)MathSciNetCrossRef
15.
Zurück zum Zitat Huang, Y.H., Guo, X.P.: Mean-variance problems for finite horizon semi-Markov decision processes. Appl. Math. Optim. 72, 233–259 (2015)MathSciNetCrossRef Huang, Y.H., Guo, X.P.: Mean-variance problems for finite horizon semi-Markov decision processes. Appl. Math. Optim. 72, 233–259 (2015)MathSciNetCrossRef
16.
Zurück zum Zitat Huang, Y.H., Guo, X.P., Song, X.Y.: Performance analysis for controlled semi-Markov process. J. Optim. Theory Appl. 150, 395–415 (2011)MathSciNetCrossRef Huang, Y.H., Guo, X.P., Song, X.Y.: Performance analysis for controlled semi-Markov process. J. Optim. Theory Appl. 150, 395–415 (2011)MathSciNetCrossRef
17.
Zurück zum Zitat Huang, Y.H., Lian, Z.T., Guo, X.P.: Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria. Adv. Appl. Probab. 50, 783–804 (2018)MathSciNetCrossRef Huang, Y.H., Lian, Z.T., Guo, X.P.: Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria. Adv. Appl. Probab. 50, 783–804 (2018)MathSciNetCrossRef
18.
Zurück zum Zitat Huang, X.X., Zou, X.L., Guo, X.P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates. Sci. China Math. 58, 1923–1938 (2015)MathSciNetCrossRef Huang, X.X., Zou, X.L., Guo, X.P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates. Sci. China Math. 58, 1923–1938 (2015)MathSciNetCrossRef
19.
Zurück zum Zitat Huo, H.F., Zou, X.L., Guo, X.P.: The risk probability criterion for discounted continuous-time Markov decision processes. Discrete Event Dyn. Syst. 27, 675–699 (2017)MathSciNetCrossRef Huo, H.F., Zou, X.L., Guo, X.P.: The risk probability criterion for discounted continuous-time Markov decision processes. Discrete Event Dyn. Syst. 27, 675–699 (2017)MathSciNetCrossRef
20.
Zurück zum Zitat Janssen, J., Manca, R.: Semi-Markov Risk Models for Finance, Insurance, and Reliability. Springer, New York (2006)MATH Janssen, J., Manca, R.: Semi-Markov Risk Models for Finance, Insurance, and Reliability. Springer, New York (2006)MATH
21.
22.
Zurück zum Zitat Jaśkiewicz, A.: A note on negative dynamic programming for risk-sensitive control. Oper. Res. Lett. 36, 531–534 (2008)MathSciNetCrossRef Jaśkiewicz, A.: A note on negative dynamic programming for risk-sensitive control. Oper. Res. Lett. 36, 531–534 (2008)MathSciNetCrossRef
23.
Zurück zum Zitat Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi Markov control processes. Math. Oper. Res. 29, 326–338 (2013)MathSciNetCrossRef Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi Markov control processes. Math. Oper. Res. 29, 326–338 (2013)MathSciNetCrossRef
24.
Zurück zum Zitat Limnios, N., Oprisan, G.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)CrossRef Limnios, N., Oprisan, G.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)CrossRef
25.
Zurück zum Zitat Luque-Vásquez, F., Minjárez-Sosa, J.A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Meth. Oper. Res. 61, 455–468 (2005)MathSciNetCrossRef Luque-Vásquez, F., Minjárez-Sosa, J.A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Meth. Oper. Res. 61, 455–468 (2005)MathSciNetCrossRef
26.
Zurück zum Zitat Mamer, J.W.: Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Oper. Res. 34, 638–644 (1986)MathSciNetCrossRef Mamer, J.W.: Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Oper. Res. 34, 638–644 (1986)MathSciNetCrossRef
27.
Zurück zum Zitat Nollau, V.: Solution of a discounted semi-Markovian decision problem by successive overrelaxation. Optimization 39, 85–97 (1997)MathSciNetCrossRef Nollau, V.: Solution of a discounted semi-Markovian decision problem by successive overrelaxation. Optimization 39, 85–97 (1997)MathSciNetCrossRef
28.
Zurück zum Zitat Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef
29.
Zurück zum Zitat Schäl, M.: Control of ruin probabilities by discrete-time investments. Math. Meth. Oper. Res. 70, 141–158 (2005)MathSciNetCrossRef Schäl, M.: Control of ruin probabilities by discrete-time investments. Math. Meth. Oper. Res. 70, 141–158 (2005)MathSciNetCrossRef
30.
Zurück zum Zitat Wei, Q.D.: Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Meth. Oper. Res. 84, 1–27 (2016)MathSciNetCrossRef Wei, Q.D.: Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Meth. Oper. Res. 84, 1–27 (2016)MathSciNetCrossRef
31.
Zurück zum Zitat Wei, Q.D., Guo, X.P.: New average optimality conditions for semi-Markov decision processes in Borel spaces. J. Optim. Theory Appl. 153, 709–732 (2012)MathSciNetCrossRef Wei, Q.D., Guo, X.P.: New average optimality conditions for semi-Markov decision processes in Borel spaces. J. Optim. Theory Appl. 153, 709–732 (2012)MathSciNetCrossRef
32.
Zurück zum Zitat Wei, Q.D., Guo, X.P.: Constrained semi-Markov decision processes with ratio and time expected average criteria in Polish spaces. Optimization 64, 1593–1623 (2015)MathSciNetCrossRef Wei, Q.D., Guo, X.P.: Constrained semi-Markov decision processes with ratio and time expected average criteria in Polish spaces. Optimization 64, 1593–1623 (2015)MathSciNetCrossRef
33.
Zurück zum Zitat Yushkevich, A.A.: On semi-Markov controlled models with average reward criterion. Theory Probab. Appl. 26, 808–815 (1982)MathSciNetCrossRef Yushkevich, A.A.: On semi-Markov controlled models with average reward criterion. Theory Probab. Appl. 26, 808–815 (1982)MathSciNetCrossRef
34.
Zurück zum Zitat Zhang, Y.: Continuous-time Markov decision processes with exponential utility. SIAM J. Control Optim. 55, 1–24 (2017)MathSciNetCrossRef Zhang, Y.: Continuous-time Markov decision processes with exponential utility. SIAM J. Control Optim. 55, 1–24 (2017)MathSciNetCrossRef
Metadaten
Titel
First Passage Exponential Optimality Problem for Semi-Markov Decision Processes
verfasst von
Haifeng Huo
Xian Wen
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-76928-4_2