Skip to main content
Erschienen in:
Buchtitelbild

2017 | OriginalPaper | Buchkapitel

Analysis of Markov Decision Processes Under Parameter Uncertainty

verfasst von : Peter Buchholz, Iryna Dohndorf, Dimitri Scheftelowitsch

Erschienen in: Computer Performance Engineering

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Markov Decision Processes (MDPs) are a popular decision model for stochastic systems. Introducing uncertainty in the transition probability distribution by giving upper and lower bounds for the transition probabilities yields the model of Bounded Parameter MDPs (BMDPs) which captures many practical situations with limited knowledge about a system or its environment. In this paper the class of BMDPs is extended to Bounded Parameter Semi Markov Decision Processes (BSMDPs). The main focus of the paper is on the introduction and numerical comparison of different algorithms to compute optimal policies for BMDPs and BSMDPs; specifically, we introduce and compare variants of value and policy iteration.
The paper delivers an empirical comparison between different numerical algorithms for BMDPs and BSMDPs, with an emphasis on the required solution time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We consider in the following and subsequent equations continuous random variables where the integrals are well-defined for sojourn times in the states. For discrete random variables, the integrals have to be substituted by sums and the densities by probabilities, respectively.
 
2
\(\epsilon \)-optimality means that the optimal value is reached up to \(\epsilon \).
 
Literatur
2.
Zurück zum Zitat Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 2, 3rd edn. Athena Scientific (2005, 2007) Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 2, 3rd edn. Athena Scientific (2005, 2007)
3.
Zurück zum Zitat Beutler, F.J., Ross, K.W.: Uniformization for Semi-Markov decision processes under stationary policies. J. Appl. Probab. 24, 644–656 (1987)MathSciNetCrossRefMATH Beutler, F.J., Ross, K.W.: Uniformization for Semi-Markov decision processes under stationary policies. J. Appl. Probab. 24, 644–656 (1987)MathSciNetCrossRefMATH
4.
Zurück zum Zitat Buchholz, P., Kriege, J., Felko, I.: Input Modeling with Phase-Type Distributions and Markov Models. SM. Springer, Cham (2014)CrossRefMATH Buchholz, P., Kriege, J., Felko, I.: Input Modeling with Phase-Type Distributions and Markov Models. SM. Springer, Cham (2014)CrossRefMATH
5.
Zurück zum Zitat Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M.Z., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: TASE, pp. 85–92 (2013) Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M.Z., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: TASE, pp. 85–92 (2013)
6.
Zurück zum Zitat Cubuktepe, M., Jansen, N., Junges, S., Katoen, J., Papusha, I., Poonawala, H.A., Topcu, U.: Sequential convex programming for the efficient verification of parametric MDPs. CoRR, abs/1702.00063 (2017) Cubuktepe, M., Jansen, N., Junges, S., Katoen, J., Papusha, I., Poonawala, H.A., Topcu, U.: Sequential convex programming for the efficient verification of parametric MDPs. CoRR, abs/1702.00063 (2017)
7.
Zurück zum Zitat Delgado, K.V., de Barros, L.N., Cozman, F.G., Sanner, S.: Using mathematical programming to solve factored Markov decision processes with imprecise probabilities. Int. J. Approx. Reasoning 52(7), 1000–1017 (2011)MathSciNetCrossRefMATH Delgado, K.V., de Barros, L.N., Cozman, F.G., Sanner, S.: Using mathematical programming to solve factored Markov decision processes with imprecise probabilities. Int. J. Approx. Reasoning 52(7), 1000–1017 (2011)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Delgado, K.V., Sanner, S., de Barros, L.N.: Efficient solutions to factored MDPs with imprecise transition probabilities. Artif. Intell. 175, 1498–1527 (2011)MathSciNetCrossRefMATH Delgado, K.V., Sanner, S., de Barros, L.N.: Efficient solutions to factored MDPs with imprecise transition probabilities. Artif. Intell. 175, 1498–1527 (2011)MathSciNetCrossRefMATH
9.
Zurück zum Zitat Filho, R.S., Cozman, F.G., Trevizan, F.W., de Campos, C.P., de Barros, L.N.: Multilinear and integer programming for Markov decision processes with imprecise probabilities. In: 5th Int. Symposium on Imprecise Porbability: Theories and Applications, Prague, Czech Republic, pp. 395–404 (2007) Filho, R.S., Cozman, F.G., Trevizan, F.W., de Campos, C.P., de Barros, L.N.: Multilinear and integer programming for Markov decision processes with imprecise probabilities. In: 5th Int. Symposium on Imprecise Porbability: Theories and Applications, Prague, Czech Republic, pp. 395–404 (2007)
10.
11.
Zurück zum Zitat Gross, D., Miller, D.: The randomization technique as a modeling tool and solution procedure for transient Markov processes. Oper. Res. 32, 343–361 (1984)MathSciNetCrossRefMATH Gross, D., Miller, D.: The randomization technique as a modeling tool and solution procedure for transient Markov processes. Oper. Res. 32, 343–361 (1984)MathSciNetCrossRefMATH
14.
Zurück zum Zitat Müller, A., Stoyan, D.: Comparison Methods for Stochastic Models and Risks. Wiley, Chichester (2002)MATH Müller, A., Stoyan, D.: Comparison Methods for Stochastic Models and Risks. Wiley, Chichester (2002)MATH
15.
Zurück zum Zitat Puterman, M.L.: Markov Decision Processes. Wiley, New York (2005)MATH Puterman, M.L.: Markov Decision Processes. Wiley, New York (2005)MATH
16.
Zurück zum Zitat Satia, J.K., Lave, R.E.: Markovian decision processes with uncertain transition probabilities. Oper. Res. 21(3), 728–740 (1973)MathSciNetCrossRefMATH Satia, J.K., Lave, R.E.: Markovian decision processes with uncertain transition probabilities. Oper. Res. 21(3), 728–740 (1973)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Serfozo, R.F.: An equivalence between continuous and discrete time Markov decision processes. Oper. Res. 27(3), 616–620 (1979)MathSciNetCrossRefMATH Serfozo, R.F.: An equivalence between continuous and discrete time Markov decision processes. Oper. Res. 27(3), 616–620 (1979)MathSciNetCrossRefMATH
18.
Zurück zum Zitat Sigaud, O., Buffet, O. (eds.): Markov Decision Processes in Artificial Intelligence. Wiley-ISTE (2010) Sigaud, O., Buffet, O. (eds.): Markov Decision Processes in Artificial Intelligence. Wiley-ISTE (2010)
19.
Zurück zum Zitat Tewari, A., Bartlett, P.L.: Bounded parameter Markov decision processes with average reward criterion. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS (LNAI), vol. 4539, pp. 263–277. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72927-3_20 CrossRef Tewari, A., Bartlett, P.L.: Bounded parameter Markov decision processes with average reward criterion. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS (LNAI), vol. 4539, pp. 263–277. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-72927-3_​20 CrossRef
20.
Zurück zum Zitat White, C.C., Eldeib, H.K.: Markov decision processes with imprecise transition probabilities. Oper. Res. 42(4), 739–749 (1994)MathSciNetCrossRefMATH White, C.C., Eldeib, H.K.: Markov decision processes with imprecise transition probabilities. Oper. Res. 42(4), 739–749 (1994)MathSciNetCrossRefMATH
Metadaten
Titel
Analysis of Markov Decision Processes Under Parameter Uncertainty
verfasst von
Peter Buchholz
Iryna Dohndorf
Dimitri Scheftelowitsch
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-66583-2_1

Neuer Inhalt