nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Maximizing the Conditional Expected Reward for Reaching the Goal

verfasst von : Christel Baier, Joachim Klein, Sascha Klüppelholz, Sascha Wunderlich

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel The Automatic Detection of Token Structures and Invariants Using SAT Checking

Nächstes Kapitel ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

The logarithmic length of an integer n is the number of bits required for a representation of n as a binary number. The logarithmic length of a rational number a / b is defined as the sum of the logarithmic lengths of its numerator a and its denominator b, assuming that a and b are coprime integers and b is positive.

The latter means a \(\mathfrak {U}\)-path \(\pi = s_0 \, \alpha _0 \, s_1 \, \alpha _1 \ldots \alpha _{k-1} \, s_k\) where \(s_0=\hat{s}_{ \scriptscriptstyle init }\) and \(s_i=s_k\) for some \(i\in \{0,1,\ldots ,k{-}1\}\) such that \(\hat{ rew }(s_j,\alpha _j)>0\) for some \(j\in \{i,\ldots ,k{-}1\}\).

The threshold algorithm solves all four variants of the threshold problem. E.g., \(\mathbb {CE}^{\max } \leqslant \vartheta \) iff \(\mathbb {CE}^{\mathfrak {S}}=\vartheta \), while \(\mathbb {CE}^{\max }<\vartheta \) iff the threshold algorithm returns “no”.

As the decisions of the already treated levels are optimal, the values \(y_{s,R}\) and \(\theta _{s,R}\) for \(R \in \{r{+}1,\ldots ,\wp \}\) can be reused in the calls of the threshold algorithms. That is, the calls of the threshold algorithm that are invoked in the scheduler-improvement steps at level r can skip levels \(\wp ,\wp {-}1,\ldots ,r{+}1\) and only need to process levels \(r,r{-}1,\ldots ,1,0\).

Abdulla, P.A., Henda, N.B., Mayr, R.: Decisive Markov chains. Logical Methods Comput. Sci. 3(4) (2007)

Acerbi, C., Tasche, D.: Expected shortfall: a natural coherent alternative to value at risk. Econ. notes 31(2), 379–388 (2002)CrossRef

Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Degano, P., Palamidessi, C.: On the information leakage of differentially-private mechanisms. J. Comput. Secur. 23(4), 427–469 (2015)CrossRef

Alvim, M.S., Chatzikokolakis, K., McIver, A., Morgan, C., Palamidessi, C., Smith, G.: Axioms for information leakage. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 77–92. IEEE Computer Society (2016)

Alvim, M.S., Chatzikokolakis, K., Palamidessi, C., Smith, G.: Measuring information leakage using generalized gain functions. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 265–279. IEEE Computer Society (2012)

Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and Nondeterministic Systems. Ph.D. thesis, UB Nijmegen (2011)

Andrés, M.E., Palamidessi, C., van Rossum, P., Sokolova, A.: Information hiding in probabilistic concurrent systems. Theoret. Comput. Sci. 412(28), 3072–3089 (2011)MathSciNetCrossRefMATH

Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78800-3_12 CrossRef

Baier, C., Dubslaff, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Probabilistic model checking for energy-utility analysis. In: Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. A Tribute to Prakash Panangaden. LNCS, vol. 8464, pp. 96–123. Springer, Heidelberg (2014). doi:10.1007/978-3-319-06880-0_5 CrossRef

10.

Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press, Cambridge (2008)MATH

11.

Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional probabilities in Markovian models efficiently. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 515–530. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54862-8_43 CrossRef

12.

Baier, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Weight monitoring with linear temporal logic: complexity and decidability. In: Proceedings of Computer Science Logic/Logic in Computer Science (CSL-LICS), pp. 11:1–11:10. ACM (2014)

13.

Baier, C., Klein, J., Klüppelholz, S. Wunderlich, S.: Maximizing the conditional expected reward for reaching the goal (extended version). arXiv:1701.05389 (2017)

14.

Barthe, G., Espitau, T., Ferrer Fioriti, L.M., Hsu, J.: Synthesizing probabilistic invariants via Doob’s decomposition. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 43–61. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_3

15.

Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)MathSciNetCrossRefMATH

16.

Bertsekas, D.P., Yu, H.: Stochastic path problems under weak conditions. Technical report, M.I.T. Cambridge, Report LIDS 2909 (2016)

17.

Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal specifications with accumulative values. In: Proceedings of Logic in Computer Science (LICS), pp. 43–52. IEEE Computer Society (2011)

18.

Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kucera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. Logical Methods Comput. Sci. 10(1) (2014)

19.

Brázdil, T., Kučera, A.: Computing the expected accumulated reward and gain for a subclass of infinite Markov Chains. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 372–383. Springer, Heidelberg (2005). doi:10.1007/11590156_30 CrossRef

20.

Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through Positivstellensatz’s. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 3–22. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_1

21.

Chatzikokolakis, K., Palamidessi, C., Braun, C.: Compositional methods for information-hiding. Math. Struct. Comput. Sci. 26(6), 908–932 (2016)MathSciNetCrossRefMATH

22.

Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664, pp. 66–81. Springer, Heidelberg (1999). doi:10.1007/3-540-48320-9_7 CrossRef

23.

Gretz, F., Katoen, J., McIver, A.: Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Perform. Eval. 73, 110–132 (2014)CrossRef

24.

Jansen, N., Kaminski, B.L., Katoen, J., Olmedo, F., Gretz, F., McIver, A.: Conditioning in probabilistic programming. In: Proceedings of Mathematical Foundations of Programming Semantics (MFPS), Electronic Notes Theoretical Computer Science, vol. 319, pp. 199–216 (2015)

25.

Kallenberg, L.: Markov Decision Processes. Lecture Notes. University of Leiden, Leiden (2011)

26.

Katoen, J.-P., Gretz, F., Jansen, N., Kaminski, B.L., Olmedo, F.: Understanding probabilistic programs. In: Meyer, R., Platzer, A., Wehrheim, H. (eds.) Correct System Design. LNCS, vol. 9360, pp. 15–32. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23506-6_4 CrossRef

27.

Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_47 CrossRef

28.

PRISM model checker. http://www.prismmodelchecker.org/

29.

Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRefMATH

30.

Randour, M., Raskin, J.-F., Sankur, O.: Variations on the stochastic shortest path problem. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 1–18. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46081-8_1

31.

Seber, G., Lee, A.: Linear Regression Analysis. Wiley Series in Probability and Statistics. Wiley, New York (2003)CrossRefMATH

32.

Uryasev, S.: Conditional value-at-risk: optimization algorithms and applications. In Proceedings of Computational Intelligence and Financial Engineering (CIFEr), pp. 49–57. IEEE (2000)

Titel: Maximizing the Conditional Expected Reward for Reaching the Goal
verfasst von: Christel Baier
Joachim Klein
Sascha Klüppelholz
Sascha Wunderlich
Verlag: Springer Berlin Heidelberg
Buch: Tools and Algorithms for the Construction and Analysis of Systems
Print ISBN: 978-3-662-54579-9

Electronic ISBN: 978-3-662-54580-5

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-662-54580-5_16

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"