Skip to main content

2017 | OriginalPaper | Buchkapitel

Maximizing the Conditional Expected Reward for Reaching the Goal

verfasst von : Christel Baier, Joachim Klein, Sascha Klüppelholz, Sascha Wunderlich

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The logarithmic length of an integer n is the number of bits required for a representation of n as a binary number. The logarithmic length of a rational number a / b is defined as the sum of the logarithmic lengths of its numerator a and its denominator b, assuming that a and b are coprime integers and b is positive.
 
2
The latter means a \(\mathfrak {U}\)-path \(\pi = s_0 \, \alpha _0 \, s_1 \, \alpha _1 \ldots \alpha _{k-1} \, s_k\) where \(s_0=\hat{s}_{ \scriptscriptstyle init }\) and \(s_i=s_k\) for some \(i\in \{0,1,\ldots ,k{-}1\}\) such that \(\hat{ rew }(s_j,\alpha _j)>0\) for some \(j\in \{i,\ldots ,k{-}1\}\).
 
3
The threshold algorithm solves all four variants of the threshold problem. E.g., \(\mathbb {CE}^{\max } \leqslant \vartheta \) iff \(\mathbb {CE}^{\mathfrak {S}}=\vartheta \), while \(\mathbb {CE}^{\max }<\vartheta \) iff the threshold algorithm returns “no”.
 
4
As the decisions of the already treated levels are optimal, the values \(y_{s,R}\) and \(\theta _{s,R}\) for \(R \in \{r{+}1,\ldots ,\wp \}\) can be reused in the calls of the threshold algorithms. That is, the calls of the threshold algorithm that are invoked in the scheduler-improvement steps at level r can skip levels \(\wp ,\wp {-}1,\ldots ,r{+}1\) and only need to process levels \(r,r{-}1,\ldots ,1,0\).
 
Literatur
1.
Zurück zum Zitat Abdulla, P.A., Henda, N.B., Mayr, R.: Decisive Markov chains. Logical Methods Comput. Sci. 3(4) (2007) Abdulla, P.A., Henda, N.B., Mayr, R.: Decisive Markov chains. Logical Methods Comput. Sci. 3(4) (2007)
2.
Zurück zum Zitat Acerbi, C., Tasche, D.: Expected shortfall: a natural coherent alternative to value at risk. Econ. notes 31(2), 379–388 (2002)CrossRef Acerbi, C., Tasche, D.: Expected shortfall: a natural coherent alternative to value at risk. Econ. notes 31(2), 379–388 (2002)CrossRef
3.
Zurück zum Zitat Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Degano, P., Palamidessi, C.: On the information leakage of differentially-private mechanisms. J. Comput. Secur. 23(4), 427–469 (2015)CrossRef Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Degano, P., Palamidessi, C.: On the information leakage of differentially-private mechanisms. J. Comput. Secur. 23(4), 427–469 (2015)CrossRef
4.
Zurück zum Zitat Alvim, M.S., Chatzikokolakis, K., McIver, A., Morgan, C., Palamidessi, C., Smith, G.: Axioms for information leakage. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 77–92. IEEE Computer Society (2016) Alvim, M.S., Chatzikokolakis, K., McIver, A., Morgan, C., Palamidessi, C., Smith, G.: Axioms for information leakage. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 77–92. IEEE Computer Society (2016)
5.
Zurück zum Zitat Alvim, M.S., Chatzikokolakis, K., Palamidessi, C., Smith, G.: Measuring information leakage using generalized gain functions. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 265–279. IEEE Computer Society (2012) Alvim, M.S., Chatzikokolakis, K., Palamidessi, C., Smith, G.: Measuring information leakage using generalized gain functions. In: Proceedings of Computer Security Foundations Symposium (CSF), pp. 265–279. IEEE Computer Society (2012)
6.
Zurück zum Zitat Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and Nondeterministic Systems. Ph.D. thesis, UB Nijmegen (2011) Andrés, M.E.: Quantitative Analysis of Information Leakage in Probabilistic and Nondeterministic Systems. Ph.D. thesis, UB Nijmegen (2011)
7.
Zurück zum Zitat Andrés, M.E., Palamidessi, C., van Rossum, P., Sokolova, A.: Information hiding in probabilistic concurrent systems. Theoret. Comput. Sci. 412(28), 3072–3089 (2011)MathSciNetCrossRefMATH Andrés, M.E., Palamidessi, C., van Rossum, P., Sokolova, A.: Information hiding in probabilistic concurrent systems. Theoret. Comput. Sci. 412(28), 3072–3089 (2011)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78800-3_12 CrossRef Andrés, M.E., van Rossum, P.: Conditional probabilities over probabilistic and nondeterministic systems. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 157–172. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-78800-3_​12 CrossRef
9.
Zurück zum Zitat Baier, C., Dubslaff, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Probabilistic model checking for energy-utility analysis. In: Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. A Tribute to Prakash Panangaden. LNCS, vol. 8464, pp. 96–123. Springer, Heidelberg (2014). doi:10.1007/978-3-319-06880-0_5 CrossRef Baier, C., Dubslaff, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Probabilistic model checking for energy-utility analysis. In: Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. A Tribute to Prakash Panangaden. LNCS, vol. 8464, pp. 96–123. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-06880-0_​5 CrossRef
10.
Zurück zum Zitat Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press, Cambridge (2008)MATH Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press, Cambridge (2008)MATH
11.
Zurück zum Zitat Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional probabilities in Markovian models efficiently. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 515–530. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54862-8_43 CrossRef Baier, C., Klein, J., Klüppelholz, S., Märcker, S.: Computing conditional probabilities in Markovian models efficiently. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 515–530. Springer, Heidelberg (2014). doi:10.​1007/​978-3-642-54862-8_​43 CrossRef
12.
Zurück zum Zitat Baier, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Weight monitoring with linear temporal logic: complexity and decidability. In: Proceedings of Computer Science Logic/Logic in Computer Science (CSL-LICS), pp. 11:1–11:10. ACM (2014) Baier, C., Klein, J., Klüppelholz, S., Wunderlich, S.: Weight monitoring with linear temporal logic: complexity and decidability. In: Proceedings of Computer Science Logic/Logic in Computer Science (CSL-LICS), pp. 11:1–11:10. ACM (2014)
13.
Zurück zum Zitat Baier, C., Klein, J., Klüppelholz, S. Wunderlich, S.: Maximizing the conditional expected reward for reaching the goal (extended version). arXiv:1701.05389 (2017) Baier, C., Klein, J., Klüppelholz, S. Wunderlich, S.: Maximizing the conditional expected reward for reaching the goal (extended version). arXiv:​1701.​05389 (2017)
14.
Zurück zum Zitat Barthe, G., Espitau, T., Ferrer Fioriti, L.M., Hsu, J.: Synthesizing probabilistic invariants via Doob’s decomposition. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 43–61. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_3 Barthe, G., Espitau, T., Ferrer Fioriti, L.M., Hsu, J.: Synthesizing probabilistic invariants via Doob’s decomposition. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 43–61. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-41528-4_​3
15.
Zurück zum Zitat Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)MathSciNetCrossRefMATH Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)MathSciNetCrossRefMATH
16.
Zurück zum Zitat Bertsekas, D.P., Yu, H.: Stochastic path problems under weak conditions. Technical report, M.I.T. Cambridge, Report LIDS 2909 (2016) Bertsekas, D.P., Yu, H.: Stochastic path problems under weak conditions. Technical report, M.I.T. Cambridge, Report LIDS 2909 (2016)
17.
Zurück zum Zitat Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal specifications with accumulative values. In: Proceedings of Logic in Computer Science (LICS), pp. 43–52. IEEE Computer Society (2011) Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal specifications with accumulative values. In: Proceedings of Logic in Computer Science (LICS), pp. 43–52. IEEE Computer Society (2011)
18.
Zurück zum Zitat Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kucera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. Logical Methods Comput. Sci. 10(1) (2014) Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kucera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. Logical Methods Comput. Sci. 10(1) (2014)
19.
Zurück zum Zitat Brázdil, T., Kučera, A.: Computing the expected accumulated reward and gain for a subclass of infinite Markov Chains. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 372–383. Springer, Heidelberg (2005). doi:10.1007/11590156_30 CrossRef Brázdil, T., Kučera, A.: Computing the expected accumulated reward and gain for a subclass of infinite Markov Chains. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 372–383. Springer, Heidelberg (2005). doi:10.​1007/​11590156_​30 CrossRef
20.
Zurück zum Zitat Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through Positivstellensatz’s. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 3–22. Springer, Heidelberg (2016). doi:10.1007/978-3-319-41528-4_1 Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through Positivstellensatz’s. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 3–22. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-41528-4_​1
21.
Zurück zum Zitat Chatzikokolakis, K., Palamidessi, C., Braun, C.: Compositional methods for information-hiding. Math. Struct. Comput. Sci. 26(6), 908–932 (2016)MathSciNetCrossRefMATH Chatzikokolakis, K., Palamidessi, C., Braun, C.: Compositional methods for information-hiding. Math. Struct. Comput. Sci. 26(6), 908–932 (2016)MathSciNetCrossRefMATH
22.
Zurück zum Zitat Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664, pp. 66–81. Springer, Heidelberg (1999). doi:10.1007/3-540-48320-9_7 CrossRef Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR 1999. LNCS, vol. 1664, pp. 66–81. Springer, Heidelberg (1999). doi:10.​1007/​3-540-48320-9_​7 CrossRef
23.
Zurück zum Zitat Gretz, F., Katoen, J., McIver, A.: Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Perform. Eval. 73, 110–132 (2014)CrossRef Gretz, F., Katoen, J., McIver, A.: Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Perform. Eval. 73, 110–132 (2014)CrossRef
24.
Zurück zum Zitat Jansen, N., Kaminski, B.L., Katoen, J., Olmedo, F., Gretz, F., McIver, A.: Conditioning in probabilistic programming. In: Proceedings of Mathematical Foundations of Programming Semantics (MFPS), Electronic Notes Theoretical Computer Science, vol. 319, pp. 199–216 (2015) Jansen, N., Kaminski, B.L., Katoen, J., Olmedo, F., Gretz, F., McIver, A.: Conditioning in probabilistic programming. In: Proceedings of Mathematical Foundations of Programming Semantics (MFPS), Electronic Notes Theoretical Computer Science, vol. 319, pp. 199–216 (2015)
25.
Zurück zum Zitat Kallenberg, L.: Markov Decision Processes. Lecture Notes. University of Leiden, Leiden (2011) Kallenberg, L.: Markov Decision Processes. Lecture Notes. University of Leiden, Leiden (2011)
26.
Zurück zum Zitat Katoen, J.-P., Gretz, F., Jansen, N., Kaminski, B.L., Olmedo, F.: Understanding probabilistic programs. In: Meyer, R., Platzer, A., Wehrheim, H. (eds.) Correct System Design. LNCS, vol. 9360, pp. 15–32. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23506-6_4 CrossRef Katoen, J.-P., Gretz, F., Jansen, N., Kaminski, B.L., Olmedo, F.: Understanding probabilistic programs. In: Meyer, R., Platzer, A., Wehrheim, H. (eds.) Correct System Design. LNCS, vol. 9360, pp. 15–32. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-23506-6_​4 CrossRef
27.
Zurück zum Zitat Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_47 CrossRef Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-22110-1_​47 CrossRef
29.
Zurück zum Zitat Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRefMATH Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRefMATH
30.
Zurück zum Zitat Randour, M., Raskin, J.-F., Sankur, O.: Variations on the stochastic shortest path problem. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 1–18. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46081-8_1 Randour, M., Raskin, J.-F., Sankur, O.: Variations on the stochastic shortest path problem. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 1–18. Springer, Heidelberg (2015). doi:10.​1007/​978-3-662-46081-8_​1
31.
Zurück zum Zitat Seber, G., Lee, A.: Linear Regression Analysis. Wiley Series in Probability and Statistics. Wiley, New York (2003)CrossRefMATH Seber, G., Lee, A.: Linear Regression Analysis. Wiley Series in Probability and Statistics. Wiley, New York (2003)CrossRefMATH
32.
Zurück zum Zitat Uryasev, S.: Conditional value-at-risk: optimization algorithms and applications. In Proceedings of Computational Intelligence and Financial Engineering (CIFEr), pp. 49–57. IEEE (2000) Uryasev, S.: Conditional value-at-risk: optimization algorithms and applications. In Proceedings of Computational Intelligence and Financial Engineering (CIFEr), pp. 49–57. IEEE (2000)
Metadaten
Titel
Maximizing the Conditional Expected Reward for Reaching the Goal
verfasst von
Christel Baier
Joachim Klein
Sascha Klüppelholz
Sascha Wunderlich
Copyright-Jahr
2017
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-54580-5_16