Skip to main content

2016 | OriginalPaper | Buchkapitel

Optimizing the Expected Mean Payoff in Energy Markov Decision Processes

verfasst von : Tomáš Brázdil, Antonín Kučera, Petr Novotný

Erschienen in: Automated Technology for Verification and Analysis

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The payoff may correspond to some independent performance measure, or it can reflect the use of the critical resource represented by the counter.
 
2
Formally, the decision algorithm answers “yes” iff, say, first possibility holds.
 
3
Under a finite description we can imagine a program with unbounded integer variables encoding the strategy’s execution.
 
Literatur
1.
Zurück zum Zitat Abdulla, P.A., Mayr, R., Sangnier, A., Sproston, J.: Solving parity games on integer vectors. In: DArgenio, P.R., Melgratti, H. (eds.) CONCUR 2013. LNCS, vol. 8052, pp. 106–120. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40184-8_9 CrossRef Abdulla, P.A., Mayr, R., Sangnier, A., Sproston, J.: Solving parity games on integer vectors. In: DArgenio, P.R., Melgratti, H. (eds.) CONCUR 2013. LNCS, vol. 8052, pp. 106–120. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-40184-8_​9 CrossRef
2.
Zurück zum Zitat Abdulla, P.A., Ciobanu, R., Mayr, R., Sangnier, A., Sproston, J.: Qualitative analysis of VASS-induced MDPs. In: Jacobs, B., et al. (eds.) FOSSACS 2016. LNCS, vol. 9634, pp. 319–334. Springer, Heidelberg (2016). doi:10.1007/978-3-662-49630-5_19 CrossRef Abdulla, P.A., Ciobanu, R., Mayr, R., Sangnier, A., Sproston, J.: Qualitative analysis of VASS-induced MDPs. In: Jacobs, B., et al. (eds.) FOSSACS 2016. LNCS, vol. 9634, pp. 319–334. Springer, Heidelberg (2016). doi:10.​1007/​978-3-662-49630-5_​19 CrossRef
3.
Zurück zum Zitat de Alfaro, L.: Formal verification of probabilistic systems. Ph.D. thesis, Stanford University, Stanford, CA, USA (1998) de Alfaro, L.: Formal verification of probabilistic systems. Ph.D. thesis, Stanford University, Stanford, CA, USA (1998)
4.
Zurück zum Zitat Bouyer, P., Fahrenberg, U., Larsen, K.G., Markey, N., Srba, J.: Infinite runs in weighted timed automata with energy constraints. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp. 33–47. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85778-5_4 CrossRef Bouyer, P., Fahrenberg, U., Larsen, K.G., Markey, N., Srba, J.: Infinite runs in weighted timed automata with energy constraints. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp. 33–47. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-85778-5_​4 CrossRef
5.
Zurück zum Zitat Bouyer, P., Markey, N., Randour, M., Larsen, K.G., Laursen, S.: Average-energy games. In: Proceedings of GandALF 2015, pp. 1–15 (2015) Bouyer, P., Markey, N., Randour, M., Larsen, K.G., Laursen, S.: Average-energy games. In: Proceedings of GandALF 2015, pp. 1–15 (2015)
6.
Zurück zum Zitat Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., Kučera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2011, pp. 33–42 (2011) Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., Kučera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2011, pp. 33–42 (2011)
7.
Zurück zum Zitat Brázdil, T., Brozek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-counter Markov decision processes. In: Proceedings of SODA 2010, pp. 863–874. SIAM (2010) Brázdil, T., Brozek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-counter Markov decision processes. In: Proceedings of SODA 2010, pp. 863–874. SIAM (2010)
8.
Zurück zum Zitat Brázdil, T., Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. J. ACM 61(6), 41:1–41:35 (2014)MathSciNetCrossRefMATH Brázdil, T., Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. J. ACM 61(6), 41:1–41:35 (2014)MathSciNetCrossRefMATH
9.
Zurück zum Zitat Brázdil, T., Kučera, A., Novotný, P.: Optimizing the Expected Mean Payoff in Energy Markov Decision Processes. CoRR abs/1607.00678 (2016) Brázdil, T., Kučera, A., Novotný, P.: Optimizing the Expected Mean Payoff in Energy Markov Decision Processes. CoRR abs/1607.00678 (2016)
10.
Zurück zum Zitat Brenguier, R., Cassez, F., Raskin, J.F.: Energy and mean-payoff timed games. In: Proceedings of the 17th International Conference on Hybrid Systems: Computation and Control, HSCC 2014, pp. 283–292. ACM, New York (2014) Brenguier, R., Cassez, F., Raskin, J.F.: Energy and mean-payoff timed games. In: Proceedings of the 17th International Conference on Hybrid Systems: Computation and Control, HSCC 2014, pp. 283–292. ACM, New York (2014)
11.
Zurück zum Zitat Brim, L., Chaloupka, J., Doyen, L., Gentilini, R., Raskin, J.: Faster algorithms for mean-payoff games. Formal Methods Syst. Des. 38(2), 97–118 (2011)CrossRefMATH Brim, L., Chaloupka, J., Doyen, L., Gentilini, R., Raskin, J.: Faster algorithms for mean-payoff games. Formal Methods Syst. Des. 38(2), 97–118 (2011)CrossRefMATH
12.
Zurück zum Zitat Bruyère, V., Filiot, E., Randour, M., Raskin, J.F.: Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Mayr, E.W., Portier, N. (eds.) STACS 2014. Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 199–213. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2014) Bruyère, V., Filiot, E., Randour, M., Raskin, J.F.: Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Mayr, E.W., Portier, N. (eds.) STACS 2014. Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 199–213. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2014)
13.
Zurück zum Zitat Cachera, D., Fahrenberg, U., Legay, A.: An omega-algebra for real-time energy problems. In: Proceedings of FSTTCS 2015. LIPIcs, vol. 45, pp. 394–407. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015) Cachera, D., Fahrenberg, U., Legay, A.: An omega-algebra for real-time energy problems. In: Proceedings of FSTTCS 2015. LIPIcs, vol. 45, pp. 394–407. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015)
14.
15.
Zurück zum Zitat Chatterjee, K., Doyen, L.: Energy parity games. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 599–610. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14162-1_50 CrossRef Chatterjee, K., Doyen, L.: Energy parity games. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 599–610. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-14162-1_​50 CrossRef
16.
Zurück zum Zitat Chatterjee, K., Doyen, L., Henzinger, T., Raskin, J.: Generalized mean-payoff and energy games. In: Proceedings of FST & TCS 2010. LIPIcs, vol. 8, pp. 505–516. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2010) Chatterjee, K., Doyen, L., Henzinger, T., Raskin, J.: Generalized mean-payoff and energy games. In: Proceedings of FST & TCS 2010. LIPIcs, vol. 8, pp. 505–516. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2010)
17.
Zurück zum Zitat Chatterjee, K., Komárková, Z., Křetínský, J.: Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2015, pp. 244–256 (2015) Chatterjee, K., Komárková, Z., Křetínský, J.: Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2015, pp. 244–256 (2015)
18.
Zurück zum Zitat Chatterjee, K., Henzinger, M.: Efficient and dynamic algorithms for alternating Büchi games and maximal end-component decomposition. J. ACM 61(3), 15:1–15:40 (2014)MathSciNetCrossRefMATH Chatterjee, K., Henzinger, M.: Efficient and dynamic algorithms for alternating Büchi games and maximal end-component decomposition. J. ACM 61(3), 15:1–15:40 (2014)MathSciNetCrossRefMATH
19.
Zurück zum Zitat Chatterjee, K., Henzinger, M., Krinninger, S., Nanongkai, D.: Polynomial-time algorithms for energy games with special weight structures. Algorithmica 70(3), 457–492 (2014)MathSciNetCrossRefMATH Chatterjee, K., Henzinger, M., Krinninger, S., Nanongkai, D.: Polynomial-time algorithms for energy games with special weight structures. Algorithmica 70(3), 457–492 (2014)MathSciNetCrossRefMATH
20.
Zurück zum Zitat Chatterjee, K., Randour, M., Raskin, J.F.: Strategy synthesis for multi-dimensional quantitative objectives. Acta Informatica 51(3–4), 129–163 (2014)MathSciNetCrossRefMATH Chatterjee, K., Randour, M., Raskin, J.F.: Strategy synthesis for multi-dimensional quantitative objectives. Acta Informatica 51(3–4), 129–163 (2014)MathSciNetCrossRefMATH
21.
Zurück zum Zitat Clemente, L., Raskin, J.F.: Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives. In: Proceedings of LICS 2015, pp. 257–268. IEEE Computer Society, Washington (2015) Clemente, L., Raskin, J.F.: Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives. In: Proceedings of LICS 2015, pp. 257–268. IEEE Computer Society, Washington (2015)
22.
Zurück zum Zitat Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer-Verlag New York Inc., New York (1996)CrossRefMATH Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer-Verlag New York Inc., New York (1996)CrossRefMATH
23.
Zurück zum Zitat Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. springer, Heidelberg (2011). doi:10.1007/978-3-642-21455-4_3 CrossRef Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. springer, Heidelberg (2011). doi:10.​1007/​978-3-642-21455-4_​3 CrossRef
24.
Zurück zum Zitat Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Comput. Math. Math. Phys. 28(5), 85–91 (1990)MathSciNetCrossRefMATH Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Comput. Math. Math. Phys. 28(5), 85–91 (1990)MathSciNetCrossRefMATH
25.
Zurück zum Zitat Haase, C., Kiefer, S.: The odds of staying on budget. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 234–246. Springer, Heidelberg (2015). doi:10.1007/978-3-662-47666-6_19 Haase, C., Kiefer, S.: The odds of staying on budget. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 234–246. Springer, Heidelberg (2015). doi:10.​1007/​978-3-662-47666-6_​19
26.
Zurück zum Zitat Howard, R.: Dynamic Programming and Markov Processes. MIT Press, New York (1960)MATH Howard, R.: Dynamic Programming and Markov Processes. MIT Press, New York (1960)MATH
27.
Zurück zum Zitat Juhl, L., Guldstrand Larsen, K., Raskin, J.-F.: Optimal bounds for multiweighted and parametrised energy games. In: Liu, Z., Woodcock, J., Zhu, H. (eds.) Theories of Programming and Formal Methods. LNCS, vol. 8051, pp. 244–255. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39698-4_15 CrossRef Juhl, L., Guldstrand Larsen, K., Raskin, J.-F.: Optimal bounds for multiweighted and parametrised energy games. In: Liu, Z., Woodcock, J., Zhu, H. (eds.) Theories of Programming and Formal Methods. LNCS, vol. 8051, pp. 244–255. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-39698-4_​15 CrossRef
28.
Zurück zum Zitat Kitaev, M., Rykov, V.: Controlled Queueing Systems. CRC Press, Boca Raton (1995)MATH Kitaev, M., Rykov, V.: Controlled Queueing Systems. CRC Press, Boca Raton (1995)MATH
30.
Zurück zum Zitat Puterman, M.L.: Markov Decision Processes. Wiley-Interscience, Hoboken (2005)MATH Puterman, M.L.: Markov Decision Processes. Wiley-Interscience, Hoboken (2005)MATH
31.
Zurück zum Zitat Velner, Y., Chatterjee, K., Doyen, L., Henzinger, T., Rabinovich, A., Raskin, J.: The complexity of multi-mean-payoff and multi-energy games. Inf. Comput. 241, 177–196 (2015)MathSciNetCrossRefMATH Velner, Y., Chatterjee, K., Doyen, L., Henzinger, T., Rabinovich, A., Raskin, J.: The complexity of multi-mean-payoff and multi-energy games. Inf. Comput. 241, 177–196 (2015)MathSciNetCrossRefMATH
32.
Metadaten
Titel
Optimizing the Expected Mean Payoff in Energy Markov Decision Processes
verfasst von
Tomáš Brázdil
Antonín Kučera
Petr Novotný
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46520-3_3