Markov Decision Processes with Average-Value-at-Risk criteria

Bäuerle, Nicole; Ott, Jonathan

doi:10.1007/s00186-011-0367-0

Markov Decision Processes with Average-Value-at-Risk criteria

Original Article
Published: 28 September 2011

Volume 74, pages 361–379, (2011)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Nicole Bäuerle¹ &
Jonathan Ott¹

997 Accesses
67 Citations
Explore all metrics

Abstract

We investigate the problem of minimizing the Average-Value-at-Risk (AVaR _τ) of the discounted cost over a finite and an infinite horizon which is generated by a Markov Decision Process (MDP). We show that this problem can be reduced to an ordinary MDP with extended state space and give conditions under which an optimal policy exists. We also give a time-consistent interpretation of the AVaR _τ. At the end we consider a numerical example which is a simple repeated casino game. It is used to discuss the influence of the risk aversion parameter τ of the AVaR _τ-criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Distributionally robust stochastic programs with side information based on trimmings

Article Open access 22 November 2021

References

Acerbi C, Tasche D (2002) On the coherence of expected shortfall. J Banking Finance 26: 1487–1503
Article Google Scholar
Artzner P, Delbaen F, Eber J, Heath D, Ku H (2007) Coherent multiperiod risk adjusted values and Bellman’s principle. Ann Oper Res 152: 5–22
Article MathSciNet MATH Google Scholar
Bäuerle N, Mundt A (2009) Dynamic mean-risk optimization in a binomial model. Math Methods Oper Res 70: 219–239
Article MathSciNet MATH Google Scholar
Bäuerle N, Rieder U (2011) Markov Decision processes with applications to finance. Springer, Berlin
Book MATH Google Scholar
Bertsekas DP, Shreve SE (1978) Stochastic optimal control. Academic Press, New York
MATH Google Scholar
Bion-Nadal J (2008) Dynamic risk measures: time consistency and risk measures from BMO martingales. Finance Stoch 12: 219–244
Article MathSciNet MATH Google Scholar
Björk T, Murgoci A (2010) A general theory of Markovian time inconsistent stochastic control problems. 1–39. Available at SSRN: http://ssrn.com/abstract=1694759
Boda K, Filar J (2006) Time consistent dynamic risk measures. Math Methods Oper Res 63: 169–186
Article MathSciNet MATH Google Scholar
Boda K, Filar JA, Lin Y, Spanjers L (2004) Stochastic target hitting time and the problem of early retirement. IEEE Trans Automat Control 49: 409–419
Article MathSciNet Google Scholar
Collins E, McNamara J (1998) Finite-horizon dynamic optimisation when the terminal reward is a concave functional of the distribution of the final state. Adv Appl Probab 30: 122–136
Article MathSciNet MATH Google Scholar
Howard R, Matheson J (1972) Risk-sensitive Markov Decision processes. Manag Sci 18: 356–369
Article MathSciNet MATH Google Scholar
Jaquette S (1973) Markov Decision processes with a new optimality criterion: discrete time. Ann Stat 1: 496–505
Article MathSciNet MATH Google Scholar
Li D, Ng W-L (2000) Optimal dynamic portfolio selection: multiperiod mean-variance formulation. Math Finance 10: 387–406
Article MathSciNet MATH Google Scholar
Ott J (2010) A Markov decision model for a surveillance application and risk-sensistive Markov decision processes. PhD http://digbib.ubka.uni-karlsruhe.de/volltexte/1000020835
Rockafellar RT, Uryasev S (2002) Conditional Value-at-Risk for general loss distributions. J Banking Finance 26: 1443–1471
Article Google Scholar
Shapiro A (2009) On a time consistency concept in risk averse multistage stochastic programming. Oper Res Lett 37: 143–147
Article MathSciNet MATH Google Scholar
White DJ (1988) Mean, variance, and probabilistic criteria in finite Markov Decision processes: a review. J Optim Theory Appl 56: 1–29
Article MathSciNet MATH Google Scholar
Wu C, Lin Y (1999) Minimizing risk models in Markov Decision processes with policies depending on target values. J Math Anal Appl 231: 47–67
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Stochastics, Karlsruhe Institute of Technology, 76128, Karlsruhe, Germany
Nicole Bäuerle & Jonathan Ott

Authors

Nicole Bäuerle
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Ott
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicole Bäuerle.

Additional information

The underlying projects have been funded by the Bundesministerium für Bildung und Forschung of Germany under promotional reference 03BAPAC1. The authors are responsible for the content of this article.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bäuerle, N., Ott, J. Markov Decision Processes with Average-Value-at-Risk criteria. Math Meth Oper Res 74, 361–379 (2011). https://doi.org/10.1007/s00186-011-0367-0

Download citation

Received: 10 March 2011
Accepted: 08 July 2011
Published: 28 September 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s00186-011-0367-0

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Decision Processes with Average-Value-at-Risk criteria

Abstract

Access this article

Similar content being viewed by others

Monte Carlo Tree Search: a review of recent modifications and applications

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Distributionally robust stochastic programs with side information based on trimmings

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Markov Decision Processes with Average-Value-at-Risk criteria

Abstract

Access this article

Similar content being viewed by others

Monte Carlo Tree Search: a review of recent modifications and applications

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Distributionally robust stochastic programs with side information based on trimmings

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation