Risk-averse dynamic programming for Markov decision processes

Ruszczyński, Andrzej

doi:10.1007/s10107-010-0393-3

Risk-averse dynamic programming for Markov decision processes

Full Length Paper
Series B
Published: 21 July 2010

Volume 125, pages 235–261, (2010)
Cite this article

Mathematical Programming Submit manuscript

Andrzej Ruszczyński¹

2999 Accesses
238 Citations
3 Altmetric
Explore all metrics

An Erratum to this article was published on 10 May 2014

Abstract

We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive risk-averse dynamic programming equations and a value iteration method. For the infinite horizon problem we develop a risk-averse policy iteration method and we prove its convergence. We also propose a version of the Newton method to solve a nonsmooth equation arising in the policy iteration method and we prove its global convergence. Finally, we discuss relations to min–max Markov decision models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Artzner P., Delbaen F., Eber J.M., Heath D.: Coherent measures of risk. Math. Finance 9, 203–228 (1999)
Article MATH MathSciNet Google Scholar
Artzner P., Delbaen F., Eber J.-M., Heath D., Ku H.: Coherent multiperiod risk adjusted values and Bellmans principle. Ann. Oper. Res. 152, 5–22 (2007)
Article MATH MathSciNet Google Scholar
Aubin J.-P., Frankowska H.: Set-Valued Analysis. Birkhäuser, Boston (1990)
MATH Google Scholar
Bellman R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci 38, 716 (1952)
Article MATH Google Scholar
Bellman R.: Applied Dynamic Programming. Princeton University Press, Princeton (1957)
Google Scholar
Bertsekas D., Shreve S.E.: Stochastic Optimal Control. The Discrete Time Case. Academic Press, New York (1978)
MATH Google Scholar
Boda K., Filar J.A.: Time consistent dynamic risk measures. Math. Methods Oper. Res. 63, 169–186 (2006)
Article MATH MathSciNet Google Scholar
Cheridito P., Delbaen F., Kupper M.: Dynamic monetary risk measures for bounded discrete-time processes. Electron. J. Probab. 11, 57–106 (2006)
MathSciNet Google Scholar
Chung K.-J., Sobel M.J.: Discounted MDPs: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)
Article MATH MathSciNet Google Scholar
Delbaen F.: Coherent risk measures on general probability spaces, In essays in honour of Dieter Sondermann. Springer, Berlin (2002)
Google Scholar
Eichhorn A., Römisch W.: Polyhedral risk measures in stochastic programming. SIAM J. Optim. 16, 69–95 (2005)
Article MATH MathSciNet Google Scholar
Fleming W.H., Sheu S.J.: Optimal long term growth rate of expected utility of wealth. Ann. Appl. Probab. 9, 871–903 (1999)
Article MATH MathSciNet Google Scholar
Fleming W.H., Sheu S.J.: Risk-sensitive control and an optimal investment model. Math. Finance 10, 197–213 (2000)
Article MATH MathSciNet Google Scholar
Föllmer H., Penner I.: Convex risk measures and the dynamics of their penalty functions. Stat. Decis. 24, 61–96 (2006)
Article MATH Google Scholar
Föllmer H., Schied A.: Convex measures of risk and trading constraints. Finance Stoch. 6, 429–447 (2002)
Article MATH MathSciNet Google Scholar
Föllmer H., Schied A.: Stochastic Finance. An Introduction in Discrete Time. de Gruyter, Berlin (2004)
Book MATH Google Scholar
Fritelli M., Rosazza Gianin E.: Putting order in risk measures. J. Bank. Finance 26, 1473–1486 (2002)
Article Google Scholar
Frittelli M., Rosazza Gianin E.: Dynamic convex risk measures. In: Szegö, G. (eds) Risk Measures for the 21st Century, pp. 227–248. Wiley, Chichester (2005)
Google Scholar
Fritelli M., Scandolo G.: Risk measures and capital requirements for processes. Math. Finance 16, 589–612 (2006)
Article MathSciNet Google Scholar
González-Trejo J.I., Hernández-Lerma O., Hoyos-Reyes L.F.: Minimax control of discrete-time stochastic systems. SIAM J. Control Optim. 41, 1626–1659 (2003)
Article MATH Google Scholar
Hernández-Lerma O., Lasserre J.B.: Discrete-time Markov Control Processes. Basic Optimality Criteria. Springer, New York (1996)
Google Scholar
Howard R.A.: Dynamic Programming and Markov Processes. Wiley, New York (1960)
MATH Google Scholar
Jaquette S.C.: Markov decision processes with a new optimality criterion: Discrete time. Ann. Stat. 1, 496–505 (1973)
Article MATH MathSciNet Google Scholar
Jaquette S.C.: A utility criterion for Markov decision processes. Manag. Sci. 23, 43–49 (1976)
Article MATH MathSciNet Google Scholar
Jobert L., Rogers L.C.G.: Valuations and dynamic convex risk measures. Math. Finance 18, 1–22 (2008)
Article MATH MathSciNet Google Scholar
Klatte D., Kummer B.: Nonsmooth Equations in Optimization. Kluwer, Dordrecht (2002)
MATH Google Scholar
Klein Haneveld, W.: Duality in stochastic linear and dynamic programming. Lecture notes economics and mathematical systems 274. Springer, Berlin (1986)
Klöppel S., Schweizer M.: Dynamic indifference valuation via convex risk measures. Math. Finance 17, 599–627 (2007)
Article MATH MathSciNet Google Scholar
Koopmans T.C.: Stationary ordinal utility and impatience. Econometrica 28, 287–309 (1960)
Article MATH MathSciNet Google Scholar
Kreps M.K., Porteus E.L.: Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46, 185–200 (1978)
Article MATH MathSciNet Google Scholar
Kummer B. et al.: Newton’s method for non-differentiable functions. In: Guddat, J. (eds) Advances in Mathematical Optimization, pp. 114–125. Academie Verlag, Berlin (1988)
Google Scholar
Kushner H.J.: Introduction to Stochastic Control. Holt, Rhinehart, and Winston, New York (1971)
MATH Google Scholar
Küenle H.-U.: Stochastiche Spiele und Entscheidungsmodelle. B. G. Teubner, Leipzig (1986)
Google Scholar
Leitner J.: A short note on second-order stochastic dominance preserving coherent risk measures. Math. Finance 15, 649–651 (2005)
Article MATH MathSciNet Google Scholar
Ogryczak W., Ruszczyński A.: From stochastic dominance to mean-risk models: Semideviations as risk measures. Eur. J. Oper. Res. 116, 33–50 (1999)
Article MATH Google Scholar
Ogryczak W., Ruszczyński A.: On consistency of stochastic dominance and mean-semideviation models. Math. Program. 89, 217–232 (2001)
Article MATH MathSciNet Google Scholar
Ogryczak W., Ruszczyński A.: Dual stochastic dominance and related mean-risk models. SIAM J. Optim. 13(1), 60–78 (2002)
Article MATH MathSciNet Google Scholar
Pflug G.Ch., Römisch W.: Modeling, Measuring and Managing Risk. World Scientific, Singapore (2007)
Book MATH Google Scholar
Puterman M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
MATH Google Scholar
Riedel F.: Dynamic coherent risk measures. Stoch. Process. Appl. 112, 185–200 (2004)
Article MATH MathSciNet Google Scholar
Rockafellar R.T., Uryasev S.P.: Conditional value-at-risk for general loss distributions. J. Bank. Finance 26, 1443–1471 (2002)
Article Google Scholar
Rockafellar R.T., Wets R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Book MATH Google Scholar
Rockafellar R.T., Uryasev S., Zabarankin M.: Deviation measures in risk analysis and optimization. Finance Stoch. 10, 51–74 (2006)
Article MATH MathSciNet Google Scholar
Robinson S.M.: Newton’s method for a class of nonsmooth functions. Set-Valued Anal. 2, 291–305 (1994)
Article MATH MathSciNet Google Scholar
Ruszczyński A., Shapiro A.: Optimization of risk measures. In: Calafiore, G., Dabbene, F. (eds) Probabilistic and Randomized Methods for Design Under Uncertainty, Springer, London (2005)
Google Scholar
Ruszczyński A., Shapiro A.: Optimization of convex risk functions. Math. Oper. Res. 31, 433–452 (2006)
Article MATH MathSciNet Google Scholar
Ruszczyński A., Shapiro A.: Conditional risk mappings. Math. Oper. Res. 31, 544–561 (2006)
Article MATH MathSciNet Google Scholar
Scandolo, G.: Risk measures in a dynamic setting. PhD Thesis, Università degli Studi di Milano, Milan (2003)
Shapiro A.: On a time consistency concept in risk averse multistage stochastic programming. Oper. Res. Lett. 37, 143–147 (2009)
Article MATH MathSciNet Google Scholar
White D.J.: Markov Decision Processes. Wiley, New York (1993)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management Science and Information Systems, Rutgers University, Piscataway, NJ, 08854, USA
Andrzej Ruszczyński

Authors

Andrzej Ruszczyński
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrzej Ruszczyński.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s10107-014-0783-z.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ruszczyński, A. Risk-averse dynamic programming for Markov decision processes. Math. Program. 125, 235–261 (2010). https://doi.org/10.1007/s10107-010-0393-3

Download citation

Received: 29 April 2009
Accepted: 09 November 2009
Published: 21 July 2010
Issue Date: October 2010
DOI: https://doi.org/10.1007/s10107-010-0393-3

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Risk-averse dynamic programming for Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates

Markov decision processes with risk-sensitive criteria: an overview

Risk-sensitive infinite-horizon discounted piecewise deterministic Markov decision processes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Risk-averse dynamic programming for Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates

Markov decision processes with risk-sensitive criteria: an overview

Risk-sensitive infinite-horizon discounted piecewise deterministic Markov decision processes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation