Skip to main content
main-content

Über dieses Buch

Dynamic programming (DP) has a relevant history as a powerful and flexible optimization principle, but has a bad reputation as a computationally impractical tool. This book fills a gap between the statement of DP principles and their actual software implementation. Using MATLAB throughout, this tutorial gently gets the reader acquainted with DP and its potential applications, offering the possibility of actual experimentation and hands-on experience. The book assumes basic familiarity with probability and optimization, and is suitable to both practitioners and graduate students in engineering, applied mathematics, management, finance and economics.

Inhaltsverzeichnis

Frontmatter

Chapter 1. The Dynamic Programming Principle

Abstract
Dynamic programming (DP) is a powerful principle for solving quite challenging optimization problems. However, it is not a tool like, e.g., linear programming. If we are able to cast a decision problem within the framework of linear programming models, in most cases we are done. We may just use a state-of-the-art commercial solver, since linear programming is a rather robust and mature technology.
Paolo Brandimarte

Chapter 2. Implementing Dynamic Programming

Abstract
We got acquainted with the DP decomposition principle and Bellman’s recursive equation in Chap. 1. Now we turn to practical matters in terms of actual implementation and computational issues. As we have stressed, dynamic programming is a principle, rather than a specific algorithm that can be purchased off-the-shelves. Thus, careful modeling and analysis are needed on the one hand, and customized implementation is needed on the other one. Indeed, one of the main aims of this tutorial booklet is to encourage newcomers to experiment with DP, which is only possible by implementing it using any programming language or computing environment they like. Our choice is MATLAB, which is widely available and quite convenient as it yields short and intuitive code, not obscuring the underlying ideas.
Paolo Brandimarte

Chapter 3. Modeling for Dynamic Programming

Abstract
The DP principle, in its essence, is a rather simple and quite flexible concept for the decomposition of a multistage decision problem into a sequence of single-stage problems. The trick is to introduce a value function to balance the immediate contribution and the expected contributions from future decisions.
Paolo Brandimarte

Chapter 4. Numerical Dynamic Programming for Discrete States

Abstract
In this chapter we consider standard numerical methods for finite Markov decision processes (MDP), i.e., stochastic control problems where both the space state and the set of available actions at each state are finite. There are several examples of systems featuring finite state and action spaces, like certain types of queueing networks. Alternatively, a finite MDP may result from the discretization of a more complicated continuous state/decision model, even though this need not be the best way to tackle those problems.
Paolo Brandimarte

Chapter 5. Approximate Dynamic Programming and Reinforcement Learning for Discrete States

Abstract
Approximate dynamic programming (ADP) is the collective name for a wide array of numerical DP methods that, unlike the standard methods of Chap. 4, do not ensure that an optimal policy will be found. In exchange for their approximate nature, ADP methods aim at easing the overwhelming curses of DP.
Paolo Brandimarte

Chapter 6. Numerical Dynamic Programming for Continuous States

Abstract
In this chapter we consider discrete-time DP models featuring continuous state and action spaces. Since the value functions are infinite-dimensional objects in this setting, we need an array of numerical techniques to apply the DP principle.
Paolo Brandimarte

Chapter 7. Approximate Dynamic Programming and Reinforcement Learning for Continuous States

Abstract
The numerical methods for stochastic dynamic programming that we have discussed in Chap. 6 are certainly useful tools for tackling some dynamic optimization problems under uncertainty. However, they are not a radical antidote against the curses of DP.
Paolo Brandimarte

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise