2005 | OriginalPaper | Chapter
On Variability of Optimal Policies in Markov Decision Processes
Author : Karl-Heinz Waldmann
Published in: Data Analysis and Decision Support
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Both the total reward criterion and the average reward criterion commonly used in Markov decision processes lead to an optimal policy which maximizes the associated expected value. The paper reviews these standard approaches and studies the distribution functions obtained by applying an optimal policy. In particular, an efficient extrapolation method is suggested resulting from the control of Markov decision models with an absorbing set.