2010 | OriginalPaper | Buchkapitel
An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning
verfasst von : Kazunori Iwata
Erschienen in: Neural Information Processing. Theory and Algorithms
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In reinforcement learning, Markov decision processes are the most popular stochastic sequential decision processes. We frequently assume stationarity or ergodicity, or both to the process for its analysis, but most stochastic sequential decision processes arising in reinforcement learning are in fact, not necessarily Markovian, stationary, or ergodic. In this paper, we give an information-spectrum analysis of return maximization in more general processes than stationary or ergodic Markov decision processes. We also present a class of stochastic sequential decision processes with the necessary condition for return maximization. We provide several examples of best sequences in terms of return maximization in the class.