2009 | OriginalPaper | Buchkapitel
Introduction
verfasst von : Matthew E. Taylor
Erschienen in: Transfer in Reinforcement Learning Domains
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In
reinforcement learning
[Sutton and Barto (1998)] (RL) problems, learning agents execute sequential actions with the goal of maximizing a reward signal, which may be time-delayed. For example, an agent could learn to play a game by being told whether it wins or loses, without ever being told the “correct” action. The RL framework has gained popularity with the development of algorithms capable of mastering increasingly complex problems. However, when RL agents begin learning
tabula rasa
, mastering difficult tasks is often slow or infeasible, and thus a significant amount of current research in RL focuses on improving the speed of learning by exploiting domain expertise with varying amounts of human-provided knowledge. Common approaches include deconstructing the task into a hierarchy of subtasks (c.f.,MAXQ [Dietterich (2000)]), finding ways to learn over temporally abstract actions (e.g., using the options framework [Sutton et al. (1999)]) rather than simple one-step actions, and abstracting over the state space (e.g., via function approximation [Sutton and Barto (1998)]) so agents may efficiently generalize experience.