Introduction
Related works
Robot model
Controller
Reinforcement learning methods as controllers
Q learning
Algorithm
Result and discussion
Deep Q network (DQN)
-
Experience Replay
-
Derivation of Q Values in one forward pass (Additional file 5).
Experience replay
-
It allows greater data efficiency as each step of experience can be used in many weight updates
-
Randomizing batches break correlations between samples
-
Behaviour distribution is averaged over many of its previous states.