Introducing Autoregressive Policies for Temporally Coherent Exploration in Continuous Control Reinforcement Learning

Introducing Autoregressive Policies for Temporally Coherent Exploration in Continuous Control Reinforcement Learning

Reinforcement Learning (RL) is a promising approach to solving complex real world tasks with physical robots, supported by recent successes, e.g. in grasping and object manipulation. In RL, a decision-making agent interacting with the world discovers new behaviours by trial and error, sometimes exploring new ways to do things, and sometimes exploiting what it has already found to work well. Efficient exploration of alternative behaviours is the key to reinforcement learning.