Introduction to Imitation Learning 谷雨 03/12
ICML2018: Imitation Learning
Background What to predict in imitation learning? A distribution of actions (or simply an action) given a state Relation between imitation learning and RL Methodology (i.e., demonstrations / rewards…) Scenario (different level of freedom) Relation between imitation learning and supervised learning
Imitation Learning in a Nutshell Given: demonstrations or demonstrator Goal: train a policy to mimic demonstrations
Components
Some Applications
Notation
Running Example
The Simplest Setting of Imitation Learning Behavioral Cloning
General Imitation Learning vs Behavioral Cloning
Limitations of Behavioral Cloning
When to use Behavioral Cloning
Types of Imitation Learning
Comparison
Interactive Direct Policy Learning
Learning Reductions
A Naïve Attempt Not guaranteed to converge!
Sequential Learning Reductions
Data Aggregation (DAgger)
Policy Aggregation
Interactive Direct Policy Learning
Inverse Reinforcement Learning Background for RL
Inverse Reinforcement Learning
Inverse Reinforcement Learning
Simplified version
More Complicated Situations…
Example
Recommended Reading ICML2018: Imitation Learning Tutorial Imitation Learning: A Survey of Learning Methods Learning to Search in Branch and Bound Algorithms (NIPS’2014) …