Download presentation
Presentation is loading. Please wait.
Published byDiane Pierce Modified over 9 years ago
1
MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling
2
MIT Artificial Intelligence Laboratory — Research Directions Making Reinforcement Learning Really Work Typical RL methods require far too much data to be practical in an online setting. Address the problem by –strong generalization techniques –using human input to bootstrap Let humans do what they’re good at Let learning algorithms do what they’re good at
3
MIT Artificial Intelligence Laboratory — Research Directions Incorporating Human Input Humans can help, even if they are bad at the task –Human provides initial trajectories –No attempt is made to learn to reproduce the trajectories –Reinforcement learning takes place in parallel –Once learned policy is good, use it
4
MIT Artificial Intelligence Laboratory — Research Directions Learning Phase One Learning System Supplied Control Policy Environment ARO
5
MIT Artificial Intelligence Laboratory — Research Directions Learning Phase Two Learning System Supplied Control Policy Environment ARO
6
MIT Artificial Intelligence Laboratory — Research Directions Early Results: Corridor Following
7
MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following 3 continuous state dimensions –corridor angle –offset from middle –distance to end of corridor 1 continuous action dimension –rotation velocity Supplied example policy – Average 110 steps to goal
8
MIT Artificial Intelligence Laboratory — Research Directions Experimental Set-Up –Initial training runs start from roughly the middle of the corridor –Translation speed has a fixed policy –Evaluation on a number of set starting points –Reward »10 at end of corridor »0 everywhere else
9
MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following “Best” possible Average training Phase 1Phase 2
10
MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: Initial Policy
11
MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: After Phase 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.