Presentation is loading. Please wait.

Presentation is loading. Please wait.

MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling.

Similar presentations


Presentation on theme: "MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling."— Presentation transcript:

1 MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling

2 MIT Artificial Intelligence Laboratory — Research Directions Making Reinforcement Learning Really Work Typical RL methods require far too much data to be practical in an online setting. Address the problem by –strong generalization techniques –using human input to bootstrap Let humans do what they’re good at Let learning algorithms do what they’re good at

3 MIT Artificial Intelligence Laboratory — Research Directions Incorporating Human Input Humans can help, even if they are bad at the task –Human provides initial trajectories –No attempt is made to learn to reproduce the trajectories –Reinforcement learning takes place in parallel –Once learned policy is good, use it

4 MIT Artificial Intelligence Laboratory — Research Directions Learning Phase One Learning System Supplied Control Policy Environment ARO

5 MIT Artificial Intelligence Laboratory — Research Directions Learning Phase Two Learning System Supplied Control Policy Environment ARO

6 MIT Artificial Intelligence Laboratory — Research Directions Early Results: Corridor Following

7 MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following 3 continuous state dimensions –corridor angle –offset from middle –distance to end of corridor 1 continuous action dimension –rotation velocity Supplied example policy – Average 110 steps to goal

8 MIT Artificial Intelligence Laboratory — Research Directions Experimental Set-Up –Initial training runs start from roughly the middle of the corridor –Translation speed has a fixed policy –Evaluation on a number of set starting points –Reward »10 at end of corridor »0 everywhere else

9 MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following “Best” possible Average training Phase 1Phase 2

10 MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: Initial Policy

11 MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: After Phase 1


Download ppt "MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling."

Similar presentations


Ads by Google