MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling.

Slides:

Advertisements

Similar presentations

Viktor Zhumatiya, Faustino Gomeza,

Advertisements

Motion and Force A. Motion 1. Motion is a change in position

Lindsey Bleimes Charlie Garrod Adam Meyerson

1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.

Reinforcement Learning for the Soccer Dribbling Task Arthur Carvalho Renato Oliveira.

DARPA Mobile Autonomous Robot SoftwareMay Adaptive Intelligent Mobile Robotics William D. Smart, Presenter Leslie Pack Kaelbling, PI Artificial.

Effective Reinforcement Learning for Mobile Robots Smart, D.L and Kaelbing, L.P.

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

MIT Artificial Intelligence Laboratory — Research Directions Visual Detection Systems Tomaso Poggio.

Partially Observable Markov Decision Process By Nezih Ergin Özkucur.

Velocity/Speed Lab. Objective To determine how fast you move! You have to find the speed you and your partners when you: Walk backwards Skip down the.

Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density Amy McGovern Andrew Barto.

Reinforcement Learning

Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.

Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux.

Reinforcement Learning, Cont’d Useful refs: Sutton & Barto, Reinforcement Learning: An Introduction, MIT Press 1998.

Effective Reinforcement Learning for Mobile Robots William D. Smart & Leslie Pack Kaelbling* Mark J. Buller (mbuller) 14 March 2007 *Proceedings of IEEE.

Learning From Data Chichang Jou Tamkang University.

Incorporating Advice into Agents that Learn from Reinforcement Presented by Alp Sardağ.

Hierarchical Reinforcement Learning Ersin Basaran 19/03/2005.

Reinforcement Learning: Learning to get what you want... Sutton & Barto, Reinforcement Learning: An Introduction, MIT Press 1998.

The People Have Spoken.... Administrivia Final Project proposal due today Undergrad credit: please see me in office hours Dissertation defense announcements.

An Empirical Evaluation of Machine Learning Approaches for Angry Birds Anjali Narayan-Chen, Liqi Xu, & Jude Shavlik University of Wisconsin-Madison Presented.

8/9/20151 DARPA-MARS Kickoff Adaptive Intelligent Mobile Robots Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT.

CS 0004 –Lecture 8 Jan 24, 2011 Roxana Gheorghiu.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; March Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.

1 Dr. Itamar Arel College of Engineering Electrical Engineering & Computer Science Department The University of Tennessee Fall 2009 August 24, 2009 ECE-517:

REINFORCEMENT LEARNING LEARNING TO PERFORM BEST ACTIONS BY REWARDS Tayfun Gürel.

Encoding Robotic Sensor States for Q-Learning using the Self-Organizing Map Gabriel J. Ferrer Department of Computer Science Hendrix College.

Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths & Weaknesses for Practical Deployment Tim Paek Microsoft Research Dialogue on Dialogues.

DESCRIBING MOTION: Kinematics in One Dimension CHAPTER 2.

Cognitive Modeling / University of Groningen / / Artificial Intelligence |RENSSELAER| Cognitive Science CogWorks Laboratories › Christian P. Janssen ›

Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.

Reinforcement Learning

Game Theory, Social Interactions and Artificial Intelligence Supervisor: Philip Sterne Supervisee: John Richter.

Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University

Top level learning Pass selection using TPOT-RL. DT receiver choice function DT is trained off-line in artificial situation DT used in a heuristic, hand-coded.

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 8: Dynamic Programming – Value Iteration Dr. Itamar Arel College of Engineering Department.

Goal: To understand velocity Objectives: 1)To understand the difference between Speed vs. velocity 2)To understand the difference between instantaneous.

Kshitij Judah, Saikat Roy Alan Fern, Tom Dietterich TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAI-2010 Atlanta,

DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; January Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.

1 ECE-517: Reinforcement Learning in Artificial Intelligence Lecture 12: Generalization and Function Approximation Dr. Itamar Arel College of Engineering.

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 21: Dynamic Multi-Criteria RL problems Dr. Itamar Arel College of Engineering Department.

CME Bruxelles. 13/03/2007 Validation. Last meeting result.

Motion in One Dimension. Displacement  x = x f - x i.

Chapter 6 Neural Network.

Reinforcement Learning. Overview Supervised Learning: Immediate feedback (labels provided for every input). Unsupervised Learning: No feedback (no labels.

1 ECE-517: Reinforcement Learning in Artificial Intelligence Lecture 12: Generalization and Function Approximation Dr. Itamar Arel College of Engineering.

NTT-MIT Collaboration Meeting, 2001Leslie Pack Kaelbling 1 Learning in Worlds with Objects Leslie Pack Kaelbling MIT Artificial Intelligence Laboratory.

Autonomous Skill Acquisition on a Mobile Manipulator Hauptseminar: Topics in Robotics Jonah Vincke George Konidaris MIT CSAIL Scott Kuindersma.

Reinforcement Learning (1)

Deep reinforcement learning

Distance, Displacement, Speed, and Velocity. Distance Distance (d) – how far an object travels. Distance (d) – how far an object travels. Does not depend.

یادگیری تقویتی Reinforcement Learning

Lecture 6: Introduction to Machine Learning

October 6, 2011 Dr. Itamar Arel College of Engineering

CS 188: Artificial Intelligence Fall 2008

Learning in Worlds with Objects

Emir Zeylan Stylianos Filippou

Maintenance Sheet 24 due Friday

If we look around us, we will see angles everywhere.

Real-world Video Adaptation with Reinforcement Learning

Angel A. Cantu, Nami Akazawa Department of Computer Science

Linear Kinematics - Speed

Morteza Kheirkhah University College London

Presentation transcript:

MIT Artificial Intelligence Laboratory — Research Directions Intelligent Agents that Learn Leslie Pack Kaelbling

MIT Artificial Intelligence Laboratory — Research Directions Making Reinforcement Learning Really Work Typical RL methods require far too much data to be practical in an online setting. Address the problem by –strong generalization techniques –using human input to bootstrap Let humans do what they’re good at Let learning algorithms do what they’re good at

MIT Artificial Intelligence Laboratory — Research Directions Incorporating Human Input Humans can help, even if they are bad at the task –Human provides initial trajectories –No attempt is made to learn to reproduce the trajectories –Reinforcement learning takes place in parallel –Once learned policy is good, use it

MIT Artificial Intelligence Laboratory — Research Directions Learning Phase One Learning System Supplied Control Policy Environment ARO

MIT Artificial Intelligence Laboratory — Research Directions Learning Phase Two Learning System Supplied Control Policy Environment ARO

MIT Artificial Intelligence Laboratory — Research Directions Early Results: Corridor Following

MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following 3 continuous state dimensions –corridor angle –offset from middle –distance to end of corridor 1 continuous action dimension –rotation velocity Supplied example policy – Average 110 steps to goal

MIT Artificial Intelligence Laboratory — Research Directions Experimental Set-Up –Initial training runs start from roughly the middle of the corridor –Translation speed has a fixed policy –Evaluation on a number of set starting points –Reward »10 at end of corridor »0 everywhere else

MIT Artificial Intelligence Laboratory — Research Directions Corridor-Following “Best” possible Average training Phase 1Phase 2

MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: Initial Policy

MIT Artificial Intelligence Laboratory — Research Directions Corridor Following: After Phase 1