Introduction to Imitation Learning

Slides:



Advertisements
Similar presentations
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
Advertisements

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
1 Temporal-Difference Learning Week #6. 2 Introduction Temporal-Difference (TD) Learning –a combination of DP and MC methods updates estimates based on.
Markov Decision Processes
Apprenticeship learning for robotic control Pieter Abbeel Stanford University Joint work with Andrew Y. Ng, Adam Coates, Morgan Quigley.
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 Chapter 2: Evaluative Feedback pEvaluating actions vs. instructing by giving correct.
Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel Stanford University [Joint work with Andrew Ng.]
Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel Stanford University [Joint work with Andrew Ng.]
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Reinforcement Learning Introduction Presented by Alp Sardağ.
1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.
Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel and Andrew Y. Ng Stanford University.
Pieter Abbeel and Andrew Y. Ng Reinforcement Learning and Apprenticeship Learning Pieter Abbeel and Andrew Y. Ng Stanford University.
Learn to Predict “Affecting Changes” in Software Engineering Xiaoxia Ren Dec. 8, 2003.
General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.
REINFORCEMENT LEARNING LEARNING TO PERFORM BEST ACTIONS BY REWARDS Tayfun Gürel.
Apprenticeship Learning for Robotics, with Application to Autonomous Helicopter Flight Pieter Abbeel Stanford University Joint work with: Andrew Y. Ng,
Balancing Exploration and Exploitation Ratio in Reinforcement Learning Ozkan Ozcan (1stLT/ TuAF)
Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths & Weaknesses for Practical Deployment Tim Paek Microsoft Research Dialogue on Dialogues.
Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.
Design and Implementation of General Purpose Reinforcement Learning Agents Tyler Streeter November 17, 2005.
Computational Modeling Lab Wednesday 18 June 2003 Reinforcement Learning an introduction part 5 Ann Nowé By Sutton.
Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.
Learning from Observations
University of British Columbia
Stochastic tree search and stochastic games
Generative Adversarial Imitation Learning
Learning Recommender Systems with Adaptive Regularization
Chapter 6: Temporal Difference Learning
Reinforcement Learning in POMDPs Without Resets
CMSC 471 – Spring 2014 Class #25 – Thursday, May 1
Teaching Style COSC 6368 Teaching Style COSC 6368
Reinforcement Learning
Human-like Planning of Swerve Maneuvers for Autonomous Vehicles
Markov Decision Processes
نتعارف لنتألف في التعارف تألف (( الأرواح جنود مجندة , ماتعارف منها أئتلف , وماتنافر منها اختلف )) نماذج من العبارات الايجابية.
MGT 582 Education for Service/tutorialrank.com
Importance Weighted Active Learning
End-to-end Driving via Conditional Imitation Learning
PD-World Pickup: Cells: (1,1), (4,1),(3,3),(5,5)
Feature Selection To avid “curse of dimensionality”
RL methods in practice Alekh Agarwal.
The Comparison of Effectiveness of Problem-Solving Skills Training دانشگاه فردوسی مشهد، دانشکده روان‌شناسی و علوم تربیتی، گروه روان‌شناسی تربیتی.
Introduction Deregulation of the market: facilities to new producers
Social Learning.
PUBLIC OPINION.
Apprenticeship Learning via Inverse Reinforcement Learning
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CASE − Cognitive Agents for Social Environments
Chapter 0 : Introduction to Object Oriented Design
Chapter 2: Evaluative Feedback
Analyzing an Algorithm Computing the Order of Magnitude Big O Notation
October 6, 2011 Dr. Itamar Arel College of Engineering
Chapter 6: Temporal Difference Learning
Intrinsically Motivated Collective Motion
Chapter 17 – Making Complex Decisions
Introduction to Reinforcement Learning and Q-Learning
Deep Reinforcement Learning
COSC 4368 Group Project Spring 2019 Learning Paths from Feedback Using Reinforcement Learning for a Transportation World P D P D D P.
Game of Life Presentation Byung-guk Kim
University of Science and Technology of China
Identification of Variation Points Using Dynamic Analysis
Recent Advances in Neural Architecture Search
Chapter 2: Evaluative Feedback
What is Artificial Intelligence?
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Morteza Kheirkhah University College London
Presentation transcript:

Introduction to Imitation Learning 谷雨 03/12

ICML2018: Imitation Learning

Background What to predict in imitation learning? A distribution of actions (or simply an action) given a state Relation between imitation learning and RL Methodology (i.e., demonstrations / rewards…) Scenario (different level of freedom) Relation between imitation learning and supervised learning

Imitation Learning in a Nutshell Given: demonstrations or demonstrator Goal: train a policy to mimic demonstrations

Components

Some Applications

Notation

Running Example

The Simplest Setting of Imitation Learning Behavioral Cloning

General Imitation Learning vs Behavioral Cloning

Limitations of Behavioral Cloning

When to use Behavioral Cloning

Types of Imitation Learning

Comparison

Interactive Direct Policy Learning

Learning Reductions

A Naïve Attempt Not guaranteed to converge!

Sequential Learning Reductions

Data Aggregation (DAgger)

Policy Aggregation

Interactive Direct Policy Learning

Inverse Reinforcement Learning Background for RL

Inverse Reinforcement Learning

Inverse Reinforcement Learning

Simplified version

More Complicated Situations…

Example

Recommended Reading ICML2018: Imitation Learning Tutorial Imitation Learning: A Survey of Learning Methods Learning to Search in Branch and Bound Algorithms (NIPS’2014) …