Introduction to Imitation Learning

Slides:

Advertisements

Similar presentations

Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.

Advertisements

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

1 Temporal-Difference Learning Week #6. 2 Introduction Temporal-Difference (TD) Learning –a combination of DP and MC methods updates estimates based on.

Markov Decision Processes

Apprenticeship learning for robotic control Pieter Abbeel Stanford University Joint work with Andrew Y. Ng, Adam Coates, Morgan Quigley.

R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 Chapter 2: Evaluative Feedback pEvaluating actions vs. instructing by giving correct.

Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel Stanford University [Joint work with Andrew Ng.]

Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel Stanford University [Joint work with Andrew Ng.]

1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.

Reinforcement Learning Introduction Presented by Alp Sardağ.

1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.

Pieter Abbeel and Andrew Y. Ng Apprenticeship Learning via Inverse Reinforcement Learning Pieter Abbeel and Andrew Y. Ng Stanford University.

Pieter Abbeel and Andrew Y. Ng Reinforcement Learning and Apprenticeship Learning Pieter Abbeel and Andrew Y. Ng Stanford University.

Learn to Predict “Affecting Changes” in Software Engineering Xiaoxia Ren Dec. 8, 2003.

General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.

REINFORCEMENT LEARNING LEARNING TO PERFORM BEST ACTIONS BY REWARDS Tayfun Gürel.

Apprenticeship Learning for Robotics, with Application to Autonomous Helicopter Flight Pieter Abbeel Stanford University Joint work with: Andrew Y. Ng,

Balancing Exploration and Exploitation Ratio in Reinforcement Learning Ozkan Ozcan (1stLT/ TuAF)

Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths & Weaknesses for Practical Deployment Tim Paek Microsoft Research Dialogue on Dialogues.

Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.

Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University

Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.

Design and Implementation of General Purpose Reinforcement Learning Agents Tyler Streeter November 17, 2005.

Computational Modeling Lab Wednesday 18 June 2003 Reinforcement Learning an introduction part 5 Ann Nowé By Sutton.

Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.

Learning from Observations

University of British Columbia

Stochastic tree search and stochastic games

Generative Adversarial Imitation Learning

Learning Recommender Systems with Adaptive Regularization

Chapter 6: Temporal Difference Learning

Reinforcement Learning in POMDPs Without Resets

CMSC 471 – Spring 2014 Class #25 – Thursday, May 1

Teaching Style COSC 6368 Teaching Style COSC 6368

Reinforcement Learning

Human-like Planning of Swerve Maneuvers for Autonomous Vehicles

Markov Decision Processes

نتعارف لنتألف في التعارف تألف (( الأرواح جنود مجندة , ماتعارف منها أئتلف , وماتنافر منها اختلف )) نماذج من العبارات الايجابية.

MGT 582 Education for Service/tutorialrank.com

Importance Weighted Active Learning

End-to-end Driving via Conditional Imitation Learning

PD-World Pickup: Cells: (1,1), (4,1),(3,3),(5,5)

Feature Selection To avid “curse of dimensionality”

RL methods in practice Alekh Agarwal.

The Comparison of Effectiveness of Problem-Solving Skills Training دانشگاه فردوسی مشهد، دانشکده روان‌شناسی و علوم تربیتی، گروه روان‌شناسی تربیتی.

Introduction Deregulation of the market: facilities to new producers

Social Learning.

PUBLIC OPINION.

Apprenticeship Learning via Inverse Reinforcement Learning

Instructors: Fei Fang (This Lecture) and Dave Touretzky

CASE − Cognitive Agents for Social Environments

Chapter 0 : Introduction to Object Oriented Design

Chapter 2: Evaluative Feedback

Analyzing an Algorithm Computing the Order of Magnitude Big O Notation

October 6, 2011 Dr. Itamar Arel College of Engineering

Chapter 6: Temporal Difference Learning

Intrinsically Motivated Collective Motion

Chapter 17 – Making Complex Decisions

Introduction to Reinforcement Learning and Q-Learning

Deep Reinforcement Learning

COSC 4368 Group Project Spring 2019 Learning Paths from Feedback Using Reinforcement Learning for a Transportation World P D P D D P.

Game of Life Presentation Byung-guk Kim

University of Science and Technology of China

Identification of Variation Points Using Dynamic Analysis

Recent Advances in Neural Architecture Search

Chapter 2: Evaluative Feedback

What is Artificial Intelligence?

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Morteza Kheirkhah University College London

Presentation transcript:

Introduction to Imitation Learning 谷雨 03/12

ICML2018: Imitation Learning

Background What to predict in imitation learning? A distribution of actions (or simply an action) given a state Relation between imitation learning and RL Methodology (i.e., demonstrations / rewards…) Scenario (different level of freedom) Relation between imitation learning and supervised learning

Imitation Learning in a Nutshell Given: demonstrations or demonstrator Goal: train a policy to mimic demonstrations

Components

Some Applications

Notation

Running Example

The Simplest Setting of Imitation Learning Behavioral Cloning

General Imitation Learning vs Behavioral Cloning

Limitations of Behavioral Cloning

When to use Behavioral Cloning

Types of Imitation Learning

Comparison

Interactive Direct Policy Learning

Learning Reductions

A Naïve Attempt Not guaranteed to converge!

Sequential Learning Reductions

Data Aggregation (DAgger)

Policy Aggregation

Interactive Direct Policy Learning

Inverse Reinforcement Learning Background for RL

Inverse Reinforcement Learning

Inverse Reinforcement Learning

Simplified version

More Complicated Situations…

Example

Recommended Reading ICML2018: Imitation Learning Tutorial Imitation Learning: A Survey of Learning Methods Learning to Search in Branch and Bound Algorithms (NIPS’2014) …