Learning From Demonstration Atkeson and Schaal Dang, RLAB Feb 28 th, 2007.

Slides:



Advertisements
Similar presentations
Introductory Control Theory I400/B659: Intelligent robotics Kris Hauser.
Advertisements

COMP Robotics: An Introduction
Kinematic Synthesis of Robotic Manipulators from Task Descriptions June 2003 By: Tarek Sobh, Daniel Toundykov.
Angular Kinematics Chapter 6 KINE 3301 Biomechanics of Human Movement.
Benjamin Stephens Carnegie Mellon University 9 th IEEE-RAS International Conference on Humanoid Robots December 8, 2009 Modeling and Control of Periodic.
Qube-Servo Curriculum Presentation This presentation is intended to provide general content for any relevant presentations The general real-world applications.
Animation Following “Advanced Animation and Rendering Techniques” (chapter 15+16) By Agata Przybyszewska.
Venkataramanan Balakrishnan Purdue University Applications of Convex Optimization in Systems and Control.
INTRODUCTION TO DYNAMICS ANALYSIS OF ROBOTS (Part 6)
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
Model Predictive Control for Humanoid Balance and Locomotion Benjamin Stephens Robotics Institute.
Control of Instantaneously Coupled Systems Applied to Humanoid Walking Eric C. Whitman & Christopher G. Atkeson Carnegie Mellon University.
Understand the football simulation source code. Understand the football simulation source code. Learn all the technical specifications of the system components.
Vision-Based Motion Control of Robots
Analysis of a Pendulum Problem after Jan Jantzen
Human-in-the-Loop Control of an Assistive Robot Arm Katherine Tsui and Holly Yanco University of Massachusetts, Lowell.
Presented By: Huy Nguyen Kevin Hufford
Pieter Abbeel and Andrew Y. Ng Reinforcement Learning and Apprenticeship Learning Pieter Abbeel and Andrew Y. Ng Stanford University.
5.2 Uniform Circular motion 5.3 Dynamic of Uniform Circular Motion
Computational aspects of motor control and motor learning Michael I. Jordan* Mark J. Buller (mbuller) 21 February 2007 *In H. Heuer & S. Keele, (Eds.),
Temporal Motion Control CMPUT 610 Martin Jagersand.
Intelligent Steering Using PID controllers
역운동학의 구현과 응용 Implementation of Inverse Kinematics and Application 서울대학교 전기공학부 휴먼애니메이션연구단 최광진
Function Approximation for Imitation Learning in Humanoid Robots Rajesh P. N. Rao Dept of Computer Science and Engineering University of Washington,
Definition of an Industrial Robot
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Kinematic Linkages.
World space = physical space, contains robots and obstacles Configuration = set of independent parameters that characterizes the position of every point.
Optimization-Based Full Body Control for the DARPA Robotics Challenge Siyuan Feng Mar
The Coach as Teacher Jody Brylinsky Suzan Ayers. Introduction Overview of the Unit Goals and objectives.
An Introduction to Programming and Algorithms. Course Objectives A basic understanding of engineering problem solving process. A basic understanding of.
Chapter 5 Trajectory Planning 5.1 INTRODUCTION In this chapters …….  Path and trajectory planning means the way that a robot is moved from one location.
Chapter 5 Trajectory Planning 5.1 INTRODUCTION In this chapters …….  Path and trajectory planning means the way that a robot is moved from one location.
Vision-Based Reach-To-Grasp Movements From the Human Example to an Autonomous Robotic System Alexa Hauck.
Whitman and Atkeson.  Present a decoupled controller for a simulated three-dimensional biped.  Dynamics broke down into multiple subsystems that are.
Tuning. Overview Basic Tuning Difference between commutation methods Use of digital filters Vertical axis – no brake Overview 2.
Reinforcement Learning 主講人:虞台文 Content Introduction Main Elements Markov Decision Process (MDP) Value Functions.
Visual SLAM Visual SLAM SPL Seminar (Fri) Young Ki Baik Computer Vision Lab.
ESS 303 – Biomechanics Angular Kinematics. From Last Time Someone kicks a football so that it travels at a velocity of 29.7m/s at an angle of 22° above.
Control of Robot Manipulators
ZMP-BASED LOCOMOTION Robotics Course Lesson 22.
S ystems Analysis Laboratory Helsinki University of Technology Automated Solution of Realistic Near-Optimal Aircraft Trajectories Using Computational Optimal.
Lecture 3 Intro to Posture Control Working with Dynamic Models.
Real-Time Simultaneous Localization and Mapping with a Single Camera (Mono SLAM) Young Ki Baik Computer Vision Lab. Seoul National University.
Introduction to Biped Walking
Just a quick reminder with another example
Anthony Beeman.  Since the project proposal submittal on 9/21/15 I began work on the Abaqus Kinematic model utilizing join, hinge, and beam elements.
5.2 Uniform Circular motion 5.3 Dynamic of Uniform Circular Motion Circular Motion HW4: Chapt.5: Pb.23, Pb.24, Pb.30, Pb.33, Pb.36, Pb.53- Due FRIDAY,
INTRODUCTION TO DYNAMICS ANALYSIS OF ROBOTS (Part 4)
Object Tracking - Slide 1 Object Tracking Computer Vision Course Presentation by Wei-Chao Chen April 05, 2000.
Tracking with dynamics
Ferdinando A. Mussa-Ivaldi, “Modular features of motor control and learning,” Current opinion in Neurobiology, Vol. 9, 1999, pp The focus on complex.
Date of download: 6/6/2016 Copyright © ASME. All rights reserved. From: The Use of the Adjoint Method for Solving Typical Optimization Problems in Multibody.
Rational Functions Review. Simplify Simplify.
Singularity-Robust Task Priority Redundancy Resolution for Real-time Kinematic Control of Robot Manipulators Stefano Chiaverini.
Fundamentals of Computer Animation
AAAI Spring Symposium : 23 March Brenna D. Argall : The Robotics Institute Learning Robot Motion Control from Demonstration and Human Advice Brenna.
Character Animation Forward and Inverse Kinematics
CS b659: Intelligent Robotics
Zaid H. Rashid Supervisor Dr. Hassan M. Alwan
"Playing Atari with deep reinforcement learning."
Rotational Kinematics
Robust Control Chris Atkeson 5/3/16.
Bowei Tang, Tianyu Chen, and Christopher Atkeson
Alfred Lynam February 7, 2008 Dynamics and Control Architecture of Control System: Controller Prototypes AAE 450 Spring 2008 Dynamics and Control.
Quanser Rotary Family Experiments
Synthesis of Motion from Simple Animations
Emir Zeylan Stylianos Filippou
Unsupervised Perceptual Rewards For Imitation Learning
Chapter 4 . Trajectory planning and Inverse kinematics
Presentation transcript:

Learning From Demonstration Atkeson and Schaal Dang, RLAB Feb 28 th, 2007

Dang, RLAB 2 Goal Robot Learning from Demonstration –Small number of human demonstrations –Task level learning (learn intent, not just mimicry) Explore –Parametric vs. nonparametric learning –role of a priori knowledge

Feb 28th, 2007Dang, RLAB 3 Known Task Pendulum swing-up task –Like pole balancing, but more complex –Difficult, but easy to evaluate success Simplified –Restricted to horz. motion –Impt. variables picked out Pendulum angle Pendulum angular velocity Hand location Hand velocity Hand acceleration

Feb 28th, 2007Dang, RLAB 4 Implementation details SARCOS 7DOF arm Stereo Vision, colored ball indicators 0.12s delay overcome with Kalman filter –Idealized pendulum dynamics Redundant inverse kinematics and real-time inverse dynamics for control

Feb 28th, 2007Dang, RLAB 5 Learning Task composed of two subtasks Believe that subtask learning accelerates new task learning –1 Pole Swing up open-loop –2 Upright Balance Feedback Focus here on swing-up –Balancing already learned

Feb 28th, 2007Dang, RLAB 6 First approach Directly mimic human hand movement –Fails Differences in human and robot capabilities Improper demonstration (not horizontal) Imprecise mimicry

Feb 28th, 2007Dang, RLAB 7 Approach the second Learn reward – Learn a model – Use human demonstration as seed so a planner can find a good policy

Feb 28th, 2007Dang, RLAB 8 Learn Task Model Parametric: – – learn parameters via linear regression Nonparametric – –Use Locally Weighted Learning –Given desired variable and a set of possibly relevant input variables Cross validation to tune meta-parameters

Feb 28th, 2007Dang, RLAB 9 Swing up Transition to balance occurs at ± 0.5 radians with angular vel. < 3 rad/sec Reward function set to make robot want to be like demonstrator –

Feb 28th, 2007Dang, RLAB 10 Parametric Parameters learned from failure data Trajectory optimized using human trajectory as seed SUCCESS

Feb 28th, 2007Dang, RLAB 11 Nonparametric Slower, but still successful

Feb 28th, 2007Dang, RLAB 12 Harder Task Double pump swing up –Approach fails Believed to be due to improper modeling of the system Solved by

Feb 28th, 2007Dang, RLAB 13 Direct task-level learning Learn a correction term to add to the target angle –Now target ± (0.5+∆)rad –Use binary search Worked for parametric Didn’t for nonparametric –Left region of validity of local models –So, tweak velocity all over Binary search for coefficient

Feb 28th, 2007Dang, RLAB 14 Results

Feb 28th, 2007Dang, RLAB 15 Summary of Technique Watch demo, mimic hand Learn model, optimize demo trajectory Tune model, reoptimize Binary search for delta Binary search for c Succeeds for None Parametric, single Nonparametric, single Parametric, double Nonparametric, double Math

Feb 28th, 2007Dang, RLAB 16 Discussion points Reward function was given or learned? Does task-level direct learning make sense? –Only useful in this task / implementation? –I in PID? Nonparametrics don’t avoid all modeling errors –Poor planner? –Not enough data? A priori knowledge –human selects inputs, outputs, control system, perception, model selection, reward function, task segmenting, task factors It Works!