Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena.

Slides:

Advertisements

Similar presentations

Advertisements

Symantec 2010 Windows 7 Migration Global Results.

University Paderborn 07 January 2009 RG Knowledge Based Systems Prof. Dr. Hans Kleine Büning Reinforcement Learning.

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory

PDAs Accept Context-Free Languages

ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala

1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,

EuroCondens SGB E.

Reinforcement Learning

& dding ubtracting ractions.

Sequential Logic Design

Multiplication X 1 1 x 1 = 1 2 x 1 = 2 3 x 1 = 3 4 x 1 = 4 5 x 1 = 5 6 x 1 = 6 7 x 1 = 7 8 x 1 = 8 9 x 1 = 9 10 x 1 = x 1 = x 1 = 12 X 2 1.

1 When you see… Find the zeros You think…. 2 To find the zeros...

Multiplication Facts Review. 6 x 4 = 24 5 x 5 = 25.

CHAPTER 18 The Ankle and Lower Leg

1 Outline relationship among topics secrets LP with upper bounds by Simplex method basic feasible solution (BFS) by Simplex method for bounded variables.

The 5S numbers game..

A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.

1 OFDM Synchronization Speaker:. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outline OFDM System Description Synchronization What is Synchronization?

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Chapter 7: Steady-State Errors 1 ©2000, John Wiley & Sons, Inc. Nise/Control Systems Engineering, 3/e Chapter 7 Steady-State Errors.

A pre-Weekend Talk on Online Learning TGIF Talk Series Purushottam Kar.

Break Time Remaining 10:00.

The basics for simulations

EE, NCKU Tien-Hao Chang (Darby Chang)

Employee & Manager Self Service Overview

MM4A6c: Apply the law of sines and the law of cosines.

Galit Haim, Ya'akov Gal, Sarit Kraus and Michele J. Gelfand A Cultural Sensitive Agent for Human-Computer Negotiation 1.

Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.

1 Prediction of electrical energy by photovoltaic devices in urban situations By. R.C. Ott July 2011.

Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.

Computer vision: models, learning and inference

Putting the ‘x’ back into goal-free problems: a brainstorming approach Carina Schubert 1, Paul Ayres 2, Katharina Scheiter 1 and John Sweller 2 1 University.

MaK_Full ahead loaded 1 Alarm Page Directory (F11)

Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.

1. Motion of an object is described by its position,

Artificial Intelligence

When you see… Find the zeros You think….

1 Motion and Manipulation Configuration Space. Outline Motion Planning Configuration Space and Free Space Free Space Structure and Complexity.

LN-251 SimINERTIAL Performance

2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.

2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.

2.10% more children born Die 0.2 years sooner Spend 95.53% less money on health care No class divide 60.84% less electricity 84.40% less oil.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

One-Degree Imager (ODI), WIYN Observatory What’s REALLY New in SolidWorks 2010 Richard Doyle, User Community Manager inspiration.

Static Equilibrium; Elasticity and Fracture

Clock will move after 1 minute

& dding ubtracting ractions.

Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.

Energy Generation in Mitochondria and Chlorplasts

Select a time to count down from the clock above

A Data Warehouse Mining Tool Stephen Turner Chris Frala

1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.

Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Introduction Embedded Universal Tools and Online Features 2.

úkol = A 77 B 72 C 67 D = A 77 B 72 C 67 D 79.

Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.

A vision-based system for grasping novel objects in cluttered environments Ashutosh Saxena, Lawson Wong, Morgan Quigley, Andrew Y. Ng 2007 Learning to.

4/15/2017 Using Gaussian Process Regression for Efficient Motion Planning in Environments with Deformable Objects Barbara Frank, Cyrill Stachniss, Nichola.

1 Last lecture  Configuration Space Free-Space and C-Space Obstacles Minkowski Sums.

Learning to grasp objects with multiple contact points Quoc V. Le, David Kamm, Arda Kara, Andrew Y. Ng.

Constraints-based Motion Planning for an Automatic, Flexible Laser Scanning Robotized Platform Th. Borangiu, A. Dogar, A. Dumitrache University Politehnica.

Learning Preferences on Trajectories via Iterative Improvement

Presentation transcript:

Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

Structured To Unstructured Environments [Images from Google] Kuka arm Kiva Beam Baxter PR2 Robot Nurse Jain, Sharma, Joachims, Saxena

Path Planning High DoF manipulators Continuous high dimensional space Obstacles BASE 7 DoF arm Joint Link End-Effector A B Geometric criteria's Collision free Shortest path Least time Minimum energy Kavraki et. al. PRM LaValle et. al. RRT Ratliff et. al. CHOMP Karaman et. al. RRT* Schulman et. al. TrajOpt Jain, Sharma, Joachims, Saxena

Context Rich Environment Jain, Sharma, Joachims, Saxena

Context Rich Environment Jain, Sharma, Joachims, Saxena Video [14 sec to 18 sec]

What went wrong? Robot not modeling the context Does not understand the preferences Jain, Sharma, Joachims, Saxena

Does Existing Works Address This? Inverse Reinforcement Learning (Kober and Peters 2011, Abbeel et. al. 2010, Ziebrat et. al. 2008, Ratliff et. al. 2006) Context is not important, focuses on specific trajectory – Modeling human navigation patterns Kitani et. al. ECCV 2012 Optimal Demonstrations: Requires an expert Abbeel et. al.Ratliff et. al.Kober et. al. Jain, Sharma, Joachims, Saxena

Our Goal Model Context Generate multiple trajectories for a task User preferences Learn from non-experts Jain, Sharma, Joachims, Saxena

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

Learning Setting UserRobot 1.Online learning system 2.Learns from user feedback 3.Sub-optimal feedback Goal: Learn user preferences Jain, Sharma, Joachims, Saxena

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

Example of Preferences Move a glass of water Upright Context Contorted Arm Preferences varies with users, tasks and environments Jain, Sharma, Joachims, Saxena

Score function Robot configuration and Environment Interactions Context Trajectory Jain, Sharma, Joachims, Saxena Object-object Interactions Connecting waypoints to neighboring objects Trajectory graph

Score function Jain, Sharma, Joachims, Saxena Trajectory graph Object attributes: {electronic, fragile, sharp, liquid, hot, …} E.g.Laptop: {electronic, fragile} Knife: {sharp} ….. Hermans et. al. ICRA w/s 2011 Koppula et. al. NIPS 2011 Distance features Object-object Interactions

Score function Object-object Interactions Robot configuration and Environment Interactions Bad Good Jain, Sharma, Joachims, Saxena Features 1.Spectrogram 2.Objects distance from horizontal and vertical surfaces 3.Objects angle with vertical axis 4.Robots wrist and elbow configuration in cylindrical co-ordinate Cakmak et. al. IROS 2011

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

User Feedback Intuitive feedback mechanisms Re-rank Interactive Zero-G Jain, Sharma, Joachims, Saxena

1. Re-rank Robot ranks trajectories and user selects one Top three trajectories User feedback User observing top three trajectories Jain, Sharma, Joachims, Saxena

2. Zero-G User corrects trajectory waypoints Bad waypoint in redHolding wrist activates zero-G mode Jain, Sharma, Joachims, Saxena

Bad waypoint in redHolding wrist activates zero-G mode 2. Zero-G User corrects trajectory waypoints Jain, Sharma, Joachims, Saxena

3. Interactive Not all robots support zero-G feedback Jain, Sharma, Joachims, Saxena

3. Interactive Not all robots support zero-G feedback Jain, Sharma, Joachims, Saxena

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

Coactive Learning UserRobot Goal: Learn user preferences Shivaswamy & Joachims, ICML 2012 Learn from sub- optimal feedback Jain, Sharma, Joachims, Saxena

Trajectory Preference Perceptron Regret bound Shivaswamy & Joachims, ICML 2012 Jain, Sharma, Joachims, Saxena

Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

Experimental Setup Two robots: Baxter and PR2 35 tasks in household setting – 2100 expert labeled trajectories 16 tasks in grocery store checkout settings – 1300 expert labeled trajectories 14 objects – Bowl, Knife, Laptop, Metal box, Fruits, Egg cartons etc. 7 users Jain, Sharma, Joachims, Saxena

Experimental Setting 1 Household environment on PR2 Pouring Cleaning the table Setting up table 35 tasks Variation in objects and environment Experts label on 2100 trajectories on a scale of 1 to 5 Jain, Sharma, Joachims, Saxena

Experimental Setting 2 Grocery store checkout on Baxter Cereal box Egg carton Knife in human vicinity 16 tasks Variations in objects and their placement Experts label on 1300 trajectories on a scale of 1 to 5 Jain, Sharma, Joachims, Saxena

Generalization #Feedback Ours w/o pre-training Ours pre-trained SVM-rank MMP-online Household setting Testing on a new environment Higher nDCG w/o feedback SVM-rank trained on experts labels MMP-online is an IRL technique Jain, Sharma, Joachims, Saxena

User Study 10 tasks per user – 7 users – Total 7 hours worth robot interaction – Users interacts until satisfied Jain, Sharma, Joachims, Saxena

User Study Task No. Time (min) #Feedback Increasing difficulty Grocery setting Baxter Re-rank popular for easier tasks Increase in zero- G for hard tasks #Feedback Time Re-rank Zero-G Jain, Sharma, Joachims, Saxena

User# Re-rank# Zero-G Time (min) Self Score Cross Score User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. Jain, Sharma, Joachims, Saxena

User# Re-rank# Zero-G Time (min) Self Score Cross Score User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. 5 Feedback 3 Re-rank 2 Zero-G Jain, Sharma, Joachims, Saxena

User# Re-rank# Zero-G Time (min) Self Score Cross Score User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. 5 to 6 min. per task Jain, Sharma, Joachims, Saxena

User# Re-rank# Zero-G Time (min) Self Score Cross Score User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. Similar preferences Jain, Sharma, Joachims, Saxena

Robot Demonstration Jain, Sharma, Joachims, Saxena Video [full video]

Conclusion Challenges of Unstructured Environment Geometric approaches are not enough Modeling context is crucial Learning from users and not experts Jain, Sharma, Joachims, Saxena

Thank You For more details visit