Download presentation
Presentation is loading. Please wait.
Published byAmanda Hall Modified over 9 years ago
1
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Learning Prospective Robot Behavior Shichao Ou and Roderic Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst
2
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE A Developmental Approach Infant Learning –In stages Maturation processes –Parents provide constrained learning contexts Protect Easy Complex –Motion mobile for newborns –Use brightly colored, easy to pick up objects –Use building blocks –Association of words and objects
3
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Application in Robotics Framework for Robot Developmental Learning –Role of teacher: setup learning contexts that make target concept conspicuous –Role of robot: acquire concepts, generalize to new contexts by autonomous exploration, provide feedback Control Basis –Robot actions are created using combinations of –Establish stages of learning by time-varying constraints on resources Easy Complex
4
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Example Learning to Reach for Objects –Stage 1: SearchTrack Focus attention using single brightly colored object (σ) Limit DOF (τ) to use head ONLY –Stage 2: ReachGrab Limit DOF (τ) to use one arm ONLY –Stage 3: Handedness, Scale- Sensitive Hart et. al, 2008
5
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Infant adapts to new situations by prospectively look ahead and predict failure and then learn a repair strategy
6
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Robot Prospective Learning with Human Guidance S0S0 S1S1 SiSi SnSn SjSj a0a0 a1a1 a i-1 aiai a j-1 ajaj a n-1 S0S0 S1S1 SiSi SnSn SjSj S i1 S in S ij sub-task a0a0 a1a1 a i-1 aiai a j-1 ajaj a n-1 S0S0 S1S1 SiSi SnSn SjSj g(f)=1 g(f)=0 a0a0 a1a1 a i-1 aiai a j-1 ajaj a n-1 Challenge
7
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE A 2D Navigation Domain Problem 30x30 map 6 doors, randomly closed 6 buttons 1 start and 1 goal 3-bit door sensor on robot
8
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Flat Learning Results Flat Q-Learning –5-bit state (x,y, door-bit1, door-bit2, door-bit3) –4 actions up, down, left, right –Reward 1 for reaching the goal -0.01 for every step taken –Learning parameter α=0.1, γ=1.0, ε=0.1 Learned solutions after 30,000 episodes
9
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Stage 1 –All doors open –Constrain resources to use only (x,y) sensors –Allow agent learn a policy from start to goal S0S0 S1S1 SiSi SnSn SjSj Right DownRight UpRight
10
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Stage 2 –Close 1 door –Robot learns the cause of the failure –Robot back tracks and finds an earlier indicator of this cause
11
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Stage 2 –Close 1 door –Robot learns the cause of the failure –Robot back tracks and finds an earlier indicator of this cause –Create a sub-task –Learn a new policy to sub- task
12
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Stage 2 –Close 1 door –Robot learns the cause of the failure –Robot back tracks and finds an earlier indicator of this cause –Create a sub-task –Learn a new policy to sub- task –Resume original policy
13
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Prospective Learning Results Learned solutions < 2000 episodes
14
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Humanoid Robot Manipulation Domain Benefits of Prospective Learning –Adapt to new contexts by maintaining majority of the existing policy –Automatically generates sub-goals –Sub-task can be learned in a completely different state space. –Supports interactive learning
15
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE Conclusion A developmental view to robot learning A framework enables interactive incremental learning in stages Extension to the control basis learning framework using the idea of prospective learning
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.