Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux.

Similar presentations


Presentation on theme: "Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux."— Presentation transcript:

1 Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux

2 Goals of Paper Practical RL approach Handles continuous state and action spaces Safely approximates value function On-line learning bootstrapped with human-provided data

3 Approaches to Continuous State or Action Space Discretize If too course, problem with hidden states If too fine, cannot generalize Curse of dimensionality Function Approximators Use to estimate the Value Function Errors tend to propagate Tendency to over-estimate (hidden extrapolation)

4 Proposed Approach - Hedger Instance-Based Approach To predict Q(s,a): Find neighborhood of (s,a) in corpus Calculate kernel weights for neighbors Do locally weighted regression, LWR, to estimate Q(s,a) If not sufficient number of points in neighborhood, or (s,a) is not in within the Independent Variable Hull, return conservative default value for Q(s,a)

5 Hedger Training – given an observation (s,a,r,s’) q new  q old +(r+ q next - q old ), where q old  Q predict (s,a) q next  max a’ Q predict (s’,a’) Use this to update Q(s,a) Use the updated value of Q(s,a) to update Q(s i,a i ) in neighborhood of (s,a) May be used in Batch or On-line

6 Potential Problems using Instance- Based Reinforcement Learning Determining appropriate metric Obtaining training paths achieving rewards Keeping the size of the corpus manageable Finding neighbors efficiently See Representations for Learning Control Policies – Forbes & Andre


Download ppt "Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux."

Similar presentations


Ads by Google