Download presentation
Presentation is loading. Please wait.
1
University of Science and Technology of China
Hindsight Experience Replay NIPS2017 by OpenAI Ze Liu University of Science and Technology of China
2
Background Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL), especially for robotics. To engineer a reward function is a common challenge. It needs RL expertise and domain-specific knowledge. It is not applicable in situations where we do not know what admissible behaviour may look like. One ability humans have can learn almost as much from achieving an undesired outcome as from the desired one.
3
Change the goal Hindsight --> 后见之明;事后聪明
4
Changes: ,
5
Assume: Given a state s we can easily find a goal g which is satisfied in this state. More formally, we assume that there is given a mapping : A strategy for sampling goals for replay
6
Algorithm
7
Experiments
11
Single goal It is advisable to train on multiple goals even if we care only about one of them.
12
How many goals should we replay each trajectory with and how to choose them?
13
Deployment on a physical robot
14
THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.