Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell

Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell
Reinforcement Learning in Robotics: Applications and Real-World Challenges Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell 2013 Presenter: Wei Zhang

Outline What is Reinforcement Learning (RL) ? Applications using RL
RL in robotics: Archery aiming task Future of RL

What is Reinforcement Learning (RL) ?
Trial-and-error, not supervised learning We have agent and environment Three unique features: Rewards Policy function Value function

Agent and Environment Picture from the book: Reinforcement Learning: An Introduction

Rewards A reward function maps a perceived state of the environment to a single number,a reward, indicate the desirability of the state. The reward can be used to calculate the return function: Gt=Rt+1+Rt+2+Rt+3+……+RT This can be used to calculate the Value function in later stage.

Value Function For a Markov Decision Process (MDP), the value function can be calculated as : Then the action-value function can be calculated as following:

Policy Function Policy function maps from a perceived states of the environment to actions that an agent can take when in those states. Greedy algorithm: always take the action that maximize the rewards in long run. Exploring algorithm: try new action without prior knowledge.This might have lower rewards as a result.

Application in RL Atari Alpha Go Games Pancake Flipping Task
Bipedal Walking Energy Minimization Task Archery Aiming Task

Archery Aiming Task Two algorithm used:
Expectation-Maximization (EM) RL algorithm called: Policy learning by weighting algorithm exploration with the returns (PoWER) The reward function is defined as: Drawback: single dimension, take longer to learn.

Archery Aiming Task Second algorithm used:
Augmented Reward Chained Regression (ARCHER) We have reward r in 2d and Θ of relative position of the hands in 3d. We define: We can then calculate r1,T: Then, we learn weights w and apply in:

Archery Aiming Task

Bipedal Energy Minimization

Pancake Flipping Task

Future of RL Not only in games, but also in medical fields Multi-task
Robotics

Thank you!

Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell

Similar presentations

Presentation on theme: "Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell

Similar presentations

Presentation on theme: "Petar Kormushev, Sylvain Calinon and Darwin G. Caldwell"— Presentation transcript:

Similar presentations

About project

Feedback