Reinforcement Learning

Reinforcement Learning

Overview Tabular Methods Approximate Methods
Deep Reinforcement Learning

Tabular Methods

Model: Mathematical models of dynamics and reward
Policy: function mapping agent’s states to action Value function: future rewards from being in a state and/or action when following a particular policy

Markov Reward Process

MDP = MRP + Action

MDP + Policy

Compare

How to Control?

Policy Search

State-Action Value Q

Policy Iteration

Worst Case Policy Iteration Can Take At Most |A|^|S| Iterations* (Size of # Policies)

Value Iteration

Reinforcement Learning

Similar presentations

Presentation on theme: "Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Reinforcement Learning

Similar presentations

Presentation on theme: "Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback