Reinforcement Learning

Reinforcement Learning
Hien Van Nguyen University of Houston 2/4/2019 Slides adopted from [1] [2]

Deep Q-learning Deep Q-learning:
You don’t know the transitions T(s,a,s’) You don’t know the rewards R(s,a,s’) You choose the actions now State space is large Goal: learn the optimal policy / values Idea: Represent Q-function by a deep network: 2/4/2019 Machine Learning

Deep Q-learning Represent Q-function by a deep network
Define objective function by mean-squared error in Q-values: Take derivative: Target Train end-to-end via SGD Can use raw data to represent state 2/4/2019 Machine Learning

Policy gradient for continuous actions
Challenge: Action space can be continuous and maximization of Q-function over action space is difficult. 2/4/2019 Machine Learning

Deterministic policy gradient
2/4/2019 Machine Learning

Deterministic actor-critic

Deterministic actor-critic learning rule

Stability issue with Deep RL

Strategies for improving stability

Experience replay 2/4/2019 Machine Learning

Fixed target Q-network

How much does DQN help? 2/4/2019 Machine Learning

Thank you for taking my class!

Reinforcement Learning

Similar presentations

Presentation on theme: "Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Reinforcement Learning

Similar presentations

Presentation on theme: "Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback