Reinforcement Learning

Slides:



Advertisements
Similar presentations
RL for Large State Spaces: Value Function Approximation
Advertisements

Reinforcement learning (Chapter 21)
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 6 - Reinforcement Learning Prof. Giancarlo.
Using Inaccurate Models in Reinforcement Learning Pieter Abbeel, Morgan Quigley and Andrew Y. Ng Stanford University.
Reinforcement Learning
Hierarchical Reinforcement Learning Ersin Basaran 19/03/2005.
Reinforcement Learning Presented by: Kyle Feuz.
Optimal Adaptation for Statistical Classifiers Xiao Li.
Human level control through deep reinforcement learning
Reinforcement Learning (1)
Kunstmatige Intelligentie / RuG KI Reinforcement Learning Sander van Dijk.
1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
MDP Reinforcement Learning. Markov Decision Process “Should you give money to charity?” “Would you contribute?” “Should you give money to charity?” $
Collaborative Filtering Matrix Factorization Approach
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Introduction Many decision making problems in real life
Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.
UAIG: Second Fall 2013 Meeting. Agenda  Introductory Icebreaker  How to get Involved with UAIG?  Discussion: Reinforcement Learning  Free Discussion.
Neural Networks Chapter 7
Reinforcement Learning with Laser Cats! Marshall Wang Maria Jahja DTR Group Meeting October 5, 2015.
Reinforcement learning (Chapter 21)
Additional NN Models Reinforcement learning (RL) Basic ideas: –Supervised learning: (delta rule, BP) Samples (x, f(x)) to learn function f(.) precise error.
Possible actions: up, down, right, left Rewards: – 0.04 if non-terminal state Environment is observable (i.e., agent knows where it is) MDP = “Markov Decision.
Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO AUTOMATICO Lezione 12 - Reinforcement Learning Prof. Giancarlo Mauri.
CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.
CS 182 Reinforcement Learning. An example RL domain Solitaire –What is the state space? –What are the actions? –What is the transition function? Is it.
Deep Reinforcement Learning
Reinforcement Learning
Continuous Control with Prioritized Experience Replay
Deep Learning Amin Sobhani.
Deep Reinforcement Learning
A Comparison of Learning Algorithms on the ALE
Mastering the game of Go with deep neural network and tree search
Reinforcement learning (Chapter 21)
CMSC 471 – Spring 2014 Class #25 – Thursday, May 1
Reinforcement learning (Chapter 21)
Deep reinforcement learning
Reinforcement Learning
Reinforcement learning with unsupervised auxiliary tasks
Policy Gradient in Continuous Time
"Playing Atari with deep reinforcement learning."
Classification Discriminant Analysis
Planning to Maximize Reward: Markov Decision Processes
Collaborative Filtering Matrix Factorization Approach
RL for Large State Spaces: Value Function Approximation
Reinforcement Learning in MDPs by Lease-Square Policy Iteration
یادگیری تقویتی Reinforcement Learning
Double Dueling Agent for Dialogue Policy Learning
Reinforcement Learning
Introduction to Reinforcement Learning and Q-Learning
Deep Reinforcement Learning
CS 188: Artificial Intelligence Fall 2008
Designing Neural Network Architectures Using Reinforcement Learning
University of Science and Technology of China
COSC 4368 Machine Learning Organization
Deep Reinforcement Learning: Learning how to act using a deep neural network Psych 209, Winter 2019 February 12, 2019.
Reinforcement Nisheeth 18th January 2019.
Mastering Open-face Chinese Poker by Self-play Reinforcement Learning
Reinforcement Learning (2)
Reinforcement Learning (2)
Morteza Kheirkhah University College London
Reinforcement Learning
A Deep Reinforcement Learning Approach to Traffic Management
Presentation transcript:

Reinforcement Learning Hien Van Nguyen University of Houston 2/4/2019 Slides adopted from [1] https://edge.edx.org/courses/course-v1:BerkeleyX+CS188x-SP16+SP16/20021a0a32d14a31b087db8d4bb582fd/ [2] http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Deep Q-learning Deep Q-learning: You don’t know the transitions T(s,a,s’) You don’t know the rewards R(s,a,s’) You choose the actions now State space is large Goal: learn the optimal policy / values Idea: Represent Q-function by a deep network: 2/4/2019 Machine Learning

Deep Q-learning Represent Q-function by a deep network Define objective function by mean-squared error in Q-values: Take derivative: Target Train end-to-end via SGD Can use raw data to represent state 2/4/2019 Machine Learning

Policy gradient for continuous actions Challenge: Action space can be continuous and maximization of Q-function over action space is difficult. 2/4/2019 Machine Learning

Deterministic policy gradient 2/4/2019 Machine Learning

Deterministic actor-critic 2/4/2019 Machine Learning

Deterministic actor-critic learning rule 2/4/2019 Machine Learning

Stability issue with Deep RL 2/4/2019 Machine Learning

Strategies for improving stability 2/4/2019 Machine Learning

Experience replay 2/4/2019 Machine Learning

Fixed target Q-network 2/4/2019 Machine Learning

How much does DQN help? 2/4/2019 Machine Learning

Thank you for taking my class! 2/4/2019 Machine Learning