Reinforcement Learning

Slides:

Advertisements

Similar presentations

RL for Large State Spaces: Value Function Approximation

Advertisements

Reinforcement learning (Chapter 21)

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Reinforcement Learning & Apprenticeship Learning Chenyi Chen.

Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 6 - Reinforcement Learning Prof. Giancarlo.

Using Inaccurate Models in Reinforcement Learning Pieter Abbeel, Morgan Quigley and Andrew Y. Ng Stanford University.

Reinforcement Learning

Hierarchical Reinforcement Learning Ersin Basaran 19/03/2005.

Reinforcement Learning Presented by: Kyle Feuz.

Optimal Adaptation for Statistical Classifiers Xiao Li.

Human level control through deep reinforcement learning

Reinforcement Learning (1)

Kunstmatige Intelligentie / RuG KI Reinforcement Learning Sander van Dijk.

1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

MDP Reinforcement Learning. Markov Decision Process “Should you give money to charity?” “Would you contribute?” “Should you give money to charity?” $

Collaborative Filtering Matrix Factorization Approach

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Introduction Many decision making problems in real life

Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.

UAIG: Second Fall 2013 Meeting. Agenda  Introductory Icebreaker  How to get Involved with UAIG?  Discussion: Reinforcement Learning  Free Discussion.

Neural Networks Chapter 7

Reinforcement Learning with Laser Cats! Marshall Wang Maria Jahja DTR Group Meeting October 5, 2015.

Reinforcement learning (Chapter 21)

Additional NN Models Reinforcement learning (RL) Basic ideas: –Supervised learning: (delta rule, BP) Samples (x, f(x)) to learn function f(.) precise error.

Possible actions: up, down, right, left Rewards: – 0.04 if non-terminal state Environment is observable (i.e., agent knows where it is) MDP = “Markov Decision.

Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.

Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.

Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO AUTOMATICO Lezione 12 - Reinforcement Learning Prof. Giancarlo Mauri.

CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.

CS 182 Reinforcement Learning. An example RL domain Solitaire –What is the state space? –What are the actions? –What is the transition function? Is it.

Deep Reinforcement Learning

Reinforcement Learning

Continuous Control with Prioritized Experience Replay

Deep Learning Amin Sobhani.

Deep Reinforcement Learning

A Comparison of Learning Algorithms on the ALE

Mastering the game of Go with deep neural network and tree search

Reinforcement learning (Chapter 21)

CMSC 471 – Spring 2014 Class #25 – Thursday, May 1

Reinforcement learning (Chapter 21)

Deep reinforcement learning

Reinforcement Learning

Reinforcement learning with unsupervised auxiliary tasks

Policy Gradient in Continuous Time

"Playing Atari with deep reinforcement learning."

Classification Discriminant Analysis

Planning to Maximize Reward: Markov Decision Processes

Collaborative Filtering Matrix Factorization Approach

RL for Large State Spaces: Value Function Approximation

Reinforcement Learning in MDPs by Lease-Square Policy Iteration

یادگیری تقویتی Reinforcement Learning

Double Dueling Agent for Dialogue Policy Learning

Reinforcement Learning

Introduction to Reinforcement Learning and Q-Learning

Deep Reinforcement Learning

CS 188: Artificial Intelligence Fall 2008

Designing Neural Network Architectures Using Reinforcement Learning

University of Science and Technology of China

COSC 4368 Machine Learning Organization

Deep Reinforcement Learning: Learning how to act using a deep neural network Psych 209, Winter 2019 February 12, 2019.

Reinforcement Nisheeth 18th January 2019.

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Reinforcement Learning (2)

Reinforcement Learning (2)

Morteza Kheirkhah University College London

Reinforcement Learning

A Deep Reinforcement Learning Approach to Traffic Management

Presentation transcript:

Reinforcement Learning Hien Van Nguyen University of Houston 2/4/2019 Slides adopted from [1] https://edge.edx.org/courses/course-v1:BerkeleyX+CS188x-SP16+SP16/20021a0a32d14a31b087db8d4bb582fd/ [2] http://icml.cc/2016/tutorials/deep_rl_tutorial.pdf

Deep Q-learning Deep Q-learning: You don’t know the transitions T(s,a,s’) You don’t know the rewards R(s,a,s’) You choose the actions now State space is large Goal: learn the optimal policy / values Idea: Represent Q-function by a deep network: 2/4/2019 Machine Learning

Deep Q-learning Represent Q-function by a deep network Define objective function by mean-squared error in Q-values: Take derivative: Target Train end-to-end via SGD Can use raw data to represent state 2/4/2019 Machine Learning

Policy gradient for continuous actions Challenge: Action space can be continuous and maximization of Q-function over action space is difficult. 2/4/2019 Machine Learning

Deterministic policy gradient 2/4/2019 Machine Learning

Deterministic actor-critic 2/4/2019 Machine Learning

Deterministic actor-critic learning rule 2/4/2019 Machine Learning

Stability issue with Deep RL 2/4/2019 Machine Learning

Strategies for improving stability 2/4/2019 Machine Learning

Experience replay 2/4/2019 Machine Learning

Fixed target Q-network 2/4/2019 Machine Learning

How much does DQN help? 2/4/2019 Machine Learning

Thank you for taking my class! 2/4/2019 Machine Learning