Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Slides:



Advertisements
Similar presentations
Todd W. Neller Gettysburg College
Advertisements

CS 4700: Foundations of Artificial Intelligence Bart Selman Reinforcement Learning R&N – Chapter 21 Note: in the next two parts of RL, some of the figure/section.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
ImageNet Classification with Deep Convolutional Neural Networks
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
INSTITUTO DE SISTEMAS E ROBÓTICA Minimax Value Iteration Applied to Robotic Soccer Gonçalo Neto Institute for Systems and Robotics Instituto Superior Técnico.
Outline MDP (brief) –Background –Learning MDP Q learning Game theory (brief) –Background Markov games (2-player) –Background –Learning Markov games Littman’s.
לביצוע מיידי ! להתחלק לקבוצות –2 או 3 בקבוצה להעביר את הקבוצות – היום בסוף השיעור ! ספר Reinforcement Learning – הספר קיים online ( גישה מהאתר של הסדנה.
Incorporating Advice into Agents that Learn from Reinforcement Presented by Alp Sardağ.
Reinforcement Learning (1)
Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.
Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Introduction Many decision making problems in real life
© D. Weld and D. Fox 1 Reinforcement Learning CSE 473.
Class 2 Please read chapter 2 for Tuesday’s class (Response due by 3pm on Monday) How was Piazza? Any Questions?
Neural Network Implementation of Poker AI
INTRODUCTION TO Machine Learning
DEEP RED An Intelligent Approach to Chinese Checkers.
Possible actions: up, down, right, left Rewards: – 0.04 if non-terminal state Environment is observable (i.e., agent knows where it is) MDP = “Markov Decision.
Adaptive Reinforcement Learning Agents in RTS Games Eric Kok.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
ConvNets for Image Classification
Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO AUTOMATICO Lezione 12 - Reinforcement Learning Prof. Giancarlo Mauri.
CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.
Deep Reinforcement Learning
Training of mind in Football game
Stochastic tree search and stochastic games
D1 Miwa Makoto Chikayama & Taura Lab
Great Theoretical Ideas in Computer Science
Deep Reinforcement Learning
Status Report on Machine Learning
A Crash Course in Reinforcement Learning
Mastering the game of Go with deep neural network and tree search
ReinforcementLearning: A package for replicating human behavior in R
AlphaGo with Deep RL Alpha GO.
Status Report on Machine Learning
A Simple Artificial Neuron
Reinforcement Learning
Deep reinforcement learning
Extensive-form games and how to solve them
AlphaGO from Google DeepMind in 2016, beat human grandmasters
CS 4700: Foundations of Artificial Intelligence
Strategies for Poker AI player
Training Neural networks to play checkers
"Playing Atari with deep reinforcement learning."
Face Recognition with Deep Learning Method
Announcements Homework 3 due today (grace period through Friday)
Training a Neural Network
Reinforcement Learning
Classification of highly unbalanced data using deep learning techniques
یادگیری تقویتی Reinforcement Learning
Reinforcement Learning
Reinforcement Learning
Reinforcement Learning for Adaptive Game Learner
Chapter 1: Introduction
CS 188: Artificial Intelligence Spring 2006
Deep Reinforcement Learning
Designing Neural Network Architectures Using Reinforcement Learning
Gain an advantage by knowing yourself and your opponents
Keras.
Reinforcement learning
Factor Game Sample Game.
These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like.
Reinforcement Learning (2)
Reinforcement Learning (2)
Presentation transcript:

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning Andrew Tan, Andrew Peng, Ajay Shah, Farbod Nowzad

Proposal Use an RL agent to successfully play a stochastic game with no domain knowledge minus the rules Open-face Chinese Poker (OFCP) is a stochastic, perfect-information zero-sum game played between 2-4 players Goal: earn more points than your opponent by winning more hands and/or by collecting royalties on premium hands without fouling Information symmetry Successes in applying the Alpha Zero technique to Open-face Chinese Poker is a first step in applying the technique to other stochastic games

Implementation Problem is modeled as a MDP Use deep Q-Learning to determine optimal policy at each state Batch normalization, dropout, ReLU activation Mean squared error loss, Adam optimizer Train model against a target network using rollouts between model and random agent Replace opponent at each iteration with newly trained policy Tools: Keras, TensorFlow, deuces