Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Slides:

Advertisements

Similar presentations

Todd W. Neller Gettysburg College

Advertisements

CS 4700: Foundations of Artificial Intelligence Bart Selman Reinforcement Learning R&N – Chapter 21 Note: in the next two parts of RL, some of the figure/section.

COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.

ImageNet Classification with Deep Convolutional Neural Networks

Reinforcement Learning & Apprenticeship Learning Chenyi Chen.

INSTITUTO DE SISTEMAS E ROBÓTICA Minimax Value Iteration Applied to Robotic Soccer Gonçalo Neto Institute for Systems and Robotics Instituto Superior Técnico.

Outline MDP (brief) –Background –Learning MDP Q learning Game theory (brief) –Background Markov games (2-player) –Background –Learning Markov games Littman’s.

לביצוע מיידי ! להתחלק לקבוצות –2 או 3 בקבוצה להעביר את הקבוצות – היום בסוף השיעור ! ספר Reinforcement Learning – הספר קיים online ( גישה מהאתר של הסדנה.

Incorporating Advice into Agents that Learn from Reinforcement Presented by Alp Sardağ.

Reinforcement Learning (1)

Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.

Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

Introduction Many decision making problems in real life

© D. Weld and D. Fox 1 Reinforcement Learning CSE 473.

Class 2 Please read chapter 2 for Tuesday’s class (Response due by 3pm on Monday) How was Piazza? Any Questions?

Neural Network Implementation of Poker AI

INTRODUCTION TO Machine Learning

DEEP RED An Intelligent Approach to Chinese Checkers.

Possible actions: up, down, right, left Rewards: – 0.04 if non-terminal state Environment is observable (i.e., agent knows where it is) MDP = “Markov Decision.

Adaptive Reinforcement Learning Agents in RTS Games Eric Kok.

Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.

ConvNets for Image Classification

Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.

Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO AUTOMATICO Lezione 12 - Reinforcement Learning Prof. Giancarlo Mauri.

CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.

Deep Reinforcement Learning

Training of mind in Football game

Stochastic tree search and stochastic games

D1 Miwa Makoto Chikayama & Taura Lab

Great Theoretical Ideas in Computer Science

Deep Reinforcement Learning

Status Report on Machine Learning

A Crash Course in Reinforcement Learning

Mastering the game of Go with deep neural network and tree search

ReinforcementLearning: A package for replicating human behavior in R

AlphaGo with Deep RL Alpha GO.

Status Report on Machine Learning

A Simple Artificial Neuron

Reinforcement Learning

Deep reinforcement learning

Extensive-form games and how to solve them

AlphaGO from Google DeepMind in 2016, beat human grandmasters

CS 4700: Foundations of Artificial Intelligence

Strategies for Poker AI player

Training Neural networks to play checkers

"Playing Atari with deep reinforcement learning."

Face Recognition with Deep Learning Method

Announcements Homework 3 due today (grace period through Friday)

Training a Neural Network

Reinforcement Learning

Classiﬁcation of highly unbalanced data using deep learning techniques

یادگیری تقویتی Reinforcement Learning

Reinforcement Learning

Reinforcement Learning

Reinforcement Learning for Adaptive Game Learner

Chapter 1: Introduction

CS 188: Artificial Intelligence Spring 2006

Deep Reinforcement Learning

Designing Neural Network Architectures Using Reinforcement Learning

Gain an advantage by knowing yourself and your opponents

Reinforcement learning

Factor Game Sample Game.

These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like.

Reinforcement Learning (2)

Reinforcement Learning (2)

Presentation transcript:

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning Andrew Tan, Andrew Peng, Ajay Shah, Farbod Nowzad

Proposal Use an RL agent to successfully play a stochastic game with no domain knowledge minus the rules Open-face Chinese Poker (OFCP) is a stochastic, perfect-information zero-sum game played between 2-4 players Goal: earn more points than your opponent by winning more hands and/or by collecting royalties on premium hands without fouling Information symmetry Successes in applying the Alpha Zero technique to Open-face Chinese Poker is a first step in applying the technique to other stochastic games

Implementation Problem is modeled as a MDP Use deep Q-Learning to determine optimal policy at each state Batch normalization, dropout, ReLU activation Mean squared error loss, Adam optimizer Train model against a target network using rollouts between model and random agent Replace opponent at each iteration with newly trained policy Tools: Keras, TensorFlow, deuces