Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning
Andrew Tan, Andrew Peng, Ajay Shah, Farbod Nowzad

Proposal Use an RL agent to successfully play a stochastic game with no domain knowledge minus the rules Open-face Chinese Poker (OFCP) is a stochastic, perfect-information zero-sum game played between 2-4 players Goal: earn more points than your opponent by winning more hands and/or by collecting royalties on premium hands without fouling Information symmetry Successes in applying the Alpha Zero technique to Open-face Chinese Poker is a first step in applying the technique to other stochastic games

Implementation Problem is modeled as a MDP
Use deep Q-Learning to determine optimal policy at each state Batch normalization, dropout, ReLU activation Mean squared error loss, Adam optimizer Train model against a target network using rollouts between model and random agent Replace opponent at each iteration with newly trained policy Tools: Keras, TensorFlow, deuces

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Similar presentations

Presentation on theme: "Mastering Open-face Chinese Poker by Self-play Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Similar presentations

Presentation on theme: "Mastering Open-face Chinese Poker by Self-play Reinforcement Learning"— Presentation transcript:

Similar presentations

About project

Feedback