Download presentation
Presentation is loading. Please wait.
Published byHildegunn Carlsson Modified over 5 years ago
1
Mastering Open-face Chinese Poker by Self-play Reinforcement Learning
Andrew Tan, Andrew Peng, Ajay Shah, Farbod Nowzad
2
Proposal Use an RL agent to successfully play a stochastic game with no domain knowledge minus the rules Open-face Chinese Poker (OFCP) is a stochastic, perfect-information zero-sum game played between 2-4 players Goal: earn more points than your opponent by winning more hands and/or by collecting royalties on premium hands without fouling Information symmetry Successes in applying the Alpha Zero technique to Open-face Chinese Poker is a first step in applying the technique to other stochastic games
3
Implementation Problem is modeled as a MDP
Use deep Q-Learning to determine optimal policy at each state Batch normalization, dropout, ReLU activation Mean squared error loss, Adam optimizer Train model against a target network using rollouts between model and random agent Replace opponent at each iteration with newly trained policy Tools: Keras, TensorFlow, deuces
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.