Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mastering Open-face Chinese Poker by Self-play Reinforcement Learning

Similar presentations


Presentation on theme: "Mastering Open-face Chinese Poker by Self-play Reinforcement Learning"— Presentation transcript:

1 Mastering Open-face Chinese Poker by Self-play Reinforcement Learning
Andrew Tan, Andrew Peng, Ajay Shah, Farbod Nowzad

2 Proposal Use an RL agent to successfully play a stochastic game with no domain knowledge minus the rules Open-face Chinese Poker (OFCP) is a stochastic, perfect-information zero-sum game played between 2-4 players Goal: earn more points than your opponent by winning more hands and/or by collecting royalties on premium hands without fouling Information symmetry Successes in applying the Alpha Zero technique to Open-face Chinese Poker is a first step in applying the technique to other stochastic games

3 Implementation Problem is modeled as a MDP
Use deep Q-Learning to determine optimal policy at each state Batch normalization, dropout, ReLU activation Mean squared error loss, Adam optimizer Train model against a target network using rollouts between model and random agent Replace opponent at each iteration with newly trained policy Tools: Keras, TensorFlow, deuces


Download ppt "Mastering Open-face Chinese Poker by Self-play Reinforcement Learning"

Similar presentations


Ads by Google