Intelligence for Games and Puzzles1 Poker: Opponent Modelling Early AI work on poker used simplified.

Slides:



Advertisements
Similar presentations
Advanced Strategies for Craps and Poker Billy J. Duke Joel A. Johnson.
Advertisements

Virtual Host: John Morales Revised: September 21, 2011 Project 4: Multi-media Lesson.
After the flop – an opponent raised before the flop Strategy: No-Limit.
After the flop – nobody raised before the flop Strategy: No-Limit.
Lecture 13. Poker Two basic concepts: Poker is a game of skill, not luck. If you want to win at poker, – make sure you are very skilled at the game, and.
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.Zelda, continued. 2.Difficult homework 3 problem. 3.WSOP 2013 hand.
Tuomas Sandholm, Andrew Gilpin Lossless Abstraction of Imperfect Information Games Presentation : B 趙峻甫 B 蔡旻光 B 駱家淮 B 李政緯.
Short Stack Strategy – How to play after the flop Strategy: No Limit.
Randomized Strategies and Temporal Difference Learning in Poker Michael Oder April 4, 2002 Advisor: Dr. David Mutchler.
Neural Networks for Opponent Modeling in Poker John Pym.
Rational Agents for the Card Game “BS” Jeff Reilly, Brian Munce, Stephen Tanguis, Alex Keeler.
Final Specification KwangMonarchIkhanJamesGraham.
Mathematics and the Game of Poker
Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.
Intro to Probability & Games
Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings …) Presented by Brett Borghetti 7 Jan 2007.
Introduction to Cognition and Gaming 9/22/02: Bluffing.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
VOCABULARY  Deck or pack  Suit  Hearts  Clubs  Diamonds  Spades  Dealer  Shuffle  Pick up  Rank  Draw  Set  Joker  Jack 
Poki: The Poker Agent Greg Priebe Zak Knudson. Overview Texas Hold’em poker Architecture and Opponent Modeling of Poki Improvements from past Poki Betting.
Models of Strategic Deficiency and Poker Workflow Inference: What to do with One Example and no Semantics.
Introduction to the Big Stack Strategy (BSS) Strategy: No Limit.
Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679
Texas Holdem Poker With Q-Learning. First Round (pre-flop) PlayerOpponent.
Overview Odds Pot Odds Outs Probability to Hit an Out
Brian Duddy.  Two players, X and Y, are playing a card game- goal is to find optimal strategy for X  X has red ace (A), black ace (A), and red two (2)
Brain Teasers. Answer 3 Quantitative Finance Society Gambling Strategies & Statistics.
Introduction for Rotarians
NearOptimalGamblingAdive Matt Morgis Peter Chapman Mitch McCann Temple University.
Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications
Poker Download A most popular card game or group of card games is called poker. Players compete against one another by betting on the values of each player's.
Learning to Play KardKuro Goals: Have Fun while Practicing Addition and Subtraction. Improve Social Learning Opportunities with Classmates. Become familiar.
Suppose someone bets (or raises) you, going all-in. What should your chances of winning be in order for you to correctly call? Let B = the amount bet to.
SARTRE: System Overview A Case-Based Agent for Two-Player Texas Hold'em Jonathan Rubin & Ian Watson University of Auckland Game AI Group
1 The Scientist Game Chris Slaughter, DrPH (courtesy of Scott Emerson) Dept of Biostatistics Vanderbilt University © 2002, 2003, 2006, 2008 Scott S. Emerson,
Shortstack Strategy: How do you play before the flop? Strategy: No Limit.
The challenge of poker NDHU CSIE AI Lab 羅仲耘. 2004/11/04the challenge of poker2 Outline Introduction Texas Hold’em rules Poki’s architecture Betting Strategy.
Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada.
Outline for the day: 1.Discuss handout / get new handout. 2.Teams 3.Example projects 4.Expected value 5.Pot odds calculations 6.Hansen / Negreanu 7.P(4.
Neural Network Implementation of Poker AI
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.HW3 2.Project B teams 3.Gold vs. Helmuth 4.Farha vs. Gold 5.Flush.
The Poker Game in Jadex by Group 1 Mohammed Musavi (Ashkan) Xavi Dolcet Enric Tejedor.
Texas Hold’em Playing Styles Team 4 Matt Darryl Alex.
Short stack strategy: Draws in a free play situation Strategy: No Limit.
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.Expected value and pot odds, continued 2.Violette/Elezra example.
All In To put all the rest of your money into the pot.
GamblingGambling What are the odds? Jessica Judd.
Expected value (µ) = ∑ y P(y) Sample mean (X) = ∑X i / n Sample standard deviation = √[∑(X i - X) 2 / (n-1)] iid: independent and identically distributed.
Introduction to Poker Originally created by Albert Wu,
1)Hand in HW. 2)No class Tuesday (Veteran’s Day) 3)Midterm Thursday (1 page, double-sided, of notes allowed) 4)Review List 5)Review of Discrete variables.
Artificial Neural Networks And Texas Hold’em ECE 539 Final Project December 19, 2003 Andy Schultz.
By: John Cook 11/06/2009 PTTE John Cook 3/4/2016.
The Mathematics of Poker– Implied Pot Odds Strategy: No-Limit.
Outline: 1) Odds ratios, continued. 2) Expected value revisited, Harrington’s strategy 3) Pot odds 4) Examples.
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.Expected value. 2.Heads up with AA. 3.Heads up with Gus vs.
Chance We will base on the frequency theory to study chances (or probability).
Stat 35b: Introduction to Probability with Applications to Poker
Lecture 13.
Game Theory Just last week:
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Strategies for Poker AI player
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
HOW TO PLAY POKER.
Presentation transcript:

Intelligence for Games and Puzzles1 Poker: Opponent Modelling Early AI work on poker used simplified variants of poker. More recently attention has focused mainly on “Limit Texas Hold’em”, in both its “heads-up” form (only two players) and its many-player form (often 10). Texas Hold’em is a popular form of poker in the USA. As in all forms of poker, betting is an essential element. Texas Hold’em offers four opportunities per hand for a round of betting. In Limit Texas Hold’em there are two sizes of bet increment: the small bet - say $2 the large bet - say $4 In No-Limit Texas Hold’em, players may bet any amount up to the current size of the pot.

Intelligence for Games and Puzzles2 The structure of a hand of Texas Hold’em A hand (if played to the bitter end) proceeds through nine stages: Dealer gives each player two cards - “hole cards” - face-down, player may see only his own cards “Preflop” Round of (small) betting, started by the “blind” Dealer lays three cards face-up “Flop” Round of (small) betting Dealer lays a fourth card face-up “Turn” Round of (big) betting Dealer lays a fifth card face-up “River” Round of (big) betting Players still in the game show their cards to determine the winner “Showdown” The winner is the player who makes the best 5-card poker hand using a combination of his hole cards and the community cards (the board).

Intelligence for Games and Puzzles3 Decisions to be made In a round of betting, players have to choose one of five actions repeatedly, starting off with the player to dealer’s left and proceeding clockwise: BetIf nobody has yet bet in the current round, a player may add the appropriate- size (small or large) bet to the pot. CheckIf nobody has yet bet in the current round, a player may do nothing. CallIf someone has put more into the pot in the current round, a player may add just enough to make their own contribution equal. RaiseIf someone has put more into the pot in the current round, a player may add enough to make their own contribution equal and then add the appropriate (small or large) bet on top. In Limit version, max 3 raises per round. FoldA player may withdraw from the hand, forfeiting any bets and raises already put in the pot, and excluding themselves from further betting. Limit games require no decision about the amount of bets and raises. No-limit games require more complex reasoning because bets may vary in size.

Intelligence for Games and Puzzles4 Betting based on probabilities One way to play poker is to use probabilities: Given your own known hole cards, and the community cards that are on show, for each possible combination of community cards yet to appear, how likely is your hand to be better than any other player’s hand? Compare this to the pot odds - the ratio If the comparison is very favourable, bet or raise; if merely favourable, check or call; if not, check if possible otherwise fold the cost of making a bet/call the size of the pot

Intelligence for Games and Puzzles5 Predictability is bad Basing your behaviour on the probabilities like this is a poor strategy. Other players will observe the cards you reveal at the showdowns, learn about your conservative style of play, learn about your assessment of winning chances, interpret your betting behaviour as indicative of the strength of your hand, and use this to beat you over the course of many hands. Good poker players 1.observe the decisions of their opponents and gather what evidence they can 2.base their decisions on models of their opponents, exploiting any weaknesses they detect 3.strive to frustrate the formation of accurate models of themselves, by bluffing, and by consciously, deliberately changing their own style

Intelligence for Games and Puzzles6 Poker as AI Testbed domain Poker, like several other games, features Competing agents Chance Finite set of choices Large game tree In addition, Poker has Risk assessment Deception In many other games, there is little to be gained by modelling opponents. Rudimentary models, like “contempt factor”, or no model at all, are common. In poker, modelling opponents - and awareness of their trying to model you - is essential to good play.

Intelligence for Games and Puzzles7 Bayesian Network approach to modelling By training over many self- played hands, CPP (conditional probability table) can be built up, Then in real play, knowing all influences upon “opponent action” except “opponent current hand”, can draw conclusions about “opponent current hand”. But CPP at ~200k entries cannot reasonably be modified over a game of ~100 hands. (Boulton)

Intelligence for Games and Puzzles8 Classification of hands At the outset, the two hole cards of an opponent player may be any two of the 50 cards you don’t have. 50x49/2=1225 combinations if you enumerate them. Sufficient to distinguish 169 qualitatively different hands: 13 possible pairs - AA KK QQ … pairings of cards of the same suit - AKs, AQs, AJs, A10s, A9s, … 42s, 32s 78 pairings of cards of different suits - AK, AQ, AJ, … 32 Collapsing still further, to 25 or so classes, loses some information but facilitates learning of statistics. Classification of boards and of pot sizes can proceed similarly.

Intelligence for Games and Puzzles9 The Loki program Loki, from Univ.Alberta, used a probabilistic approach, with one initial model (set of weights) for all players, then updating weights for individual players on the basis of their observed actions. Assess prob. of holding each class of hand, given own cards & board; Modify prob. estimates in light of each action  e.g. “raise”  increase strong hand probs. & decrease weak hand probs  Adjust weights from estimated hands to better predict observed action This showed improved performance compared to (i) programs with no modelling and (ii) programs with only static modelling.

Intelligence for Games and Puzzles10 Bluffing behaviour Being able to model others is only part of the solution. Good players find it easy to model opponents who never bluff. Bluffing purely at random (say 5% of hands) has a problem: in some cases opponents can know for certain you cannot win, avoid bluffing at such a time. Keeping raising when bluffing is not typical of behaviour when you truly do have a good hand - good opponents will detect the difference. Follow a plan: proceed as if your chance of losing was say 50% of your true estimate of that chance - this will lead to consistent and realistic behaviour that cannot be easily diagnosed as bluffing.

Intelligence for Games and Puzzles11 The Poki program Poki is a rewrite & enhancement of Loki. It features a neural-network opponent modelling mechanism, inputs include estimated hand strength estimated hand potential previous action of opponent position of player clockwise from dealer (first, last, neither) predictions from “expert predictors” Opponent modelling is viewed as machine learning: predict opponent’s action Backpropagation within the neural network Plug-in “Expert Predictor(s)” (ensemble) may be machine-learning systems too Poki also features game-tree search, to 5 ply, using “miximax” to handle the problem of imperfect knowledge.

Intelligence for Games and Puzzles12 References

Intelligence for Games and Puzzles13 References Quoted in Aaron Davidson’s 2002 MSc thesis at the U.Alberta site: