Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007.

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007

7 Feb 2007Brett Borghetti 2 Contributions of the work: New betting strategy using probability: –Propagate a “Probability Triple” knowledge representation –An atomic unit stating the likelihood of each action occurring under a given situation Uses real-time simulations to generate a selective sample of the possible outcomes while a hand is in progress.

7 Feb 2007Brett Borghetti 3 Old Loki (Loki-1) Only carried most likely action or probability of playing the hand. Uses “expert knowledge” –Initial tables of income rates –Initial weight probabilities of opponent hands (how likely will they play with these cards) –Re-weighting rules for opponent model updates –Hand evaluator strength and potential –Rule based Betting module

7 Feb 2007Brett Borghetti 4 New Loki (Loki-2) All tables store probability triples –Propagating distributions allows distributed decisionmaking in components Simulator calculates expected value of the selected sample of the way the hand might play out –Eliminates some of the required ‘expert knowledge’

7 Feb 2007Brett Borghetti 5 Probability Triples Stores probability of 3 actions –[f,c,r] such that f+c+r = 1.0 Used in 3 locations in Loki 2 –Triple Generator Evaluates 2 card hands in the current context –Opponent Modeler For updating the weight tables in the opponent model –Action Selector (for choosing our next action) Can adjust the selection based on desired play style

7 Feb 2007Brett Borghetti 6 Simulation-Based Betting Strategy Calculates approximate expected value of the return on investment (expected value) for each possible betting action. Since folding has EV=0, they only consider the actions of call or raise from current position and try to expand the game tree from there Since the entire game tree would be intractable to search, uses selective sampling –Simulated opponent actions are biased by their weight tables, using random number to select the actual action in that simulated hand Author claims this approach should be better than the static approach –[brett] that would depend on how accurate the weighting scheme was at detecting the true behavior of the opponent

7 Feb 2007Brett Borghetti 7 Comparing Performance Single measurement: Small Bets per Hand –If you play 30 hands and it is a $10/20 game, an improvement of +0.10sb/hand means you win an extra $1.00 per game which results in an extra $30 won.

7 Feb 2007Brett Borghetti 8 Experiments Examines each change from Loki-1 separately –R: changing the reweighting –B: changing from rules-based betting to ‘action selector’ with randomizing –S: incorporating the simulation to compute EV in the action selection decision

7 Feb 2007Brett Borghetti 9 Experiments, (continued) Self Play in 10-seat game: Added components one at a time and compared performance B~R, B<<S, R<<S –B alone vs R alone is roughly equivalent and provides the least improvement, with S alone providing the most improvement B+R+S > S

7 Feb 2007Brett Borghetti 10 Experiments, (continued) Player Type comparisons in 10-seat game –Number of hands played to the flop: T = Tight L = Loose –How frequent bet and raise after the flop A = Agressive C = Conservative

7 Feb 2007Brett Borghetti 11 Issues [Brett] At the core of Loki-2 is the weighting system that models the opponent. –Is this system flexible and adaptable to rapid changes in opponent strategy, or do the weights have some kind of inertia that prevents the model to incorporate changes as quick as they might happen –Do the weight updates (belief updates) make sense?

7 Feb 2007Brett Borghetti 12 Background Information

7 Feb 2007Brett Borghetti 13 Texas Hold’em Heads-up Limit Poker Basics 2 Players 4 Betting Rounds per hand –Preflop(2 hole cards), Flop(3 community cards), Turn (1cc), River (1 cc) Action set = {fold, call(check), raise(bet)} Up to 3 raises allowed per round Round is over when either –When all players are even in the pot via a final call and each player has had at least one opportunity to act [go on to next round] –When one player folds [other player wins]

7 Feb 2007Brett Borghetti 14 Requirements for a World Class Poker Player Able to assess –Hand Strength –Hand Potential –Opponents Betting Strategy (opponent model) Has a strong –Betting strategy –Ability to play deceptively [bluff vs. slow play*] –Ability to play unpredictably

7 Feb 2007Brett Borghetti 15 Optimal vs Maximal play Optimal player makes decisions based on game-theoretic probabilities without regard to specific context (opponent’s plays) Maximal player takes into account the opponent’s sub-optimal tendencies and adjusts its play to exploit perceived weaknesses

7 Feb 2007Brett Borghetti 16 Hand Assessment (Hand Strength = HS) Pre-Flop HS determined from 169 equivalence classes “income rate” from 1M simulated poker hands Flop HS determined comparing each of the 1081 possible opponent hands with ours and determining how many wins each player has

7 Feb 2007Brett Borghetti 17 Hand Potential (HP) at the Flop PPot 1 = likelihood that our hand will improve with one card (the turn card) PPot 2 = likelihood that our hand will improve with two cards (turn and river) NPot 1 and 2 = equivalent calculations of likelihood that our opponent’s hand will get better than ours on the turn and/or river

7 Feb 2007Brett Borghetti 18 Effective Hand Strength & Pot Odds EHS = HS n + (1-HS n ) x Ppot n –The chance that we either are ahead or could pull ahead by the end of n=1 or n=2 cards from now Pot odds = P(win)/(Expected Return on Pot) –Example: if your chance of winning is 25%, you would call a $4 bet to win a $16 pot because your earnings are 0.25*$20 = $5 and hence you can expect to win $5 every time you pay $4 for an expected net gain of $1.00 per play.

7 Feb 2007Brett Borghetti 19 Opponent Modeling Uses initial weighting scheme based on original income rates on the 169 preflop card equivalency classes Updates the weights generically on each hand based on the betting used during that hand Updates the weights specifically based on the total betting history over all hands with this opponent Weight updates based on mean and variance of call vs. raise vs. fold actions

7 Feb 2007Brett Borghetti 20 Using the opponent model Calculate a new weight for all possible starting card combos (1081) of the opponent based on initial weights, HS, EHS and opponent actions (generic and specific) Weights for each possible hole card tuple provides an ordering over the possible hands Usually greatly reduces the uncertainty of what hands the opponent is playing… asuming they are not playing deceptively.

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007.

Similar presentations

Presentation on theme: "Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007.

Similar presentations

Presentation on theme: "Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007."— Presentation transcript:

Similar presentations

About project

Feedback