Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …) Presented by Brett Borghetti 7 Jan 2007.

Slides:



Advertisements
Similar presentations
Bet sizing – How much to bet and why? Strategy: SnG / Tournaments.
Advertisements

Advanced Strategies for Craps and Poker Billy J. Duke Joel A. Johnson.
After the flop – an opponent raised before the flop Strategy: No-Limit.
After the flop – nobody raised before the flop Strategy: No-Limit.
Lecture 13. Poker Two basic concepts: Poker is a game of skill, not luck. If you want to win at poker, – make sure you are very skilled at the game, and.
Decision Theory.
Randomized Strategies and Temporal Difference Learning in Poker Michael Oder April 4, 2002 Advisor: Dr. David Mutchler.
Managerial Decision Modeling with Spreadsheets
Chapter 4 Decision Analysis.
1 1 Slide Decision Analysis Professor Ahmadi. 2 2 Slide Decision Analysis Chapter Outline n Structuring the Decision Problem n Decision Making Without.
ISMT 161: Introduction to Operations Management
Neural Networks for Opponent Modeling in Poker John Pym.
Final Specification KwangMonarchIkhanJamesGraham.
Mathematics and the Game of Poker
POKER AGENTS LD Miller & Adam EckMay 3, Motivation 2  Classic environment properties of MAS  Stochastic behavior (agents and environment)  Incomplete.
Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.
Intelligence for Games and Puzzles1 Poker: Opponent Modelling Early AI work on poker used simplified.
DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3 Junghuei Chen 1* Departments of 1 Chemistry & Biochemistry.
Lectures in Microeconomics-Charles W. Upton Minimax Strategies.
Intro to Probability & Games
Introduction to Cognition and Gaming 9/22/02: Bluffing.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
Minimax Strategies. Everyone who has studied a game like poker knows the importance of mixing strategies. –With a bad hand, you often fold –But you must.
Poki: The Poker Agent Greg Priebe Zak Knudson. Overview Texas Hold’em poker Architecture and Opponent Modeling of Poki Improvements from past Poki Betting.
Models of Strategic Deficiency and Poker Workflow Inference: What to do with One Example and no Semantics.
Introduction to the Big Stack Strategy (BSS) Strategy: No Limit.
Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679
Poker and AI How the most “stable” creature on earth got used to that good old game from the west!
Texas Holdem Poker With Q-Learning. First Round (pre-flop) PlayerOpponent.
April 2009 BEATING BLACKJACK CARD COUNTING FEASIBILITY ANALYSIS THROUGH SIMULATION.
Game Playing.
Brain Teasers. Answer 3 Quantitative Finance Society Gambling Strategies & Statistics.
Introduction for Rotarians
NearOptimalGamblingAdive Matt Morgis Peter Chapman Mitch McCann Temple University.
Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications
Poker Download A most popular card game or group of card games is called poker. Players compete against one another by betting on the values of each player's.
What is Probability?  Hit probabilities  Damage probabilities  Personality (e.g. chance of attack, run, etc.)  ???  Probabilities are used to add.
Value of information Marko Tainio Decision analysis and Risk Management course in Kuopio
DEEP GOLD Team Members: Daniel Mack – Jared Sylvester Garrett Britten – Brian Bien CSE331: Data Structures Notre Dame, Fall 04, Stewman.
SARTRE: System Overview A Case-Based Agent for Two-Player Texas Hold'em Jonathan Rubin & Ian Watson University of Auckland Game AI Group
Memory and Analogy in Game-Playing Agents Jonathan Rubin & Ian Watson University of Auckland Game AI Group
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Senior Project Poster Day 2007, CIS Dept. University of Pennsylvania Reversi Meng Tran Faculty Advisor: Dr. Barry Silverman Strategies: l Corners t Corners.
The challenge of poker NDHU CSIE AI Lab 羅仲耘. 2004/11/04the challenge of poker2 Outline Introduction Texas Hold’em rules Poki’s architecture Betting Strategy.
Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada.
Neural Network Implementation of Poker AI
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Texas Hold’em Playing Styles Team 4 Matt Darryl Alex.
Short stack strategy: Draws in a free play situation Strategy: No Limit.
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.Expected value and pot odds, continued 2.Violette/Elezra example.
Introduction to Poker Originally created by Albert Wu,
Odds and Outs Strategy: General concepts. Expected Value – Dice Game Expected Value (EV) Calculating the EV in a dice game 1/3 x (+3$) + 2/3 x (-1$) =
Artificial Neural Networks And Texas Hold’em ECE 539 Final Project December 19, 2003 Andy Schultz.
By: John Cook 11/06/2009 PTTE John Cook 3/4/2016.
The Mathematics of Poker– Implied Pot Odds Strategy: No-Limit.
Outline: 1) Odds ratios, continued. 2) Expected value revisited, Harrington’s strategy 3) Pot odds 4) Examples.
Stat 35b: Introduction to Probability with Applications to Poker Outline for the day: 1.Expected value. 2.Heads up with AA. 3.Heads up with Gus vs.
Copyright © 2009 Pearson Education, Inc. Chapter 11 Understanding Randomness.
DECISION MODELS. Decision models The types of decision models: – Decision making under certainty The future state of nature is assumed known. – Decision.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Texas Holdem A Poker Variant vs. Flop TurnRiver. How to Play Everyone is dealt 2 cards face down (Hole Cards) 5 Community Cards Best 5-Card Hand Wins.
Lecture 13.
Game Theory Just last week:
Stat 35b: Introduction to Probability with Applications to Poker
AlphaGo with Deep RL Alpha GO.
Strategies for Poker AI player
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Stat 35b: Introduction to Probability with Applications to Poker
Presentation transcript:

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings …) Presented by Brett Borghetti 7 Jan 2007

7 Feb 2007Brett Borghetti 2 Contributions of the work: New betting strategy using probability: –Propagate a “Probability Triple” knowledge representation –An atomic unit stating the likelihood of each action occurring under a given situation Uses real-time simulations to generate a selective sample of the possible outcomes while a hand is in progress.

7 Feb 2007Brett Borghetti 3 Old Loki (Loki-1) Only carried most likely action or probability of playing the hand. Uses “expert knowledge” –Initial tables of income rates –Initial weight probabilities of opponent hands (how likely will they play with these cards) –Re-weighting rules for opponent model updates –Hand evaluator strength and potential –Rule based Betting module

7 Feb 2007Brett Borghetti 4 New Loki (Loki-2) All tables store probability triples –Propagating distributions allows distributed decisionmaking in components Simulator calculates expected value of the selected sample of the way the hand might play out –Eliminates some of the required ‘expert knowledge’

7 Feb 2007Brett Borghetti 5 Probability Triples Stores probability of 3 actions –[f,c,r] such that f+c+r = 1.0 Used in 3 locations in Loki 2 –Triple Generator Evaluates 2 card hands in the current context –Opponent Modeler For updating the weight tables in the opponent model –Action Selector (for choosing our next action) Can adjust the selection based on desired play style

7 Feb 2007Brett Borghetti 6 Simulation-Based Betting Strategy Calculates approximate expected value of the return on investment (expected value) for each possible betting action. Since folding has EV=0, they only consider the actions of call or raise from current position and try to expand the game tree from there Since the entire game tree would be intractable to search, uses selective sampling –Simulated opponent actions are biased by their weight tables, using random number to select the actual action in that simulated hand Author claims this approach should be better than the static approach –[brett] that would depend on how accurate the weighting scheme was at detecting the true behavior of the opponent

7 Feb 2007Brett Borghetti 7 Comparing Performance Single measurement: Small Bets per Hand –If you play 30 hands and it is a $10/20 game, an improvement of +0.10sb/hand means you win an extra $1.00 per game which results in an extra $30 won.

7 Feb 2007Brett Borghetti 8 Experiments Examines each change from Loki-1 separately –R: changing the reweighting –B: changing from rules-based betting to ‘action selector’ with randomizing –S: incorporating the simulation to compute EV in the action selection decision

7 Feb 2007Brett Borghetti 9 Experiments, (continued) Self Play in 10-seat game: Added components one at a time and compared performance B~R, B<<S, R<<S –B alone vs R alone is roughly equivalent and provides the least improvement, with S alone providing the most improvement B+R+S > S

7 Feb 2007Brett Borghetti 10 Experiments, (continued) Player Type comparisons in 10-seat game –Number of hands played to the flop: T = Tight L = Loose –How frequent bet and raise after the flop A = Agressive C = Conservative

7 Feb 2007Brett Borghetti 11 Issues [Brett] At the core of Loki-2 is the weighting system that models the opponent. –Is this system flexible and adaptable to rapid changes in opponent strategy, or do the weights have some kind of inertia that prevents the model to incorporate changes as quick as they might happen –Do the weight updates (belief updates) make sense?

7 Feb 2007Brett Borghetti 12 Background Information

7 Feb 2007Brett Borghetti 13 Texas Hold’em Heads-up Limit Poker Basics 2 Players 4 Betting Rounds per hand –Preflop(2 hole cards), Flop(3 community cards), Turn (1cc), River (1 cc) Action set = {fold, call(check), raise(bet)} Up to 3 raises allowed per round Round is over when either –When all players are even in the pot via a final call and each player has had at least one opportunity to act [go on to next round] –When one player folds [other player wins]

7 Feb 2007Brett Borghetti 14 Requirements for a World Class Poker Player Able to assess –Hand Strength –Hand Potential –Opponents Betting Strategy (opponent model) Has a strong –Betting strategy –Ability to play deceptively [bluff vs. slow play*] –Ability to play unpredictably

7 Feb 2007Brett Borghetti 15 Optimal vs Maximal play Optimal player makes decisions based on game-theoretic probabilities without regard to specific context (opponent’s plays) Maximal player takes into account the opponent’s sub-optimal tendencies and adjusts its play to exploit perceived weaknesses

7 Feb 2007Brett Borghetti 16 Hand Assessment (Hand Strength = HS) Pre-Flop HS determined from 169 equivalence classes “income rate” from 1M simulated poker hands Flop HS determined comparing each of the 1081 possible opponent hands with ours and determining how many wins each player has

7 Feb 2007Brett Borghetti 17 Hand Potential (HP) at the Flop PPot 1 = likelihood that our hand will improve with one card (the turn card) PPot 2 = likelihood that our hand will improve with two cards (turn and river) NPot 1 and 2 = equivalent calculations of likelihood that our opponent’s hand will get better than ours on the turn and/or river

7 Feb 2007Brett Borghetti 18 Effective Hand Strength & Pot Odds EHS = HS n + (1-HS n ) x Ppot n –The chance that we either are ahead or could pull ahead by the end of n=1 or n=2 cards from now Pot odds = P(win)/(Expected Return on Pot) –Example: if your chance of winning is 25%, you would call a $4 bet to win a $16 pot because your earnings are 0.25*$20 = $5 and hence you can expect to win $5 every time you pay $4 for an expected net gain of $1.00 per play.

7 Feb 2007Brett Borghetti 19 Opponent Modeling Uses initial weighting scheme based on original income rates on the 169 preflop card equivalency classes Updates the weights generically on each hand based on the betting used during that hand Updates the weights specifically based on the total betting history over all hands with this opponent Weight updates based on mean and variance of call vs. raise vs. fold actions

7 Feb 2007Brett Borghetti 20 Using the opponent model Calculate a new weight for all possible starting card combos (1081) of the opponent based on initial weights, HS, EHS and opponent actions (generic and specific) Weights for each possible hole card tuple provides an ordering over the possible hands Usually greatly reduces the uncertainty of what hands the opponent is playing… asuming they are not playing deceptively.