A Whist AI Jason Fong CS261A, Spring 2005
What is Whist? Old card game, driven into obscurity by Bridge Similar to other trick taking games Bridge, Spades, Hearts Many variations– no current official version Variation studied is the one I play Similar to Oh Hell! a.k.a. Perpetual Aggravation, Oh Jerusalem, Oh Pshaw, Blackout, Screw Your Neighbor, Nomination Whist, or Animal
Rules of Whist Game structure Played with 5 players No partnerships Rounds played with varying number of cards … … After hands dealt, next card turned face up and determines the trump suit Players take turns dealing Highest cumulative score wins
Rules of Whist Trick taking game Each player plays one card per round Suit of first card played is the lead suit Subsequent players must play same suit if possible If no cards of same suit, then play any card Highest card of lead suit wins the trick Trump suit One suit is designated as the trump for each deal Highest card of trump suit always wins a trick Can only play trumps if void of lead suit or if trump led
Rules of Whist Bidding Each player declares how many tricks they will take Goal is to take exact number of tricks bid Bidding starts at person following dealer Bids must not add up to number of cards dealt to each player Dealer bids last– bid value restricted
Rules of Whist Scoring One point for each trick taken 10 points for making bid exactly No bonus for zero (nil) bids Making/missing bids most important part of score Stopping other players bids factors into strategy Also need to consider other players trying to stop you
Making an AI for Whist Perform some sort of search Each node is a particular play order Internal nodes are the order of play for a partial game Leaves are complete order of play Imperfect information
A Naïve Approach Assume opponents can have any card not seen yet Search tree quickly explodes Worse case when 10 cards dealt: Depth of 50 Opponent branching factor begins at 41 Branching factor reduces by one with each opponent turn – but tree still too big Number of leaves: 41! x 10! ~ 10 56
Reducing Branching Factor Assume we can cheat, and peek at opponents cards Branching factor becomes the number of cards in an opponents hand Tree still too large to search completely Number of leaves: (10!) 5 ~ 6 x Cut off search depth and use a heuristic
Heuristic Most important part of score is making bids How likely is it to make the bid? How many tricks can I win? How many tricks can I duck? Calculating a heuristic value: Classify cards based on ability to win / duck a trick Consider tricks needed to make bid Estimate final number of tricks taken
Heuristic Classify cards in your hand Absolute winners Have highest card of suit, and no players void of the suit has a trump Absolute losers At least one player has the suit and their lowest value is higher than this card Or some player has all trumps Likely winners / losers Consider whether an opponent wants to take the trick Consider opponent tricks left after absolute winners/losers Neutral cards Everything else
Heuristic Calculating a heuristic value: Tricks to make bid Subtract absolute winners Subtract some fraction of likely winners If under bid: subtract some fraction of neutrals If over bid: add some fraction of neutrals Gives an estimate of likelihood of making bid
Heuristic Other considerations Each trick taken adds a point If you cant make your bid, might as well take as many tricks as possible Forcing opponents to miss a bid If everybody misses their bid, then scores dont change much If you cant make your bid, take as many others down with you
Search Methodology Max n algorithm Similar to minimax, but for n players Evaluation of each node gives an n-tuple Each node maximizes the moving players element in the n-tuple Backup entire n-tuple to parent Best next move is best child of root
An Honest Whist AI Nobody wants to play with a cheater But gamblers are fine – use Monte-Carlo simulation Remember which cards have been seen Take unseen cards and randomly deal the opponents hands Peek at cards and find best move Do this many times and count how often each of your cards was chosen as the best move Play the card that came up most often Pseudo-perfect-information while searching, but does not need to see the actual cards in play
Optimization Considerations Branching factor decreases as game progresses Search deeper later in game No choice of moves in last trick Automatically play last 5 cards Possible to have only one legal move Dont search, just play the card
Improving Quality Depth vs. Iterations Thinking time can be spent on searching deeper or searching more iterations of deal variations Many cards in hand – Favor iterations Many more possible variations – need more samples Searching deeper not cost-effective Additional depth more expensive Heuristic values not as accurate Few cards in hand – Favor depth Fewer possible variations – need fewer samples Complete search more often Heuristic more accurate when fewer cards left to play Adjust depth and iterations based on cards left to play
Evaluation Methodology Non-standard game, so no existing programs to compare against A subjective evaluation: Played against a focus group of four human players Took their comments on blunders made by the AI Observed percentage of games won Observed reasoning behind moves made Difficult to determine effect of each component of the system
Results Play against the focus group suggests the AI plays acceptably well Does not play well enough to consistently win Plays well enough to not fall hopelessly behind Trials possibly effected by opponent bias Human players focused on defeating the AI player AI possibly hindered by not being paranoid enough
Heuristic Improvements Weights of likely winners/losers Divide neutral card classification based on whether a card is more likely to win or lose Rank cards relative to other remaining cards with same suit Count how many cards of the same suit have been played – likelihood of being trumped Consider protected high cards If you have low cards of the same suit as a high card, not as dangerous when close to getting too many tricks Adjust value of forcing missed bids Learned paranoia degree for each opponent
Other Improvements Deal opponent hands based on their bids Need to consider bids being incorrect when a player misses with a winner/loser card Observe play and count possible missed winners/losers? Adds lots of complications
Automating Bidding Possible algorithm: Consider all bid possibilities and get backed up value at root Choose bid with best backed up value Results not very effective Missing other factors: protected high cards, long/short suits, ability to keep / give away lead, etc. Bids entered manually Evaluation based on live games, so no need to quickly bid on a large number of games Focus on card play without being affected by bid quality
Conclusions Produced a respectable Whist AI Not a dominating player, but still many possibilities for improvements Winner/loser card classifier effective for Whist – possibly applicable to other trick taking games Suggests that Monte-Carlo / peek-at-cards method can be effective for other imperfect information card games
Future Work Improve AI performance Graphical interface Online multiplayer version