Games with Chance 2012/04/25 1
Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can move to any position unless there are multiple opponent pieces there; if there is one opponent, it is captured and must start over. White has rolled 6-5 and must choose among four legal moves: (5- 10, 5-11), (5-11, 19-24), (5-10, 10-16), (5-11, 11-16) 2
Nondeterministic Games: Backgammon (cont.-1) Backgammon – White can decide what his legal moves are, but cannot determine black’s because that depends on what black rolls Game must include chance nodes How to pick best move? – Cannot apply minimax directly 3
Nondeterministic Games: Backgammon (cont.-1) Chance nodes are included in the game tree 4
Nondeterministic Games in General In nondeterministic games, chance introduced by dice, card-shuffling, face- down shuffling 5
Algorithm for Nondeterministic Games Expectiminimax gives perfect play Expectiminimax(n) = Utility(n)if n Terminal max s Successors(n) Expectiminimax(s)if n MAX min s Successors(n) Expectiminimax(s)if n MIN s Successors(n) P(s)Expectiminimax(s)if n Chance Successor function for a chance node n augments the state of n with each possible dice roll to produce successor s and P(s) 6
Pruning in Nondeterministic Game Trees A version of - pruning is possible [-∞,+∞] [-∞,2][-∞,+∞] 2 [2,2][-∞,+∞] 2 [2,2][-∞,2][-∞,+∞] [-∞,2][-∞,+∞] [2,2][1,1][-∞,+∞] [1.5,1.5][-∞,+∞] [2,2][1,1][-∞,0][-∞,+∞] [1.5,1.5][-∞,+∞] [2,2][1,1][0,0][-∞,+∞] [1.5,1.5][-∞,+∞] [2,2][1,1][0,0][-∞,1] [1.5,1.5][-∞,0.5]
Pruning Contd. More pruning occurs if we can bound the leaf values [-2,2] 2 [2,2][-2,2] [0,2][-2,2] 2 [2,2][-2,2] [0,2][-2,2] [2,2][1,1][-2,2] [1.5,1.5][-2,2] [2,2][1,1][-2,0][-2,2] [1.5,1.5][-2,1]
Digression: Exact Value DO Matter Behavior is preserved only by positive linear transformation of Eval Hence Eval should be proportional to the expected payoff Move to A 1 is best Move to A 2 is best 9
10 Nondeterministic games in practice
Games of Imperfect Information e.g., card game, where opponent’s initial cards are unknown Typically we can calculate a probability for each possible deal Seems just like having a big dice roll at the beginning of games Idea: averaging over clairvoyancy – compute the minimax value of each action for each possible deal of the cards – choose the action with the highest expected value over all deals Special case: if an action is optimal for all deals, it is optimal GIB (Ginsberg, 1999), current best bridge program, approximate this idea by modifying averaging over clairvoyancy – generating 100 deals consistent with bidding information – picking the action that wins most tricks on average 11
Example Four-card bridge/whist/hearts hand, Max to play first 12
13 Proper analysis
- Algorithm function A LPHA -B ETA -S EARCH ( state ) returns an action inputs: state, current state in game v M AX -V ALUE ( state, – , ) return the action in S UCCESSORS ( state ) with value v function M AX -V ALUE ( state, , ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v – for a, s in S UCCESSORS ( state ) do v M AX ( v, M IN -V ALUE ( s, , )) if v then return v// fail-high M AX ( , v ) return v 14
- Algorithm (cont.) function M IN -V ALUE ( state, , ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v for a, s in S UCCESSORS ( state ) do v M IN ( v, M AX -V ALUE ( s, , )) if v then return v// fail low M IN ( , v ) return v 15
Negamax(B. Chen, 2010) 16