Presentation is loading. Please wait.

Presentation is loading. Please wait.

Games with Chance 2012/04/25 1. Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can.

Similar presentations


Presentation on theme: "Games with Chance 2012/04/25 1. Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can."— Presentation transcript:

1 Games with Chance 2012/04/25 1

2 Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can move to any position unless there are multiple opponent pieces there; if there is one opponent, it is captured and must start over. White has rolled 6-5 and must choose among four legal moves: (5- 10, 5-11), (5-11, 19-24), (5-10, 10-16), (5-11, 11-16) 2

3 Nondeterministic Games: Backgammon (cont.-1) Backgammon – White can decide what his legal moves are, but cannot determine black’s because that depends on what black rolls Game must include chance nodes How to pick best move? – Cannot apply minimax directly 3

4 Nondeterministic Games: Backgammon (cont.-1) Chance nodes are included in the game tree 4

5 Nondeterministic Games in General In nondeterministic games, chance introduced by dice, card-shuffling, face- down shuffling 5

6 Algorithm for Nondeterministic Games Expectiminimax gives perfect play Expectiminimax(n) = Utility(n)if n  Terminal max s  Successors(n) Expectiminimax(s)if n  MAX min s  Successors(n) Expectiminimax(s)if n  MIN  s  Successors(n) P(s)Expectiminimax(s)if n  Chance Successor function for a chance node n augments the state of n with each possible dice roll to produce successor s and P(s) 6

7 Pruning in Nondeterministic Game Trees A version of  -  pruning is possible [-∞,+∞] [-∞,2][-∞,+∞] 2 [2,2][-∞,+∞] 2 [2,2][-∞,2][-∞,+∞] [-∞,2][-∞,+∞] 2 2 2 [2,2][1,1][-∞,+∞] [1.5,1.5][-∞,+∞] 2 2 2 1 [2,2][1,1][-∞,0][-∞,+∞] [1.5,1.5][-∞,+∞] 2 2 2 1 0 [2,2][1,1][0,0][-∞,+∞] [1.5,1.5][-∞,+∞] 2 2 2 1 0 1 [2,2][1,1][0,0][-∞,1] [1.5,1.5][-∞,0.5] 2 2 2 1 0 1 1 7

8 Pruning Contd. More pruning occurs if we can bound the leaf values [-2,2] 2 [2,2][-2,2] [0,2][-2,2] 2 [2,2][-2,2] [0,2][-2,2] 2 2 2 [2,2][1,1][-2,2] [1.5,1.5][-2,2] 2 2 2 1 [2,2][1,1][-2,0][-2,2] [1.5,1.5][-2,1] 2 2 2 1 0 8

9 Digression: Exact Value DO Matter Behavior is preserved only by positive linear transformation of Eval Hence Eval should be proportional to the expected payoff Move to A 1 is best Move to A 2 is best 9

10 10 Nondeterministic games in practice

11 Games of Imperfect Information e.g., card game, where opponent’s initial cards are unknown Typically we can calculate a probability for each possible deal Seems just like having a big dice roll at the beginning of games Idea: averaging over clairvoyancy – compute the minimax value of each action for each possible deal of the cards – choose the action with the highest expected value over all deals Special case: if an action is optimal for all deals, it is optimal GIB (Ginsberg, 1999), current best bridge program, approximate this idea by modifying averaging over clairvoyancy – generating 100 deals consistent with bidding information – picking the action that wins most tricks on average 11

12 Example Four-card bridge/whist/hearts hand, Max to play first 12

13 13 Proper analysis

14  -  Algorithm function A LPHA -B ETA -S EARCH ( state ) returns an action inputs: state, current state in game v  M AX -V ALUE ( state, – ,   ) return the action in S UCCESSORS ( state ) with value v function M AX -V ALUE ( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v  –  for a, s in S UCCESSORS ( state ) do v  M AX ( v, M IN -V ALUE ( s, ,  )) if v   then return v// fail-high   M AX ( , v ) return v 14

15  -  Algorithm (cont.) function M IN -V ALUE ( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v    for a, s in S UCCESSORS ( state ) do v  M IN ( v, M AX -V ALUE ( s, ,  )) if v   then return v// fail low   M IN ( , v ) return v 15

16 Negamax(B. Chen, 2010) 16


Download ppt "Games with Chance 2012/04/25 1. Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can."

Similar presentations


Ads by Google