Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.

Similar presentations


Presentation on theme: "Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java."— Presentation transcript:

1 Adversarial Search CMPT 463

2 When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java IDEs: Python IDLE, Visual Studio, Eclipse, NetBeans Event Schedule 3:30 – 5:30 pm – Contest 5:30 pm – Award Ceremony 5:30 pm – Pizza Party Register your team online at http://goo.gl/forms/Ub65Df7pAe or in RLC 203. http://goo.gl/forms/Ub65Df7pAe Contact Dr. Tina Tian for questions.

3 Outline Game playing Game trees o Minimax o Alpha-beta pruning

4 Games vs. search problems competitive environments: agents’ goals are in conflict adversarial search problems (games) Games are interesting b/c: Time limits o Search tree of chess has about 10 154 nodes. o How to make the best possible use of time o Choose a good move when time is limited

5 Types of Games chess, checkers, go, othello backgammon, monopoly bridge, poker perfect information imperfect information deterministicchance

6 Games deterministic, fully-observable, turn-taking, two–player, zero-sum games o Utility values at the end are equal and opposite o Tic-tac-toe

7 Game Search Formulation Two players MAX and MIN take turns (with MAX playing first) S 0 : Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

8 Game Search Formulation S 0 : initial state Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

9 Game Search Formulation S 0 : initial state Player(s): which player has the move in a state Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

10 Game Search Formulation S 0 : initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): Terminal-test(s): Utility(s,p):

11 Game Search Formulation S 0 : initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): Utility(s,p):

12 Game Search Formulation S 0 : initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p):

13 Game Search Formulation S 0 : initial state Player(s): which player has the move in a state Actions(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p o zero-sum games: same total payoff

14 Game tree (1-player)

15

16 Partial Game Tree for Tic-Tac-Toe

17 Optimal strategies MAX uses search tree to determine next move. Assumption: Both players play optimally!! Given a game tree, the optimal strategy can be determined by using the minimax value of each node

18 Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = o Utility (s)if Terminal-test(s) o max of Minimax(Result(s,a)) if Player(s) = Max o min of Minimax(Result(s,a))if Player(s) = Min

19 Optimal Play MAX MIN This is the optimal play 271 8 271 8 2 271 8 21 271 8 21 2 271 8

20 Two-Ply Game Tree

21

22

23 The minimax decision Minimax maximizes the worst-case outcome for max.

24 Minimax Tree MAX node MIN node f value value computed by minimax Minimax decision (backed up )

25 What if MIN does not play optimally? Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX can do even better.

26 Minimax Algorithm function MINIMAX-DECISION(state) returns an action inputs: state, current state in game v  MAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s)) return v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  -∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s)) return v

27 Properties of minimax Complete? o Yes (if tree is finite) Optimal? o Yes (against an optimal opponent) Time complexity? o O(b m ) Space complexity? o O(bm) (depth-first exploration) o For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution is infeasible

28 Alpha-Beta Pruning Problem with minimax search: exponential in the depth of the tree Can we cut it in half? It is possible to compute the minimax decision without looking at every node. o pruning : eliminate some parts of the tree

29 Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning 271? MAX MIN

30 Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning 271? We don’t need to compute the value at this node. No matter what it is, it can’t affect the value of the root node. MAX MIN

31 Alpha-Beta Example [-∞, +∞] Range of possible values Do DFS until the first leaf

32 Alpha-Beta Example [-∞, +∞] Range of possible values Do DFS until first leaf

33 Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

34 Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

35 Alpha-Beta Example (continued) [3,3] [-∞,+∞]

36 Alpha-Beta Example (continued) [3,+∞] [3,3]

37 Alpha-Beta Example (continued) [-∞, ∞] [3,+∞] [3,3]

38 Alpha-Beta Example (continued) [-∞,2] [3,+∞] [3,3]

39 Alpha-Beta Example (continued) [-∞,2] [3,+∞] [3,3] This node is worse for MAX

40 Alpha-Beta Example (continued) [-∞,2] [3,14] [3,3][-∞, ∞],

41 Alpha-Beta Example (continued) [-∞,2] [3,14] [3,3][-∞,14],

42 Alpha-Beta Example (continued) [−∞,2] [3,5] [3,3][-∞,5],

43 Alpha-Beta Example (continued) [2,2] [−∞,2][3,3]

44 Alpha-Beta Example (continued) [2,2] [-∞,2] [3,3]

45 α-β pruning example Minimax(root) = max(min(3,12,8),min(2,x,y),min(14,5,2)) = max(3,min(2,x,y),2) = 3

46 α-β pruning We made the same minimax decision without ever evaluating two of the leaf nodes! o They are independent. It is possible to prune entire subtrees.

47 Why is it called α-β? α = value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it  prune that branch Define β similarly for min

48 Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game v  MAX-VALUE(state, - ∞, + ∞ ) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, ,  ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  - ∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s, ,  )) if v ≥  then return v   MAX( ,v) return v

49 Alpha-Beta Algorithm function MIN-VALUE(state, ,  ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  + ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s, ,  )) if v ≤  then return v   MIN( ,v) return v

50 Alpha-Beta Example 05-325-232-3033-501-3501-5532-35

51 Alpha-Beta Example 05-325-232-3033-501-3501-5532-35 1. 3. 4. 6. 5.8. 7. 9. 10. 11. 12. 13. 2.

52 Comments: Alpha-Beta Pruning Pruning does not affect the final results. Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning. With “perfect ordering,” time complexity is O(b m/2 ) o Branching factor of sqrt(b) !! o Alpha-beta pruning can look twice as far as minimax in the same amount of time

53 Deterministic games in practice Checkers : Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess : Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello : Logistello defeated the human world champion. It is generally acknowledged that human are no match for computers at Othello.

54


Download ppt "Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java."

Similar presentations


Ads by Google