Presentation is loading. Please wait.

Presentation is loading. Please wait.

P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 1 Single-Person Game  conventional search problem  identify a sequence of moves.

Similar presentations


Presentation on theme: "P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 1 Single-Person Game  conventional search problem  identify a sequence of moves."— Presentation transcript:

1 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 1 Single-Person Game  conventional search problem  identify a sequence of moves that leads to a winning state  examples: Solitaire, dragons and dungeons, Rubik’s cube  little attention in AI  some games can be quite challenging  some versions of solitaire  a heuristic for Rubik’s cube was found by the Absolver program

2 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 2 Two-Person Game  games with two opposing players  often called MAX and MIN  usually MAX moves first, then min  in game terminology, a move comprises one step, or play by each player  Typically you are Max  MAX wants a strategy to find a winning state  no matter what MIN does  Max must assume MIN does the same  or at least tries to prevent MAX from winning

3 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 3 Perfect Decisions  Optimal strategy for MAX  traverse all relevant parts of the search tree  this must include possible moves by MIN  identify a path that leads MAX to a winning state  So MAX must do some work to estimate all the possible moves (to a certain depth) from the current position and try to plan the best way forward such that he will win  often impractical  time and space limitations

4 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 4 479 Nodes are discovered using DFS Once leaf nodes are discovered they are scored Here Max is building a tree of possibilities Which way should he play when the tree is finished? Max’s possible moves Mins’s possible moves Max’s moves Mins’s moves

5 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 5 Max-Min Example 479698856752325493 terminal nodes: values calculated from the utility function

6 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 6 MiniMax Example Min 479698856752325493 476263451254126343 other nodes: values calculated via minimax algorithm Here the green nodes pick the minimum value from the nodes underneath

7 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 7 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 other nodes: values calculated via minimax algorithm Here the red nodes pick the maximum value from the nodes underneath

8 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 8 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 other nodes: values calculated via minimax algorithm

9 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 9 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 5

10 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 10 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 5 moves by Max and countermoves by Min

11 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 11 MiniMax Observations  the values of some of the leaf nodes are irrelevant for decisions at the next level  this also holds for decisions at higher levels  as a consequence, under certain circumstances, some parts of the tree can be disregarded  it is possible to still make an optimal decision without considering those parts

12 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 12 What is pruning?  You don’t have to look at every node in the tree  discards parts of the search tree  That are guaranteed not to contain good moves  results in substantial time and space savings  as a consequence, longer sequences of moves can be explored  the leftover part of the task may still be exponential, however

13 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 13 Alpha-Beta Pruning  extension of the minimax approach  results in the same move as minimax, but with less overhead  prunes uninteresting parts of the search tree  certain moves are not considered  won’t result in a better evaluation value than a move further up in the tree  they would lead to a less desirable outcome  applies to moves by both players   indicates the best choice for Max so far never decreases   indicates the best choice for Min so far never increases

14 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 14 Note  For the following example remember:  Nodes are found with DFS  As a terminal or leaf node (or at a temporary terminal node at a certain depth) is found a utility function gives it a score  So do we need to evaluate F and G once we find E? Can we prune the tree? 5 E FG

15 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 15 Alpha-Beta Example 1 Max Min [-∞, +∞] 5  Step 1 --- we expand the tree a little  we assume a depth-first, left-to-right search as basic strategy  the range ( [-∞, +∞] ) of the possible values for each node are indicated  initially the local values [-∞, +∞] reflect the values of the sub-trees in that node from Max’s or Min’s perspective;  Since we haven’t expanded they are infinite  the global values  and  are the best overall choices so far for Max or Min [-∞, +∞]  best choice for Max?  best choice for Min? Global Values Local Values

16 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 16 Alpha-Beta Example 2 Max Min 7 [-∞, 7] 5  We evaluate a node  Min obtains the first value from a successor node [-∞, +∞]  best choice for Max?  best choice for Min7 @ min node if value of node < value of parent then abandon. …NO because no  yet

17 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 17 Alpha-Beta Example 3 Max Min 76 [-∞, 6] 5  Min obtains the second value from a successor node [-∞, +∞]  best choice for Max?  best choice for Min6 @ min node if value of node < value of parent then abandon. …NO

18 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 18 Alpha-Beta Example 4 Max Min 765 5 5 [5, +∞]  best choice for Max5  best choice for Min5  Min obtains the third value from a successor node  this is the last value from this sub-tree, and the exact value is known  Min is finished on this branch  Max now has a value for its first successor node, but hopes that something better might still come @ min node if value of node < value of parent then abandon. …No more nodes

19 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 19 Alpha-Beta Example 5 Max Min 765 5 5  Min continues with the next sub-tree, and gets a better value  Max has a better choice from its perspective (the ‘5’) and should not consider a move into the sub-tree currently being explored by Min 3 [5, +∞]  best choice for Max5  best choice for Min3 [-∞, 3] @ min node if value of node < value of parent then abandon. …YES!

20 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 20 Alpha-Beta Example 6 Max Min 765 5 5  Max won’t consider a move to this sub-tree it would choose 5 first =>abandon expansion  this is a case of pruning, indicated by  Pruning means that we won’t do a DFS into these nodes  No need to use the utility function to calculate the nodes value  No need to explore that part of the tree further. 3 [5, +∞]  best choice for Max5  best choice for Min3 [-∞, 3] @ min node if value of node < value of parent then abandon. …YES!

21 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 21 Alpha-Beta Example 7 Max Min 7656 5  Min explores the next sub-tree, and finds a value that is worse than the other nodes at this level  Min knows this ‘6’ is good for Max and will then evaluate the leaf nodes looking for a lower value (if it exists)  if Min is not able to find something lower, then Max will choose this branch 3  best choice for Max5  best choice for Min3 5 [5, +∞] [-∞, 3][-∞, 6] @ min node if value of node < value of parent then abandon. …NO

22 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 22 Alpha-Beta Example 8 Max Min 7656 5  Min is lucky, and finds a value that is the same as the current worst value at this level  Max can choose this branch, or the other branch with the same value 3  best choice for Max5  best choice for Min3 5 [5, +∞] [-∞, 3][-∞, 5] 5 @ min node if value of node < value of parent then abandon. …YES!

23 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 23 Alpha-Beta Example 9 Max Min 7656 5  Min could continue searching this sub-tree to see if there is a value that is less than the current worst alternative in order to give Max as few choices as possible  this depends on the specific implementation  Max knows the best value for its sub-tree 3  best choice for Max5  best choice for Min3 5 5 [-∞, 3][-∞, 5] 5

24 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 24 Alpha-Beta Example Overview Max Min 765654 5<=5 5  some branches can be pruned because they would never be considered  after looking at one branch, Max already knows that they will not be of interest since Min would choose a value that is less than what Max already has at its disposal 364 <= 3 5  best choice for Max5  best choice for Min3

25 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 25 Properties of Alpha-Beta Pruning  in the ideal case, the best successor node is examined first  alpha-beta can look ahead twice as far as minimax  assumes an idealized tree model  uniform branching factor, path length  random distribution of leaf evaluation values  requires additional information for good players  game-specific background knowledge  empirical data

26 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 26 Imperfect Decisions  Does this mean I have to expand the tree all the way???  complete search is impractical for most games  alternative: search the tree only to a certain depth  requires a cutoff-test to determine where to stop

27 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 27 Evaluation Function  If I stop part of the way down how do I score these nodes (which are not terminal nodes!)  Use an evaluation function to score these nodes!  must be consistent with the utility function  values for terminal nodes (or at least their order) must be the same  tradeoff between accuracy and time cost  without time limits, minimax could be used

28 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 28 Example: Tic-Tac-Toe  simple evaluation function E(s) = (rx + cx + dx) - (ro + co + do) where r,c,d are the number of rows, columns and diagonals lines available for a win for X (or O) in that state; x and o are the pieces of the two players  1-ply lookahead  start at the top of the tree  evaluate all 9 choices for player 1  pick the maximum E-value  2-ply lookahead  also looks at the opponents possible move  assuming that the opponents picks the minimum E-value

29 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 29 E(s1:2) 8 - 6 = 2 E(s1:3) 8 - 5 = 3 E(s1:4) 8 - 6 = 2 E(s1:5) 8 - 4 = 4 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 Tic-Tac-Toe 1-Ply XXX XXX XXX E(s1:1) 8 - 5 = 3 E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:1) == E of state @ depth-level ‘1’ number ‘1’ Simple evaluation function E(s1:1) = (rx + cx + dx) - (ro + co + do) E(s1:1) = ‘X’ has (3rows + 3 cols + 2 diags) – ‘O’ has (2r + 2c +1d) E(s11) = 8 -5 = 3

30 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 30 E(s2:16) 5 - 6 = -1 E(s2:15) 5 -6 = -1 E(s28) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s2:48) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:13) 5 - 6 = -1 E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 5 - 6 = -1 E(s2:14) 5 - 6 = -1 E(s25) 6 - 5 = 1 E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s26) 5 - 5 = 0 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 E(s1:5) 8 - 4 = 4 E(s1:3) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:1) 8 - 5 = 3 E(s2:45) 6 - 4 = 2 Tic-Tac-Toe 2-Ply XXX XXX XXX E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:4) 8 - 6 = 2 XOX O X O E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 OX O X O X O XX O X O X O X O XX O XOOXX O X O X O X O X O X O XOXOX O O Note: For 2-Ply we must expand all nodes in the 1 st level but as scores of 2 and 3 repeat a few times we only expand one of each.

31 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 31 E(s2:16) 5 - 6 = -1 E(s2:15) 5 -6 = -1 E(s28) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s2:48) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:13) 5 - 6 = -1 E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 5 - 6 = -1 E(s2:14) 5 - 6 = -1 E(s25) 6 - 5 = 1 E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s26) 5 - 5 = 0 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 E(s1:5) 8 - 4 = 4 E(s1:3) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:1) 8 - 5 = 3 E(s2:45) 6 - 4 = 2 Tic-Tac-Toe 2-Ply XXX XXX XXX E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:4) 8 - 6 = 2 XOX O X O E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 OX O X O X O XX O X O X O X O XX O XOOXX O X O X O X O X O X O XOXOX O O It seems that the centre X (i.e. a move to E(s1:5)) is the best because its successor nodes have a value of 1 (as opposed to 0 or worse -1)

32 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 32 31 Checkers Case Study  initial board configuration  Black single on 20 single on 21 king on 31  Redsingle on 23 king on 22  evaluation function E(s) = (5 x 1 + x 2 ) - (5r 1 + r 2 ) where x 1 = black king advantage, x 2 = black single advantage, r 1 = red king advantage, r 2 = red single advantage 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23

33 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 33 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 21 -> 14 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MIN moves Part 1 MiniMax using DFS Typically you expand DFS style to a certain depth and then eval. node E(s) = (5 x1 + x2) - (5r1 + r2) = (5+2) –(0 +1) = 6 takes red king MAX moves

34 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 34 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN We fast-forward and see the full tree to a certain depth

35 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 35 11112611111111116000-4 -8 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Then we score leaf nodes

36 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 36 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Then we propagate Min-Max scores upwards

37 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 37 Could we make this a little easier?  This tree will really expand some considerable distance  What about if we introduce a little pruning  OK then … but first we have to go back to the start.

38 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 38 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 21 -> 14 MIN moves Checkers Min-Max With  pruning MAX moves 6 6 Temporarily propagate values up

39 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 39 Checkers Alpha-Beta Example 6 6 1 1 1111 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 21 -> 14 16 -> 11 31 -> 27 MAX MIN @ max node if value of node > value of parent then abandon. …No! … No! … No! …. No …. No! Alas no pruning below this Max node! Next node is expanded

40 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 40 Checkers Alpha-Beta Example 6 6 1 1 11111 1 1111 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Notice new value propagated up! @ max node if value of node > value of parent then abandon. …No! So its in the interest of Max node to see if it can do better … No! etc …. No etc …. No!

41 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 41 Checkers Alpha-Beta Example 6 6 1 1 11111 1 1111 1 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 Its not in MAXs interest to bother expanding these nodes as they have the same value as 16-> 11 so we prune them MAX MIN

42 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 42 Checkers Alpha-Beta Example 6 6 1 1 11111 1 11116 6 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN

43 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 43 Checkers Alpha-Beta Example 1 1 1116 6 1 1 11111 1 11116 6 1 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 MAX MIN

44 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 44 Search Limits  search must be cut off because of time or space limitations  strategies like depth-limited or iterative deepening search can be used  don’t take advantage of knowledge about the problem  more refined strategies apply background knowledge  quiescent search  cut off only parts of the search space that don’t exhibit big changes in the evaluation function

45 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 45 Games and Computers  state of the art for some game programs  Chess  Checkers  Othello  Backgammon  Go

46 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 46 Chess  Deep Blue, a special-purpose parallel computer, defeated the world champion Gary Kasparov in 1997  the human player didn’t show his best game  some claims that the circumstances were questionable  Deep Blue used a massive data base with games from the literature  Fritz, a program running on an ordinary PC, challenged the world champion Vladimir Kramnik to an eight-game draw in 2002  top programs and top human players are roughly equal

47 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 47 Checkers  Arthur Samuel develops a checkers program in the 1950s that learns its own evaluation function  reaches an expert level stage in the 1960s  Chinook becomes world champion in 1994  human opponent, Dr. Marion Tinsley, withdraws for health reasons  Tinsley had been the world champion for 40 years  Chinook uses off-the-shelf hardware, alpha-beta search, end-games data base for six-piece positions

48 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 48 Othello  Logistello defeated the human world champion in 1997  many programs play far better than humans  smaller search space than chess  little evaluation expertise available

49 P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 49 Backgammon  TD-Gammon, neural-network based program, ranks among the best players in the world  improves its own evaluation function through learning techniques  search-based methods are practically hopeless  chance elements, branching factor


Download ppt "P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 1 Single-Person Game  conventional search problem  identify a sequence of moves."

Similar presentations


Ads by Google