Download presentation
Presentation is loading. Please wait.
1
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 1 Single-Person Game conventional search problem identify a sequence of moves that leads to a winning state examples: Solitaire, dragons and dungeons, Rubik’s cube little attention in AI some games can be quite challenging some versions of solitaire a heuristic for Rubik’s cube was found by the Absolver program
2
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 2 Two-Person Game games with two opposing players often called MAX and MIN usually MAX moves first, then min in game terminology, a move comprises one step, or play by each player Typically you are Max MAX wants a strategy to find a winning state no matter what MIN does Max must assume MIN does the same or at least tries to prevent MAX from winning
3
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 3 Perfect Decisions Optimal strategy for MAX traverse all relevant parts of the search tree this must include possible moves by MIN identify a path that leads MAX to a winning state So MAX must do some work to estimate all the possible moves (to a certain depth) from the current position and try to plan the best way forward such that he will win often impractical time and space limitations
4
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 4 479 Nodes are discovered using DFS Once leaf nodes are discovered they are scored Here Max is building a tree of possibilities Which way should he play when the tree is finished? Max’s possible moves Mins’s possible moves Max’s moves Mins’s moves
5
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 5 Max-Min Example 479698856752325493 terminal nodes: values calculated from the utility function
6
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 6 MiniMax Example Min 479698856752325493 476263451254126343 other nodes: values calculated via minimax algorithm Here the green nodes pick the minimum value from the nodes underneath
7
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 7 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 other nodes: values calculated via minimax algorithm Here the red nodes pick the maximum value from the nodes underneath
8
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 8 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 other nodes: values calculated via minimax algorithm
9
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 9 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 5
10
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 10 MiniMax Example Max Min 479698856752325493 476263451254126343 765564 534 5 moves by Max and countermoves by Min
11
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 11 MiniMax Observations the values of some of the leaf nodes are irrelevant for decisions at the next level this also holds for decisions at higher levels as a consequence, under certain circumstances, some parts of the tree can be disregarded it is possible to still make an optimal decision without considering those parts
12
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 12 What is pruning? You don’t have to look at every node in the tree discards parts of the search tree That are guaranteed not to contain good moves results in substantial time and space savings as a consequence, longer sequences of moves can be explored the leftover part of the task may still be exponential, however
13
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 13 Alpha-Beta Pruning extension of the minimax approach results in the same move as minimax, but with less overhead prunes uninteresting parts of the search tree certain moves are not considered won’t result in a better evaluation value than a move further up in the tree they would lead to a less desirable outcome applies to moves by both players indicates the best choice for Max so far never decreases indicates the best choice for Min so far never increases
14
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 14 Note For the following example remember: Nodes are found with DFS As a terminal or leaf node (or at a temporary terminal node at a certain depth) is found a utility function gives it a score So do we need to evaluate F and G once we find E? Can we prune the tree? 5 E FG
15
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 15 Alpha-Beta Example 1 Max Min [-∞, +∞] 5 Step 1 --- we expand the tree a little we assume a depth-first, left-to-right search as basic strategy the range ( [-∞, +∞] ) of the possible values for each node are indicated initially the local values [-∞, +∞] reflect the values of the sub-trees in that node from Max’s or Min’s perspective; Since we haven’t expanded they are infinite the global values and are the best overall choices so far for Max or Min [-∞, +∞] best choice for Max? best choice for Min? Global Values Local Values
16
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 16 Alpha-Beta Example 2 Max Min 7 [-∞, 7] 5 We evaluate a node Min obtains the first value from a successor node [-∞, +∞] best choice for Max? best choice for Min7 @ min node if value of node < value of parent then abandon. …NO because no yet
17
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 17 Alpha-Beta Example 3 Max Min 76 [-∞, 6] 5 Min obtains the second value from a successor node [-∞, +∞] best choice for Max? best choice for Min6 @ min node if value of node < value of parent then abandon. …NO
18
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 18 Alpha-Beta Example 4 Max Min 765 5 5 [5, +∞] best choice for Max5 best choice for Min5 Min obtains the third value from a successor node this is the last value from this sub-tree, and the exact value is known Min is finished on this branch Max now has a value for its first successor node, but hopes that something better might still come @ min node if value of node < value of parent then abandon. …No more nodes
19
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 19 Alpha-Beta Example 5 Max Min 765 5 5 Min continues with the next sub-tree, and gets a better value Max has a better choice from its perspective (the ‘5’) and should not consider a move into the sub-tree currently being explored by Min 3 [5, +∞] best choice for Max5 best choice for Min3 [-∞, 3] @ min node if value of node < value of parent then abandon. …YES!
20
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 20 Alpha-Beta Example 6 Max Min 765 5 5 Max won’t consider a move to this sub-tree it would choose 5 first =>abandon expansion this is a case of pruning, indicated by Pruning means that we won’t do a DFS into these nodes No need to use the utility function to calculate the nodes value No need to explore that part of the tree further. 3 [5, +∞] best choice for Max5 best choice for Min3 [-∞, 3] @ min node if value of node < value of parent then abandon. …YES!
21
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 21 Alpha-Beta Example 7 Max Min 7656 5 Min explores the next sub-tree, and finds a value that is worse than the other nodes at this level Min knows this ‘6’ is good for Max and will then evaluate the leaf nodes looking for a lower value (if it exists) if Min is not able to find something lower, then Max will choose this branch 3 best choice for Max5 best choice for Min3 5 [5, +∞] [-∞, 3][-∞, 6] @ min node if value of node < value of parent then abandon. …NO
22
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 22 Alpha-Beta Example 8 Max Min 7656 5 Min is lucky, and finds a value that is the same as the current worst value at this level Max can choose this branch, or the other branch with the same value 3 best choice for Max5 best choice for Min3 5 [5, +∞] [-∞, 3][-∞, 5] 5 @ min node if value of node < value of parent then abandon. …YES!
23
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 23 Alpha-Beta Example 9 Max Min 7656 5 Min could continue searching this sub-tree to see if there is a value that is less than the current worst alternative in order to give Max as few choices as possible this depends on the specific implementation Max knows the best value for its sub-tree 3 best choice for Max5 best choice for Min3 5 5 [-∞, 3][-∞, 5] 5
24
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 24 Alpha-Beta Example Overview Max Min 765654 5<=5 5 some branches can be pruned because they would never be considered after looking at one branch, Max already knows that they will not be of interest since Min would choose a value that is less than what Max already has at its disposal 364 <= 3 5 best choice for Max5 best choice for Min3
25
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 25 Properties of Alpha-Beta Pruning in the ideal case, the best successor node is examined first alpha-beta can look ahead twice as far as minimax assumes an idealized tree model uniform branching factor, path length random distribution of leaf evaluation values requires additional information for good players game-specific background knowledge empirical data
26
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 26 Imperfect Decisions Does this mean I have to expand the tree all the way??? complete search is impractical for most games alternative: search the tree only to a certain depth requires a cutoff-test to determine where to stop
27
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 27 Evaluation Function If I stop part of the way down how do I score these nodes (which are not terminal nodes!) Use an evaluation function to score these nodes! must be consistent with the utility function values for terminal nodes (or at least their order) must be the same tradeoff between accuracy and time cost without time limits, minimax could be used
28
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 28 Example: Tic-Tac-Toe simple evaluation function E(s) = (rx + cx + dx) - (ro + co + do) where r,c,d are the number of rows, columns and diagonals lines available for a win for X (or O) in that state; x and o are the pieces of the two players 1-ply lookahead start at the top of the tree evaluate all 9 choices for player 1 pick the maximum E-value 2-ply lookahead also looks at the opponents possible move assuming that the opponents picks the minimum E-value
29
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 29 E(s1:2) 8 - 6 = 2 E(s1:3) 8 - 5 = 3 E(s1:4) 8 - 6 = 2 E(s1:5) 8 - 4 = 4 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 Tic-Tac-Toe 1-Ply XXX XXX XXX E(s1:1) 8 - 5 = 3 E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:1) == E of state @ depth-level ‘1’ number ‘1’ Simple evaluation function E(s1:1) = (rx + cx + dx) - (ro + co + do) E(s1:1) = ‘X’ has (3rows + 3 cols + 2 diags) – ‘O’ has (2r + 2c +1d) E(s11) = 8 -5 = 3
30
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 30 E(s2:16) 5 - 6 = -1 E(s2:15) 5 -6 = -1 E(s28) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s2:48) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:13) 5 - 6 = -1 E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 5 - 6 = -1 E(s2:14) 5 - 6 = -1 E(s25) 6 - 5 = 1 E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s26) 5 - 5 = 0 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 E(s1:5) 8 - 4 = 4 E(s1:3) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:1) 8 - 5 = 3 E(s2:45) 6 - 4 = 2 Tic-Tac-Toe 2-Ply XXX XXX XXX E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:4) 8 - 6 = 2 XOX O X O E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 OX O X O X O XX O X O X O X O XX O XOOXX O X O X O X O X O X O XOXOX O O Note: For 2-Ply we must expand all nodes in the 1 st level but as scores of 2 and 3 repeat a few times we only expand one of each.
31
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 31 E(s2:16) 5 - 6 = -1 E(s2:15) 5 -6 = -1 E(s28) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s2:48) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:13) 5 - 6 = -1 E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 5 - 6 = -1 E(s2:14) 5 - 6 = -1 E(s25) 6 - 5 = 1 E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s26) 5 - 5 = 0 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 E(s1:5) 8 - 4 = 4 E(s1:3) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:1) 8 - 5 = 3 E(s2:45) 6 - 4 = 2 Tic-Tac-Toe 2-Ply XXX XXX XXX E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:4) 8 - 6 = 2 XOX O X O E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 OX O X O X O XX O X O X O X O XX O XOOXX O X O X O X O X O X O XOXOX O O It seems that the centre X (i.e. a move to E(s1:5)) is the best because its successor nodes have a value of 1 (as opposed to 0 or worse -1)
32
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 32 31 Checkers Case Study initial board configuration Black single on 20 single on 21 king on 31 Redsingle on 23 king on 22 evaluation function E(s) = (5 x 1 + x 2 ) - (5r 1 + r 2 ) where x 1 = black king advantage, x 2 = black single advantage, r 1 = red king advantage, r 2 = red single advantage 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23
33
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 33 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 21 -> 14 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MIN moves Part 1 MiniMax using DFS Typically you expand DFS style to a certain depth and then eval. node E(s) = (5 x1 + x2) - (5r1 + r2) = (5+2) –(0 +1) = 6 takes red king MAX moves
34
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 34 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN We fast-forward and see the full tree to a certain depth
35
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 35 11112611111111116000-4 -8 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Then we score leaf nodes
36
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 36 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Then we propagate Min-Max scores upwards
37
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 37 Could we make this a little easier? This tree will really expand some considerable distance What about if we introduce a little pruning OK then … but first we have to go back to the start.
38
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 38 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 21 -> 14 MIN moves Checkers Min-Max With pruning MAX moves 6 6 Temporarily propagate values up
39
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 39 Checkers Alpha-Beta Example 6 6 1 1 1111 6 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 21 -> 14 16 -> 11 31 -> 27 MAX MIN @ max node if value of node > value of parent then abandon. …No! … No! … No! …. No …. No! Alas no pruning below this Max node! Next node is expanded
40
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 40 Checkers Alpha-Beta Example 6 6 1 1 11111 1 1111 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Notice new value propagated up! @ max node if value of node > value of parent then abandon. …No! So its in the interest of Max node to see if it can do better … No! etc …. No etc …. No!
41
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 41 Checkers Alpha-Beta Example 6 6 1 1 11111 1 1111 1 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 Its not in MAXs interest to bother expanding these nodes as they have the same value as 16-> 11 so we prune them MAX MIN
42
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 42 Checkers Alpha-Beta Example 6 6 1 1 11111 1 11116 6 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN
43
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 43 Checkers Alpha-Beta Example 1 1 1116 6 1 1 11111 1 11116 6 1 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 MAX MIN
44
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 44 Search Limits search must be cut off because of time or space limitations strategies like depth-limited or iterative deepening search can be used don’t take advantage of knowledge about the problem more refined strategies apply background knowledge quiescent search cut off only parts of the search space that don’t exhibit big changes in the evaluation function
45
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 45 Games and Computers state of the art for some game programs Chess Checkers Othello Backgammon Go
46
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 46 Chess Deep Blue, a special-purpose parallel computer, defeated the world champion Gary Kasparov in 1997 the human player didn’t show his best game some claims that the circumstances were questionable Deep Blue used a massive data base with games from the literature Fritz, a program running on an ordinary PC, challenged the world champion Vladimir Kramnik to an eight-game draw in 2002 top programs and top human players are roughly equal
47
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 47 Checkers Arthur Samuel develops a checkers program in the 1950s that learns its own evaluation function reaches an expert level stage in the 1960s Chinook becomes world champion in 1994 human opponent, Dr. Marion Tinsley, withdraws for health reasons Tinsley had been the world champion for 40 years Chinook uses off-the-shelf hardware, alpha-beta search, end-games data base for six-piece positions
48
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 48 Othello Logistello defeated the human world champion in 1997 many programs play far better than humans smaller search space than chess little evaluation expertise available
49
P. Dunne Adapted from slides created by © 2000-2005 Franz Kurfess Games 49 Backgammon TD-Gammon, neural-network based program, ranks among the best players in the world improves its own evaluation function through learning techniques search-based methods are practically hopeless chance elements, branching factor
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.