Download presentation
Presentation is loading. Please wait.
Published byEverett Manning Modified over 9 years ago
1
Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe Minimax Alpha-beta cutoff Extensions
2
Games, page 2 CSI 4106, Winter 2005 Definitions We consider non-random or semi-random games with full information, zero-sum, two-person (dual), with rational players. That's mainly board games such as chess, chequers, go -- sometimes called strategic games; some paper-and-pencil games are of this kind too.
3
Games, page 3 CSI 4106, Winter 2005 Definitions (2) A (pure) strategy: a complete set of advance instructions that specifies a definite choice for every conceivable situation in which the player may be required to act. In a two-player game, a strategy allows the player to have a response to every move of the opponent. Game-playing programs implement a strategy as a software mechanism that supplies the right move on request.
4
Games, page 4 CSI 4106, Winter 2005 A logic-based approach to games Find a winning strategy by proving that the game can be won -- use backward chaining. A very simple game: nim. initially, there is one stack of chips; a move: select a stack and divide it in two unequal non-empty stacks; a player who cannot move loses the game. (The player who moves first can win.)
5
Games, page 5 CSI 4106, Winter 2005... Games in logic (2)
6
Games, page 6 CSI 4106, Winter 2005... Games in logic (3) A_wins([6], A) A_wins([5], B) A_wins([4], B) A_wins([4], A) A_wins([3], A) A_wins([3], B)A_wins([], B) A_wins([], A) The players are A and B. A_wins( P, X ) means "player X moves in position P and there is a winning continuation for A". A position is represented as a list of sizes of stacks with 3 or more chips (only those can be still divided). This is an AND/OR tree (actually a directed, acyclic AND/OR graph). Player B loses.
7
Games, page 7 CSI 4106, Winter 2005... Games in logic (4) A_wins([6], B) A_wins([5], A) A_wins([4], A) A_wins([4], B) A_wins([3], B) A_wins([3], A)A_wins([], A) A_wins([], B) A winning strategy would always lead to a win. Here, such a strategy is described by a subgraph with one OR edge selected from each OR node. All leaves in the subgraph must represent wins for player A. Now we cannot find a winning strategy: why?
8
Games, page 8 CSI 4106, Winter 2005... Games in logic (5) This kind of analysis only works for very small game trees: A_wins([8], A) A_wins([7], B) A_wins([6], A) A_wins([5], A) A_wins([4, 3], A) A_wins([4], B)
9
Games, page 9 CSI 4106, Winter 2005 The basic loop The basic loop in a game program: build as much of the complete tree as seems reasonable (for example, within a given time limit); evaluate the incomplete tree; prune unpromising or bad moves; make a move; get the opponent's move. Regularities: moves of player A sprout from OR nodes, moves of player B sprout from AND nodes. Seen from A's perspective, this means that A chooses one of the moves (the best move, if possible), and is ready to react to all of B's moves.
10
Games, page 10 CSI 4106, Winter 2005 Static evaluation A static evaluation function returns the value of a move without trying to play (which would mean simulating the rest of the game but not playing it). Usually a static evaluation function returns positive values for positions advantageous to A, negative values for positions advantageous to B. If player A is rational, he will choose the maximal value of a leaf. Player B will choose the minimal value.
11
Games, page 11 CSI 4106, Winter 2005 Static evaluation (2) If we can have (guess or calculate) the value of an internal node N, we can treat it as if it were a leaf. This is the basis of the minimax procedure. No tree would be necessary if we could evaluate the initial position statically. Normally we need a tree, and we need look-ahead into it. Further positions can be evaluated more precisely, because there is more information, and a more focussed search. Minimax works best for large trees, but it can be useful even in mini-games such as tic-tac-toe.
12
Games, page 12 CSI 4106, Winter 2005 Tic-Tac-Toe Let player A be x and let open(x), open(o) mean the number of lines open to x and o. There are 8 lines. An evaluation function for position P: f(P) = - if o wins f(P) = + if x wins f(P) = open(x) - open(o)otherwise Example: open(x) - open(o) = 6 - 4 x o Assumptions: only one of symmetrical positions is generated; we build 2 levels of the game tree (one move -- one response) to have 2-ply lookahead.
13
Games, page 13 CSI 4106, Winter 2005 Tic-Tac-Toe (2) Player B chooses the minimal backed-up value among level 1 nodes. Player A chooses the maximal value, and makes the move. Player B, as a rational agent, selects the optimal response. x x x xxxxx xxxxx oo o o o oo x x o o o o o 6-55-56-55-54-55-46-4 5-65-55-66-64-6
14
Games, page 14 CSI 4106, Winter 2005 Tic-Tac-Toe (3) B's first three moves are blocking moves. Other moves lead to + for A: the only finite value is the minimum. For A this is a three-way tie in the evaluation; the chance to get more information is to consider more plies. x xx xx o xx oo x o o x oo 3-33-24-34-23-2 oo x x o x x o xx o x x o x x o x x o o o
15
Games, page 15 CSI 4106, Winter 2005 Tic-Tac-Toe (4) Now, what happens if B chooses a weaker move? The procedure finds a winning continuation: the best position ensures a win by forced moves. x x x x x o xx o o x o o x oo 2-23-24-24-3 3-3 oo x o x xx o x x o x x o x x o x x o oo
16
Games, page 16 CSI 4106, Winter 2005 Tic-Tac-Toe (5) Building complete plies is usually not necessary. If we evaluate a position when it is generated, we may save a lot. Assume that we are at a minimizing level. If the evaluation function returns - , we do not need to consider other positions: - will be the minimum. The same applies to + at a maximizing level.
17
Games, page 17 CSI 4106, Winter 2005 Tic-Tac-Toe (6) This is possible because of the special properties of the infinite values, but we can achieve a similar effect for finite values. x x xxx xx oo o o o o x 6-55-56-55-54-55-6
18
Games, page 18 CSI 4106, Winter 2005 Tic-Tac-Toe (7) The backed-up value of the first node at level 1 is -1, so the value of the (maximizing) root must be ≥ -1. When we see 5 - 6 = -1, we know that the value of the (minimizing) node must be ≤ -1. The whole subtree sprouting from cannot contribute anything and should not even be built. x x xxxxxoo o o o o x 6-55-56-55-54-55-6
19
Games, page 19 CSI 4106, Winter 2005 Minimax with cut-off In general, we keep a provisional value in every node. This value can only increase in an OR (maximizing) node, and decrease in an AND (minimizing) node. If an AND node with the provisional value V has a child C with a value less than V, we abandon C. If an OR node with the provisional value V has a child C with a value greater that V, we abandon C. Provisional values are established as soon as we "know something", and are propagated up the tree, from the leaves to the root.
20
Games, page 20 CSI 4106, Winter 2005 - cut-off (2) i.We stop searching in and below a minimizing node N with a provisional value PV N that is less than or equal to the provisional values of its maximizing ancestors. The final value for N is PV N. ii.We stop searching in and below a maximizing node N with a provisional value PV N that is greater than or equal to the provisional values of its minimizing ancestors. The final value for N is PV N.
21
Games, page 21 CSI 4106, Winter 2005 - cut-off (3) A provisional value of an OR node is called its alpha-value. A provisional value of an AND node is called its beta-value. During search, the alpha-value of a node is set to the currently largest of the final values for descendants; the beta-value of a node is set to the currently smallest of the final values for descendants. (i) is a shallow alpha-cutoff, (ii) is a shallow beta-cutoff.
22
Games, page 22 CSI 4106, Winter 2005 - cut-off (4)
23
Games, page 23 CSI 4106, Winter 2005 Extensions, modifications "Waiting for quiescence" -- when we reach the depth limit in the middle of a dynamic exchange (large amplitude of values). Secondary search ("feedover") -- "double check" down a path that seems the best. Book moves -- "canned" continuations (openings, endgames), and forced moves. Disadvantages of minimax: relying on the optimality of the opponent's play, no spectacular sacrifices are possible (winning back beyond the search limit), the horizon effect. Queen lost Pawn lost Search limit
24
Games, page 24 CSI 4106, Winter 2005 What next? This is a technology for one kind of games, and quite probably not the most popular (-:). Adventure games require a very different kind of Artificial Intelligence. Visit http://www.gameai.com/ai.html for a nearly professional perspective. And, in general, google it.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.