Download presentation
Presentation is loading. Please wait.
Published byBaldric Evans Modified over 8 years ago
1
Backtracking and Game Trees 15-211: Fundamental Data Structures and Algorithms April 8, 2004
2
Backtracking X O X O X O X O O CD B A
3
Backtracking An algorithm-design technique “Organized brute force” Explore the space of possible answers Do it in an organized way Example: maze Try S, E, N, W IN OUT
4
Backtracking is useful … … when a problem is too hard to be solved directly Maze traversal 8-queens problem Knight’s tour … when we have limited time and can accept a good but potentially not optimal solution Game playing (second part of lecture) Planning (not in this class)
5
Basic backtracking Develop answers by identifying a set of successive decisions. Maze Where do I go now: N, S, E or W? 8 Queens Where do I put the next queen? Knight’s tour Where do I jump next?
6
Basic backtracking 2 Develop answers by identifying a set of successive decisions. Decisions can be binary: ok or impossible (pure backtracking) Decisions can have goodness (heuristic backtracking): Good to sacrifice a pawn to take a bishop Bad to sacrifice the queen to take a pawn
7
ABCDE FGHIJ KLMNO PQRST UVXYZ Basic Backtracking 3 Can be implemented using a stack Stack can be implicit in recursive calls What if we get stuck? Withdraw the most recent choice Undo its consequences Is there a new choice? If so, try that If not, you are at another dead-end IN OUT A F K P U V X YS L G B C H M R Q N O
8
Basic Backtracking 4 No optimality guarantees If there are multiple solutions, simple backtracking will find one of them, but not necessarily the optimal No guarantees that we’ll reach a solution quickly
9
Backtracking Summary Organized brute force Formulate problem so that answer amounts to taking a set of successive decisions When stuck, go back, try the remaining alternatives Does not give optimal solutions
10
Games X O X O X O X O O
11
Why Make Computers Play Games? People have been fascinated by games since the dawn of civilization Idealization of real-world problems containing adversary situations Typically rules are very simple State of the world is fully accessible Example: auctions
12
No, not Quake (although interesting research there, too) Simpler Strategy Games: Deterministic Chess Checkers Othello Go Non-determinstic Poker Backgammon What Kind of Games?
13
So let’s take a simple game
14
A Tic Tac Toe Game Tree moves
15
A Tic Tac Toe Game Tree moves Nodes Denote board configurations Include a score Edges Denote legal moves Path Successive moves by the players
16
A path in a game tree KARPOV-KASPAROV, 1985 1.e4 c5 2.Nf3 e6 3.d4 cxd4 4.Nxd4 Nc6 5.Nb5 d6 6.c4 Nf6 7.N1c3 a6 8.Na3 d5!? 9.cxd5 exd5 10.exd5 Nb4 11.Be2!?N Bc5! 12.0-0 0-0 13.Bf3 Bf5 14.Bg5 Re8! 15.Qd2 b5 16.Rad1 Nd3! 17.Nab1? h6! 18.Bh4 b4! 19.Na4 Bd6 20.Bg3 Rc8 21.b3 g5!! 22.Bxd6 Qxd6 23.g3 Nd7! 24.Bg2 Qf6! 25.a3 a5 26.axb4 axb4 27.Qa2 Bg6 28.d6 g4! 29.Qd2 Kg7 30.f3 Qxd6 31.fxg4 Qd4+ 32.Kh1 Nf6 33.Rf4 Ne4 34.Qxd3 Nf2+ 35.Rxf2 Bxd3 36.Rfd2 Qe3! 37.Rxd3 Rc1!! 38.Nb2 Qf2! 39.Nd2 Rxd1+ 40.Nxd1 Re1+ White resigned
17
How to play? moves
18
Two-player games We can define the value (goodness) of a certain game state (board). What about the non-final board? Look at board, assign value Look at children in game tree, assign value 1 0 ?
19
How to play? moves 1 0 0001 1 0 0001 00 0 0
20
More generally Player A (us) maximize goodness. Player B (opponent) minimizes goodness Player A maximize Player B minimize Player A maximize 5 97 8 2 ab c d
21
Games Trees are useful… Provide “lookahead” to determine what move to make next Build whole tree = we know how to play 5 97 8 2 ab c d
22
But there’s one problem… Games have large search trees: Tic-Tac-Toe There are 9 ways we can make the first move and opponent has 8 possible moves he can make. Then when it is our turn again, we can make 7 possible moves for each of the 8 moves and so on…. 9! = 362,880
23
Or chess… Suppose we look at 16 possible moves at each step And suppose we explore to a depth of 50 turns Then the tree will have 16 50 =10 120 nodes! DeepThought (1990) searched to a depth of 10
24
So… Need techniques to avoid enumerating and evaluating the whole tree Heuristic search
25
Heuristic search 1.Organize search to eliminate large sets of possibilities Chess: Consider major moves first 2.Explore decisions in order of likely success Chess: Use a library of known strategies 3.Save time by guessing search outcomes Chess: Estimate the quality of a situation Count pieces Count pieces, using a weighting scheme Count pieces, using a weighting scheme, considering threats
26
Mini-Max Algorithm
27
Player A (us) maximize goodness. Player B (opponent) minimizes goodness Player A maximize (draw) Player B minimize(lose, draw) Player A maximize(lose, win) At a leaf (a terminal position) A wins, loses, or draws. Assign a score: 1 for win; 0 for draw; -1 for lose. At max layers, take node score as maximum of the child node scores At min layers, take nodes score as minimum of the child node scores 0 1 0
28
Let’s see it in action 102121627-5-80 1016 7-5 10-5 10 Max (Player A) Min (Player B) Evaluation function applied to the leaves!
29
Minimax, in reality Rarely do we reach an actual leaf Use estimator functions to statically guess goodness of missing subtrees MaxA moves MinB moves MaxA moves 27 18
30
Minimax, in reality Rarely do we reach an actual leaf Use estimator functions to statically guess goodness of missing subtrees MaxA moves MinB moves MaxA moves 2 2 7 1 1 8
31
Minimax, in reality Rarely do we reach an actual leaf Use estimator functions to statically guess goodness of missing subtrees MaxA moves MinB moves MaxA moves 2 2 2 7 1 1 8
32
Minimax Trade-off Quality vs. Speed Quality: deeper search Speed: use of estimator functions Balancing Relative costs of move generation and estimator functions Quality and cost of estimation function
33
Mini-Max Algorithm Definitions Terminal position is a position where the game is over In TicTacToe a game may be over with a tie, win or loss Each terminal position has some value The value of a non-terminal position P, is given by v(P) = max - v(P') P' in S(P) Where S(P) is the set of all successor positions of P Minus sign is there because the other player is moving into position P’
34
Mini-Max algorithm - pseudo code min-max(P){ if P is terminal, return v(P) m = - for each P' in S(P) v = -(min-max(P')) if m < v then m = v return m }
35
Game Tree search techniques Min-max search Assume optimal play on both sides Pruning The alpha-beta procedure: Next!
36
Alpha-Beta Pruning Track expectations. Use 2 variables and to prune the tree – Best score so far at a max node Value increases as we see more children – Best score so far at a min node Value decreases as we see more children
37
Alpha-beta What pruning is always possible? Max =2 (want >) Min Max 22 2 27
38
Alpha-beta What pruning is always possible? Max =2 (want >) Min =1 (want <) Max The root already has a value larger than the current minimizing value . Therefore there is no point in finding a better minimum. Prune! 22 2 27 11 1
39
10212 1012 10 Alpha Beta Example Max Min =10 = 12 > !
40
Alpha Beta Example 1021227 1012 7 107 Max Min = 10 =7 > !
41
alphaBeta (, ) The top level call: Return alphaBeta (- , )) alphaBeta (, ): At leaf level (depth limit is reached): Assign estimator function value (in the range (- .. ) to the leaf node. Return this value. At a min level (opponent moves): For each child, until : Set = min(, alphaBeta (, )) Return . At a max level (our move): For each child, until : Set = max(, alphaBeta (, )) Return .
42
Alpha-Beta Pseudo Code AB( , ,P){ if P is terminal, return v(P) 1 = for each P' in S(P) v = -AB(- , - 1, P') if 1 = then return 1 return 1 }
43
Heuristic search techniques Alpha-beta is one way to prune the game tree…
44
Heuristic search techniques 1. Organize search to eliminate large sets of possibilities Pruning strategies remove subtrees (alpha-beta) Transposition strategies combine multiple states, exploiting symmetries Memoizing techniques remember states previously explored Also, eliminate loops X O O X O X O O X O OXOX X O X O O
45
Heuristic search techniques 2. Explore decisions in order of likely success Guide search with estimator functions that correlate with likely search outcomes 3. Save time by guessing search outcomes Use estimator functions
46
Heuristic search techniques 4. Put resources into promising approaches Go deeper for more promising moves 5. Consider progressive deepening Breadth-first: find a most plausible move; then do deeper search to improve confidence.
47
Summary Backtracking Organized brute force Answer to problem = set of successive decisions When stuck, go back, try the remaining alternatives No optimality guarantees
48
Summary Game playing Game trees Mini-max algorithm Optimization Heuristics search Alpha-beta pruning
49
State-of-the-art: Backgammon Gerald Tesauro (IBM) Wrote a program which became “overnight” the best player in the world Not easy!
50
State-of-the-art: Backgammon Learned the evaluation function by playing 1,500,000 games against itself Temporal credit assignment using reinforcement learning Used Neural Network to learn the evaluation function 5 97 8 2 ab c d Neural Net 9 Learn!Predict!
51
State-of-the-art: Go Average branching factor 360 Regular search methods go bust ! People use higher level strategies Systems use vast knowledge bases of rules… some hope, but still play poorly $2,000,000 for first program to defeat a top-level player
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.