ADVERSARIAL GAME SEARCH: Min-Max Search

ADVERSARIAL GAME SEARCH: Min-Max Search
O X O Between 2 adversaries Input: A board position Output: Best move (actually, the score for moving there) Objective function: board position evaluated to a number (may be +-1, 0) higher the number better the move: maximization Task: To compute optimum value of objective function out of all possible legal moves i.e. Optimize for the highest board value X X O Computer X Evaluated best move value = 1 (win)

Adversarial Games: Min-Max algorithm
Computer move X 9 options Adversary move 8 options X O X O O X X O Leaf: return +1 Leaf: return -1 Search tree is going up to leaves (terminal board) to evaluate board May not be practically feasible for a given computation time 9! total worst case Most algorithms will search for a pre-assigned depth or “ply” & evaluate board Alternate move by Adversary/human & Algorithm/computer: Minimizes the objective function for adversary move Maximizes the objective function for self-move or computer move the same algorithm alternately calls maximizer and minimizer

Maximizing part of the Min-Max algorithm:
Input: A board position; Output: Best next move, with max evaluation value Function findCompMove( ) if( fullBoard( ) ) value = DRAW; else if( ( quickWinInfo = immediateCompWin( ) ) != null ) return quickWinInfo; // if the next move ends the game; recursion termination else value = COMP_LOSS; // initialize with lowest value, max-problem for( i = 1; i ≤ 9; i++ ) // try each square in tic-tac-toe if ( isEmpty( i) ) place( i, COMP ); // temporary placement on board, // as global variable responseValue = findHumanMove( ).value; unplace( i ); // Restore board: alg does not actually move if( responseValue > value ) // Update best move value = responseValue; bestMove = i; return new Movelnfo( bestMove, value );

Minimizing part of the Min-Max algorithm:
Input: A board position; Output: Best opponent move, with min evaluation value Function findHumanMove( ) if( fullBoard( ) ) value = DRAW; else if( ( quickWinInfo = immediateHumanWin( ) ) != null ) return quickWinInfo; // if the next move ends the game; recursion termination else value = COMP_WIN; // initialize with lowest value for( i = 1; i ≤ 9; i++ ) // Try each square in tic-tac-toe if ( isEmpty( i) ) place( i, HUMAN ); // temporary placement on board, // as global variable responseValue = findCompMove( ).value; unplace( i ); // Restore board: alg does not actually move if( responseValue < value ) // Update best move value = responseValue; bestMove = i; return new Movelnfo( bestMove, value ); Driver call?

But will not return >40, so Do not call other children
Alpha-beta pruning Max  [- ] a=44 Max: get me >44  Min Must get >44 But will not return >40, so Do not call other children 44 60>40, so call pruned 60 40 All branches pruned

Min-max algorithm with Alpha-beta pruning
 Alpha pruning Must be true: Max-node’s value ≥ min-node’s value Otherwise, useless to expand tree: Prun the branch  Beta pruning From Weiss’ text

Min-max algorithm From Weiss’ text
Variable Ply on Min-Max: When to stop a search tree path? Quiscent: Stable boards, returned values are close to each other, no higher ply needed Horizon effect: Maybe just below this there is a major event!  Avoiding horizon effect strategies exist, e.g. remember from past search or games Delay tactic: stay conservative, don’t take risk, push the “horizon” to see what develops but, may actually just delay an eventual loss Iterative deepening: use (n-1)-th ply’s best move values toward alpha-beta in n-th ply Transposition table: multiple moves get to same board, hash table to avoid such moves From Weiss’ text

Min-max may lead to “wrong” path: very conservative search: presumes opponent is exactly as rational
Text Figure 5.14 Max (min(99, 1000, 1000, 1000), min(100, 101, 102, 100) ) = Max (99, 100) = 100, is not the best, 1000’s on the other branch was But, that is the point, opponent will not let ‘you’ go there! However, the risk may be worth taking => utility driven search, not just on board evaluation ProbCut: Probabilistic alpha-beta pruning, weight heuristic function from past experience – machine (machine-learning) or human (knowledge-based) alpha-beta are on probability distribution, pruning of probably useless branches rather than provably useless From Weiss’ text

Forward Pruning: beam search
K-best nodes are explored simultaneously/iteratively Not pure depth first, rather k-best first, but the optimum may be beyond k-best at a particular ply From Weiss’ text

Lookup Table From Weiss’ text Database / Knowledge base of past games
Specially on start- or end-game in chess Knowledge-base: compressed information as strategy, if-then-else From Weiss’ text

Dice Games: Backgamon From Weiss’ text
Max-probability-Min-Probability-… Max-Min part “Or” branches, Dice throwing part “And” branches: Must consider all possibilities, but probabilistic weight may be added from past knowledge From Weiss’ text

Partially Observable Games: Card games
Each node is really a subset of nodes belief on what “may be” the current nodes are search should filter possibilities: player plays a probing hand Utility function, as in “game therory” in economics – heuristic function typically, embeds probability measures as well From Weiss’ text

ADVERSARIAL GAME SEARCH: Min-Max Search

Similar presentations

Presentation on theme: "ADVERSARIAL GAME SEARCH: Min-Max Search"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ADVERSARIAL GAME SEARCH: Min-Max Search

Similar presentations

Presentation on theme: "ADVERSARIAL GAME SEARCH: Min-Max Search"— Presentation transcript:

Similar presentations

About project

Feedback