Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.

Slides:

Advertisements

Similar presentations

Chapter 6, Sec Adversarial Search.

Advertisements

Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.

Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.

Games & Adversarial Search

CS 484 – Artificial Intelligence

Adversarial Search Chapter 6 Section 1 – 4.

Adversarial Search Chapter 5.

COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.

Adversarial Search Chapter 6.

An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.

1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.

Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

Adversarial Search Board games. Games 2 player zero-sum games Utility values at end of game – equal and opposite Games that are easy to represent Chess.

Game Playing CSC361 AI CSC361: Game Playing.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.

Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.

Adversarial Search and Game Playing Examples. Game Tree MAX’s play  MIN’s play  Terminal state (win for MAX)  Here, symmetries have been used to reduce.

Games & Adversarial Search Chapter 6 Section 1 – 4.

Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”

Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.

1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.

CSC 412: AI Adversarial Search

Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.

Adversarial Search; Heuristics in Games Foundations of Artificial Intelligence.

Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.

1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.

ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.

Adversarial Search 2 (Game Playing)

Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.

Adversarial Search and Game Playing Russell and Norvig: Chapter 5 Russell and Norvig: Chapter 6 CS121 – Winter 2003.

Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)

Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.

Game Playing Why do AI researchers study game playing?

Adversarial Search and Game-Playing

Games and adversarial search (Chapter 5)

Game Playing Why do AI researchers study game playing?

4. Games and adversarial search

Iterative Deepening A*

Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.

PENGANTAR INTELIJENSIA BUATAN (64A614)

Games and adversarial search (Chapter 5)

Pengantar Kecerdasan Buatan

Adversarial Search Chapter 5.

Games & Adversarial Search

Games & Adversarial Search

Adversarial Search.

CMSC 471 Fall 2011 Class #8-9 Tue 9/27/11 – Thu 9/29/11 Game Playing

Artificial Intelligence Chapter 12 Adversarial Search

Games & Adversarial Search

Games & Adversarial Search

Game Playing: Adversarial Search

NIM - a two person game n objects are in one pile

Introduction to Artificial Intelligence Lecture 9: Two-Player Games I

Pruned Search Strategies

Minimax strategies, alpha beta pruning

Adversarial Search and Game Playing

Game Playing: Adversarial Search

Game Playing Chapter 5.

Adversarial Search and Game Playing Examples

Games & Adversarial Search

Adversarial Search CMPT 420 / CMPG 720.

Adversarial Search CS 171/271 (Chapter 6)

Games & Adversarial Search

Adversarial Search Chapter 6 Section 1 – 4.

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

Presentation transcript:

Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6

Games like Chess or Go are compact settings that mimic the uncertainty of interacting with the natural world For centuries humans have used them to exert their intelligence Recently, there has been great success in building game programs that challenge human supremacy

Relation to Previous Lecture Here, uncertainty is caused by the actions of another agent (MIN), which competes with our agent (MAX)

Relation to Previous Lecture Here, uncertainty is caused by the actions of another agent (MIN), which competes with our agent (MAX) MIN wants MAX to fail (and vice versa) No plan exists that guarantees MAX’s success regardless of which actions MIN executes (the same is true for MIN) At each turn, the choice of which action to perform must be made within a specified time limit The state space is enormous: only a tiny fraction of this space can be explored within the time limit

Specific Setting Two-player, turn-taking, deterministic, fully observable, zero-sum, time-constrained game State space Initial state Successor function: it tells which actions can be executed in each state and gives the successor state for each action MAX’s and MIN’s actions alternate, with MAX playing first in the initial state Terminal test: it tells if a state is terminal and, if yes, if it’s a win or a loss for MAX, or a draw All states are fully observable

Game Tree MAX nodes MAX’s play  MIN’s play  Terminal state (win for MAX)  MIN nodes Here, symmetries have been used to reduce the branching factor

Game tree (2-player, deterministic, turns)

Minimax Perfect play for deterministic games Idea: choose move to position with highest minimax value = best achievable payoff against best play E.g., 2-ply game:

Minimax algorithm

20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= -3 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= -3 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= 15 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min v= 15 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= i Min return v= 15 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min return v= 15 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min v= -10 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min v= -10 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min v= 3 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 15 Min return v= 3 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max v= 3 Min return v= 3 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= -i Max return v= 3 Min Max 20 A C D B E F G H I J K M L 15 N O P Q S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max return v= 3 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max v= i Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max v= i Min return v= 6 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max v= 6 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max v= 6 Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max v= 6 Min return v= 12 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 3 Max return v= 6 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max return v= 6 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= i Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= i Min return v= 8 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min v= -i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min return v= 30 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min v=-i Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max v= 8 Min return v=20 Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 6 Max return v= 8 Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

v= 8 with move to D Max Min Max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

max min 15 3 6 12 8 30 20 max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

max 6 8 3 min 15 3 6 12 8 30 20 max 20 A C D B E F G H I J K M L 15 N O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

8 max 6 8 3 min 15 3 6 12 8 30 20 max 20 A C D B E F G H I J K M L 15 O P Q R S T U V W X Y Z A1 -3 -5 -10 3 4 6 -7 12 4 8 20 12 30 20 10

Game Tree MAX’s play  MIN’s play  Terminal state (win for MAX)  In general, the branching factor and the depth of terminal states are large Chess: Number of states: ~1040 Branching factor: ~35 Number of total moves in a game: ~100

Choosing an Action: Basic Idea Using the current state as the initial state, build the game tree uniformly to the maximal depth h (called horizon) feasible within the time limit Evaluate the states of the leaf nodes Back up the results from the leaves to the root and pick the best action assuming the worst from MIN  Minimax algorithm

Evaluation Function Function e: state s  number e(s) e(s) is a heuristics that estimates how favorable s is for MAX e(s) > 0 means that s is favorable to MAX (the larger the better) e(s) < 0 means that s is favorable to MIN e(s) = 0 means that s is neutral

Example: Tic-tac-Toe e(s) = number of rows, columns, and diagonals open for MAX - number of rows, columns, and diagonals open for MIN 8-8 = 0 6-4 = 2 3-3 = 0

Construction of an Evaluation Function Usually a weighted sum of “features”: Features may include Number of pieces of each type Number of possible moves Number of squares controlled

Backing up Values Tic-Tac-Toe tree at horizon = 2 1 -1 1 -2 Best move 6-5=1 5-6=-1 5-5=0 5-5=1 4-5=-1 6-4=2 5-4=1 6-6=0 4-6=-2

Continuation 1 1 1 3 2 1 2 1 2 3 1 1

Why using backed-up values? At each non-leaf node N, the backed-up value is the value of the best state that MAX can reach at depth h if MIN plays well (by the same criterion as MAX applies to itself) If e is to be trusted in the first place, then the backed-up value is a better estimate of how favorable STATE(N) is than e(STATE(N))

Minimax Algorithm Expand the game tree uniformly from the current state (where it is MAX’s turn to play) to depth h Compute the evaluation function at every leaf of the tree Back-up the values from the leaves to the root of the tree as follows: A MAX node gets the maximum of the evaluation of its successors A MIN node gets the minimum of the evaluation of its successors Select the move toward a MIN node that has the largest backed-up value

Minimax Algorithm Expand the game tree uniformly from the current state (where it is MAX’s turn to play) to depth h Compute the evaluation function at every leaf of the tree Back-up the values from the leaves to the root of the tree as follows: A MAX node gets the maximum of the evaluation of its successors A MIN node gets the minimum of the evaluation of its successors Select the move toward a MIN node that has the largest backed-up value Horizon: Needed to return a decision within allowed time

Repeated States Left as an exercise [Distinguish between states on the same path and states on different paths]

Game Playing (for MAX) Repeat until a terminal state is reached Select move using Minimax Execute move Observe MIN’s move Note that at each cycle the large game tree built to horizon h is used to select only one move All is repeated again at the next cycle (a sub-tree of depth h-2 can be re-used)

Can we do better? Yes ! Much better !  Pruning  3 3  -1 This part of the tree can’t have any effect on the value that will be backed up to the root  Pruning -1

Example

Example The beta value of a MIN node is an upper bound on the final backed-up value. It can never increase b = 2 2

Example 1 b = 1 2 The beta value of a MIN node is an upper bound on the final backed-up value. It can never increase

Example 1 b = 1 2 a = 1 The alpha value of a MAX node is a lower bound on the final backed-up value. It can never decrease

Example 1 b = 1 2 a = 1 -1 b = -1

Example 1 b = 1 2 a = 1 -1 b = -1 Search can be discontinued below any MIN node whose beta value is less than or equal to the alpha value of one of its MAX ancestors

Alpha-Beta Pruning Explore the game tree to depth h in depth-first manner Back up alpha and beta values whenever possible Prune branches that can’t lead to changing the final decision

Example 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 -3 3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 -3 3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 -3 3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 -3 3 5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 2 -3 3 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 3 2 -3 3 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 -3 3 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 -3 3 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 -3 3 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 -3 3 2 5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 1 -3 3 2 1 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 1 -3 3 2 1 -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 2 3 2 1 -3 3 2 1 -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 1 2 1 3 2 1 -3 3 2 1 -3 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 1 2 1 3 2 1 -3 3 2 1 -3 -5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 1 2 1 3 2 1 -3 3 2 1 -3 -5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 1 2 1 -5 3 2 1 -5 -3 3 2 1 -3 -5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 2 1 2 1 -5 3 2 1 -5 -3 3 2 1 -3 -5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 1 1 2 1 2 1 -5 3 2 1 -5 -3 3 2 1 -3 -5 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 1 1 2 1 2 2 1 -5 2 3 2 1 -5 2 -3 3 2 1 -3 -5 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Example 1 1 2 1 2 2 1 -5 2 3 2 1 -5 2 -3 3 2 1 -3 -5 2 5 -3 3 3 -3 2 -2 3 5 2 5 -5 1 5 1 -3 -5 5 -3 3 2

Alpha-Beta Algorithm Update the alpha/beta value of the parent of a node N when the search below N has been completed or discontinued Discontinue the search below a MAX node N if its alpha value is  the beta value of a MIN ancestor of N Discontinue the search below a MIN node N if its beta value is  the alpha value of a MAX ancestor of N

The α-β algorithm

The α-β algorithm

( -i, i) v= -i T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i ( -i, i) T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i ( -i, i) v= i Min T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i ( -i, i) v= 3 Min T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i ( -i, 3) v= 3 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i ( -i, 3) v= 3 min T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= -i Max ( -i, 3) Return v=3 T1 T3 T2 G H D E F 5 3 I J K L O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( -i, i) v= 3 Max ( -i, 3) v= 3, return v T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 Max ( -i, 3) T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= -i T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= -i (3, i) v= i T1 T3 T2 G H D E F 5 J K L M N O P v= i R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= -i (3, i) v= i T1 T3 T2 G H D E F 5 J K L M N O P v= i R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= -i (3, i) v= 0 Min T1 T3 T2 G H D E F (3, i) v= -i 5 3 (3, i) I J K L M N O P v= 0 Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= -i (3, i) Return v=0 Min T1 T3 T2 G H D E F (3, i) 5 v= -i 3 (3, i) I J K L M N O P Return v=0 Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 (3, i) T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 (3, i) v= i Min T1 T3 T2 G H D E F (3, i) 5 v= 0 3 (3, i) I J K L M N O P v= i Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 (3, i) v= 5 Min T1 T3 T2 G H D E F (3, i) 5 v= 0 3 (3, i) I J K L M N O P v= 5 Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 (3, 5) v= 5 Min T1 T3 T2 G H D E F (3, i) 5 v= 0 3 (3, 5) I J K L M N O P v= 5 Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 (3, 5) v= 5 Min T1 T3 T2 G H D E F (3, i) 5 v= 0 3 (3, 5) I J K L M N O P v= 5 Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 0 Max (3, 5) Return v= 5 T1 T3 T2 G H D E F (3, i) 5 v= 0 3 Max (3, 5) I J K L M N O P Return v= 5 R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (3, i) v= 5 Max T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i (5, i) v= 5 Max T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= i Min (5, i) Return v=5 T1 T3 T2 G H D E F 5 3 J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, i) v= 5 Min (5, i) Return v=5 T1 T3 T2 G H D E F 5 3 J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v=-i T1 T3 T2 G H D E F 5 3 I J K L O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= i T1 T3 T2 G H D E F J v= i K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= i Min 7 T1 T3 T2 G H D E F 5 3 (3,5) I J v= i K L M N O P Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= 7 Min 7 T1 T3 T2 G H D E F 5 3 (3,5) I J v= 7 K L M N O P Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= 7 Min 7 T1 T3 T2 G H D E F 5 3 (3,5) I J v= 7 K L M N O P Min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i Max (3,5) Return v= 7 7 T1 T3 G H D E F 5 3 Max (3,5) I J Return v= 7 K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= 7 Max (3,5) Return v= 7 7 T1 T3 G H D E F 5 3 Max (3,5) I J Return v= 7 K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) Return v=7 Max 7 T1 T3 T2 G H D E F J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= i 7 T1 T3 T2 G H D E F 5 3 (3,5) I J K L M N O P v= i R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= i min 7 T1 T3 T2 G H D E F 5 3 (3,5) I J K L M N O P v= i min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,5) v= 4 min 7 T1 T3 T2 G H D E F 5 3 (3,5) I J K L M N O P v= 4 min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,4) v= 4 min 7 T1 T3 T2 G H D E F 5 3 (3,4) I J K L M N O P v= 4 min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i (3,4) v= 4 min 7 T1 T3 T2 G H D E F 5 3 (3,4) I J K L M N O P v= 4 min R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= -i Max (3,4) Return v= 4 7 T1 T3 G H v= -i D E F 5 3 Max (3,4) I J K L M N O P Return v= 4 R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (3,5) v= 4 Max (3,4) Return v= 4 7 T1 T3 G H v= 4 D E F 5 3 Max (3,4) I J K L M N O P Return v= 4 R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max (4,5) v= i 7 T1 T3 T2 G H D E F 5 3 Max (4,5) I J K L M N O P v= i R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max (4,5) v= i 7 T1 T3 T2 G H D E F 5 3 Max (4,5) I J K L M N O P v= i R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max (4,5) v= 2 7 T1 T3 T2 G H D E F 5 3 Max (4,5) I J K L M N O P v= 2 R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max (4,5) return v=2 7 T1 T3 G H v= 4 D E F 5 3 Max (4,5) I J K L M N O P return v=2 R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) v= 4 Max 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 5 Min (4,5) Return v= 4 Max 7 T1 T3 T2 G H D E F 5 3 Max I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 5) v= 4 Min 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 3 (3, 4) Return v=4 Min 7 T1 T3 T2 G H D E F 5 3 I J K L M O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 3, i) v= 4 Min 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

( 4, i) v= 4 Min 7 T1 T3 T2 G H D E F 5 3 I J K L M N O P R Q 7 S T U W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

return v=4 and associated move to T3 Min G H D E F 5 3 I J K L M N O P R Q 7 S T U V W X Y Z A1 A2 A3 A4 A5 A6 A7 A8 A9 10 3 5 6 7 8 8 10 -3 -12 4 2 6 4 5 2 8

How much do we gain? Consider these two cases: 3 a = 3 -1 b=-1 (4) 3

How much do we gain? Assume a game tree of uniform branching factor b Minimax examines O(bh) nodes, so does alpha-beta in the worst-case The gain for alpha-beta is maximum when: The MIN children of a MAX node are ordered in increasing backed up values The MAX children of a MIN node are ordered in decreasing backed up values Then alpha-beta examines O(bh/2) nodes [Knuth and Moore, 1975] But this requires an oracle (if we knew how to order nodes perfectly, we would not need to search the game tree) If nodes are ordered at random, then the average number of nodes examined by alpha-beta is ~O(b3h/4)

Heuristic Ordering of Nodes Order the nodes below the root according to the values backed-up at the previous iteration Order MIN (resp.MAX) nodes by decreasing (increasing) values of the evaluation function e computed at these nodes

Other Improvements Adaptive horizon + iterative deepening Extended search: Retain k>1 best paths, instead of just one, and extend the tree at greater depth below their leaf nodes to (help dealing with the “horizon effect”) Singular extension: If a move is obviously better than the others in a node at horizon h, then expand this node along this move Use transposition tables to deal with repeated states Null-move search

Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello: human champions refuse to compete against computers, who are too good. Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.