Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,

Slides:



Advertisements
Similar presentations
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Advertisements

February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
Games and adversarial search
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search Game Playing Chapter 6. Outline Games Perfect Play –Minimax decisions –α-β pruning Resource Limits and Approximate Evaluation Games.
Adversarial Search CSE 473 University of Washington.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Games Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
This time: Outline Game playing The minimax algorithm
Lecture 02 – Part C Game Playing: Adversarial Search
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
Game Playing CSC361 AI CSC361: Game Playing.
Games and adversarial search
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
CSC 412: AI Adversarial Search
Utility Theory & MDPs Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.
Game Playing ECE457 Applied Artificial Intelligence Spring 2007 Lecture #5.
1 Computer Group Engineering Department University of Science and Culture S. H. Davarpanah
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Instructor: Vincent Conitzer
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
For Wednesday Read chapter 7, sections 1-4 Homework: –Chapter 6, exercise 1.
Quiz 4 : Minimax Minimax is a paranoid algorithm. True
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Games and adversarial search (Chapter 5)
MDPs (cont) & Reinforcement Learning
CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
CMSC 421: Intro to Artificial Intelligence October 6, 2003 Lecture 7: Games Professor: Bonnie J. Dorr TA: Nate Waisbrot.
Game Playing: Adversarial Search chapter 5. Game Playing: Adversarial Search  Introduction  So far, in problem solving, single agent search  The machine.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Chapter 5: Adversarial Search & Game Playing
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.
Artificial Intelligence AIMA §5: Adversarial Search
Adversarial Search and Game-Playing
Games and adversarial search (Chapter 5)
Announcements Homework 1 Full assignment posted..
4. Games and adversarial search
PENGANTAR INTELIJENSIA BUATAN (64A614)
Games and adversarial search (Chapter 5)
CS440/ECE448 Lecture 9: Minimax Search
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Pengantar Kecerdasan Buatan
Games & Adversarial Search
Games & Adversarial Search
Artificial Intelligence
Games & Adversarial Search
Games & Adversarial Search
Artificial Intelligence
Games & Adversarial Search
Adversarial Search CMPT 420 / CMPG 720.
Games & Adversarial Search
Presentation transcript:

Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore, Percy Liang, Luke Zettlemoyer

Announcements I sent out an to the class mailing list on Thurs (making the medium and big cheese mazes extra credit). HW2 will be released later this week.

Adversarial search (Chapter 5) World Champion chess player Garry Kasparov is defeated by IBM’s Deep Blue chess-playing computer in a six-game match in May, 1997 (link)link © Telegraph Group Unlimited 1997

Why study games? Games are a traditional hallmark of intelligence Games are easy to formalize Games can be a good model of real-world competitive or cooperative activities –Military confrontations, negotiation, auctions, etc.

Types of game environments DeterministicStochastic Perfect information (fully observable) Imperfect information (partially observable) Chess, checkers, go Backgammon, monopoly Battleship Scrabble, poker, bridge

Alternating two-player zero-sum games Players take turns Each game outcome or terminal state has a utility for each player (e.g., 1 for win, 0 for loss) The sum of both players’ utilities is a constant

Games vs. single-agent search We don’t know how the opponent will act –The solution is not a fixed sequence of actions from start state to goal state, but a strategy or policy (a mapping from state to best move in that state) Efficiency is critical to playing well –The time to make a move is limited –The branching factor, search depth, and number of terminal configurations are huge In chess, branching factor ≈ 35 and depth ≈ 100, giving a search tree of nodes –Number of atoms in the observable universe ≈ –This rules out searching all the way to the end of the game

Search problem components Initial state Actions Transition model –What state results from performing a given action in a given state? Goal state Path cost –Assume that it is a sum of nonnegative step costs The optimal solution is the sequence of actions that gives the lowest path cost for reaching the goal Initial state Goal state

Deterministic Games Many possible formulations, e.g.: States: S (start at initial state) Players: P={1…N} (usually take turns) Actions: A (may depend on player/state) Transition Model: SxA -> S Terminal Test: S->{t,f} Terminal Utilities: SxP -> R Solution for a player is a policy: S->A

Deterministic Single-Player Deterministic, single player, perfect information (e.g. 8-puzzle, rubik’s cube): Know the rules, know what actions do, know when you win … it’s just search! Slight reinterpretation: Each node stores a value: the best outcome it can reach This is the maximal value of its children Note we don’t have path sums as before (utilities at end) After search, pick moves that lead to best outcome

Game tree A game of tic-tac-toe between two players, “MAX” (X) and “MIN” (O)

A more abstract game tree Terminal utilities (for MAX) A two-ply game

A more abstract game tree Minimax value of a node: the utility (for MAX) of being in the corresponding state, assuming perfect play on both sides Minimax strategy: Choose the move that gives the best worst-case payoff 322 3

Computing the minimax value of a node Minimax(node) =  Utility(node) if node is terminal  max action Minimax(Succ(node, action)) if player = MAX  min action Minimax(Succ(node, action)) if player = MIN 322 3

Optimality of minimax The minimax strategy is optimal against an optimal opponent What if your opponent is suboptimal? –Your utility can only be higher than if you were playing an optimal opponent! –A different strategy may work better for a sub-optimal opponent, but it will necessarily be worse against an optimal opponent 11 Example from D. Klein and P. Abbeel

More general games More than two players, non-zero-sum Utilities are now tuples Each player maximizes their own utility at their node Utilities get propagated (backed up) from children to parents 4,3,24,3,27,4,17,4,1 4,3,24,3,2 1,5,21,5,27,7,17,7,1 1,5,21,5,2 4,3,24,3,2

10 9 x

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22  14

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22 55

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 3 22 2

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree (-∞,+∞) Range of possible values (-∞,+∞)

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree ≤3 (−∞,3] (−∞,+∞)

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree ≤3 (−∞,3] (−∞,+∞)

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 [3,3] [3,+∞)

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22 (−∞,2] This node is worse for MAX [3,+∞)

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22  14 (−∞,14] [3,14]

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 33 22 55 (−∞,5] [3,5] 55

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 3 22 2 [2,2] (−∞,2] [3,3]

Alpha-beta pruning It is possible to compute the exact minimax decision without expanding every node in the game tree 3 3 22 2 [2,2] (−∞,2] [3,3]

General alpha-beta pruning Consider a node n in the tree --- If player has a better choice at: –Parent node of n –Or any choice point further up Then n will never be reached in play. Hence, when that much is known about n, it can be pruned.

Alpha-beta Algorithm Depth first search –only considers nodes along a single path from root at any time  = highest-value choice found at any choice point of path for MAX (initially,  = −infinity)  = lowest-value choice found at any choice point of path for MIN (initially,  = +infinity) Pass current values of  and  down to child nodes during search. Update values of  and  during search as: –MAX updates  at MAX nodes –MIN updates  at MIN nodes Prune remaining branches at a node when  ≥ 

When to Prune Prune whenever  ≥ . –Prune below a Max node whose alpha value becomes greater than or equal to the beta value of its ancestors. Max nodes update alpha based on children’s returned values. –Prune below a Min node whose beta value becomes less than or equal to the alpha value of its ancestors. Min nodes update beta based on children’s returned values.

Alpha-Beta Example Revisited , , initial values Do DF-search until first leaf  =−   =+   =−   =+  , , passed to kids

Alpha-Beta Example (continued) MIN updates , based on kids  =−   =+   =−   =3

Alpha-Beta Example (continued)  =−   =3 MIN updates , based on kids. No change.  =−   =+ 

Alpha-Beta Example (continued)  =−   =3 MIN updates , based on kids. No change.  =−   =+ 

Alpha-Beta Example (continued) MAX updates , based on kids.  =3  =+  3 is returned as node value.

Alpha-Beta Example (continued)  =3  =+   =3  =+  , , passed to kids

Alpha-Beta Example (continued)  =3  =+   =3  =2 MIN updates , based on kids.

Alpha-Beta Example (continued)  =3  =2  ≥ , so prune.  =3  =+ 

Alpha-Beta Example (continued) 2 is returned as node value. MAX updates , based on kids. No change.  =3  =+ 

Alpha-Beta Example (continued),  =3  =+   =3  =+  , , passed to kids

Alpha-Beta Example (continued),  =3  =14  =3  =+  MIN updates , based on kids.

Alpha-Beta Example (continued),  =3  =5  =3  =+  MIN updates , based on kids.

Alpha-Beta Example (continued)  =3  =+  2  =3  =2 MIN updates , based on kids.

Alpha-Beta Example (continued)  =3  =+  2 is returned as node value. 2

Alpha-Beta Example (continued) 2

Alpha-beta pruning α is the value of the best (highest value) choice for the MAX player found so far at any choice point above node n We want to compute the MIN-value at n As we loop over n’s children, the MIN-value decreases If it drops below α, MAX will never choose n, so we can ignore n’s remaining children

Alpha-beta pruning β is the value of the best (lowest value) choice for the MIN player found so far at any choice point above node n We want to compute the MAX-value at n As we loop over n’s children, the MAX-value increases If it goes above β, MIN will never choose n, so we can ignore n’s remaining children b

Alpha-beta pruning Function action = Alpha-Beta-Search(node) v = Min-Value(node, −∞, ∞) return the action from node with value v α: best alternative available to the Max player β: best alternative available to the Min player Function v = Min-Value(node, α, β) if Terminal(node) return Utility(node) v = +∞ for each action from node v = Min(v, Max-Value(Succ(node, action), α, β)) if v ≤ α return v β = Min(β, v) end for return v node Succ(node, action) action …

Alpha-beta pruning Function action = Alpha-Beta-Search(node) v = Max-Value(node, −∞, ∞) return the action from node with value v α: best alternative available to the Max player β: best alternative available to the Min player Function v = Max-Value(node, α, β) if Terminal(node) return Utility(node) v = −∞ for each action from node v = Max(v, Min-Value(Succ(node, action), α, β)) if v ≥ β return v α = Max(α, v) end for return v node Succ(node, action) action …

Alpha-beta pruning Pruning does not affect final result Amount of pruning depends on move ordering

Example which nodes can be pruned?

Answer to Example which nodes can be pruned? Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side). Max Min Max

Second Example (the exact mirror image of the first example) which nodes can be pruned?

Answer to Second Example (the exact mirror image of the first example) which nodes can be pruned? Min Max Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side).

Alpha-beta pruning Pruning does not affect final result Amount of pruning depends on move ordering –Should start with the “best” moves (highest-value for MAX or lowest-value for MIN) –For chess, can try captures first, then threats, then forward moves, then backward moves –Can also try to remember “killer moves” from other branches of the tree With perfect ordering, the time to find the best move is reduced to O(b m/2 ) from O(b m ) –Depth of search you can afford is effectively doubled