CS 188: Artificial Intelligence Fall 2008 Lecture 6: Adversarial Search 9/16/2008 Dan Klein – UC Berkeley Many slides over the course adapted from either.

Slides:



Advertisements
Similar presentations
Chapter 6, Sec Adversarial Search.
Advertisements

Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
CSC 8520 Spring Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
Adversarial Search Game Playing Chapter 6. Outline Games Perfect Play –Minimax decisions –α-β pruning Resource Limits and Approximate Evaluation Games.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
Artificial Intelligence for Games Game playing Patrick Olivier
Advanced Artificial Intelligence
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
EIE426-AICV 1 Game Playing Filename: eie426-game-playing-0809.ppt.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
This time: Outline Game playing The minimax algorithm
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
CS 188: Artificial Intelligence Fall 2009
Games & Adversarial Search Chapter 6 Section 1 – 4.
CS 188: Artificial Intelligence Spring 2007 Lecture 7: CSP-II and Adversarial Search 2/6/2007 Srini Narayanan – ICSI and UC Berkeley Many slides over the.
CSE 473: Artificial Intelligence Spring 2012
CSE 473: Artificial Intelligence Autumn 2011 Adversarial Search Luke Zettlemoyer Based on slides from Dan Klein Many slides over the course adapted from.
CSE 573: Artificial Intelligence Autumn 2012
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
CSC 412: AI Adversarial Search
Quiz 4 : CSP  Tree-structured CSPs require only polynomial time.True  If a CSP is not tree-structured, you can always transform it into a tree. True.
Adversarial Search. Game playing Perfect play The minimax algorithm alpha-beta pruning Resource limitations Elements of chance Imperfect information.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Adversarial Search Chapter 6 Section 1 – 4. Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What.
CHAPTER 4 PROBABILITY THEORY SEARCH FOR GAMES. Representing Knowledge.
Quiz 4 : Minimax Minimax is a paranoid algorithm. True
CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,
Paula Matuszek, CSC 8520, Fall Based in part on aima.eecs.berkeley.edu/slides-ppt 1 CS 8520: Artificial Intelligence Adversarial Search Paula Matuszek.
CS 188: Artificial Intelligence Spring 2006 Lecture 23: Games 4/18/2006 Dan Klein – UC Berkeley.
Adversarial Search Chapter 6 Section 1 – 4. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time.
CS 188: Artificial Intelligence Fall 2006 Lecture 7: Adversarial Search 9/19/2006 Dan Klein – UC Berkeley Many slides over the course adapted from either.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.
5/4/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 9, 5/4/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Game Playing Why do AI researchers study game playing?
Announcements Homework 1 Full assignment posted..
CS 460 Spring 2011 Lecture 4.
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Adversarial Search Chapter 5.
CS 188: Artificial Intelligence Fall 2007
Games & Adversarial Search
Games & Adversarial Search
Adversarial Search.
Games & Adversarial Search
Games & Adversarial Search
Lecture 1C: Adversarial Search(Game)
Minimax strategies, alpha beta pruning
Game Playing Fifth Lecture 2019/4/11.
Games & Adversarial Search
Adversarial Search CS 171/271 (Chapter 6)
Minimax strategies, alpha beta pruning
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4.
Presentation transcript:

CS 188: Artificial Intelligence Fall 2008 Lecture 6: Adversarial Search 9/16/2008 Dan Klein – UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 1

Announcements  Project 2 is up (Multi-Agent Pacman)  Other annoucements:  No more Friday project deadlines (makes it hard to use slip days)  After this week, we’ll use section and drop box for written assignments rather than lecture  Sanity checker issues – informal poll  Looking for partners? Workload balanced for pairs 2

Local Search  Queue-based algorithms keep fallback options (backtracking)  Local search: improve what you have until you can’t make it better  Generally much more efficient (but incomplete) 3

Hill Climbing  Simple, general idea:  Start wherever  Always choose the best neighbor  If no neighbors have better scores than current, quit  Why can this be a terrible idea?  Complete?  Optimal?  What’s good about it? 4

Hill Climbing Diagram  Random restarts?  Random sideways steps? 5

Simulated Annealing  Idea: Escape local maxima by allowing downhill moves  But make them rarer as time goes on 6

Simulated Annealing  Theoretical guarantee:  Stationary distribution:  If T decreased slowly enough, will converge to optimal state!  Is this an interesting guarantee?  Sounds like magic, but reality is reality:  The more downhill steps you need to escape, the less likely you are to every make them all in a row  People think hard about ridge operators which let you jump around the space in better ways 7

Beam Search  Like hill-climbing search, but keep K states at all times:  Variables: beam size, encourage diversity?  The best choice in MANY practical settings  Complete? Optimal?  Why do we still need optimal methods? Greedy SearchBeam Search 8

Genetic Algorithms  Genetic algorithms use a natural selection metaphor  Like beam search (selection), but also have pairwise crossover operators, with optional mutation  Probably the most misunderstood, misapplied (and even maligned) technique around! 9

Example: N-Queens  Why does crossover make sense here?  When wouldn’t it make sense?  What would mutation be?  What would a good fitness function be? 10

Adversarial Search [DEMO: mystery pacman] 11

Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved!  Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply.  Othello: human champions refuse to compete against computers, which are too good.  Go: human champions refuse to compete against computers, which are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.  Pacman: unknown 12

GamesCrafters 13

Game Playing  Many different kinds of games!  Axes:  Deterministic or stochastic?  One, two or more players?  Perfect information (can you see the state)?  Want algorithms for calculating a strategy (policy) which recommends a move in each state 14

Deterministic Games  Many possible formalizations, one is:  States: S (start at s 0 )  Players: P={1...N} (usually take turns)  Actions: A (may depend on player / state)  Transition Function: SxA  S  Terminal Test: S  {t,f}  Terminal Utilities: SxP  R  Solution for a player is a policy: S  A 15

Deterministic Single-Player?  Deterministic, single player, perfect information:  Know the rules  Know what actions do  Know when you win  E.g. Freecell, 8-Puzzle, Rubik’s cube  … it’s just search!  Slight reinterpretation:  Each node stores a value: the best outcome it can reach  This is the maximal outcome of its children  Note that we don’t have path sums as before (utilities at end)  After search, can pick move that leads to best node winlose 16

Deterministic Two-Player  E.g. tic-tac-toe, chess, checkers  Minimax search  A state-space search tree  Players alternate  Each layer, or ply, consists of a round of moves  Choose move to position with highest minimax value = best achievable utility against best play  Zero-sum games  One player maximizes result  The other minimizes result 8256 max min 17

Tic-tac-toe Game Tree 18

Minimax Example 19

Minimax Search 20

Minimax Properties  Optimal against a perfect player. Otherwise?  Time complexity?  O(b m )  Space complexity?  O(bm)  For chess, b  35, m  100  Exact solution is completely infeasible  But, do we need to explore the whole tree? max min [DEMO: minVsExp] 21

Resource Limits  Cannot search to leaves  Depth-limited search  Instead, search a limited depth of tree  Replace terminal utilities with an eval function for non-terminal positions  Guarantee of optimal play is gone  More plies makes a BIG difference  [DEMO: limitedDepth]  Example:  Suppose we have 100 seconds, can explore 10K nodes / sec  So can check 1M nodes per move   -  reaches about depth 8 – decent chess program ???? min max

Evaluation Functions  Function which scores non-terminals  Ideal function: returns the utility of the position  In practice: typically weighted linear sum of features:  e.g. f 1 ( s ) = (num white queens – num black queens), etc. 23

Evaluation for Pacman [DEMO: thrashing, smart ghosts] 24

Iterative Deepening Iterative deepening uses DFS as a subroutine: 1.Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2) 2.If “1” failed, do a DFS which only searches paths of length 2 or less. 3.If “2” failed, do a DFS which only searches paths of length 3 or less. ….and so on. This works for single-agent search as well! Why do we want to do this for multiplayer games? … b 25

Pruning in Minimax Search [- ,+  ] [- ,3][- ,2][- ,14] [3,3] [- ,5] [2,2] [3,+  ] [3,14][3,5][3,3] 26

 -  Pruning Example 27

 -  Pruning  General configuration   is the best value that MAX can get at any choice point along the current path  If n becomes worse than , MAX will avoid it, so can stop considering n’s other children  Define  similarly for MIN Player Opponent Player Opponent  n 28

 -  Pruning Pseudocode  v 29

 -  Pruning Properties  Pruning has no effect on final result  Good move ordering improves effectiveness of pruning  With “perfect ordering”:  Time complexity drops to O(b m/2 )  Doubles solvable depth  Full search of, e.g. chess, is still hopeless!  A simple example of metareasoning, here reasoning about which computations are relevant 30

Non-Zero-Sum Games  Similar to minimax:  Utilities are now tuples  Each player maximizes their own entry at each node  Propagate (or back up) nodes from children 1,2,61,2,64,3,24,3,26,1,26,1,27,4,17,4,15,1,15,1,11,5,21,5,27,7,17,7,15,4,55,4,5 31

Stochastic Single-Player  What if we don’t know what the result of an action will be? E.g.,  In solitaire, shuffle is unknown  In minesweeper, mine locations  In pacman, ghosts!  Can do expectimax search  Chance nodes, like actions except the environment controls the action chosen  Calculate utility for each node  Max nodes as in search  Chance nodes take average (expectation) of value of children  Later, we’ll learn how to formalize this as a Markov Decision Process max average [DEMO: minVsExp] 32

Stochastic Two-Player  E.g. backgammon  Expectiminimax (!)  Environment is an extra player that moves after each agent  Chance nodes take expectations, otherwise like minimax 33

Stochastic Two-Player  Dice rolls increase b: 21 possible rolls with 2 dice  Backgammon  20 legal moves  Depth 4 = 20 x (21 x 20) x 10 9  As depth increases, probability of reaching a given node shrinks  So value of lookahead is diminished  So limiting depth is less damaging  But pruning is less possible…  TDGammon uses depth-2 search + very good eval function + reinforcement learning: world- champion level play 34

What’s Next?  Make sure you know what:  Probabilities are  Expectations are  Next topics:  Dealing with uncertainty  How to learn evaluation functions  Markov Decision Processes 35