CS 188: Artificial Intelligence Fall 2006 Lecture 7: Adversarial Search 9/19/2006 Dan Klein – UC Berkeley Many slides over the course adapted from either.

Slides:

Advertisements

Similar presentations

Chapter 6, Sec Adversarial Search.

Advertisements

Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.

Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.

February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.

Games & Adversarial Search

CSC 8520 Spring Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou.

CS 484 – Artificial Intelligence

Adversarial Search Chapter 6 Section 1 – 4.

Adversarial Search Chapter 5.

COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.

1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.

Adversarial Search Game Playing Chapter 6. Outline Games Perfect Play –Minimax decisions –α-β pruning Resource Limits and Approximate Evaluation Games.

Adversarial Search CSE 473 University of Washington.

Adversarial Search Chapter 6.

Artificial Intelligence for Games Game playing Patrick Olivier

Advanced Artificial Intelligence

Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.

An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.

1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.

EIE426-AICV 1 Game Playing Filename: eie426-game-playing-0809.ppt.

G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.

This time: Outline Game playing The minimax algorithm

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.

CS 188: Artificial Intelligence Fall 2009

Games & Adversarial Search Chapter 6 Section 1 – 4.

CS 188: Artificial Intelligence Spring 2007 Lecture 7: CSP-II and Adversarial Search 2/6/2007 Srini Narayanan – ICSI and UC Berkeley Many slides over the.

CSE 473: Artificial Intelligence Spring 2012

CSE 473: Artificial Intelligence Autumn 2011 Adversarial Search Luke Zettlemoyer Based on slides from Dan Klein Many slides over the course adapted from.

CSE 573: Artificial Intelligence Autumn 2012

Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.

CSC 412: AI Adversarial Search

Quiz 4 : CSP  Tree-structured CSPs require only polynomial time.True  If a CSP is not tree-structured, you can always transform it into a tree. True.

Adversarial Search. Game playing Perfect play The minimax algorithm alpha-beta pruning Resource limitations Elements of chance Imperfect information.

Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.

1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.

Adversarial Search Chapter 6 Section 1 – 4. Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What.

CHAPTER 4 PROBABILITY THEORY SEARCH FOR GAMES. Representing Knowledge.

Quiz 4 : Minimax Minimax is a paranoid algorithm. True

CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,

Paula Matuszek, CSC 8520, Fall Based in part on aima.eecs.berkeley.edu/slides-ppt 1 CS 8520: Artificial Intelligence Adversarial Search Paula Matuszek.

CS 188: Artificial Intelligence Spring 2006 Lecture 23: Games 4/18/2006 Dan Klein – UC Berkeley.

Adversarial Search Chapter 6 Section 1 – 4. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time.

Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)

CS 188: Artificial Intelligence Fall 2008 Lecture 6: Adversarial Search 9/16/2008 Dan Klein – UC Berkeley Many slides over the course adapted from either.

Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.

Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.

5/4/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 9, 5/4/2005 University of Washington, Department of Electrical Engineering Spring 2005.

Game Playing Why do AI researchers study game playing?

Announcements Homework 1 Full assignment posted..

Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.

Adversarial Search Chapter 5.

CS 188: Artificial Intelligence Fall 2007

Games & Adversarial Search

Games & Adversarial Search

Adversarial Search.

Games & Adversarial Search

Games & Adversarial Search

Lecture 1C: Adversarial Search(Game)

Minimax strategies, alpha beta pruning

Game Playing Fifth Lecture 2019/4/11.

Games & Adversarial Search

Adversarial Search CS 171/271 (Chapter 6)

Minimax strategies, alpha beta pruning

Games & Adversarial Search

Adversarial Search Chapter 6 Section 1 – 4.

Presentation transcript:

CS 188: Artificial Intelligence Fall 2006 Lecture 7: Adversarial Search 9/19/2006 Dan Klein – UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore

Announcements  Project 1.2 is up (Single-Agent Pacman)  Critical update: make sure you have the most recent version!  Reminder: you are allowed to work with a partner!  Change to John’s section: M 3-4pm now in 4 Evans

Motion as Search  Motion planning as path-finding problem  Problem: configuration space is continuous  Problem: under-constrained motion  Problem: configuration space can be complex Why are there two paths from 1 to 2?

Decomposition Methods  Break c-space into discrete regions  Solve as a discrete problem

Approximate Decomposition  Break c-space into a grid  Search (A*, etc)  What can go wrong?  If no path found, can subdivide and repeat  Problems?  Still scales poorly  Incomplete*  Wiggly paths G S

Hierarchical Decomposition  But:  Not optimal  Not complete  Still hopeless above a small number of dimensions  Actually used in practical systems

Skeletonization Methods  Decomposition methods turn configuration space into a grid  Skeletonization methods turn it into a set of points, with preset linear paths between them

Visibility Graphs  Shortest paths:  No obstacles: straight line  Otherwise: will go from vertex to vertex  Fairly obvious, but somewhat awkward to prove  Visibility methods:  All free vertex-to-vertex lines (visibility graph)  Search using, e.g. A*  Can be done in O(n 3 ) easily, O(n 2 log(n)) less easily  Problems?  Bang, screech!  Not robust to control errors  Wrong kind of optimality? q start q goal q start

Voronoi Decomposition  Voronoi regions: points colored by closest obstacle  Voronoi diagram: borders between regions  Can be calculated efficiently for points (and polygons) in 2D  In higher dimensions, some approximation methods R G B Y

Voronoi Decomposition  Algorithm:  Compute the Voronoi diagram of the configuration space  Compute shortest path (line) from start to closest point on Voronoi diagram  Compute shortest path (line) from goal to closest point on Voronoi diagram.  Compute shortest path from start to goal along Voronoi diagram  Problems:  Hard over 2D, hard with complex obstacles  Can do weird things:

Probabilistic Roadmaps  Idea: just pick random points as nodes in a visibility graph  This gives probabilistic roadmaps  Very successful in practice  Lets you add points where you need them  If insufficient points, incomplete, or weird paths

Roadmap Example

Potential Field Methods  So far: implicit preference for short paths  Rational agent should balance distance with risk!  Idea: introduce cost for being close to an obstacle  Can do this with discrete methods (how?)  Usually most natural with continuous methods

Potential Fields  Cost for:  Being far from goal  Being near an obstacle  Go downhill  What could go wrong?

Adversarial Search [DEMO 1]

Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions.  Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply.  Othello: human champions refuse to compete against computers, which are too good.  Go: human champions refuse to compete against computers, which are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.  Pacman: unknown

Game Playing  Axes:  Deterministic or stochastic?  One, two or more players?  Perfect information (can you see the state)?  Want algorithms for calculating a strategy (policy) which recommends a move in each state

Deterministic Single-Player?  Deterministic, single player, perfect information:  Know the rules  Know what actions do  Know when you win  E.g. Freecell, 8-Puzzle, Rubik’s cube  … it’s just search!  Slight reinterpretation:  Each node stores the best outcome it can reach  This is the maximal outcome of its children  Note that we don’t store path sums as before  After search, can pick move that leads to best node winlose

Deterministic Two-Player  E.g. tic-tac-toe, chess, checkers  Minimax search  A state-space search tree  Players alternate  Each layer, or ply, consists of a round of moves  Choose move to position with highest minimax value = best achievable utility against best play  Zero-sum games  One player maximizes result  The other minimizes result 8256 max min

Tic-tac-toe Game Tree

Minimax Example

Minimax Search

Minimax Properties  Optimal against a perfect player. Otherwise?  Time complexity?  O(b m )  Space complexity?  O(bm)  For chess, b  35, m  100  Exact solution is completely infeasible  But, do we need to explore the whole tree? max min [DEMO2: minVsExp]

Resource Limits  Cannot search to leaves  Limited search  Instead, search a limited depth of the tree  Replace terminal utilities with an eval function for non-terminal positions  Guarantee of optimal play is gone  More plies makes a BIG difference  [DEMO 3: limitedDepth]  Example:  Suppose we have 100 seconds, can explore 10K nodes / sec  So can check 1M nodes per move   -  reaches about depth 8 – decent chess program ???? min max -24

Evaluation Functions  Function which scores non-terminals  Ideal function: returns the utility of the position  In practice: typically weighted linear sum of features:  e.g. f 1 ( s ) = (num white queens – num black queens), etc.

Evaluation for Pacman [DEMO4: evalFunction]

Iterative Deepening Iterative deepening uses DFS as a subroutine: 1.Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2) 2.If “1” failed, do a DFS which only searches paths of length 2 or less. 3.If “2” failed, do a DFS which only searches paths of length 3 or less. ….and so on. This works for single-agent search as well! Why do we want to do this for multiplayer games? … b

Pruning in Minimax Search [- ,+  ] [- ,3][- ,2][- ,14] [3,3] [- ,5] [2,2] [3,+  ] [3,14][3,5][3,3]

 -  Pruning Example

 -  Pruning  General configuration   is the best value the Player can get at any choice point along the current path  If n is worse than , MAX will avoid it, so prune n’s branch  Define  similarly for MIN Player Opponent Player Opponent  n

 -  Pruning Pseudocode  v

 -  Pruning Properties  Pruning has no effect on final result  Good move ordering improves effectiveness of pruning  With “perfect ordering”:  Time complexity drops to O(b m/2 )  Doubles solvable depth  Full search of, e.g. chess, is still hopeless!  A simple example of metareasoning, here reasoning about which computations are relevant

Non-Zero-Sum Games  Similar to minimax:  Utilities are now tuples  Each player maximizes their own entry at each node  Propagate (or back up) nodes from children 1,2,61,2,64,3,24,3,26,1,26,1,27,4,17,4,15,1,15,1,11,5,21,5,27,7,17,7,15,4,55,4,5

Stochastic Single-Player  What if we don’t know what the result of an action will be? E.g.,  In solitaire, shuffle is unknown  In minesweeper, don’t know where the mines are  Can do expectimax search  Chance nodes, like actions except the environment controls the action chosen  Calculate utility for each node  Max nodes as in search  Chance nodes take average (expectation) of value of children  Later, we’ll learn how to formalize this as a Markov Decision Process 8256 max average

Stochastic Two-Player  E.g. backgammon  Expectiminimax (!)  Environment is an extra player that moves after each agent  Chance nodes take expectations, otherwise like minimax

Stochastic Two-Player  Dice rolls increase b: 21 possible rolls with 2 dice  Backgammon  20 legal moves  Depth 4 = 20 x (21 x 20) x 10 9  As depth increases, probability of reaching a given node shrinks  So value of lookahead is diminished  So limiting depth is less damaging  But pruning is less possible…  TDGammon uses depth-2 search + very good eval function + reinforcement learning: world- champion level play

What’s Next?  Make sure you know what:  Probabilities are.  Expectations are.  Next:  How to learn evaluation functions  Markov Decision Processes