Game Playing Perfect decisions Heuristically based decisions Pruning search trees Games involving chance.

Slides:



Advertisements
Similar presentations
Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Advertisements

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4. Warm Up Let’s play some games!
CMSC 671 Fall 2001 Class #8 – Thursday, September 27.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.
Adversarial Search: Game Playing Reading: Chapter next time.
Adversarial Search Chapter 6. History Much of the work in this area has been motivated by playing chess, which has always been known as a "thinking person's.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
Artificial Intelligence for Games Game playing Patrick Olivier
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
EIE426-AICV 1 Game Playing Filename: eie426-game-playing-0809.ppt.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
This time: Outline Game playing The minimax algorithm
Game Playing CSC361 AI CSC361: Game Playing.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
double AlphaBeta(state, depth, alpha, beta) begin if depth
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing Perfect decisions Heuristically based decisions Pruning search trees Games involving chance.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
CHAPTER 6 : ADVERSARIAL SEARCH
1 Game Playing Why do AI researchers study game playing? 1.It’s a good reasoning problem, formal and nontrivial. 2.Direct comparison with humans and other.
Games as Game Theory Systems (Ch. 19). Game Theory It is not a theoretical approach to games Game theory is the mathematical study of decision making.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Notes on Game Playing by Yun Peng of theYun Peng University of Maryland Baltimore County.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Minimax with Alpha Beta Pruning The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Adversarial Search Chapter 6 Section 1 – 4. Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What.
CHAPTER 4 PROBABILITY THEORY SEARCH FOR GAMES. Representing Knowledge.
CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,
Paula Matuszek, CSC 8520, Fall Based in part on aima.eecs.berkeley.edu/slides-ppt 1 CS 8520: Artificial Intelligence Adversarial Search Paula Matuszek.
Artificial Intelligence
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.
1 Chapter 6 Game Playing. 2 Chapter 6 Contents l Game Trees l Assumptions l Static evaluation functions l Searching game trees l Minimax l Bounded lookahead.
5/4/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 9, 5/4/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Game Playing Why do AI researchers study game playing?
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Adversarial Search Chapter 5.
Games & Adversarial Search
Games & Adversarial Search
Adversarial Search.
Games & Adversarial Search
Games & Adversarial Search
Games & Adversarial Search
Adversarial Search CMPT 420 / CMPG 720.
Adversarial Search CS 171/271 (Chapter 6)
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4.
Presentation transcript:

Game Playing Perfect decisions Heuristically based decisions Pruning search trees Games involving chance

Differences from problem solving Opponent makes own choices! Each choice that game playing agent makes depends on what response opponent makes Playing quickly may be important – need a good way of approximating solutions and improving search

Starting point: Look at entire tree

Minimax Decision Assign a utility value to each possible ending Assures best possible ending, assuming opponent also plays perfectly opponent tries to give you worst possible ending Depth-first search tree traversal that updates utility values as it recurses back up the tree

Simple game for example: Minimax decision MAX (player) MIN (opponent)

Simple game for example: Minimax decision MAX (player) MIN (opponent)

Properties of Minimax Time complexity O(b m ) Space complexity O(bm) Same complexity as depth-first search For chess, b ~ 35, m ~ 100 for a “reasonable” game completely intractable!

So what can you do? Cutoff search early and apply a heuristic evaluation function Evaluation function can represent point values to pieces, board position, and/or other characteristics Evaluation function represents in some sense “probability” of winning In practice, evaluation function is often a weighted sum

How do you cutoff search? Most straightforward: depth limit... or even iterative deepening Bad in some cases What if just beyond depth limit, catastrophic move happens? One fix: only apply evaluation function to quiescent moves, i.e. unlikely to have wild swings in evaluation function Example: no pieces about to be captured Horizon problem One piece running away from another, but must ultimately be lost No generally good solution currently

How much lookahead for chess? Ply = half-move Human novice: 4 ply Typical PC, human master: 8 ply Deep Blue, Kasparov: 12 ply But if b=35, m = 12: Time ~ O(b m ) = ~ 3.4 x Need to cut this down

Alpha-Beta Pruning: Example MAX (player) MIN (opponent)

Alpha-Beta Pruning: Example MAX (player) MIN (opponent) Stop right here when evaluating this node: opponent takes minimum of these nodes, player will take maximum of nodes above

Alpha-Beta Pruning: Concept m n If m > n, Player would choose the m-node to get a guaranteed utility of at least m n-node would never be reached, stop evaluation

Alpha-Beta Pruning: Concept m n If m < n, Opponent would choose the m-node to get a guaranteed utility of at m n-node would never be reached, stop evaluation

The Alpha and the Beta For a leaf,  =  = utility At a max node:  = largest child utility found so far  =  of parent At a min node:  =  of parent  = smallest child utility found so far For any node:  <= utility <=  “If I had to decide now, it would be...”

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = -inf,  = inf D:  = -inf,  = inf E:  = 10,  = 10 utility = 10

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = -inf,  = inf D:  = -inf,  = 10 E:  = 10,  = 10

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = -inf,  = inf D:  = -inf,  = 10 F:  = 11,  = 11

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = -inf,  = inf D:  = -inf,  = 10 utility = 10 F:  = 11,  = 11 utility = 11

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = 10,  = inf D:  = -inf,  = 10 utility = 10

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = 10,  = inf G:  = 10,  = inf

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = 10,  = inf G:  = 10,  = inf H:  = 9,  = 9 utility = 9

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = 10,  = inf G:  = 10,  = 9 utility = ? At an opponent node, with  >  : Stop here and backtrack (never visit I) H:  = 9,  = 9

Originally from A:  = -inf,  = inf B:  = -inf,  = inf C:  = 10,  = inf utility = 10 G:  = 10,  = 9 utility = ?

Originally from A:  = -inf,  = inf B:  = -inf,  = 10 C:  = 10,  = inf utility = 10

Originally from A:  = -inf,  = inf B:  = -inf,  = 10 J:  = -inf,  = and so on!

How effective is alpha-beta in practice? Pruning does not affect final result With some extra heuristics (good move ordering): Branching factor becomes b 1/2 35  6 Can look ahead twice as far for same cost Can easily reach depth 8 and play good chess

Determinstic games today Checkers: Chinook ended 40­year­reign of human world champion Marion Tinsley in Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Othello: human champions refuse to compete against computers, who are too good. Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

Deterministic games today Chess: Deep Blue defeated human world champion Gary Kasparov in a six­ game match in Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

More on Deep Blue Garry Kasparov, world champ, beat IBM’s Deep Blue in 1996 In 1997, played a rematch Game 1: Kasparov won Game 2: Kasparov resigned when he could have had a draw Game 3: Draw Game 4: Draw Game 5: Draw Game 6: Kasparov makes some bad mistakes, resigns Info from

Kasparov said... “Unfortunately, I based my preparation for this match... on the conventional wisdom of what would constitute good anti- computer strategy. Conventional wisdom is -- or was until the end of this match -- to avoid early confrontations, play a slow game, try to out- maneuver the machine, force positional mistakes, and then, when the climax comes, not lose your concentration and not make any tactical mistakes. It was my bad luck that this strategy worked perfectly in Game 1 -- but never again for the rest of the match. By the middle of the match, I found myself unprepared for what turned out to be a totally new kind of intellectual challenge.

Some technical details on Deep Blue 32-node IBM RS/6000 supercomputer Each node has a Power Two Super Chip (P2SC) Processor and 8 specialized chess processors Total of 256 chess processors working in parallel Could calculate 60 billion moves in 3 minutes Evaluation function (tuned via neural networks) considers material: how much pieces are worth position: how many safe squares can pieces attack king safety: some measure of king safety tempo: have you accomplished little while opponent has gotten better position? Written in C under AIX Operating System Uses MPI to pass messages between nodes

Alpha-Beta Pruning: Coding It (defun max-value (state, alpha, beta) (let ((node-value 0)) (if (cutoff-test state) (evaluate state) (dolist (new-state (neighbors state) nil) (setf node-value (min-value new-state alpha beta)) (setf alpha (max alpha node-value)) (if (>= alpha beta) (return beta))) alpha)))

Alpha-Beta Pruning: Coding It (defun min-value (state, alpha, beta) (let ((node-value 0)) (if (cutoff-test state) (evaluate state) (dolist (new-state (neighbors state) nil) (setf node-value (max-value new-state alpha beta)) (setf beta (min beta node-value)) (if (<= beta alpha) (return alpha))) beta)))

Nondeterminstic Games Games with an element of chance (e.g., dice, drawing cards) like backgammon, Risk, RoboRally, Magic, etc. Add chance nodes to tree

Example with coin flip instead of dice (simple)

Example with coin flip instead of dice (simple)

Expectimax Methodology For each chance node, determine expected value Evaluation function should be linear with value, otherwise expected value calculations are wrong Evaluation should be linearly proportional to expected payoff Complexity: O(b m n m ), where n=number of random states (distinct dice rolls) Alpha-beta pruning can be done Requires a bounded evaluation function Need to calculate upper / lower bounds on utilities Less effective

Real World Most gaming systems start with these concepts, then apply various hacks and tricks to get around computability problems Databases of stored game configurations Learning (coming up next): Chapter 18