Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.

Slides:



Advertisements
Similar presentations
Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Advertisements

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.
Adversarial Search: Game Playing Reading: Chapter next time.
ICS-171:Notes 6: 1 Notes 6: Two-Player Games and Search ICS 171 Winter 2001.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
EIE426-AICV 1 Game Playing Filename: eie426-game-playing-0809.ppt.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Two-player games overview Computer programs which play 2-player games – –game-playing as search – –with the complication of an opponent General principles.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
This time: Outline Game playing The minimax algorithm
Game Playing CSC361 AI CSC361: Game Playing.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
Artificial Intelligence Adversarial search Chapter 6, AIMA This presentation owes a lot to V. Rutgers, who borrowed from J. D. Skrentny, who.
How computers play games with you CS161, Spring ‘03 Nathan Sturtevant.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
double AlphaBeta(state, depth, alpha, beta) begin if depth
Games & Adversarial Search Chapter 6 Section 1 – 4.
Cooperating Intelligent Systems Adversarial search Chapter 6, AIMA 2 nd ed Chapter 5, AIMA 3 rd ed This presentation owes a lot to V. Rutgers,
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
ICS-270a:Notes 5: 1 Notes 5: Game-Playing ICS 270a Winter 2003.
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
PSU CS 370 – Introduction to Artificial Intelligence Game MinMax Alpha-Beta.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing ECE457 Applied Artificial Intelligence Spring 2007 Lecture #5.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Instructor: Vincent Conitzer
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf.
Today’s Topics Playing Deterministic (no Dice, etc) Games –Mini-max –  -  pruning –ML and games? 1997: Computer Chess Player (IBM’s Deep Blue) Beat Human.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Game Playing Chapter 5.1 – 5.3, 5.5. Game Playing and AI Game playing as a problem for AI research: – game playing is non-trivial players need “human-like”
Artificial Intelligence in Game Design Board Games and the MinMax Algorithm.
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
1 Chapter 6 Game Playing. 2 Chapter 6 Contents l Game Trees l Assumptions l Static evaluation functions l Searching game trees l Minimax l Bounded lookahead.
5/4/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 9, 5/4/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Game Playing Why do AI researchers study game playing?
Adversarial Search and Game-Playing
Games and Adversarial Search
Adversarial Search Chapter 5.
Adversarial Search.
© James D. Skrentny from notes by C. Dyer, et. al.
Notes 6: Two-Player Games and Search
Adversarial Search CS 171/271 (Chapter 6)
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4.
Presentation transcript:

Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA

Rutgers CS440, Fall 2003 Adversarial search So far, single agent search – no opponents or collaborators Multi-agent search: –Playing a game with an opponent: adversarial search –Economies: even more complex, societies of cooperative and non- cooperative agents Game playing and AI: –Games can be complex, require (?) human intelligence –Have to evolve in “real-time” –Well-defined problems –Limited scope

Rutgers CS440, Fall 2003 Games and AI DeterministicChance perfect infoCheckers, Chess, Go, Othello Backgammon, Monopoly imperfect infoBridge, Poker, Scrabble

Rutgers CS440, Fall 2003 Games and search Traditional search: single agent, searches for its well-being, unobstructed Games: search against an opponent Consider a two player board game: –e.g., chess, checkers, tic-tac-toe –board configuration: unique arrangement of "pieces" Representing board games as search problem: –states: board configurations –operators: legal moves –initial state: current board configuration –goal state: winning/terminal board configuration

Rutgers CS440, Fall 2003 Wrong representation We want to optimize our (agent’s) goal, hence build a search tree based on possible moves/actions Problem: discounts the opponent X XX X X XO X XO X O XX OX X

Rutgers CS440, Fall 2003 Better representation: game search tree Include opponent’s actions as well XXX X OXOX O X O X X O X XO X XO X O XX OX X Agent move Opponent move Agent move Full move 1015 Utilities (assigned to goal nodes)

Rutgers CS440, Fall 2003 Game search trees What is the size of the game search trees? –O(b d ) –Tic-tac-toe: 9! leaves (max depth= 9) –Chess: 35 legal moves, average “depth” 100 b d ~ ~ states, “only” ~10 40 legal states Too deep for exhaustive search!

Rutgers CS440, Fall 2003 Utilities in search trees Assign utility to (terminal) states, describing how much they are valued for the agent –High utility – good for the agent –Low utility – good for the opponent M1M1 N3N3 O2O2 K0K0 L2L2 F -7 G -5 H3H3 I9I9 J -6 E3E3 D2D2 B -5 C9C9 opponent's possible moves board evaluation from agent's perspective A9A9 computer's possible moves terminal states

Rutgers CS440, Fall 2003 Search strategy Worst-case scenario: assume the opponent will always make a best move (i.e., worst move for us) Minimax search: maximize the utility for our agent while expecting that the opponent plays his best moves: 1.High utility favors agent => chose move with maximal utility 2.Low move favors opponent => assume opponent makes the move with lowest utility EDBC A E1E1 D0D0 B -7 C -6 A1A1 M1M1 N3N3 O2O2 K0K0 L2L2 F -7 G -5 H3H3 I9I9 J -6 computer's possible moves opponent's possible moves terminal states

Rutgers CS440, Fall 2003 Minimax algorithm 1.Start with utilities of terminal nodes 2.Propagate them back to root node by choosing the minimax strategy EDBC A ED BC A M1M1 N3N3 O2O2 K0K0 L2L2 F -7 G -5 H3H3 I9I9 J -6 EDBC A E1E1 D0D0 B -5 C -6 A M1M1 N3N3 O2O2 K0K0 L2L2 F -7 G -5 H3H3 I9I9 J -6 min EDBC A E1E1 D0D0 B -5 C -6 A1A1 M1M1 N3N3 O2O2 K0K0 L2L2 F -7 G -5 H3H3 I9I9 J -6 max

Rutgers CS440, Fall 2003 Complexity of minimax algorithm Utilities propagate up in a recursive fashion: –DFS Space complexity: –O(bd) Time complexity: –O(b d ) Problem: time complexity – it’s a game, finite time to make a move

Rutgers CS440, Fall 2003 Reducing complexity of minimax (1) Don’t search to full depth d, terminate early Prune bad paths Problem: –Don’t have utility of non-terminal nodes Estimate utility for non-terminal nodes: –static board evaluation function (SBE) is a heuristic that assigns utility to non-terminal nodes –it reflects the computer’s chances of winning from that node –it must be easy to calculate from board configuration For example, Chess: SBE = α * materialBalance + β * centerControl + γ * … material balance = Value of white pieces - Value of black pieces pawn = 1, rook = 5, queen = 9, etc.

Rutgers CS440, Fall 2003 Minimax with Evaluation Functions Same as general Minimax, except –only goes to depth m –estimates using SBE function How would this algorithm perform at chess? –if could look ahead ~4 pairs of moves (i.e., 8 ply) would be consistently beaten by average players –if could look ahead ~8 pairs as done in a typical PC, is as good as human master

Rutgers CS440, Fall 2003 Reducing complexity of minimax (2) Some branches of the tree will not be taken if the opponent plays cleverly. Can we detect them ahead of time? Prune off paths that do not need to be explored Alpha-beta pruning Keep track of while doing DFS of game tree: –maximizing level: alpha highest value seen so far lower bound on node's evaluation/score –minimizing level: beta lowest value seen so far higher bound on node's evaluation/score

Rutgers CS440, Fall 2003 O W -3 B N4N4 F G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 A Alpha-Beta Example minimax(A,0,4) max Call Stack A A Aα=Aα=

Rutgers CS440, Fall 2003 O W -3 B N4N4 F G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(B,1,4) max Call Stack A B Bβ=Bβ= B min

Rutgers CS440, Fall 2003 O W -3 Bβ=Bβ= N4N4 F G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(F,2,4) max Call Stack A F Fα=Fα= B min max F

Rutgers CS440, Fall 2003 O W -3 Bβ=Bβ= N4N4 Fα=Fα= G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(N,3,4) maxCall Stack A N4N4 B min max F blue: terminal state N

Rutgers CS440, Fall 2003 O W -3 Bβ=Bβ= N4N4 Fα=Fα= G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(F,2,4) is returned to max Call Stack A alpha = 4, maximum seen so far B min max F blue: terminal state F α=4

Rutgers CS440, Fall 2003 O W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(O,3,4) max Call Stack A B min max F blue: terminal state O min O Oβ=Oβ=

Rutgers CS440, Fall 2003 blue: terminal state Oβ=Oβ= W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(W,4,4) max Call Stack A B min max F blue: terminal state (depth limit) O W -3 min W

Rutgers CS440, Fall 2003 blue: terminal state Oβ=Oβ= W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(O,3,4) is returned to max Call Stack A beta = -3, minimum seen so far B min max F O min O β=-3

Rutgers CS440, Fall 2003 blue: terminal state O β=-3 W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(O,3,4) is returned to max Call Stack A O's beta  F's alpha: stop expanding O (alpha cut-off) B min max F O min X -5

Rutgers CS440, Fall 2003 blue: terminal state O β=-3 W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example Why?Smart opponent will choose W or worse, thus O's upper bound is –3 So computer shouldn't choose O:-3 since N:4 is better max Call Stack A B min max F O min

Rutgers CS440, Fall 2003 blue: terminal state O β=-3 W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(F,2,4) is returned to max Call Stack A B min max Fmin X -5 alpha not changed (maximizing)

Rutgers CS440, Fall 2003 blue: terminal state O β=-3 W -3 Bβ=Bβ= N4N4 F α=4 G -5 X -5 E D0D0 C R0R0 P9P9 Q -6 S3S3 T5T5 U -7 V -9 KM H3H3 I8I8 J L2L2 Aα=Aα= Alpha-Beta Example minimax(B,1,4) is returned to max Call Stack A B min max min X -5 beta = 4, minimum seen so far B β=4

Rutgers CS440, Fall 2003 Effectiveness of Alpha-Beta Search  Effectiveness depends on the order in which successors are examined. More effective if best are examined first Worst Case: –ordered so that no pruning takes place –no improvement over exhaustive search Best Case: –each player’s best move is evaluated first (left-most) In practice, performance is closer to best rather than worst case

Rutgers CS440, Fall 2003 Effectiveness of Alpha-Beta Search In practice often get O(b (d/2) ) rather than O(b d ) –same as having a branching factor of  b since (  b) d = b (d/2) For Example: Chess –goes from b ~ 35 to b ~ 6 –permits much deeper search for the same time –makes computer chess competitive with humans

Rutgers CS440, Fall 2003 Dealing with Limited Time In real games, there is usually a time limit T on making a move How do we take this into account? –cannot stop alpha-beta midway and expect to use results with any confidence –so, we could set a conservative depth-limit that guarantees we will find a move in time < T –but then, the search may finish early and the opportunity is wasted to do more search

Rutgers CS440, Fall 2003 Dealing with Limited Time In practice, iterative deepening search (IDS) is used –run alpha-beta search with an increasing depth limit –when the clock runs out, use the solution found for the last completed alpha-beta search (i.e., the deepest search that was completed)

Rutgers CS440, Fall 2003 The Horizon Effect Sometimes disaster lurks just beyond search depth –computer captures queen, but a few moves later the opponent checkmates (i.e., wins) The computer has a limited horizon; it cannot see that this significant event could happen How do you avoid catastrophic losses due to “short- sightedness”? –quiescence search –secondary search

Rutgers CS440, Fall 2003 The Horizon Effect Quiescence Search –when evaluation frequently changing, look deeper than limit –look for a point when game “quiets down” Secondary Search 1.find best move looking to depth d 2.look k steps beyond to verify that it still looks good 3.if it doesn't, repeat Step 2 for next best move

Rutgers CS440, Fall 2003 Book Moves Build a database of opening moves, end games, and studied configurations If the current state is in the database, use database: –to determine the next move –to evaluate the board Otherwise, do alpha-beta search

Rutgers CS440, Fall 2003 Examples of Algorithms which Learn to Play Well Checkers: A. L. Samuel, “Some Studies in Machine Learning using the Game of Checkers,” IBM Journal of Research and Development, 11(6): , 1959 Learned by playing a copy of itself thousands of times Used only an IBM 704 with 10,000 words of RAM, magnetic tape, and a clock speed of 1 kHz Successful enough to compete well at human tournaments

Rutgers CS440, Fall 2003 Examples of Algorithms which Learn to Play Well Backgammon: G. Tesauro and T. J. Sejnowski, “A Parallel Network that Learns to Play Backgammon,” Artificial Intelligence 39(3), , 1989 Also learns by playing copies of itself Uses a non-linear evaluation function - a neural network Rated one of the top three players in the world

Rutgers CS440, Fall 2003 Non-deterministic Games Some games involve chance, for example: –roll of dice –spin of game wheel –deal of cards from shuffled deck How can we handle games with random elements? The game tree representation is extended to include chance nodes: 1.agent moves 2.chance nodes 3.opponent moves

Rutgers CS440, Fall 2003 Non-deterministic Games The game tree representation is extended: Aα=Aα= B β=2 72 C β=6 96 D β=0 50 E β= /50.5 max chance min

Rutgers CS440, Fall 2003 Non-deterministic Games Weight score by the probabilities that move occurs Use expected value for move: sum of possible random outcomes Aα=Aα= B β=2 72 C β=6 96 D β=0 50 E β= /50.5 max chance min 50/ /50 -2

Rutgers CS440, Fall 2003 Non-deterministic Games Choose move with highest expected value Aα=Aα= B β=2 72 C β=6 96 D β=0 50 E β= / / max chance min A α=4

Rutgers CS440, Fall 2003 Non-deterministic Games Non-determinism increases branching factor –21 possible rolls with 2 dice Value of lookahead diminishes: as depth increases probability of reaching a given node decreases alpha-beta pruning less effective TDGammon: –depth-2 search –very good heuristic –plays at world champion level

Rutgers CS440, Fall 2003 Computers can play GrandMaster Chess “Deep Blue” (IBM) Parallel processor, 32 nodes Each node has 8 dedicated VLSI “chess chips” Can search 200 million configurations/second Uses minimax, alpha-beta, sophisticated heuristics It currently can search to 14 ply (i.e., 7 pairs of moves) Can avoid horizon by searching as deep as 40 ply Uses book moves

Rutgers CS440, Fall 2003 Computers can play GrandMaster Chess Kasparov vs. Deep Blue, May game full-regulation chess match sponsored by ACM Kasparov lost the match 2 wins & 1 tie to 3 wins & 1 tie This was an historic achievement for computer chess being the first time a computer became the best chess player on the planet Note that Deep Blue plays by “brute force” (i.e., raw power from computer speed and memory); it uses relatively little that is similar to human intuition and cleverness

Rutgers CS440, Fall 2003 Status of Computers in Other Deterministic Games Checkers/Draughts –current world champion is Chinook –can beat any human, (beat Tinsley in 1994) –uses alpha-beta search, book moves (> 443 billion) Othello –computers can easily beat the world experts Go –branching factor b ~ 360 (very large!) –$2 million prize for any system that can beat a world expert