6. Fully Observable Game Playing

Slides:



Advertisements
Similar presentations
Chapter 6, Sec Adversarial Search.
Advertisements

Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4. Warm Up Let’s play some games!
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
Hoe schaakt een computer? Arnold Meijster. Why study games? Fun Historically major subject in AI Interesting subject of study because they are hard Games.
Adversarial Search Chapter 6.
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Games with Chance 2012/04/25 1. Nondeterministic Games: Backgammon White moves clockwise toward 25. Black moves counterclockwise toward 0. A piece can.
This time: Outline Game playing The minimax algorithm
Game Playing CSC361 AI CSC361: Game Playing.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
CSC 412: AI Adversarial Search
Adversarial Search Chapter 5 Adapted from Tom Lenaerts’ lecture notes.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing.
AD FOR GAMES Lecture 4. M INIMAX AND A LPHA -B ETA R EDUCTION Borrows from Spring 2006 CS 440 Lecture Slides.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Instructor: Vincent Conitzer
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
For Wednesday Read chapter 7, sections 1-4 Homework: –Chapter 6, exercise 1.
Adversarial Search Chapter 6 Section 1 – 4. Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Adversarial Search Chapter 6 Section 1 – 4. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.
5/4/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 9, 5/4/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Artificial Intelligence AIMA §5: Adversarial Search
Game Playing Why do AI researchers study game playing?
Adversarial Search and Game-Playing
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search Chapter 5.
Games & Adversarial Search
Games & Adversarial Search
Adversarial Search.
Games & Adversarial Search
Games & Adversarial Search
Game Playing Fifth Lecture 2019/4/11.
Games & Adversarial Search
Adversarial Search CMPT 420 / CMPG 720.
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4.
Presentation transcript:

6. Fully Observable Game Playing 2012/03/28

Games vs. search problems

Game Theory Studied by mathematicians, economists, finance In AI, we limit game to deterministic turn-taking two-player zero-sum perfect information This means deterministic, full observable environments in which there are two agents whose action must alternate and in in which the utility values at the end of the game are always equal and opposite

Types of Games deterministic chance perfect information Chess, Checkers (西洋跳棋), Go, Othello Backgammon (西洋雙陸棋) imperfect information Bridge, Poker Game playing was one of the first tasks undertaken in AI Machines have surpassed humans on checker and Othello, have defeated human champions in chess and backgammon In Go, computers perform at the amateur level

Checkers

Game as Search Problems Games offer pure, abstract competition A chess playing computer would be an existence proof of a machine doing something generally thought to require intelligence Games are idealization of worlds in which the world state is fully accessible the (small number of) actions are well-defined uncertainty due to moves by the opponent due to the complexity of games

Game as Search Problems (cont.-1) Games are usually much too hard to solve Example, chess: average branching factor = 35 average moves per player = 50 total number of nodes in search tree = 35100 or 10154 total number of different legal positions = 1040 Time limits for making good decisions Unlikely to find goal, must approximate

Game as Search Problems (cont.-2) Initial State How does the game start? Successor Function A list of legal (move, state) pairs for each state Terminal Test Determine when game is over Utility Function Provide numeric value for all terminal states e.g., win, lose, draw with +1, -1, 0

Game Tree (2-player, deterministic, turns) Game tree complexity 9!=362880 Game board complexity 39= 19683

Minimax Strategy Assumption MinimaxValue(n) = Both players are knowledgeable and play the best possible move MinimaxValue(n) = Utility(n) if n is a terminal state maxsSuccessors(n) MinimaxValue(s) if n is a MAX node minsSuccessors(n) MinimaxValue(s) if n is a MIN node

Minimax Strategy (cont.) Is a Optimal Strategy Leads to outcomes at least as good as any other strategy when playing an infallible opponent Pick the option that most (max) minimizes the damage your opponent can do maximize the worst-case outcome because your skillful opponent will certainly find the most damaging move

Minimax Perfect play for deterministic, perfect information games Idea: choose moves to a position with highest minimax value = best achievable payoff against best play

Minimax – Animated Example 3 6 The computer can obtain 6 by choosing the right hand edge from the first node. Max 5 1 3 6 2 7 6 Min 5 3 1 3 7 Max 6 5 13

Minimax Algorithm function MINIMAX-DECISION ( state ) returns an action inputs: state, current state in game v  MAX-VALUE ( state ) return the action in SUCCESSORS( state ) with value v function MAX-VALUE ( state ) returns a utility value if TERMINAL-TEST ( state ) then return UTILITY ( state ) v  –  for a, s in SUCCESSORS ( state ) do v  MAX ( v, MIN-VALUE ( s )) return v function MIN-VALUE ( state ) returns a utility value v   v  MIN ( v, MAX-VALUE ( s ))

Optimal Decisions in Multiplayer Games Extend the minimax idea to multiplayer games Replace the single value for each node with a vector of values

Minimax Algorithm (cont.) Generate the whole game tree Apply the utility function to each terminal state Propagate utility of terminal states up one level Utility(n) = max / min (n.1, n.2, …, n.b) At the root, MAX chooses the move leading to the highest utility value

Analysis of Minimax Complete? Yes, only if the tree is finite Optimal? Yes, against an optimal opponent Time? O(bm), is a complete depth-first search m: max depth, b: # of legal moves Space? O(bm), generate all successors at once or O(m), generate successor one at a time For chess, b  35, m  100 for reasonable games  Exact solution completely infeasible

Complex Games What happens if minimax is applied to large complex games? What happens to the search space? Example, chess Decent amateur program  1000 moves / second 150 seconds / move (tournament play) Look at approx. 150,000 moves Chess branching factor of 35 Generate trees that are 3-4 ply Resultant play – pure amateur

- Pruning The problem of minimax search - pruning # of states to examine: exponential in number of moves - pruning return the same move as minimax would, but prune away branches that cannot possibly influence the final decision  – lower bound on MAX node, never decreasing value of the best (highest) choice so far in search of MAX  – upper bound on MAX node, never increasing value of the best (lowest) choice so far in search of MIN

- Pruning Example - 1

- Pruning Example - 1 (2nd Ed.) 2 5 14 ? [-, 2]

- Pruning (cont.)  cut-off  cut-off Search is discontinued below any MIN node with min-value     cut-off Search is discontinued below any MAX node with max-value    Order of considering successors matters (look at step f in previous slide) If possible, consider best successors first

- Pruning (cont.) max min  If n is worse than , max will avoid it  prune the branch If m is better than n for player, we will never get to n in play and just prune it

- Pruning Example - 2 A D E F G  = - 6  =   = -  =  6 8 2 5  = 6  =  6     = 6  =  B C  = - 6  =   = -  =   = -  = 6  = - 8  = 6     = 6  =  D E F G H I J K L M L M 6 5 8 6 2 1 5 4

- Pruning Example - 3 MAX MIN Completed 5 1 2 3 4 7 6 a c b d e f g 6 a c b d e f g Node Alpha Beta a 3 ∞ Node Alpha Beta a -∞ ∞ b -∞ 3 b -∞ ∞ d 1 ∞ d -∞ ∞ d 3 ∞ d 2 ∞ e 4 3 CUT-OFF e -∞ 3 c 3 3 CUT-OFF c 3 ∞ f 3 ∞ Completed

Key: -∞ = negative infinity; +∞ = positive infinity Function Node a b V Return Max A -∞, 3 +∞ 3, 2 Min B -∞ +∞, 3 3, 4 D -∞,1,2,3 1,2,3 1 2 3 E -∞, 4 4 Cutoff 5 & 7 C +∞, 6 6 F 3,4,5,6 -∞,4,5,6 4,5,6 5 G -∞, 6 Cutoff 1 & 5 Key: -∞ = negative infinity; +∞ = positive infinity The last value in a square is the final value assigned to the specific variable, i.e. at the end of the search Node A’s a = 3.

- Algorithm function ALPHA-BETA-SEARCH ( state ) returns an action inputs: state, current state in game v  MAX-VALUE ( state, – ,   ) return the action in SUCCESSORS ( state ) with value v function MAX-VALUE ( state, ,  ) returns a utility value , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST ( state ) then return UTILITY ( state ) v  –  for a, s in SUCCESSORS ( state ) do v  MAX ( v, MIN-VALUE ( s, ,  )) if v   then return v // fail-high   MAX ( , v ) return v

- Algorithm (cont.) function MIN-VALUE ( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST ( state ) then return UTILITY ( state ) v    for a, s in SUCCESSORS ( state ) do v  MIN ( v, MAX-VALUE ( s, ,  )) if v   then return v // fail low   MIN ( , v ) return v

- Pruning Example - 4 MAX MIN 5 8 7 a c b d e f g 4 2 1 3 h i j k l 3 h i j k l m n

- Pruning Example - 5 MAX MIN -1 5 a c b d e f g 1 2 -5 3 4 h i j k -1 5 a c b d e f g 1 2 -5 3 4 h i j k l n o -4 -3 m

Analysis of - Search Pruning does not affect final result The effectiveness of - pruning is highly dependent on the order in which the successors are examined  It is worthwhile to try to examine first the successors that are likely to be best e.g., Example 1 (e,f) If successors of D is 2, 5, 14 (instead of 14, 5, 2) then 5, 14 can be pruned

Analysis of - Search (cont.) If best move first (perfect ordering), the total number of nodes examined = O(bm/2) effective branching factor = b1/2 for chess, 6 instead 35 i.e., - can look ahead roughly twice as far as minimax in the same amount of time If random order, the total number of nodes examined = O(b3m/4) for moderate b

Imperfect, Real-Time Decisions No practical to assume the program has time to search all the ways to terminal states Since moves must be made in a reasonable amount of time, to alter minimax or - in two ways Evaluation Function (instead of utility function) an estimate of the expected utility of game from a given position Cutoff Test (instead of terminal test) decide when to apply Eval e.g., depth limit (perhaps add quiescence search)

Evaluation Functions The heuristic that estimates expected utility Preserve the ordering among terminal states in the same way as the true utility function, otherwise it can cause bad decision making Computation cannot take too long For nonterminal states, it should be strongly correlated with the actual chances of winning Define features of game state that assist in evaluation What are features of chess? e.g., # of pawns possessed, etc. Weighted Linear Function Eval(s) = w1f1(s) + w2f2(s) + … + wnfn(s)

Evaluation Functions (cont.-1) (a) Black has an advantage of a knight and two pawns and will win the game (b) Black will lose after white captures the queen

Evaluation Functions (cont.-2) Digression: Exact values don’t matter Behavior is preserved under any monotonic transformation of Eval Only the order matter payoff in deterministic games acts as an order utility function

Cutting off Search When do you use evaluation functions? if Cutoff-Test(state, depth) then return Eval(state) controlling the amount of search is to set a fixed depth limit d Cutoff-Test(state, depth) returns 1 or 0 when 1 is returned for all depth greater than some fixed depth d, use evaluation function cutoff beyond a certain depth cutoff if state is stable (more predictable) cutoff moves you know are bad (forward pruning) Can have disastrous effect if evaluation functions are not sophisticated enough Should continue the search until a quiescent position is found

Cutting off Search (cont.) Does it work in practice? bm = 106, b = 35  m = 4 4-ply lookahead is a hopeless chess player 4-ply  human novice 8-ply  typical PC, human master 12-ply  Deep Blue, Kasparov

Horizontal Effect a series of checks by the black rook forces the inevitable queening move by white “over the horizontal” and makes the position look like a win for black, when it is really a win for white Horizontal effect arises when the program is facing a move by the opponent that causes serious damage and is ultimately unavoidable At present, no general solution has been found for horizontal problem

Suggestion Improve evaluation function Make the search deeper Know that the bishop is trapped Make the search deeper Make the search depth more flexible Program searches deeper in the line that a pawn is being given away, and less deep in other lines

HW2, Deadline 4/12 Design the Evaluation Functions for Chinese chess and Chinese Dark chess.