1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen.

Slides:



Advertisements
Similar presentations
Adversarial Search Chapter 6 Sections 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Advertisements

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search Game Playing Chapter 6. Outline Games Perfect Play –Minimax decisions –α-β pruning Resource Limits and Approximate Evaluation Games.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
This time: Outline Game playing The minimax algorithm
CS 561, Sessions Administrativia Assignment 1 due tuesday 9/24/2002 BEFORE midnight Midterm exam 10/10/2002.
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
CS 561, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
Game Playing CSC361 AI CSC361: Game Playing.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
CS 460, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Chapter 6 Adversarial Search. Adversarial Search Problem Initial State Initial State Successor Function Successor Function Terminal Test Terminal Test.
Adversarial Search. Game playing Perfect play The minimax algorithm alpha-beta pruning Resource limitations Elements of chance Imperfect information.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Quiz 4 : Minimax Minimax is a paranoid algorithm. True
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Adversarial Search Chapter 6 Section 1 – 4. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time.
Adversarial Search 2 (Game Playing)
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Artificial Intelligence AIMA §5: Adversarial Search
Game Playing Why do AI researchers study game playing?
Announcements Homework 1 Full assignment posted..
Last time: search strategies
PENGANTAR INTELIJENSIA BUATAN (64A614)
Games & Adversarial Search
Games & Adversarial Search
Adversarial Search.
Artificial Intelligence
Games & Adversarial Search
Games & Adversarial Search
Artificial Intelligence
Game Playing Fifth Lecture 2019/4/11.
Games & Adversarial Search
Adversarial Search Chapter 6 Section 1 – 4.
Presentation transcript:

1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen

DCP 1172, Ch. 6 2 This time: Outline Adversarial search - Game playing The mini-max algorithm Resource limitations alpha-beta pruning Elements of chance

DCP 1172, Ch. 6 3 Game Playing Search Why study games? Why is search a good idea?

DCP 1172, Ch. 6 4 Why Study Games ? (1) Game playing was one the first tasks undertaken in AI. By 1950, Chess had been studied by many forerunners in AI ( e.g., Claude Shannon, Alan Turing, etc.) For AI researchers, the abstract nature of games make them an appealing feature for study. The state of a game is easy to represent, and agents are usually restricted to a small number of actions, whose outcomes are defined by precise rules.

DCP 1172, Ch. 6 5 Why Study Games ? (2) Games are interesting because they are too hard to solve. Games requires the ability to make some decision even when calculating the optimal decision is infeasible. Games also penalize inefficiency severely. Game-playing research has therefore spawned a number of interesting ideas on how to make the best possible use of time.

DCP 1172, Ch. 6 6 Why is search a good idea? Ignoring computational complexity, games are a perfect application for a complete search. Some majors assumptions we ’ ve been making: Only an agent ’ s actions change the world World is deterministic and fully observable Pretty much true in lots of games Of course, ignoring complexity is a bad idea, so games are a good place to study resource bounded searches.

DCP 1172, Ch. 6 7 What kind of games? Abstraction: To describe a game we must capture every relevant aspect of the game. Such as: Chess Tic-tac-toe … Fully observable environments: Such games are characterized by perfect information Search: game-playing then consists of a search through possible game positions Unpredictable opponent: introduces uncertainty thus game-playing must deal with contingency problems

DCP 1172, Ch. 6 8 Searching for the next move Complexity: many games have a huge search space Chess:b = 35, m=100  nodes = if each node takes about 1 ns to explore then each move will take about millennia to calculate. Resource (e.g., time, memory) limit: optimal solution not feasible/possible, thus must approximate 1.Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result. 2.Evaluation functions: heuristics to evaluate utility of a state without exhaustive search.

DCP 1172, Ch. 6 9 Two-player games A game formulated as a search problem: Initial state: board position and turn Successor functions: definition of legal moves Terminal state: conditions for when game is over Utility function: a numeric value that describes the outcome of the game. E.g., -1, 0, 1 for loss, draw, win. (AKA payoff function)

DCP 1172, Ch Game vs. search problem

DCP 1172, Ch Example: Tic-Tac-Toe

DCP 1172, Ch Type of games

DCP 1172, Ch Type of games

DCP 1172, Ch Generate Game Tree

DCP 1172, Ch Generate Game Tree x xxx x

DCP 1172, Ch Generate Game Tree x ox x o x o xo

DCP 1172, Ch Generate Game Tree x ox x o x o xo 1 ply 1 move

DCP 1172, Ch A subtree win lose draw xx o o o x xx o o o x xx o o o x x xx o o o x x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x o o oo o o xx o o o x x oxxxx xx o o o x x o xx o o o x x o xx o o o x x o x xx o o o x xo

DCP 1172, Ch What is a good move? win lose draw xx o o o x xx o o o x xx o o o x x xx o o o x x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x xx o o o x x o o oo o o xx o o o x x oxxx xx o o o x x o xx o o o x x o x xx o o o x xo

DCP 1172, Ch MiniMax Perfect play for deterministic environments with perfect information From among the moves available to you, take the best one Where the best one is determined by a search using the MiniMax strategy

DCP 1172, Ch The minimax algorithm Basic idea: choose move with highest minimax value = best achievable payoff against best play Algorithm: 1.Generate game tree completely 2.Determine utility of each terminal state 3.Propagate the utility values upward in the three by applying MIN and MAX operators on the nodes in the current level 4.At the root node use minimax decision to select the move with the max (of the min) utility value Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

DCP 1172, Ch Minimax Minimize opponent’s chance Maximize your chance

DCP 1172, Ch Minimax MIN Minimize opponent’s chance Maximize your chance

DCP 1172, Ch Minimax MAX MIN Minimize opponent’s chance Maximize your chance

DCP 1172, Ch Minimax MAX MIN Minimize opponent’s chance Maximize your chance

DCP 1172, Ch MiniMax = maximum of the minimum I ’ ll choose the best move for me (max) You ’ ll choose the best move for you (min) 1st Ply 2nd Ply

DCP 1172, Ch Minimax: Recursive implementation Complete: Yes, for finite state-space Optimal: Yes Time complexity: O(b m ) Space complexity: O(bm) (= DFS Does not keep all nodes in memory.)

DCP 1172, Ch Do We Have To Do All That Work? 3812 MAX MIN

DCP 1172, Ch Do We Have To Do All That Work? MAX MIN

DCP 1172, Ch Do We Have To Do All That Work? MAX MIN Since 2 is smaller than 3, then there is no need for further search

DCP 1172, Ch Do We Have To Do All That Work? 3 3 X 3812 MAX MIN More on this next time: α-β pruning

DCP 1172, Ch Ideal Case Search all the way to the leaves (end game positions) Return the leaf (leaves) that leads to a win (for me) Anything wrong with that?

DCP 1172, Ch More Realistic Search ahead to a non-leaf (non-goal) state and evaluate it somehow Chess 4 ply is a novice 8 ply is a master 12 ply can compete at the highest level In no sense can 12 ply be likened to a search of the whole space

DCP 1172, Ch Move evaluation without complete search Complete search is too complex and impractical Evaluation function: evaluates value of state using heuristics and cuts off search New MINIMAX: CUTOFF-TEST: cutoff test to replace the terminal test condition (e.g., deadline, depth-limit, etc.) EVAL: evaluation function to replace utility function (e.g., number of chess pieces taken)

DCP 1172, Ch Evaluation Functions Need a numerical function that assigns a value to a non- goal state Has to capture the notion of a position being good for one player Has to be fast Typically a linear combination of simple metrics

DCP 1172, Ch Evaluation functions Weighted linear evaluation function: to combine n heuristics f = w 1 f 1 + w 2 f 2 + … + w n f n E.g, w’s could be the values of pieces (1 for prawn, 3 for bishop etc.) f’s could be the number of type of pieces on the board

DCP 1172, Ch Note: exact values do not matter

DCP 1172, Ch Minimax with cutoff: viable algorithm? Assume we have 100 seconds, evaluate 10 4 nodes/s; can evaluate 10 6 nodes/move

DCP 1172, Ch  -  pruning: search cutoff Pruning: eliminating a branch of the search tree from consideration without exhaustive examination of each node  -  pruning: the basic idea is to prune portions of the search tree that cannot improve the utility value of the max or min node, by just considering the values of nodes seen so far. Does it work? Yes, in roughly cuts the branching factor from b to  b resulting in double as far look- ahead than pure minimax

DCP 1172, Ch  -  pruning: example  6 6 6 MAX 6128 MIN

DCP 1172, Ch  -  pruning: example  6 6 6 MAX  2 2 MIN

DCP 1172, Ch  -  pruning: example  6 6 6 MAX  2 2 5  5 5 MIN

DCP 1172, Ch  -  pruning: example  6 6 6 MAX  2 2 5  5 5 MIN Selected move

DCP 1172, Ch Properties of  - 

DCP 1172, Ch  -  pruning: general principle Player Opponent m n  v If  > v then MAX will chose m so prune tree under n Similar for  for MIN

DCP 1172, Ch Remember: Minimax: Recursive implementation

DCP 1172, Ch Alpha-beta Pruning Algorithm

DCP 1172, Ch More on the  -  algorithm Same basic idea as minimax, but prune (cut away) branches of the tree that we know will not contain the solution. Because minimax is depth-first, let’s consider nodes along a given path in the tree. Then, as we go along this path, we keep track of:  : Best choice so far for MAX  : Best choice so far for MIN

DCP 1172, Ch More on the  -  algorithm: start from Minimax Note: These are both Local variables. At the Start of the algorithm, We initialize them to  = -  and  = + 

DCP 1172, Ch More on the  -  algorithm … MAX MIN MAX  = -   = +  Min-Value loops over these In Min-Value:  = -   = 5  = -   = 5  = -   = 5 Max-Value loops over these

DCP 1172, Ch More on the  -  algorithm … MAX MIN MAX  = -   = +  In Max-Value:  = -   = 5  = -   = 5  = -   = 5  = 5  = +  Max-Value loops over these

DCP 1172, Ch In Min-Value: More on the  -  algorithm … MAX MIN MAX  = -   = +   = -   = 5  = -   = 5  = -   = 5  = 5  = +   = 5  = 2 Min-Value loops over these  < , End loop and return 5

DCP 1172, Ch In Max-Value: More on the  -  algorithm … MAX MIN MAX  = -   = +   = -   = 5  = -   = 5  = -   = 5  = 5  = +   = 5  = 2 End loop and return 5  = 5  = +  Max-Value loops over these

DCP 1172, Ch Operation of  -  pruning algorithm  < , End loop and return

DCP 1172, Ch Example

DCP 1172, Ch  -  algorithm:

DCP 1172, Ch Solution NODE TYPE ALPHA BETA SCORE A Max -I +I B Min -I +I C Max -I +I D Min -I +I E Max D Min -I 10 F Max D Min -I C Max 10 +I G Min 10 +I H Max G Min C Max 10 +I 10 B Min -I 10 J Max -I 10 K Min -I 10 L Max K Min -I … NODE TYPE ALPHA BETA SCORE … J Max B Min -I A Max 10 +I Q Min 10 +I R Max 10 +I S Min 10 +I T Max S Min R Max 10 +I V Min 10 +I W Max V Min R Max 10 +I 10 Q Min A Max

DCP 1172, Ch State-of-the-art for deterministic games

DCP 1172, Ch Stochastic games

DCP 1172, Ch Algorithm for stochastic games

DCP 1172, Ch Remember: Minimax algorithm

DCP 1172, Ch Stochastic games: the element of chance 3 ? ? CHANCE ? expectimax and expectimin, expected values over all possible outcomes

DCP 1172, Ch Stochastic games: the element of chance CHANCE 4 = 0.5* *5 Expectimax Expectimin

DCP 1172, Ch Evaluation functions: Exact values DO matter Order-preserving transformation do not necessarily behave the same!

DCP 1172, Ch State-of-the-art for stochastic games

DCP 1172, Ch Summary