CSE 415 -- (c) S. Tanimoto, 2008 Alpha-Beta Search 1 Alpha-Beta Search Outline: Two-person, zero-sum, perfect info games. Static evaluation functions.

Slides:



Advertisements
Similar presentations
Game Playing CS 63 Chapter 6
Advertisements

Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
CMSC 671 Fall 2001 Class #8 – Thursday, September 27.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
1 CSC 550: Introduction to Artificial Intelligence Fall 2008 search in game playing  zero-sum games  game trees, minimax principle  alpha-beta pruning.
Game Playing Games require different search procedures. Basically they are based on generate and test philosophy. At one end, generator generates entire.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
MINIMAX SEARCH AND ALPHA- BETA PRUNING: PLAYER 1 VS. PLAYER 2.
Search Strategies.  Tries – for word searchers, spell checking, spelling corrections  Digital Search Trees – for searching for frequent keys (in text,
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
CMSC 463 Chapter 5: Game Playing Prof. Adam Anthony.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
This time: Outline Game playing The minimax algorithm
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
Game Playing CSC361 AI CSC361: Game Playing.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Adversarial Search: Game Playing Reading: Chess paper.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Alpha-Beta Search. 2 Two-player games The object of a search is to find a path from the starting position to a goal position In a puzzle-type problem,
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
PSU CS 370 – Introduction to Artificial Intelligence Game MinMax Alpha-Beta.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
CISC 235: Topic 6 Game Trees.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
KU NLP Heuristic Search Heuristic Search and Expert Systems (1) q An interesting approach to implementing heuristics is the use of confidence.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
1 Computer Group Engineering Department University of Science and Culture S. H. Davarpanah
Game-playing AIs Part 1 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part I (this set of slides)  Motivation  Game Trees  Evaluation.
Notes on Game Playing by Yun Peng of theYun Peng University of Maryland Baltimore County.
Instructor: Vincent Conitzer
Minimax with Alpha Beta Pruning The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
For Wednesday Read chapter 7, sections 1-4 Homework: –Chapter 6, exercise 1.
Game Playing. Introduction One of the earliest areas in artificial intelligence is game playing. Two-person zero-sum game. Games for which the state space.
CSCI 4310 Lecture 6: Adversarial Tree Search. Book Winston Chapter 6.
GAME PLAYING 1. There were two reasons that games appeared to be a good domain in which to explore machine intelligence: 1.They provide a structured task.
Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games.
CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Game-playing AIs Part 2 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part II  The Minimax Rule  Alpha-Beta Pruning  Game-playing.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
February 25, 2016Introduction to Artificial Intelligence Lecture 10: Two-Player Games II 1 The Alpha-Beta Procedure Can we estimate the efficiency benefit.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search and Game Playing. Topics Game playing Game trees Minimax Alpha-beta pruning Examples.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Game Playing Why do AI researchers study game playing?
PENGANTAR INTELIJENSIA BUATAN (64A614)
Chapter 6 : Game Search 게임 탐색 (Adversarial Search)
Alpha-Beta Search.
Alpha-Beta Search.
CSE (c) S. Tanimoto, 2001 Search-Introduction
CSE (c) S. Tanimoto, 2007 Search 2: AlphaBeta Pruning
Presentation transcript:

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 1 Alpha-Beta Search Outline: Two-person, zero-sum, perfect info games. Static evaluation functions. Minimax search. Alpha-beta pruning. Checkers-playing issues.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 2 Two-Person, Zero-Sum, Perfect Information Games 1. A two-person, zero-sum game is a game in which only one player wins and only one player loses. There may be ties (“draws”). There are no “win-win” or “lose-lose” instances. 2. Most 2PZS games involve turn taking. In each turn, a player makes a move. Turns alternate between the players. 3. Perfect information: no randomness as in Poker or bridge. 4. Examples of 2PZS games include Tic-Tac-Toe, Othello, Checkers, and Chess.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 3 Why Study 2PZS Games in AI? 1.Games are idealizations of problems. 2.AI researchers can study the theory and (to some extent) practice of search algorithms in an easier information environment than, say, software for the design of the Space Shuttle. (“Pure Search”)

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 4 Static Evaluation Functions In most of the interesting 2PZS games, there are too many possibilities for game positions to exhaustively search each alternative evolutionary path to its end. In order to determine good moves, one can compute some real-valued function of the board that results from a sequence of moves, and this value will be high if it is favorable to one player (the player we’ll call Max) and unfavorable to the other player (whom we will call Min). This function is called a static evaluation function. Example in Checkers: f(board) = 5x 1 + x 2 Where x 1 = Max’s king advantage; x 2 = Max’s single man advantage.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 5 Tic-Tac-Toe Static Eval. Fn. f(board) = 100 A + 10 B + C – (100 D + 10 E + F) A = number of lines of 3 Xs in a row. B = number of lines of 2 Xs in a row (not blocked by an O) C = number of lines containing one X and no Os. D = number of lines of 3 Os in a row. E = number of lines of 2 Os in a row (not blocked by an X) F = number of lines containing one O and no Xs.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 6 Minimax Search (Illustration)

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 7 Minimax Search (Rationale) If looking ahead one move, generate all successors of the current state, and apply the static evaluation function to each of them, and if we are Max, make the move that goes to the state with the maximum score. If we are looking ahead two moves, we will be considering the positions that our opponent can get to in one move, from each of the positions that we can get to in one move. Assuming that the opponent is playing rationally, the opponent, Min, will be trying to minimize the value of the resulting board. Therefore, instead of using the static value at each successor of the current state, we examine the successors of each of those, computing their static values, and take the minimum of those as the value of our successor.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 8 Minimax Search (Algorithm) Procedure minimax(board, whoseMove, plyLeft): if plyLeft == 0: return staticValue(board) if whoseMove == ‘Max’: provisional = else: provisional = for s in successors(board, whoseMove): newVal = minimax(s, other(whoseMove), plyLeft-1) if (whoseMove == ‘Max’ and newVal > provisional\ or (whoseMove == ‘Min’ and newVal < provisional): provisional = newVal return provisional

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 9 Checkers Example Black to move, White = “Min”, Black = “Max”

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 10 Minimax Search Example

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 11 Alpha-Beta Cutoffs An alpha (beta) cutoff occurs at a Maximizing (minimizing) node when it is known that the maximizing (minimizing) player has a move that results in a value alpha (beta) and, subsequently, when an alternative to that move is explored, it is found that the alternative gives the opponent the option of moving to a lower (higher) valued position. Any further exploration of the alternative can be canceled.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 12 Strategy to Increase the Number of Cutoffs At each non-leaf level, perform a static evaluation of all successors of a node and order them best-first before doing the recursive calls. If the best move was first, the tendency should be to get cutoffs when exploring the remaining ones. Or, use Iterative Deepening, with ply limits increasing from, say 1 to 15. Use results of the last iteration to order moves in the next iteration.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 13 Another Performance Technique Avoid recomputing values for some states (especially those within 3 or 4 ply of the current state, which are relatively expensive to compute), by saving their values. Use a hash table to save: [state, value, ply-used]. As a hashing function, use a Zobrist hashing function: For each piece on the board, exclusive-or the current key with a pre-generated random number. Hash values for similar boards are very different. Hash values can be efficiently computed with an incremental approach (in some games, like checkers and chess, at least).

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 14 Zobrist Hashing in Python # Set up a 64x2 array of random ints. S = 64 P = 2 zobristnum = [[0]*P]*S from random import randint def myinit(): global zobristnum for i in range(S): for j in range(P): zobristnum[i][j]= randint(0, L) myinit() # Hash the board to an int. def zhash(board): global zobristnum val = 0; for i in range(S): piece = None if(board[i] == 'B'): piece = 0 if(board[i] == 'W'): piece = 1 if(piece != None): val ^= zobristnum[i][piece] return val # Testing: b = [' ']*64 ; b[0]='B' ; b[1]='W' print zhash(b)

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 15 Game-Playing Issues Representing moves: a (Source, Destination) approach works for some games when the squares on the board have been numbered. Source: The number of the square where a piece is being moved from. Destination: The number of the square where the piece is being moved to. (For Othello, only the destination is needed.) Opening moves: Some programs use an “opening book” Some competitions require that the first 3 moves be randomly selected from a set of OK opening moves, to make sure that players are “ready for anything” Regular maximum ply are typically for machines, with extra ply allowed in certain situations. Static evaluation functions in checkers or chess may take 15 to 20 different features into consideration.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 16 Learning a Scoring Polynomial by From Experience Arthur Samuel: Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development, Vol 3. pp , Arthur Samuel: Some Studies in Machine Learning Using the Game of Checkers. II --- Recent Progress. IBM Journal, Vol 116. pp ,

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 17 Scoring Polynomial f ( s ) = a 1 ADV + a 2 APEX + a 3 BACK a 16 THRET There are 16 terms at any one time. They are automatically selected from a set of 38 candidate terms. 26 of them are described in the following 3 slides.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 18 Scoring Polynomial Terms

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 19

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 20

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 21 Scoring Polynomial Coefficient adjustment Coefficients are powers of 2. They are ordered so that no two are equal at any time.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 22 Polynomial adjustment For each term, the program keeps track of whether its value was correlated with an improvement in the game position over a series of moves. If so, its value goes up, if not, it goes down.

CSE (c) S. Tanimoto, 2008 Alpha-Beta Search 23 Checkers: Computer vs Human Samuel’s program beat a human player in a widely publicized match in Later a program called Chinook, developed by Jonathan Schaeffer at the Univ. of Alberta became the nominal “Man vs Machine Champion of the World” in * Checkers playing was the vehicle under which much of the basic research in game playing was developed. *