Download presentation
Presentation is loading. Please wait.
Published byEdwin Hutchinson Modified over 9 years ago
1
Game Playing Evolve a strategy for two-person zero-sum games. Help the user to determine the next move. Constructing a game tree Each node represents a state in the game Each arc represents a legal move The minimax algorithm Alpha-beta pruning
2
Example: Minimax Algorithm Game Tree: We want to maximize player X’ score. A value of 1 indicates a win for player X and a loss for player O. A value of 0 indicates a win for player O and a loss for player X. 1 0 0 1 1 1 1
3
Heuristics Not viable to generate the entire game tree. Use of heuristics Example : Tic-Tac-Toe Number of possible wins for X minus number of possible wins for O. 8 – 5 = 34 – 5 = -1
4
Example: Minimax Algorithm 32 16 8 24 8 16
5
Game Tree
6
Operators Terminals – Legal moves, i.e. left and right Functions: CXM1, CXM2, COM1, COM2 XM1: first move made by player X XM2: second move made by player X OM1: first move made by player O OM2: second move made by player O
7
Fitness Cases Consists of the possible combinations of L and R for the moves that O can make. Format: XM1, OM1, XM2, OM2 LLLL LRRR LLLR LRRL
8
Evaluation The raw fitness of an individual is the sum of the payoffs for each fitness case. The hits ratio is the number of fitness cases for which the individual receives a payoff at least as good as the minimax strategy. What is the raw fitness and hits ratio of the following individuals?
9
GP Parameters Population size:500 Max. no. of Generations:51 Initial Population Generation: The ramped half- and-half method with an initial tree depth of six and a depth limit of seventeen on the size of trees created by the genetic operators. Method of Selection:Fitness proportionate selection
10
Evolved Solution
11
Simplified Solution
12
Pursuer - Evader
13
Game Parameters The payoff for the pursuer is the time it takes to catch the evader. The payoff of the evader is the time it remains free. The information available at each stage of the game is the position of the pursuer and the evader. A game-playing strategy will specify the angle at which the pursuer must move in order to catch the evader.
14
Terminals and Functions T={ X, Y, R } X - x-coordinate of the position of the evader Y – Y-coordinate of the position of the evader R – ephemeral constant in the range [-1, 1] F={ +, -, /, EXP, IFLTZ} EXP – the exponential function IFLTZ – evaluates its first argument if its second argument is less than zero else it evaluates its third arguments
15
Evaluation This fitness cases consists of 20 different positions of the evader on the plane, i.e. a set of (X, Y) coordinate values. The raw fitness of an individual is average time required to catch the evader over the 20 fitness cases. An upper limit is set on the maximum time permitted. The hits ratio is the number of fitness cases for which this time limit is not exceeded.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.