ICT2191 Topic 11 Genetic Algorithm Animat Design of an animat for defensive tactics Representation of Action Sequences Genetic Operators Genetic Algorithm Module Computing Fitness Getting it to Work Evaluation of Performance Reading: Champandard Chapter 34 Link to Genetic Algorithms on Website
ICT2192 Design of Animat for Defensive Tactics In combat game terms, defensive tactics is the sequence of actions carried out by an animat to protect itself when it comes under attack This is a natural choice for learning behaviour by genetic algorithm, because the animat is in a highly competitive situation with a survival mandate It should be possible to decide on the fittest behaviours and select for them in the evolving sequence of actions To keep things simple, we will focus on only two behaviours – dodging enemy fire and rocket jumping But the method could be extended to include other defensive moves, such as weaving and seeking cover
ICT2193 Representation of Action Sequences The representation is always important in AI problems, but particularly so for genetic algorithm solutions When attacked, we want our animat to perform an optimal action sequence that will preserve its health That means optimal ordering of a list of basic actions, optimal parameters to control them and optimal timing DodgingMove (3 weights) Rocket Jumping Look Fire - Jump - behaviour action(parameters)
ICT2194 Representation of Action Sequences The only complicated matter is the 3-weights for Move. Three pre-chosen vectors represent fleeing from: the projectile, the projected hit point given the current path, and the point of impact There could be reasons to move away from each of these points. Can assign a weight to each, then move along the path resulting from the combined vector potential collision predicted impact from impact from collision from projectile animat current path
ICT2195 Representation of Action Sequences Useful behaviour involves not only the right order of actions, with the right parameters, but also at the right time Each action in our ordered list of actions must be associated with a start-time This could be a relative offset, (say in msec) from the last action Or maybe absolute timing, with an offset from the start of the sequence – reduces dependencies in action timings move (x,y,z) look (up) jump () 400msec 200msec move (x,y,z) look (up) jump () 1500msec 2040msec 3050msec
ICT2196 Data Structure An action sequence will be an array of any size up to a maximum length. The array contains tokens representing any of the four possible actions: Each element of the array will be a structure containing - a symbol representing the action for that part of the sequence - a floating-point field for the absolute time offset in seconds from the beginning of sequence - if the action is Look, a direction (one of up, down, ahead) - if the action is Move, a Vec3f of three “fleeing weights” to be used as a steering vector To initialise, an array of random length will be assigned a random sequence of actions, parameterised by a random time offset
ICT2197 Genetic Operators We need operators for selection, crossover and mutation In the hypothetical example in Chapter 34, two parents are selected from a randomly-generated population with a probability proportional to their fitness (Note - Kanga actually uses tournament – best of 3 randomly selected individuals) For crossover, use the simplest type: one-point crossover. This chooses a random index on the arrays of two parents, then swaps the subarrays: move fire look jump move fire look move crossover index=3 move fire look jump move fire look move swap parents offspring
ICT2198 Genetic Operators Two forms of mutation are discussed in the example of Chapter 34: 1) There is a low probability that the length of a sequence may increase or decrease by 1 2) There is a (different) low probability that an action and its parameters, or the timing of an action will change randomly move fire look jump move original mutant move fire look move fire look move original mutant move fire move fire
ICT2199 Computing Fitness We are actually working with two behaviours. It follows that we should try to optimise two fitness functions (or one fuction with two components): Rocket Jumping – the point of this is to jump high, so we measure upward movement and try to optimise that. To avoid the loophole of being rewarded for climbing stairs, etc., this is only measured when the animats feet are off the ground We want the animat to discover that rocket jumping is the best method of jumping high. We will specially reward very high jumps by squaring the vertical distance off the ground Dodging - begin with distance from point of explosion. If the animat is hit, this will be zero; otherwise, simply linear distance increases with better performance To avoid the loophole of obtaining high fitness scores from simply standing far from an explosion, use average velocity - the change in distance (animat to crossing point) between time the weapon fired and time of explosion Some Quake 2 code is required to implement these two functions. Each returns a single floating point number.
ICT21910 Genetic Algorithm Module A standard module can be parameterised to handle most of the variations on the GA process – so most of the work is really building a set of helper functions to measure the fitness or the operators Champandard wants to abstract the application of an evolutionary style of computation away from the details of how the populations are generated, how fitness is computed, and even what kind of evolutionary computation is to be performed In the FEAR code this would ideally be done as a set of interfaces which lie between components being evolved and the (general purpose, evolutionary, genetic) algorithm. In Kanga, this idea has been shortcut. There are four main functions, into which many of the needed operations are mixed
ICT21911 The Kanga animat Consists of fairly hacky code to implement (part of) this process A degenerate form of genetic operator is used – does not do crossover, only mutation as described earlier The Think() function contains the following: constants: k_PopulationSize=8, k_MaxActionsPerSequnce=8, k_MaxSequenceLength=2.0f, kDodgesPerTrial=3 MonitorSituation() - Decides if a projectile has been fired, and tries to track it. This function also accumulates fitness for jumping when the animat is off the ground (should really be in AssignFitness!) GenerateSequence() - Creates a new sequence from a parent, applying the mutation at prob of 0.1 to change length, prob of 0.4 to change action. No crossover. ApplySequence() - Applies the generated sequence using ApplyAction(), a switch statement which calls embodied I/O functions according to the action symbol AssignFitness() - Evaluates the performance of dodging (every 3 dodges) and jumping using inverse change in fitness.
ICT21912 Evaluating the Tactical Performance Kanga can be tried out by putting it into a small environment such as arena1 and going after it as a player with something that launches rockets (not a machine gun) To automate the process, put Salty into the room – doesn’t move but fires rockets continuously! You should see a gradual improvement in the jumping and dodging performance. Only dodging fitness seems to be reported for the current population – this might be remedied by better printf statements It might take some time before the animat discovers rocket jumping Obviously more work needs to be done to properly evaluate the performance of animats in FEAR. Some kind of graphical tool which lets us keep track of measures continuously in real time would be useful