For games. 1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor.

for games

1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor control system.  How about in games?

2. Threat Assessment  Strategy/simulation type game  Use NN to predict the type of threat presented by the player at any time during gameplay.

3. Attack or Flee  RPG – to control how certain AI creatures behave  Handle AI creature’s decision making

 Using a 3-layer feed-forward neural network as example  Structure  Input  Hidden  Output : Feed-forward process

 Input  What to choose as input is problem-specific  Keeping inputs to a minimum set will make training easier.  Forms: Boolean, enumerated, continuous  The different scales used require the input values to be normalized

 Weights  “Synaptic connection” in a biological NN  Weights influence the strength of inputs  Determining the weights involve “training” or “evolving” a NN  Every connection between neurons has an associated weight – net input to a given neuron j is calculated from a set of input i neurons  Net input to a given neuron is a linear combination of weighted inputs from other neurons from previous layer

 Activation Functions  Takes the net input to a neuron, operates on it to produce an output for the neuron  Should be nonlinear functions, for NN to work as expected  Common: Logistic (or sigmoid) function

 Activation Functions (cont’d)  Other well-known activation functions: Step function and hyperbolic tangent function

 Bias  Each neuron (except from input layer) has a bias associated to it  Bias term shifts net input along horizontal axis of activation function, changing the threshold it activates  Value: always 1 or -1  Its weight also adjusted just like other weights

 Output  Choice is also problem-specific  Same rule of thumb: Keep number to minimum  Using a logistic function as output activation, an output around 0.9 is considered activated or true, 0.1 considered not activated or false  In practice, we may not even get close to these values! So, a threshold has to be set… Using midpoint of the function (0.5) is a simple choice  If more than one output neuron is used, more than one outputs could be activated – easier to select just one output by “winner-take-all” approach

 Hidden Layer  Some NNs have no hidden layers or a few hidden layers – design  The more hidden layers, the more features the network can handle, and vice versa  Increasing the number of features (dimensionality) can enable better fit to the expected function

 Back-propagation Training  Aim of training – to find values for the weights that connect the neurons such that the input data can generate the desired output values  Need a training set  Done iteratively  Optimization process – requires some measure of merit: Error measure that needs to be minimize  Error measures: Mean square error

 Finding optimum weights iteratively 1. Start with training set consisting of input data and desired outputs 2. Initialize the weights in the NN to some small random values 3. With each set of input data, feed network and calculate output 4. Compare calculated output with desired output, compute error 5. Adjust weights to reduce error, repeat process  Each iteration is known as an “epoch”

 Computing error  Most common error measure: Mean square error, or average of the square of difference between desired and calculated output:  Goal: To get the error value as small as possible  Iteratively adjust the weight values, by calculating error associated with each neuron in output and hidden layers

 Computing error (cont’d)  Output neuron error  Hidden-layer neuron error  No error is associated with input layer neurons because those neuron values are given  Can you observe how back-propagation is at work?

 Adjusting weights  Calculate suitable adjustments for each weight in the network.  Adjustment to each weight:  New weight = Old weight +  w  Adjustments are made for each individual weight  The learning rate p is a multiplier that affects how much each weight is adjusted.

 Adjusting weights (cont’d)  Setting p too high, might overshoot the optimum weights  Setting p too low, training might take too long  Special technique  Adding “momentum” (see textbook), or regularization (another technique)

 Earlier example  Flocking and Chasing – A flock of units chasing a player  Applying neural networks  To decide whether to chase the player, evade him, or flock with other AI units  Simplistic method: Creature always attack player, OR use a FSM “brain” (or other decision- making method) to decide between those actions based on conditions

 Neural Networks:  Advantage: Not only for making decisions but to adapt their behavior given their experience with attacking the player  A “feedback” mechanism is useful to model “experience”, so that subsequent decisions can be improved or made “smarter”.

 How it works (example)  Assume we have 20 AI units moving on the screen  Behaviors: Chase, Evade, Flock with other units  Combat mode  When player and AI units come within a specified radius of one another, assume to be in combat  Combat will not be simulated – but use a simple system whereby AI units will lose a number of HP every turn through the game loop  Player also loses a number of HP proportional to number of AI units  A unit dies when HP = 0, and is respawned

 “Brain”  All AI units share the same “brain”  The brain evolves as the unit gains experience with the player  Implement back-propagation so that the NN’s weights can be adjusted in real time  Assume all AI units evolve collectively  Expectations  AI become more aggressive if player is weak  AI become more withdrawn if player is strong  AI learns to stay in flock to have better chance of defeating player

 Initialize values for neural network  Number of neurons in each layer – 4 inputs, 3 hidden neurons, 3 output neurons

 Preparation for training  Initialize learning rate to 0.2 – tuned by trial-and- error with the aim of keeping the training time down while maintaining accuracy  Data is dumped into a text file so that it can be referred during debugging  Training loop – cycle through until… Calculated error is less than some specified value, OR Number of iterations reach a specified maximum

 Sample training data for NN double TrainingSet[14][7] = { //#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade 0, 1, 0, 0.2, 0.9, 0.1, 0.1, 0, 1, 1, 0.2, 0.9, 0.1, 0.1, 0, 1, 0, 0.8, 0.1, 0.1, 0.1,....  14 sets of input and output values  All data values are within range from 0.0-1.0, normalized  Use 0.1 for inactive (false) output and 0.9 for active (true) output – impractical to achieve 0 or 1 for NN output, so use reasonable target value

 Training data was chosen empirically  Assume a few arbitrary input conditions and then specified a reasonable response.  In practice, you are likely to design more training sets than what was shown in example  Training loop  Error initialize to 0, can calculated for each ‘epoch’ (once thru all 14 sets of inputs/outputs)  For each set of data, 1. Feed-forward performed 2. Error calculated and accumulated 3. Back-propagation to adjust connection weights  Average error calculated (divide by 14)

 Updating AI Units – cycle thru all  Calculate distance from the current unit to target  Check if target is killed. If it is, then check where current unit is in relation to target (if it is in the combat range). If it is, retrain NN to reinforce chase behavior (unit doing something right, so train it to be more aggressive). Otherwise, retraining NN will reinforce other behaviors.

 Use the trained NN for real-time decision- making  Under the current set of conditions in real-time, output will show which behavior the unit should take  REMEMBER: Input values have to be consistently normalized as well before feeding thru NN!  Feed-forward is applied  Output values are then examined to derive the proper choice of behavior Simple way – just select output with highest activation

 Some outcomes of this AI:  If target is left to die without inflicting much damage on the units  AI units will adapt to attack more often (target perceived as weak)  If target inflicts massive damage  AI units will adapt to avoid target more (target perceived as strong)  AI units also adapt to flock together if they are faced with strong target

 Some outcomes of this AI:  Interesting emergent behavior  Leaders emerge in flocks, intermediate and trailing units will follow the lead. Q: How is it possible to design such behaviors??

For games. 1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor.

Similar presentations

Presentation on theme: "For games. 1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

For games. 1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor.

Similar presentations

Presentation on theme: "For games. 1. Control  Controllers for robotic applications.  Robot’s sensory system provides inputs and output sends the responses to the robot’s motor."— Presentation transcript:

Similar presentations

About project

Feedback