Download presentation
Presentation is loading. Please wait.
Published byAngelina O’Connor’ Modified over 9 years ago
1
CAP6938 Neuroevolution and Artificial Embryogeny Neural Network Weight Optimization Dr. Kenneth Stanley January 18, 2006
2
Review Remember, the values of the weights and the topology determine the functionality Given a topology, how are weights optimized? Weights are just parameters on a structure ? ? ? ?? ? ? ??
3
Two Cases Output targets are known Output targets are not known X1X1 X2X2 H1H1 H2H2 out 1 out 2 w 11 w 21 w 12
4
Decision Boundaries ++ -+ OR function: 1 1 1 1 -1 1 -1 1 1 -1 -1 -1 InputOutput OR is linearly separable Linearly separable problems do not require hidden nodes (nonlinearities) Bias
5
Decision Boundaries XOR is not linearly separable Requires at least one hidden node -+ -+ XOR function: 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 InputOutput Bias
6
Hebbian Learning Change weights based on correlation of connected neurons Learning rules are local Simple Hebb Rule: Works best when relevance of inputs to outputs is independent Simple Hebb Rule grows weights unbounded Can be made incremental:
7
More Complex Local Learning Rules Hebbian Learning with a maximum magnitude: –Excitatory: –Inhibitory: Second terms are decay terms: forgetting –Happens when presynaptic node does not affect postsynaptic node Other rules are possible Videos: watch the connections change
8
Perceptron Learning Will converge on correct weights Single layer learning rule: Rule is applied until boundary is learned Bias
9
Backpropagation Designed for at least one hidden layer First, activation propagates to outputs Then, errors are computed and assigned Finally, weights are updated Sigmoid is a common activation function X1X1 X2X2 z1z1 z2z2 y1y1 y2y2 v 11 v 21 v 12 v 22 w 11 w 21 w 12 w 22 t1t1 t2t2 x’s are inputs z’s are hidden units y’s are outputs t’s are targets v’s are layer 1 weights w’s are layer 2 weights
10
Backpropagation Algorithm 1)Initialize weights 2)While stopping condition is false, for each training pair 1)Compute outputs by forward activation 2)Backpropagate error: 1)For each output unit, error 2) Weight correction 3)Send error back to hidden units 4)Calculate error contribution for each hidden unit: 5)Weight correction 3)Adjust weights by adding weight corrections (target minus output times slope) (Learning rate times error times hidden output)
11
Example Applications Anything with a set of examples and known targets XOR Character recognition NETtalk: reading English aloud Failure predicition Disadvantages: trapped in local optima
12
Output Targets Often Not Available (Stone, Sutton, and Kuhlmann 2005)
13
One Approach: Value Function Reinforcement Learning Divide the world into states and actions Assign values to states Gradually learn the most promising states and actions Start Goal 0 0 0 0 0 0 0 1 0 0 0 0
14
Learning to Navigate Start Goal 0 0 0 0 0 0 0 1 0 0 0 0 Start Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0.5 0 0 0 0 Start Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0.9 1 Start Goal 0 0 0 0 0 0 0 1 1 1 1 1 T=1 T=56 T=350 T=703
15
How to Update State/Action Values Q learning rule: Exploration increases Q-values’ accuracy The best actions to take in different states become known Works only in Markovian domains
16
Backprop In RL The state/action table can be estimated by a neural network The target learned by the network is the Q-value: NN Action State_description Value
17
Next Week: Evolutionary Computation For 1/23: Mitchell ch.1 (pp. 1-31) and ch.2 (pp. 35-80) Note Section 2.3 is "Evolving Neural Networks" For 1/25: Mitchell pp. 117-38, paper: No Free Lunch Theorems for Optimization (1996)No Free Lunch Theorems for Optimization by David H. Wolpert, William G. Macready EC does not require targets EC can be a kind of RL EC is policy search EC is more than RL
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.