Download presentation
Presentation is loading. Please wait.
1
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods
2
Early learning algorithms Designed for single layer neural networks Generally more limited in their applicability Some of them are –Perceptron learning –LMS or Widrow- Hoff learning –Grossberg learning
3
Perceptron learning 1.Randomly initialize all the networks weights. 2.Apply inputs and find outputs ( feedforward). 3. compute the errors. 4.Update each weight as 5.Repeat steps 2 to 4 until the errors reach the satisfactory level.
4
Performance Optimization Gradient based methods
5
Basic Optimization Algorithm
6
Steepest Descent (first order Taylor expansion)
7
Example
8
Plot
9
LMS or Widrow- Hoff learning First introduce ADALINE (ADAptive LInear NEuron) Network
10
LMS or Widrow- Hoff learning or Delta Rule ADALINE network same basic structure as the perceptron network
11
Approximate Steepest Descent
12
Approximate Gradient Calculation
13
LMS Algorithm This algorithm inspire from steepest descent algorithm
14
Multiple-Neuron Case
15
Difference between perceptron learning and LMS learning DERIVATIVE Linear activation function has derivative but sign (bipolar, unipolar) has not derivative
16
Grossberg learning (associated learning) Sometimes known as instar and outstar training Updating rule: Where could be the desired input values (instar training, example: clustering) or the desired output values (outstar) depending on network structure. Grossberg network (use Hagan to more details)
17
First order gradient method Back propagation
18
Multilayer Perceptron R – S 1 – S 2 – S 3 Network
19
Example
20
Elementary Decision Boundaries
22
Total Network
23
Function Approximation Example
24
Nominal Response
25
Parameter Variations
26
Multilayer Network
27
Performance Index
28
Chain Rule
29
Gradient Calculation
30
Steepest Descent
31
Jacobian Matrix
32
Backpropagation (Sensitivities)
33
Initialization (Last Layer)
34
Summary
35
Back-propagation training algorithm Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. Network activation Forward Step Error propagation Backward Step Summary
36
Example: Function Approximation
37
Network
38
Initial Conditions
39
Forward Propagation
40
Transfer Function Derivatives
41
Backpropagation
42
Weight Update
43
Choice of Architecture
44
Choice of Network Architecture
45
Convergence Global minium (left) local minimum (rigth)
46
Generalization
47
Disadvantage of BP algorithm Slow convergence speed Sensitivity to initial conditions Trapped in local minima Instability if learning rate is too large Note: despite above disadvantages, it is popularly used in control community. There are numerous extensions to improve BP algorithm.
48
Improved BP algorithms ( first order gradient method) 1.BP with momentum 2.Delta- bar- delta 3.Decoupled momentum 4.RProp 5.Adaptive BP 6.Trinary BP 7.BP with adaptive gain 8.Extended BP
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.