Download presentation
Presentation is loading. Please wait.
1
Neural Network - 2 Mayank Vatsa
Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
2
The Perceptron Binary classifier functions
Threshold activation function
3
The Perceptron: Threshold Activation Function
Binary classifier functions Threshold activation function
4
The vector of derivatives that form the gradient can be obtained by differentiating E
Derivation of GDR … The weight update rule for standard gradient descent can be summarized as
5
Example Input 0: x1 = 12 Input 1: x2 = 4 Weight 0: 0.5 Weight 1: -1
Input 0 * Weight 0 ⇒ 12 * 0.5 = 6 Input 1 * Weight 1 ⇒ 4 * -1 = -4 Sum = = 2 Output = sign(sum) ⇒ sign(2) ⇒ +1
6
AB problem
7
XOR problem
8
XOR problem 1
9
XOR problem 2
10
XOR problem
11
XOR problem 1 1
12
XOR problem 2 2
13
XOR problem AND 1 2
14
xn x1 x2 Input Output Three-layer networks Hidden layers
15
Feed-forward layered network
Output layer 2nd hidden layer 1st hidden layer Input layer
16
Different Non-Linearly Separable Problems
Types of Decision Regions Exclusive-OR Problem Class Separation Most General Region Shapes Structure Single-Layer Half Plane Bounded By Hyperplane A B B A Two-Layer Convex Open Or Closed Regions A B B A Arbitrary (Complexity Limited by No. of Nodes) Three-Layer A B B A
17
In the perceptron/single layer nets, we used gradient descent on the error function to find the correct weights: D wji = (tj - yj) xi We see that errors/updates are local to the node i.e. the change in the weight from node i to output j (wji) is controlled by the input that travels along the connection and the error signal from output j x1 (tj - yj)
18
x1 ? x2 But with more layers how are the weights for the first 2 layers found when the error is computed for layer 3 only? There is no direct error signal for the first layers!!!!!
19
Objective of Multilayer NNet
Training set x1 x2 xn w1 w2 wm x = Goal for all k
20
Learn the Optimal Weight Vector
Training set x1 x2 xn x = Goal for all k w1 w2 wm
21
First Complex NNet Algorithm
Multilayer feedforward NNet
22
Training: Backprop algorithm
Searches for weight values that minimize the total error of the network over the set of training examples. Repeated procedures of the following two passes: Forward pass: Compute the outputs of all units in the network, and the error of the output layers. Backward pass: The network error is used for updating the weights (credit assignment problem). Starting at the output layer, the error is propagated backwards through the network, layer by layer. This is done by recursively computing the local gradient of each neuron.
23
Back-propagation training algorithm illustrated:
Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. Network activation Error computation Forward Step Error propagation Backward Step
24
Learning Algorithm: Backpropagation
Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n.
25
Learning Algorithm: Backpropagation
26
Learning Algorithm: Backpropagation
27
Learning Algorithm: Backpropagation
Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer.
28
Learning Algorithm: Backpropagation
29
Learning Algorithm: Backpropagation
30
Learning Algorithm: Backpropagation
Propagation of signals through the output layer.
31
Learning Algorithm: Backpropagation
In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal d of output layer neuron
32
Learning Algorithm: Backpropagation
The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.
33
Learning Algorithm: Backpropagation
The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.
34
Learning Algorithm: Backpropagation
The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below:
35
Learning Algorithm: Backpropagation
When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
36
Learning Algorithm: Backpropagation
When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
37
Learning Algorithm: Backpropagation
When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
38
Bias in NNet
39
Bias in NNet
40
Bias in NNet This example shows that “The bias term allows us to make affine transformations to the data.”
41
Single-Hidden Layer NNet
1 2 m x1 x2 xn w1 w2 wm x = Hidden Units
42
Radial Basis Function Networks
y 1 2 m x1 x2 xn w1 w2 wm x = Hidden Units
43
Non-Linear Models Weights Adjusted by the Learning process
44
Typical Radial Functions
Gaussian Hardy Multiquadratic Inverse Multiquadratic
45
Gaussian Basis Function (=0.5,1.0,1.5)
46
Most General RBF + + +
47
The Topology of RBF NNet
x1 x2 xn y1 ym Output Units Classes Hidden Units Subclasses Inputs Feature Vectors
48
Radial Basis Function Networks
Training set x1 x2 xn w1 w2 wm x = Goal for all k
49
Learn the Optimal Weight Vector
Training set x1 x2 xn x = Goal for all k w1 w2 wm
50
Regularization Training set If regularization is not needed, set Goal
for all k
51
Learn the Optimal Weight Vector
Minimize
52
Learning Kernel Parameters
x1 x2 xn y1 ym 1 2 l w11 w12 w1l wm1 wm2 wml Training set Kernels
53
What to Learn? Weights wij’s Centers j’s of j’s Widths j’s of j’s
x1 x2 xn y1 ym 1 2 l w11 w12 w1l wm1 wm2 wml Weights wij’s Centers j’s of j’s Widths j’s of j’s Number of j’s
54
Two-Stage Training w11 w12 w1l wm1 wm2 wml x1 x2 xn y1 ym 1 2 l
Step 2 Determines wij’s. Step 1 Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.
55
Learning Rule Backpropagation learning rule will apply in RBF also.
56
Three-layer RBF neural network
57
Auto-encoders
58
Creating your own architecture
Variables: input Hidden layer characteristics Output layer Learn parameters and weights
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.