Download presentation
Presentation is loading. Please wait.
1
Prof. Pushpak Bhattacharyya, IIT Bombay
CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Feedforward Nets Prof. Pushpak Bhattacharyya, IIT Bombay
2
Prof. Pushpak Bhattacharyya, IIT Bombay
Perceptron Cannot compute non-linearly separable data Real life problems are typically non-linear Prof. Pushpak Bhattacharyya, IIT Bombay
3
Basic Computing Paradigm
Setting up hyperplanes Use higher power surfaces Tolerate error Use multiple perceptrons Prof. Pushpak Bhattacharyya, IIT Bombay
4
A quadratic surface can separate the data. Difficult to train.
Prof. Pushpak Bhattacharyya, IIT Bombay
5
Prof. Pushpak Bhattacharyya, IIT Bombay
Pocket Algorithm Algorithm evolved in 1985 – essentially uses PTA Basic Idea: Always preserve the best weight obtained so far in the “pocket” Change weights, if found better (i.e. changed weights result in reduced error). Tolerate error Used in connectionist expert systems Prof. Pushpak Bhattacharyya, IIT Bombay
6
Multilayer Feedforward Network
Geometrically y x h2 h1 x x2 x1 Prof. Pushpak Bhattacharyya, IIT Bombay
7
Prof. Pushpak Bhattacharyya, IIT Bombay
Algebraically LINEARIZATION X1 X2 = X1 X2’ + X1’X2 = OR (AND (X1 , X2’ ) , AND (X1’ , X2 )) Prof. Pushpak Bhattacharyya, IIT Bombay
8
Prof. Pushpak Bhattacharyya, IIT Bombay
Example Output layer neurons Input layer neurons Hidden layer neurons 1 & 3 are also called computation neurons x2 x1 y 0.5 1 1.5 Prof. Pushpak Bhattacharyya, IIT Bombay
9
Prof. Pushpak Bhattacharyya, IIT Bombay
Hidden Layer Neurons They contribute to the power of network. How many hidden layers ? How many neurons/layer ? Pure feed-forward network – no jumping of connections Prof. Pushpak Bhattacharyya, IIT Bombay
10
Prof. Pushpak Bhattacharyya, IIT Bombay
XOR Example = 0.5 w1=1 w2=1 x1x2 1 1 x1x2 1.5 -1 -1 1.5 x1 x2 Prof. Pushpak Bhattacharyya, IIT Bombay
11
Prof. Pushpak Bhattacharyya, IIT Bombay
Constraints Constraints on neurons in multi-layer perceptrons : The compute-neurons must be non-linear. Non linearity is the source of power. Prof. Pushpak Bhattacharyya, IIT Bombay
12
Prof. Pushpak Bhattacharyya, IIT Bombay
Explanation y y = m1(h1.w1 + h2.w2) + c1 h1 = m2(w3.x1 + w4.x2) + c2 h2 = m3(w5.x1 + w6.x2) + c3 Substituting h1 & h2 y = k1x1 + k2x2 + c’ w2 w1 h2 h1 w5 w3 w6 w4 x1 x2 Prof. Pushpak Bhattacharyya, IIT Bombay
13
Prof. Pushpak Bhattacharyya, IIT Bombay
Explanation (Contd) y = mx + c yU yL y > yU is regarded as y = 1 y < yL is regarded as y = 0 yU > yL Prof. Pushpak Bhattacharyya, IIT Bombay
14
Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron Can a linear neuron compute XOR. We want y = w1x1 + w2x2 + c : characteristic y w2 w1 x2 x1 Prof. Pushpak Bhattacharyya, IIT Bombay
15
Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 1) for (1,1), (0,0) y < yL For (0,1), (1,0) y > yU yU > yL Can (w1, w2, c) be found Prof. Pushpak Bhattacharyya, IIT Bombay
16
Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 2) (0,0) y = w1.0 + w2.0 + c = c y < yL c < yL - (1) (0,1) y = w1.1 + w2.0 + c y > yU w1 + c > yU - (2) Prof. Pushpak Bhattacharyya, IIT Bombay
17
Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 3) 1,0 w2 + c > yU - (3) 1,1 w1 + w2 + c < yL - (4) yU > yL - (5) Prof. Pushpak Bhattacharyya, IIT Bombay
18
Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 4) c < yL - (1) w1 + c > yU - (2) w2 + c > yU - (3) w1 + w2 + c < yL - (4) yU > yL - (5) Inconsistent Prof. Pushpak Bhattacharyya, IIT Bombay
19
Prof. Pushpak Bhattacharyya, IIT Bombay
Observations A linear neuron cannot compute XOR A multilayer network with linear characteristic neurons is collapsible to a single linear neuron. Therefore addition of layers does not contribute to computing power. Neurons in feedforward network must be non-linear Threshold elements will do iff we can linearize a non-linearly function. Prof. Pushpak Bhattacharyya, IIT Bombay
20
Prof. Pushpak Bhattacharyya, IIT Bombay
Linearity Linearity is not in general possible – Need to know the function in closed form. Very large space even for boolean data. Prof. Pushpak Bhattacharyya, IIT Bombay
21
Prof. Pushpak Bhattacharyya, IIT Bombay
Training Algorithm Looks at the pre-classified data Arrives at weight values Prof. Pushpak Bhattacharyya, IIT Bombay
22
Prof. Pushpak Bhattacharyya, IIT Bombay
Why won’t PTA do? Since we do not know desired outputs at hidden layer neurons, PTA cannot be applied. So apply a training method called GRADIENT DESCENT. Prof. Pushpak Bhattacharyya, IIT Bombay
23
Prof. Pushpak Bhattacharyya, IIT Bombay
Minima E Parameters (w1,w2….) Prof. Pushpak Bhattacharyya, IIT Bombay
24
Prof. Pushpak Bhattacharyya, IIT Bombay
Gradient Descent Ensured by GRADIENT DESCENT E – error wmn - parameter Prof. Pushpak Bhattacharyya, IIT Bombay
25
Prof. Pushpak Bhattacharyya, IIT Bombay
Sigmoid neurons Gradient Descent needs a derivative computation - not possible in perceptron due to the discontinuous step function used! Sigmoid neurons with easy-to-compute derivatives required! (Radial basis functions are also differentiable) Computing power comes from non-linearity of sigmoid function. Prof. Pushpak Bhattacharyya, IIT Bombay
26
Prof. Pushpak Bhattacharyya, IIT Bombay
Summary Feed-forward network, pure or non-pure networks XOR computed by multi layer perceptron Non-linearity : must Gradient Descent Sigmoid Prof. Pushpak Bhattacharyya, IIT Bombay
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.