Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prof. Pushpak Bhattacharyya, IIT Bombay

Similar presentations


Presentation on theme: "Prof. Pushpak Bhattacharyya, IIT Bombay"— Presentation transcript:

1 Prof. Pushpak Bhattacharyya, IIT Bombay
CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Feedforward Nets Prof. Pushpak Bhattacharyya, IIT Bombay

2 Prof. Pushpak Bhattacharyya, IIT Bombay
Perceptron Cannot compute non-linearly separable data Real life problems are typically non-linear Prof. Pushpak Bhattacharyya, IIT Bombay

3 Basic Computing Paradigm
Setting up hyperplanes Use higher power surfaces Tolerate error Use multiple perceptrons Prof. Pushpak Bhattacharyya, IIT Bombay

4 A quadratic surface can separate the data. Difficult to train.
Prof. Pushpak Bhattacharyya, IIT Bombay

5 Prof. Pushpak Bhattacharyya, IIT Bombay
Pocket Algorithm Algorithm evolved in 1985 – essentially uses PTA Basic Idea: Always preserve the best weight obtained so far in the “pocket” Change weights, if found better (i.e. changed weights result in reduced error). Tolerate error Used in connectionist expert systems Prof. Pushpak Bhattacharyya, IIT Bombay

6 Multilayer Feedforward Network
Geometrically y x h2 h1 x x2 x1 Prof. Pushpak Bhattacharyya, IIT Bombay

7 Prof. Pushpak Bhattacharyya, IIT Bombay
Algebraically LINEARIZATION X1  X2 = X1 X2’ + X1’X2 = OR (AND (X1 , X2’ ) , AND (X1’ , X2 )) Prof. Pushpak Bhattacharyya, IIT Bombay

8 Prof. Pushpak Bhattacharyya, IIT Bombay
Example Output layer neurons Input layer neurons Hidden layer neurons 1 & 3 are also called computation neurons x2 x1 y 0.5 1 1.5 Prof. Pushpak Bhattacharyya, IIT Bombay

9 Prof. Pushpak Bhattacharyya, IIT Bombay
Hidden Layer Neurons They contribute to the power of network. How many hidden layers ? How many neurons/layer ? Pure feed-forward network – no jumping of connections Prof. Pushpak Bhattacharyya, IIT Bombay

10 Prof. Pushpak Bhattacharyya, IIT Bombay
XOR Example = 0.5 w1=1 w2=1 x1x2 1 1 x1x2 1.5 -1 -1 1.5 x1 x2 Prof. Pushpak Bhattacharyya, IIT Bombay

11 Prof. Pushpak Bhattacharyya, IIT Bombay
Constraints Constraints on neurons in multi-layer perceptrons : The compute-neurons must be non-linear. Non linearity is the source of power. Prof. Pushpak Bhattacharyya, IIT Bombay

12 Prof. Pushpak Bhattacharyya, IIT Bombay
Explanation y y = m1(h1.w1 + h2.w2) + c1 h1 = m2(w3.x1 + w4.x2) + c2 h2 = m3(w5.x1 + w6.x2) + c3 Substituting h1 & h2 y = k1x1 + k2x2 + c’ w2 w1 h2 h1 w5 w3 w6 w4 x1 x2 Prof. Pushpak Bhattacharyya, IIT Bombay

13 Prof. Pushpak Bhattacharyya, IIT Bombay
Explanation (Contd) y = mx + c yU yL y > yU is regarded as y = 1 y < yL is regarded as y = 0 yU > yL Prof. Pushpak Bhattacharyya, IIT Bombay

14 Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron Can a linear neuron compute XOR. We want y = w1x1 + w2x2 + c : characteristic y w2 w1 x2 x1 Prof. Pushpak Bhattacharyya, IIT Bombay

15 Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 1) for (1,1), (0,0) y < yL For (0,1), (1,0) y > yU yU > yL Can (w1, w2, c) be found Prof. Pushpak Bhattacharyya, IIT Bombay

16 Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 2) (0,0) y = w1.0 + w2.0 + c = c y < yL c < yL - (1) (0,1) y = w1.1 + w2.0 + c y > yU w1 + c > yU - (2) Prof. Pushpak Bhattacharyya, IIT Bombay

17 Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 3) 1,0 w2 + c > yU - (3) 1,1 w1 + w2 + c < yL - (4) yU > yL - (5) Prof. Pushpak Bhattacharyya, IIT Bombay

18 Prof. Pushpak Bhattacharyya, IIT Bombay
Linear Neuron (Contd 4) c < yL - (1) w1 + c > yU - (2) w2 + c > yU - (3) w1 + w2 + c < yL - (4) yU > yL - (5) Inconsistent Prof. Pushpak Bhattacharyya, IIT Bombay

19 Prof. Pushpak Bhattacharyya, IIT Bombay
Observations A linear neuron cannot compute XOR A multilayer network with linear characteristic neurons is collapsible to a single linear neuron. Therefore addition of layers does not contribute to computing power. Neurons in feedforward network must be non-linear Threshold elements will do iff we can linearize a non-linearly function. Prof. Pushpak Bhattacharyya, IIT Bombay

20 Prof. Pushpak Bhattacharyya, IIT Bombay
Linearity Linearity is not in general possible – Need to know the function in closed form. Very large space even for boolean data. Prof. Pushpak Bhattacharyya, IIT Bombay

21 Prof. Pushpak Bhattacharyya, IIT Bombay
Training Algorithm Looks at the pre-classified data Arrives at weight values Prof. Pushpak Bhattacharyya, IIT Bombay

22 Prof. Pushpak Bhattacharyya, IIT Bombay
Why won’t PTA do? Since we do not know desired outputs at hidden layer neurons, PTA cannot be applied. So apply a training method called GRADIENT DESCENT. Prof. Pushpak Bhattacharyya, IIT Bombay

23 Prof. Pushpak Bhattacharyya, IIT Bombay
Minima E Parameters (w1,w2….) Prof. Pushpak Bhattacharyya, IIT Bombay

24 Prof. Pushpak Bhattacharyya, IIT Bombay
Gradient Descent Ensured by GRADIENT DESCENT E – error wmn - parameter Prof. Pushpak Bhattacharyya, IIT Bombay

25 Prof. Pushpak Bhattacharyya, IIT Bombay
Sigmoid neurons Gradient Descent needs a derivative computation - not possible in perceptron due to the discontinuous step function used!  Sigmoid neurons with easy-to-compute derivatives required! (Radial basis functions are also differentiable) Computing power comes from non-linearity of sigmoid function. Prof. Pushpak Bhattacharyya, IIT Bombay

26 Prof. Pushpak Bhattacharyya, IIT Bombay
Summary Feed-forward network, pure or non-pure networks XOR computed by multi layer perceptron Non-linearity : must Gradient Descent Sigmoid Prof. Pushpak Bhattacharyya, IIT Bombay


Download ppt "Prof. Pushpak Bhattacharyya, IIT Bombay"

Similar presentations


Ads by Google