Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability, Introduction of Feedforward Network
Prof. Pushpak Bhattacharyya, IIT Bombay 2 Test for Linear Separability (LS) Theorem: A function is linearly separable iff the vectors corresponding to the function do not have a Positive Linear Combination (PLC) PLC – Both a necessary and sufficient condition. X 1, X 2, …, X m - Vectors of the function Y 1, Y 2, …, Y m - Augmented negated set Prepending -1 to the 0-class vector X i and negating it, gives Y i
Prof. Pushpak Bhattacharyya, IIT Bombay 3 Example (1) - XNOR The set {Y i } has a PLC if ∑ P i Y i = 0, 1 ≤ i ≤ m –where each P i is a non-negative scalar and –atleast one P i > 0 Example : 2 bit even- parity (X-NOR function) X1X1 +Y1Y1 X2X2 -Y2Y2 X3X3 -Y3Y3 X4X4 +Y4Y4
Prof. Pushpak Bhattacharyya, IIT Bombay 4 Example (1) - XNOR P 1 [ ] T + P 2 [ ] T + P 3 [ ] T + P 4 [ ] T = [ ] T All P i = 1 gives the result. For Parity function, PLC exists => Not linearly separable.
Prof. Pushpak Bhattacharyya, IIT Bombay 5 Example - AND AND does not have PLC. Suppose not, P 1, P 2, P 3,P 4 s.t. 4 P i X i T = 0 i=1
Prof. Pushpak Bhattacharyya, IIT Bombay 6 AND (Contd) X 1 = [1,0,0] X 2 = [1,0,-1] X 3 = [1,-1,0] X 4 = [-1,1,1] P 1 [1,0,0] T + P 2 [1,0,-1] T + P 3 [1,-1,0] T + P 4 [-1,1,1] T = [0,0,0] T P 1 + P 2 + P 3 - P 4 = 0- (1) - P 3 + P 4 = 0- (2) - P 2 + P 4 = 0- (3)
Prof. Pushpak Bhattacharyya, IIT Bombay 7 AND (contd) This can be satisfied if P 1 = P 2 = P 3 = P 4 = 0 So, PLC does not exist. So, AND is computable by perceptron. However finding PLC is not efficient.
Prof. Pushpak Bhattacharyya, IIT Bombay 8 Exercise Try to learn the SNNS package (available on CS621 homepage). Try PLC test for 1.Different boolean functions. 2.Majority function (1 if #1s > #0s) 3.Comparator function (1 if decimal(Y) > decimal(X) 4.Odd parity 5.IRIS data
Prof. Pushpak Bhattacharyya, IIT Bombay 9 Study of Linear Separability W. Xj = 0 defines a hyperplane in the (n+1) dimension. => W vector and Xj vectors are perpendicular to each other.... θ w1w1 w2w2 w3w3 wnwn x2x2 x3x3 xnxn
Prof. Pushpak Bhattacharyya, IIT Bombay 10 Linear Separability X k+1 - X k X m - X1+X1+ + X 2 Xk+Xk+ + Positive set : w. Xj > 0 j≤k Negative set : w. Xj k Separating hyperplane
Prof. Pushpak Bhattacharyya, IIT Bombay 11 Linear Separability w. X j = 0 => w is normal to the hyperplane which separates the +ve points from the –ve points. In this computing paradigm, computation means “placing hyperplanes”. Functions computable by the perceptron are called –“threshold functions” because of comparing ∑ w i X i with θ (the threshold) –“linearly separable” because of setting up linear surfaces to separate +ve and –ve points
Prof. Pushpak Bhattacharyya, IIT Bombay 12 Concept of Separability All these need the concept of separability Perceptrons Support Vector Machines Feed-forward networks
Prof. Pushpak Bhattacharyya, IIT Bombay 13 SVMs Variations of the idea of linear separability – Separating plane Vapnik – Statistical Learning Theory
Prof. Pushpak Bhattacharyya, IIT Bombay 14 Feed-Forward Networks Motivation: Most real life data is not linearly separable. If you can’t separate by a single plane, use more planes.
Prof. Pushpak Bhattacharyya, IIT Bombay 15 Feed-Forward Networks (contd) (0,0) (0,1) (1,0) (1,1)0-class 1-class 0-class 1-class x2x2 x1x1 h2h2 h1h1 y hidden layers