Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic 7 Support Vector Machine for Classification.

Similar presentations


Presentation on theme: "Topic 7 Support Vector Machine for Classification."— Presentation transcript:

1 Topic 7 Support Vector Machine for Classification

2 Outline Linear Maximal Margin Classifier for Linearly Separable Data Linear Soft Margin Classifier for Overlapping Classes The Nonlinear Classifier

3 Linear Maximal Margin Classifier for linearly Separable Data

4 Goal: seeking an optimal separating plane. – That is, among all the hyperplanes that minimizes the training error (empirical risk), find the one with the largest margin. A classifier with a larger margin might have better performance in generalization; on the other hand, a classifier with a smaller margin might have a higher expected risk. Linear Maximal Margin Classifier for linearly Separable Data

5 Canonical hyperplane 1. Minimize the training error

6 maximize margin → minimize w T w 2. Maximize the margin

7

8 Linear Maximal Margin Classifier for linearly Separable Data

9 Rosenblatt ’ s Algorithm

10 Pattern= Target= norm = [1 1 1 [ 1.4142 1 2 1 2.2361 2 -1 1 2.2361 2 0 1 2.0000 -1 2 -1 2.2361 -2 1 -1 2.2361 -1 -1 -1 1.4142 -2 -2 -1 ] 2.8284 ] -> R K=[ 2 3 1 2 1 -1 -2 -4 3 5 0 2 3 0 -3 -6 1 0 5 4 -4 -5 -1 -2 2 2 4 4 -2 -4 -2 -4 1 3 -4 -2 5 4 -1 -2 -1 0 -5 -4 4 5 1 2 -2 -3 -1 -2 -1 1 2 4 -4 -6 -2 -4 -2 2 4 8 ]

11 1st iteration α=[0 0 0 0 0 0 0 0], b=0, R=2.8284 x1=[1 1]; y1=1; k(:,1)=[2 3 1 2 1 -1 -2 -4] X2=[1 2]; y2=1; k(:,2)=[3 5 0 2 3 0 -3 -6] X3=[2 -1];y3=1;k(:,3)=[1 0 5 4 -4 -5 -1 -2] 1*[1*1+8]=9>0 X4=[2 0];y4=1; k(:,4)=[2 2 4 4 -2 -4 -2 -4] 1*[1*2+8]=10>0

12 1st iteration X5=[-1 2];y5=-1;k(:,5)=[1 3 -4 -2 5 4 -1 -2] (-1)*[1*1+8]= -9 α=[1 0 0 0 1 0 0 0], b=8-8=0 X6=[-2 1];y6=-1;k(:,6)=[ -1 0 -5 -4 4 5 1 2] (-1)*[1*(-1)+(-1)*4+0]= 5>0 X7=[-1 -1];y7=-1;k(:,7)=[ -2 -3 -1 -2 -1 1 2 4] (-1)*[1*(-2)+(-1)*(-1)+0]=1>0 X8=[-2 -2];y8=-1;k(:,8)=[-4 -6 -2 -4 -2 2 4 8] (-1)*[1*(-4)+(-1)*(-2)+0]=2>0

13 2ed iteration α=[1 0 0 0 1 0 0 0], b=0, R=2.8284 x1=[1 1]; y1=1; k(:,1)=[2 3 1 2 1 -1 -2 -4] 1*[1*2+(-1)*1+0]=1>0 X2=[1 2]; y2=1; k(:,2)=[3 5 0 2 3 0 -3 -6] 1*[1*3+(-1)*3+0]=0 α=[1 1 0 0 1 0 0 0], b=0+8=8 X3=[2 -1];y3=1;k(:,3)=[1 0 5 4 -4 -5 -1 -2] 1*[1*1+1*0+(-1)*(-4)+8]=14>0 X4=[2 0];y4=1; k(:,4)=[2 2 4 4 -2 -4 -2 -4] 1*[1*2+1*2+(-1)*(-2)+8]=14>0

14 2ed iteration X5=[-1 2];y5=-1;k(:,5)=[1 3 -4 -2 5 4 -1 -2] (-1)*[1*1+1*3+(-1)*5+8]= -7 α=[1 1 0 0 2 0 0 0], b=8-8=0 X6=[-2 1];y6=-1;k(:,6)=[ -1 0 -5 -4 4 5 1 2] (-1)*[1*(-1)+1*0+(-2)*4+0]=9>0 X7=[-1 -1];y7=-1;k(:,7)=[ -2 -3 -1 -2 -1 1 2 4] (-1)*[1*(-2)+1*(-3)+(-2)*(-1)+0]=3>0 X8=[-2 -2];y8=-1;k(:,8)=[-4 -6 -2 -4 -2 2 4 8] (-1)*[1*(-4)+1*(-6)+(-2)*(-2)+0]=6>0

15 3rd iteration α=[1 1 0 0 2 0 0 0], b=0, R=2.8284 x1=[1 1]; y1=1; k(:,1)=[2 3 1 2 1 -1 -2 -4] 1*[1*2+1*3+(-2)*1+0]=3>0 X2=[1 2]; y2=1; k(:,2)=[3 5 0 2 3 0 -3 -6] 1*[1*3+1*(5)+(-2)*3+0]=2>0 X3=[2 -1];y3=1;k(:,3)=[1 0 5 4 -4 -5 -1 -2] 1*[1*1+1*0+(-2)*(-4)+0]=9>0 X4=[2 0];y4=1; k(:,4)=[2 2 4 4 -2 -4 -2 -4] 1*[1*2+1*2+(-2)*(-2)+0]=8>0 X5=[-1 2];y5=-1;k(:,5)=[1 3 -4 -2 5 4 -1 -2] (-1)*[1*1+1*3+(-2)*5+0]= 6>0 X6=[-2 1];y6=-1;k(:,6)=[-1 0 -5 -4 4 5 1 2] (-1)*[1*(-1)+1*0+(-2)*4+0]=9>0 X7=[-1 -1];y7=-1;k(:,7)=[-2 -3 -1 -2 -1 1 2 4] (-1)*[1*(-2)+1*(-3)+(-2)*(-1)+0]=3>0 X8=[-2 -2];y8=-1;k(:,8)=[-4 -6 -2 -4 -2 2 4 8] (-1)*[1*(-4)+1*(-6)+(-2)*(-2)+0]=6>0

16 f(x)=sum(z.*y.*k(x,x)')+b=1*(1*x1+1*x2)+1*(1*x1+2*x2)+2*( -1*x1+2*x2)+0=7x2

17 Linear Maximal Margin Classifier for linearly Separable Data

18

19 Linear Soft Margin Classifier for Overlapping Classes Soft margin

20

21 2-parameter Sequential Minimal Optimization Algorithm At every step, SMO chooses two Lagrange multiplier to jointly optimize, finds the optimal values for these multipliers, and updates the SVM to reflect the new optimal values. Heuristic to choose which multipliers to optimize – first multiplier is the multiplier of the pattern with the largest current prediction error – Second multiplier is the multiplier of the pattern with the smallest current prediction error

22 Step 1. Choose 2 multiplier2 α 1 and α 2 Step 2. Define bounds for α2 If y1≠y2, If y1=y2, Step 3. Update α2 Step 4. Update α 1

23

24 K=[ 2 3 1 1 2 1 -1 0 -2 -4 3 5 1 0 2 3 0 0 -3 -6 1 1 1 2 2 -1 -2 0 -1 -2 1 0 2 5 4 -4 -5 0 -1 -2 2 2 2 4 4 -2 -4 0 -2 -4 1 3 -1 -4 -2 5 4 0 -1 -2 -1 0 -2 -5 -4 4 5 0 1 2 0 0 0 0 0 0 0 0 0 0 -2 -3 -1 -1 -2 -1 1 0 2 4 -4 -6 -2 -2 -4 -2 2 0 4 8] Pattern= [1 1; 1 2; 1 0; 2 -1; 2 0; -1 2; -2 1; 0 0; -1 -1; -2 -2] Target= [ 1; 1; -1; 1; 1; -1; -1; 1; -1; -1 ] C=0.8

25 1st iteration F(x)-Y=[0 -1.4 3.4 7.1 5.7 -8 -10.9 -2.9 -3.8 -6.7] ’ α=[0.8 0 0 0.8 0 0.3 0.8 0 0 0]‘ b=1-(0.8*1*2+0.8*1*1+0.3*(-1)*1+0.8*(-1)*(-1)) =1-2.9= -1.9 f(x)=sum(z.*y.*k(x,x)')+b=0.8*(1*x1+1*x2)+0.8*(2*x1- 1*x2)+(-1)*(-0.3*x1+0.6*x2)+(-1)*(-1.6*x1+0.8*x2)-1.9 =4.3x1+1.4x2-1.9 e1 e2

26

27 U=0, V=0.8 η=k(4,4)+k(7,7)-2*k(4,7)=5+5-2*(-5)=20 α2_new=0.8+((-1)*(7.1-(-10.9))./20)= -0.1 α2_new,clipped=0 α1_new=0 α=[0.8 0 0 0 0 0.3 0 0 0 0]‘ b=1-(0.8*1*2+0.3*(-1)*1)=1-1.3= -0.3 f(x)=0.8*(1*x1+1*x2)+ (-1)*(-0.3*x1+0.6*x2)-0.3 =1.1x1+0.2x2-0.3

28

29 2ed iteration F(x)-Y=[0 0.2 1.8 0.7 0.9 0 -1.3 -1.3 -0.6 -1.9] ’ α=[0.8 0 0 0 0 0.3 0.8 0 0 0] ‘ U=0, V=0 η=k(3,3)+k(10,10)-2*k(3,10)=1+8-2*(-2)=13 α2_new=0+((-1)*(1.8-(-1.9))./13)=0.28 α2_new,clipped=0 α1_new=0 α=[0.8 0 0 0 0 0.3 0 0 0 0]‘ e1 e2

30 Trained by Rosenblatt ’ s Algorithm

31 Let α1*y1+α2*y2=R Case 1: y1=1, y2=1 (α1>=0, α2>=0, α1+α2=R>=0) α1α1 α2α2 R=0 R=C R=2C C C If C<R<2C, If 0 <R<=C,

32 Case 2: y1=-1, y2=1 (R=-α1+α2) α1α1 α2α2 R=-C R=0 R=C C C If -C<R<0, If 0= <R<C,

33 Case 3: y1=-1, y2=-1 (-α1-α2=R<=0) α1α1 α2α2 R=0 R=-C R=-2C C C If -2C<R<-C, If -C <=R<0,

34 Case 2: y1=1, y2=-1 (R=α1-α2) α1α1 α2α2 R=C R=0 R=-C C C If 0<=R<C, If -C <R<0,

35 The Nonlinear Classifier

36


Download ppt "Topic 7 Support Vector Machine for Classification."

Similar presentations


Ads by Google