Other Classification Models: Support Vector Machine (SVM) COMP5331 Other Classification Models: Support Vector Machine (SVM) Prepared by Raymond Wong Presented by Raymond Wong raywong@cse COMP5331
What we learnt for Classification Decision Tree Bayesian Classifier Nearest Neighbor Classifier COMP5331
Other Classification Models Support Vector Machine (SVM) Neural Network Recurrent Neural Network COMP5331
Support Vector Machine Support Vector Machine (SVM) Linear Support Vector Machine Non-linear Support Vector Machine COMP5331
Support Vector Machine Advantages: Can be visualized Accurate when the data is well partitioned COMP5331
Linear Support Vector Machine x2 w1x1 + w2x2 + b > 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b < 0 COMP5331
Linear Support Vector Machine COMP5331
Linear Support Vector Machine COMP5331
Linear Support Vector Machine COMP5331
Linear Support Vector Machine x2 Support Vector x1 Margin COMP5331 We want to maximize the margin Why?
Linear Support Vector Machine x2 w1x1 + w2x2 + b - D = 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + D = 0 COMP5331
Linear Support Vector Machine Let y be the label of a point x2 +1 +1 w1x1 + w2x2 + b - 1 0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 -1 w1x1 + w2x2 + b + 1 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0 COMP5331
Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b) 1 x2 +1 +1 w1x1 + w2x2 + b - 1 0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b) 1 -1 w1x1 + w2x2 + b + 1 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0 COMP5331
Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b) 1 x2 +1 +1 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b) 1 Margin = |(b+1) – (b-1)| -1 x1 = 2 Margin w1x1 + w2x2 + b + 1 = 0 COMP5331 We want to maximize the margin
Linear Support Vector Machine Maximize Subject to for each data point (x1, x2, y) where y is the label of the point (+1/-1) = 2 Margin y(w1x1 + w2x2 + b) 1 COMP5331
Linear Support Vector Machine Minimize Subject to for each data point (x1, x2, y) where y is the label of the point (+1/-1) 2 y(w1x1 + w2x2 + b) 1 COMP5331
Linear Support Vector Machine Minimize Subject to for each data point (x1, x2, y) where y is the label of the point (+1/-1) Quadratic objective Linear constraints y(w1x1 + w2x2 + b) 1 Quadratic programming COMP5331
Linear Support Vector Machine We have just described 2-dimensional space We can divide the space into two parts by a line For n-dimensional space where n >=2, We use a hyperplane to divide the space into two parts COMP5331
Support Vector Machine Support Vector Machine (SVM) Linear Support Vector Machine Non-linear Support Vector Machine COMP5331
Non-linear Support Vector Machine x2 x1 COMP5331
Non-linear Support Vector Machine Two Steps Step 1: Transform the data into a higher dimensional space using a “nonlinear” mapping Step 2: Use the Linear Support Vector Machine in this high-dimensional space COMP5331
Non-linear Support Vector Machine x2 x1 COMP5331