Presentation is loading. Please wait.

Presentation is loading. Please wait.

SVMs in a Nutshell.

Similar presentations


Presentation on theme: "SVMs in a Nutshell."— Presentation transcript:

1 SVMs in a Nutshell

2 What is an SVM? Support Vector Machine
More accurately called support vector classifier Separates training data into two classes so that they are maximally apart

3 Simpler version Suppose the data is linearly separable
Then we could draw a line between the two classes

4 Simpler version But what is the best line? In SVM, we’ll use the maximum margin hyperplane

5 Maximum Margin Hyperplane

6 What if it’s non-linear?

7 Higher dimensions SVM uses a kernel function to map the data into a different space where it can be separated

8 What if it’s not separable?
Use linear separation, but allow training errors This is called using a “soft margin” Higher cost for errors = creation of more accurate model, but may not generalize Choice of parameters (kernel and cost) determines accuracy of SVM model To avoid over- or under-fitting, use cross validation to choose parameters

9 Some math Data: {(x1, c1), (x2, c2), …, (xn, cn)}
xi is vector of attributes/features, scaled ci is class of vector (-1 or +1) Dividing hyperplane: wx - b = 0 Linearly separable means there exists a hyperplane such that wxi - b > 0 if positive example and wxi - b < 0 if negative example w points perpendicular to hyperplane

10 More math wx - b = 0 Support vectors wx - b = 1 wx - b = -1
Distance between hyperplanes is 2/|w|, so minimize |w|

11 More math For all i, either w xi - b  1 or wx - b  -1
Can be rewritten: ci(w xi - b)  1 Minimize (1/2)|w| subject to ci(w xi - b)  1 This is a quadratic programming problem and can be solved in polynomial time

12 A few more details So far, assumed linearly separable
To get to higher dimensions, use kernel function instead of dot product; may be nonlinear transform Radial Basis Function is commonly used kernel: k(x, x’) = exp(||x - x’||2) [need to choose ] So far, no errors; soft margin: Minimize (1/2)|w| + C i Subject to ci(w xi - b)  1 - i C is error penalty


Download ppt "SVMs in a Nutshell."

Similar presentations


Ads by Google