Download presentation
Presentation is loading. Please wait.
Published byRoy Stevens Modified over 9 years ago
1
Support Vector Machine (SVM) Based on Nello Cristianini presentation http://www.support-vector.net/tutorial.html
2
Basic Idea Use Linear Learning Machine (LLM). Overcome the linearity constraints: Map to non-linearly to higher dimension. Select between hyperplans Use margin as a test Generalization depends on the margin.
3
General idea Original Problem Transformed Problem
4
Kernel Based Algorithms Two separate learning functions Learning Algorithm: in an imbedded space Kernel function performs the embedding
5
Basic Example: Kernel Perceptron Hyperplane classification f(x)= +b = h(x)= sign(f(x)) Perceptron Algorithm: Sample: (x i,t i ), t i {-1,+1} If t i < 0 THEN /* Error*/ w k+1 = w k + t i x i k=k+1
6
Recall Margin of hyperplan w Mistake bound
7
Observations Solution is a linear combination of inputs w = a i t i x i where a i >0 Mistake driven Only points on which we make mistake influence! Support vectors The non-zero a i
8
Dual representation Rewrite basic function: f(x) = +b = a i t i +b w = a i t i x i Change update rule: IF t j ( a i t i +b) < 0 THEN a j = a j +1 Observation: Data only inside inner product!
9
Limitation of Perceptron Only linear separations Only converges for linearly separable data Only defined on vectorial data
10
The idea of a Kernel Embed data to a different space Possibly higher dimension Linearly separable in the new space. Original Problem Transformed Problem
11
Kernel Mapping Need only to compute inner-products. Mapping: M(x) Kernel: K(x,y) = Dimensionality of M(x): unimportant! Need only to compute K(x,y) Using it in the embedded space: Replace by K(x,y)
12
Example x=(x 1, x 2 ); z=(z 1, z 2 ); K(x,z) = ( ) 2
13
Polynomial Kernel Original Problem Transformed Problem
14
Kernel Matrix
15
Example of Basic Kernels Polynomial K(x,z)= ( ) d Gaussian K(x,z)= exp{- ||x-z ||2 /2 }
16
Kernel: Closure Properties K(x,z) = K 1 (x,z) + c K(x,z) = c*K 1 (x,z) K(x,z) = K 1 (x,z) * K 2 (x,z) K(x,z) = K 1 (x,z) + K 2 (x,z) Create new kernels using basic ones!
17
Support Vector Machines Linear Learning Machines (LLM) Use dual representation Work in the kernel induced feature space f(x) = a i t i K(x i, x) +b Which hyperplane to select
18
Generalization of SVM PAC theory: error = O( Vcdim / m) Problem: Vcdim >> m No preference between consistent hyperplanes
19
Margin based bounds H: Basic Hypothesis class conv(H): finite convex combinations of H D: Distribution over X and {+1,-1} S: Sample of size m over D
20
Margin based bounds THEOREM: for every f in conv(H)
21
Maximal Margin Classifier Maximizes the margin Minimizes the overfitting due to margin selection. Increases margin Rather than reduce dimensionality
22
SVM: Support Vectors
23
Margins Geometric Margin: min i t i f(x i )/ ||w|| Functional margin: min i t i f(x i ) f(x)
24
Main trick in SVM Insist on functional marginal at least 1. Support vectors have margin 1. Geometric margin = 1 / || w|| Proof.
25
SVM criteria Find a hyperplane (w,b) That Maximizes: || w || 2 = Subject to: for all i t i ( +b) 1
26
Quadratic Programming Quadratic goal function. Linear constraint. Unique Maximum. Polynomial time algorithms.
27
Dual Problem Maximize W(a) = a i - 1/2 i,j a i t i a j t j K(x i, x j ) +b Subject to i a i t i =0 a i 0
28
Applications: Text Classify a text to given categories Sports, news, business, science, … Feature space Bag of words Huge sparse vector!
29
Applications: Text Practicalities: M w (x) = tf w log (idf w ) / K ft w = text frequency of w idf w = inverse document frequency idf w = # documents / # documents with w Inner product sparse vectors SVM: finds a hyperplan in “document space”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.