Download presentation
Presentation is loading. Please wait.
Published byMargery Watson Modified over 8 years ago
1
Introduction to Machine Learning 236756 Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)
2
Linear Separators
3
Which Is Better?
4
Margin The margin of a linear separator is defined as the distance of the closest instance point to the linear hyperplane Large margins are intuitively more stable: If noise is added to data, then it is more likely to still be separated
5
The Margin
6
Hard-SVM
8
Hard-SVM Equivalent Formulation
9
Sample Complexity With Margin
10
NO! Margin must be relative to data scale (Could take any data of tiny margin and blow it up for free.)
11
Sample Complexity With Margin
13
Shattering With a Margin Separated with large margin Separated, but not with large margin
14
What does this replace? Sample Complexity of Hard-SVM with Margin
15
Soft-SVM 1
16
Soft-SVM: Equivalent Definition SRM (structural risk minimization) Hypothesis penalized by norm
17
Sample Complexity for Soft- SVM No dimensionality dependence
18
What About Computational Complexity in High Dimension?
19
The Representer Theorem
22
Gram
23
The Kernel
24
Polynomial Kernels
25
Gaussian Kernels (RBF: Radial Basis Functions)
26
Kernels As Prior Knowledge If we think that positive examples can (almost) be separated by some ellipse: then we should use polynomials of degree 2 What should we do if we believe that we can classify a text message using words in a dictionary? A Kernel encodes a measure of similarity between objects. Must be a valid inner product function.
27
Solving SVM’s Efficiently
28
SGD for SVM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.