Download presentation
Presentation is loading. Please wait.
1
Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)
2
Administrivia Reminder: straw poll RL or Unsup?
3
Nonlinear data projection Suppose you have a “projection function”: Original feature space “Projected” space Usually Do learning w/ linear model in Ex:
4
The catch... How many dimensions does have? For degree- k polynomial expansions: E.g., for k =4, d =256 (16x16 images), Yike! For “radial basis functions”,
5
Linear surfaces for cheap Can’t directly find linear surfaces in Have to find a clever “method” for finding them indirectly It’ll take (quite) a bit of work to get there... Will need different criterion than We’ll look for the “maximum margin” classifier Surface s.t. class 1 (“true”) data falls as possible on one side; class -1 (“false”) falls as far as possible on the other
6
Max margin hyperplanes Hyperplane Margin
7
Max margin is unique Hyperplane Margin
8
Exercise Given a hyperplane defined by a weight vector What is the equation for points on the surface of the hyperplane? What are the equations for points on the two margins? Give an expression for the distance between a point and the hyperplane (and/or either margin) What is the role of ?
9
5 minutes of math... A dot product (inner product) is a projection of one vector onto another When the projection of X onto w is equal to ww10, then X falls exactly onto the w hyperplane w Hyperplane X
10
5 minutes of math... BTW, are we sure that hyperplane is perpendicular to w ? Why?
11
5 minutes of math... BTW, are we sure that hyperplane is perpendicular to w ? Why? Consider any two vectors, and, falling exactly on the hyperplane, then: is some vector in the hyperplane is perpendicular to any vector in the hyperplane
12
5 minutes of math... Projections on one side of the line have dot products >0... w Hyperplane X
13
5 minutes of math... Projections on one side of the line have dot products >0...... and on the other, <0 w Hyperplane X
14
5 minutes of math... What is the distance from any vector X to the hyperplane? w X r =?
15
5 minutes of math... What is the distance from any vector X to the hyperplane? Write X as a point on plane + offset from plane w
16
5 minutes of math... Now:
17
5 minutes of math... Theorem: The distance, r, from any point X to the hyperplane defined by w and is given by: Lemma: The distance from the origin to the hyperplane is given by: Also: r>0 for points on one side of the hyperplane; r<0 for points on the other
18
Back to SVMs & margins The margins are parallel to hyperplane, so are defined by same w, plus constant offsets w b b
19
Back to SVMs & margins The margins are parallel to hyperplane, so are defined by same w, plus constant offsets Want to ensure that all data points are “outside” the margins w b b
20
Maximizing the margin So now we have a learning criterion function: Pick w to maximize b s.t. all points still satisfy Note: w.l.o.g. can rescale w arbitrarily (why?) So can formulate full problem as: Minimize: Subject to: But how do you do that? And how does this help?
21
Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems There are off-the-shelf methods to solve them Actually solving this is way, way beyond the scope of this class Consider it a black box If a solution exists, it will be found & be unique Expensive, but not intractably so
22
Nonseparable data What if the data isn’t linearly separable? Project into higher dim space (we’ll get there) Allow some “slop” in the system Allow margins to be violated “a little” w
23
The new “slackful” QP The are “slack variables” Allow margins to be violated a little Still want to minimize margin violations, so add them to QP instance: Minimize: Subject to:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.