Download presentation
Presentation is loading. Please wait.
Published byArnold Harper Modified over 8 years ago
2
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute
3
Best Linear Separator: Supporting Plane Method Maximize distance Between two parallel supporting planes Distance = “Margin” =
4
Soft Margin SVM Just add non-negative error vector z.
5
Method 2: Find Closest Points in Convex Hulls c d
6
Plane Bisects Closest Points d c
7
Find using quadratic program Many existing and new QP solvers.
8
Dual of Closest Points Method is Support Plane Method Solution only depends on support vectors:
9
One bad example? Convex Hulls Intersect! Same argument won’t work.
10
Don’t trust a single point! Each point must depend on at least two actual data points.
11
Depend on >= two points Each point must depend on at least two actual data points.
12
Depend on >= two points Each point must depend on at least two actual data points.
13
Depend on >= two points Each point must depend on at least two actual data points.
14
Depend on >= two points Each point must depend on at least two actual data points.
15
Final Reduced/Robust Set Each point must depend on at least two actual data points. Called Reduced Convex Hull
16
Reduced Convex Hulls Don’t Intersect Reduce by adding upper bound D
17
Find Closest Points Then Bisect No change except for D. D determines number of Support Vectors.
18
Dual of Closest Points Method is Soft Margin Method Solution only depends on support vectors:
19
What will linear SVM do?
20
Linear SVM Fails
21
High Dimensional Mapping trick http://www.slideshare.net/ankitksh arma/svm-37753690
23
Nonlinear Classification: Map to higher dimensional space IDEA: Map each point to higher dimensional feature space and construct linear discriminant in the higher dimensional space. Dual SVM becomes:
24
Kernel Calculates Inner Product
25
Final Classification via Kernels The Dual SVM becomes:
26
Generalized Inner Product By Hilbert-Schmidt Kernels (Courant and Hilbert 1953) for certain and K, e.g. Also kernels for nonvector data like strings, histograms, dna,…
27
Solve Dual SVM QP Recover primal variable b Classify new x Final SVM Algorithm Solution only depends on support vectors :
28
SVM AMPL DUAL MODEL
30
S5: Recal linear solution
31
RBF results on Sample Data
32
Have to pick parameters Effect of C
33
Effect of RBF parameter
34
General Kernel methodology Pick a learning task Start with linear function and data Define loss function Define regularization Formulate optimization problem in dual space/inner product space Construct an appropriate kernel Solve problem in dual space
35
Extensions Many Inference Tasks Regression One-class Classification, novelty detection Ranking Clustering Multi-Task Learning Learning Kernels Cannonical Correlation Analysis Principal Component Analysis
36
Algorithms Algorithms Types: General Purpose solvers CPLEX by ILOG Matlab optimization toolkit Special purpose solvers exploit structure of the problem Best linear SVM take time linear in the number of training data points. Best kernel SVM solvers take time quadratic in the number of training data points. Good news since convex, algorithm doesn’t really matter as long as solvable.
37
Hallelujah! Generalization theory and practice meet General methodology for many types of inference problems Same Program + New Kernel = New method No problems with local minima Few model parameters. Avoids overfitting Robust optimization methods. Applicable to non-vector problems. Easy to use and tune Successful Applications BUT…
38
Catches Will SVMs beat my best hand-tuned method Z on problem X? Do SVM scale to massive datasets? How to chose C and Kernel? How to transform data? How to incorporate domain knowledge? How to interpret results? Are linear methods enough?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.