Presentation is loading. Please wait.

Presentation is loading. Please wait.

PEGASOS Primal Estimated sub-GrAdient Solver for SVM

Similar presentations


Presentation on theme: "PEGASOS Primal Estimated sub-GrAdient Solver for SVM"— Presentation transcript:

1 PEGASOS Primal Estimated sub-GrAdient Solver for SVM
Ming TIAN

2 Reference [1] Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: primal estimated sub-gradient solver for svm. ICML, Mathematical Programming, Series B, 127(1):3-30, 2011. [2] Zhuang Wang, Koby Crammer, Slobodan Vucetic (2010). Multi-Class Pegasos on a Budget. ICML. [3] Crammer, K & Singer. Y. (2001). On the algorithmic implemen- tation of multiclass kernel-based vector machines. JMLR, 2, [4] Crammer, K., Kandola, J. & Singer, Y. (2004). Online classifi- cation on a budget. NIPS, 16,

3 Outline Review of SVM optimization The Pegasos algorithm
Multi-Class Pegasos on a Budget Further works

4 Outline Review of SVM optimization The Pegasos algorithm
Multi-Class Pegasos on a Budget Further works

5 Review of SVM optimization
Q1: Regularization term Empirical loss

6 Review of SVM optimization

7 Review of SVM optimization
Dual-based methods Interior Point methods Memory: m2, time: m3, log(log(1/)) Decomposition methods Memory: m, Time: super-linear in m Online learning & Stochastic Gradient Memory: O(1), Time: 1/2 (linear kernel) Memory: 1/2, Time: 1/4 (non-linear kernel) Typically, online learning algorithms do not converge to the optimal solution of SVM Better rates for finite dimensional instances (Murata, Bottou)

8 Outline Review of SVM optimization The Pegasos algorithm
Multi-Class Pegasos on a Budget Further works

9 PEGASOS A_t = S Subgradient method |A_t| = 1 Stochastic gradient
Projection

10 Run-Time of Pegasos Choosing |At|=1 and a linear kernel over Rn
 Run-time required for Pegasos to find  accurate solution with probability 1- Run-time does not depend on #examples Depends on “difficulty” of problem ( and )

11 Formal Properties Definition: w is  accurate if
Theorem 1: Pegasos finds  accurate solution w.p  after at most iterations. Theorem 2: Pegasos finds log(1/) solutions s.t. w.p , at least one of them is  accurate after iterations

12 Proof Sketch A second look on the update step:

13 Proof Sketch Denote: Logarithmic Regret for OCP Take expectation:
f(wr)-f(w*) 0  Markov gives that w.p  Amplify the confidence

14 Proof Sketch

15 Proof Sketch A function f is called strongly convex if is a convex function.

16 Proof Sketch

17 Proof Sketch

18 Experiments 3 datasets (provided by Joachims)
Reuters CCAT (800K examples, 47k features) Physics ArXiv (62k examples, 100k features) Covertype (581k examples, 54 features) 4 competing algorithms SVM-light (Joachims) SVM-Perf (Joachims’06) Norma (Kivinen, Smola, Williamson ’02) Zhang’04 (stochastic gradient descent)

19 Training Time (in seconds)
Pegasos SVM-Perf SVM-Light Reuters 2 77 20,075 Covertype 6 85 25,514 Astro-Physics 5 80

20 Compare to Norma (on Physics)
obj. value test error

21 Compare to Zhang (on Physics)
Objective But, tuning the parameter is more expensive than learning …

22 Effect of k=|At| when T is fixed
Objective

23 Effect of k=|At| when kT is fixed
Objective

24 bias term Popular approach: increase dimension of x Cons: “pay” for b in the regularization term Calculate subgradients w.r.t. w and w.r.t b: Cons: convergence rate is 1/2 Define: Cons: |At| need to be large Search b in an outer loop Cons: evaluating objective is 1/2

25 Outline Review of SVM optimization The Pegasos algorithm
Multi-Class Pegasos on a Budget Further works

26 multi-class SVM (Crammer & Singer, 2001)
multi-class model :

27 multi-class SVM (Crammer & Singer, 2001)
multi-class SVM objective function: where and the multi-class hinge-loss function is defined as: where

28 multi-class Pegasos use the instantaneous objective function :
multi-class Pegasos works by iteratively executing the two-step updates : Step 1: Where:

29 multi-class Pegasos If loss is equal to zero then: Else: Step 2:
project the weight wt+1 into the closed convex set:

30 Budgeted Multi-Class Pegasos

31 Budget Maintenance Strategies
Budget maintenance through removal the optimal removal always selects the oldest SV Budget maintenance through projection projecting an SV onto all the remaining SVs and thus results in smaller weight degradation. Budget maintenance through Merging merging two SVs to a newly created one The total cost of finding the optimal merging for the n-th and m-th SV is O(1).

32 Experiments

33 Outline Review of SVM optimization The Pegasos algorithm
Multi-Class Pegasos on a Budget Further works

34 Further works Distribution_aware Pegasos?
Online structural regularized SVM?

35 Thanks! Q&A

36


Download ppt "PEGASOS Primal Estimated sub-GrAdient Solver for SVM"

Similar presentations


Ads by Google