Download presentation
Presentation is loading. Please wait.
Published byAugusta Lawrence Modified over 9 years ago
1
Learning Structural SVMs with Latent Variables Xionghao Liu
2
Annotation Mismatch Input x Annotation y Latent h x y = “jumping” h Action Classification Mismatch between desired and available annotations Exact value of latent variable is not “important” Desired output during test time is y
3
Latent SVM Optimization Practice Extensions Outline – Annotation Mismatch Andrews et al., NIPS 2001; Smola et al., AISTATS 2005; Felzenszwalb et al., CVPR 2008; Yu and Joachims, ICML 2009
4
Weakly Supervised Data Input x Output y {-1,+1} Hidden h x y = +1 h
5
Weakly Supervised Classification Feature Φ(x,h) Joint Feature Vector Ψ(x,y,h) x y = +1 h
6
Weakly Supervised Classification Feature Φ(x,h) Joint Feature Vector Ψ(x,+1,h) Φ(x,h) 0 = x y = +1 h
7
Weakly Supervised Classification Feature Φ(x,h) Joint Feature Vector Ψ(x,-1,h) 0 Φ(x,h) = x y = +1 h
8
Weakly Supervised Classification Feature Φ(x,h) Joint Feature Vector Ψ(x,y,h) Score f : Ψ(x,y,h) (-∞, +∞) Optimize score over all possible y and h x y = +1 h
9
Scoring function w T Ψ(x,y,h) Prediction y(w),h(w) = argmax y,h w T Ψ(x,y,h) Latent SVM Parameters
10
Learning Latent SVM (y i, y i (w)) ΣiΣi Empirical risk minimization min w No restriction on the loss function Annotation mismatch Training data {(x i,y i ), i = 1,2,…,n}
11
Learning Latent SVM (y i, y i (w)) ΣiΣi Empirical risk minimization min w Non-convex Parameters cannot be regularized Find a regularization-sensitive upper bound
12
Learning Latent SVM - w T (x i,y i (w),h i (w)) (y i, y i (w)) w T (x i,y i (w),h i (w)) +
13
Learning Latent SVM (y i, y i (w)) w T (x i,y i (w),h i (w)) + - max h i w T (x i,y i,h i ) y(w),h(w) = argmax y,h w T Ψ(x,y,h)
14
Learning Latent SVM (y i, y) w T (x i,y,h) + max y,h - max h i w T (x i,y i,h i ) ≤ ξ i min w ||w|| 2 + C Σ i ξ i Parameters can be regularized Is this also convex?
15
Learning Latent SVM (y i, y) w T (x i,y,h) + max y,h - max h i w T (x i,y i,h i ) ≤ ξ i min w ||w|| 2 + C Σ i ξ i Convex - Difference of convex (DC) program
16
min w ||w|| 2 + C Σ i ξ i w T Ψ(x i,y,h) + Δ(y i,y) - max h i w T Ψ(x i,y i,h i ) ≤ ξ i Scoring function w T Ψ(x,y,h) Prediction y(w),h(w) = argmax y,h w T Ψ(x,y,h) Learning Recap
17
Latent SVM Optimization Practice Extensions Outline – Annotation Mismatch
18
Learning Latent SVM (y i, y) w T (x i,y,h) + max y,h - max h i w T (x i,y i,h i ) ≤ ξ i min w ||w|| 2 + C Σ i ξ i Difference of convex (DC) program
19
Concave-Convex Procedure + (y i, y) w T (x i,y,h) + max y,h wT(xi,yi,hi)wT(xi,yi,hi) - max h i Linear upper-bound of concave part
20
Concave-Convex Procedure + (y i, y) w T (x i,y,h) + max y,h wT(xi,yi,hi)wT(xi,yi,hi) - max h i Optimize the convex upper bound
21
Concave-Convex Procedure + (y i, y) w T (x i,y,h) + max y,h wT(xi,yi,hi)wT(xi,yi,hi) - max h i Linear upper-bound of concave part
22
Concave-Convex Procedure + (y i, y) w T (x i,y,h) + max y,h wT(xi,yi,hi)wT(xi,yi,hi) - max h i Until Convergence
23
Concave-Convex Procedure + (y i, y) w T (x i,y,h) + max y,h wT(xi,yi,hi)wT(xi,yi,hi) - max h i Linear upper bound?
24
Linear Upper Bound - max h i w T (x i,y i,h i ) -w T (x i,y i,h i *) h i * = argmax h i w t T (x i,y i,h i ) Current estimate = w t ≥ - max h i w T (x i,y i,h i )
25
CCCP for Latent SVM Start with an initial estimate w 0 Update Update w t+1 as the ε-optimal solution of min ||w|| 2 + C∑ i i w T (x i,y i,h i *) - w T (x i,y,h) ≥ (y i, y) - i h i * = argmax h i H w t T (x i,y i,h i ) Repeat until convergence
26
Thanks & QA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.