Download presentation
Presentation is loading. Please wait.
1
Learning to Segment with Diverse Data M. Pawan Kumar Stanford University
2
Semantic Segmentation car road grass tree sky
3
Segmentation Models car road grass tree sky MODEL w xy P(x,y; w) Learn accurate parameters y* = argmax y P(x,y; w) P(x,y; w) α exp(-E(x,y;w)) y* = argmin y E(x,y; w)
4
Fully Supervised Data
5
“Fully” Supervised Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets
6
“Fully” Supervised Data Specific background classes, generic foreground class Stanford Background Datasets
7
J. Gonfaus et al. Harmony Potentials for Joint Classification and Segmentation. CVPR, 2010 S. Gould et al. Multi-Class Segmentation with Relative Location Prior. IJCV, 2008 S. Gould et al. Decomposing a Scene into Geometric and Semantically Consistent Regions. ICCV, 2009 X. He et al. Multiscale Conditional Random Fields for Image Labeling. CVPR, 2004 S. Konishi et al. Statistical Cues for Domain Specific Image Segmentation with Performance Analysis. CVPR, 2000 L. Ladicky et al. Associative Hierarchical CRFs for Object Class Image Segmentation. ICCV, 2009 F. Li et al. Object Recognition as Ranking Holistic Figure-Ground Hypotheses. CVPR, 2010 J. Shotton et al. TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. ECCV, 2006 J. Verbeek et al. Scene Segmentation with Conditional Random Fields Learned from Partially Labeled Images. NIPS, 2007 Y. Yang et al. Layered Object Detection for Multi-Class Segmentation. CVPR, 2010 Supervised Learning Generic classes, burdensome annotation
8
PASCAL VOC Detection Datasets Thousands of images Weakly Supervised Data Bounding Boxes for Objects
9
“Car” Weakly Supervised Data Thousands of images ImageNet, Caltech… Image-Level Labels
10
B. Alexe et al. ClassCut for Unsupervised Class Segmentation. ECCV, 2010 H. Arora et al. Unsupervised Segmentation of Objects Using Efficient Learning. CVPR, 2007 L. Cao et al. Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes. ICCV, 2007 J. Winn et al. LOCUS: Learning Object Classes with Unsupervised Segmentation. ICCV, 2005 Weakly Supervised Learning Binary segmentation, limited data
11
Diverse Data “Car”
12
Diverse Data Learning Avoid “generic” classes Take advantage of – Cleanliness of supervised data – Vast availability of weakly supervised data
13
Outline Model Energy Minimization Parameter Learning Results Future Work
14
Region-Based Model Pixels Regions Gould, Fulton and Koller, ICCV 2009 Unary Potential θ r (i) = w i T Ψ r (x) For example, Ψ r (x) = Average [R G B] w water = [0 0 -10] w grass = [0 -10 0] Features extracted from region r of image x Pairwise Potential θ rr’ (i,j) = w ij T Ψ rr’ (x) For example, Ψ rr’ (x) = constant > 0 w ”car above ground” << 0w ”ground above car” >> 0
15
Region-based Model E(x,y) α -log P(x,y) = Unaries + Pairwise E(x,y) = w T Ψ (x,y) Best segmentation of an image?Accurate w? x y
16
Outline Model Energy Minimization Parameter Learning Results Future Work Kumar and Koller, CVPR 2010
17
Besag. On the Statistical Analysis of Dirty Pictures, JRSS, 1986 Boykov et al. Fast Approximate Energy Minimization via Graph Cuts, PAMI, 2001 Komodakis et al. Fast, Approximately Optimal Solutions for Single and Dynamic MRFs, CVPR, 2007 Lempitsky et al. Fusion Moves for Markov Random Field Optimization, PAMI, 2010 Move-Making T. Minka. Expectation Propagation for Approximate Bayesian Inference, UAI, 2001 Murphy. Loopy Belief Propagation: An Empirical Study, UAI, 1999 J. Winn et al. Variational Message Passing, JMLR, 2005 J. Yedidia et al. Generalized Belief Propagation, NIPS, 2001 Message-Passing Chekuri et al. Approximation Algorithms for Metric Labeling, SODA, 2001 M. Goemans et al. Improved Approximate Algorithms for Maximum-Cut, JACM, 1995 M. Muramatsu et al. A New SOCP Relaxation for Max-Cut, JORJ, 2003 Ravikumar et al. QP Relaxations for Metric Labeling, ICML, 2006 Convex Relaxations K. Alahari et al. Dynamic Hybrid Algorithms for MAP Inference, PAMI 2010 P. Kohli et al. On Partial Optimality in Multilabel MRFs, ICML, 2008 C. Rother et al. Optimizing Binary MRFs via Extended Roof Duality, CVPR, 2007 Hybrid Algorithms Which one is the best relaxation?
18
Convex Relaxations Time LP 1976 SOCP 2003 QP 2006 Tightness We expect …. Kumar, Kolmogorov and Torr, NIPS, 2007 Use LP!! LP provably better than QP, SOCP.
19
Energy Minimization Find Regions Find Labels Fixed Regions LP Relaxation
20
Energy Minimization Good region – homogenous appearance, textureBad region – inhomogenous appearance, texture Low-level segmentation for candidate regions Find Regions Find Labels Can we prune regions? Super-exponential in Number of Pixels ………………
21
Energy Minimization Spatial Bandwidth = 10 Mean-Shift Segmentation
22
Energy Minimization Spatial Bandwidth = 20 Mean-Shift Segmentation
23
Energy Minimization Spatial Bandwidth = 30 Mean-Shift Segmentation
24
Energy Minimization “Combine” Multiple Segmentations Car
25
Dictionary of Regions Select Regions, Assign Classes y r (i) {0,1}, for i = 0, 1, 2, …, C Not Selected Selected regions cover entire image No two selected regions overlap min Σ θ r (i)y r (i) + Σ θ rr’ (i,j)y r (i)y r’ (j) Pixel Regions Kumar and Koller, CVPR 2010 Efficient DD. Komodakis and Paragios, CVPR, 2009 ✗ 2323 3
26
Comparison Energy Accuracy IMAGEIMAGE GOULDGOULD OUROUR Parameters learned using Gould, Fulton and Koller, ICCV 2009 Statistically significant improvement (paired t-test)
27
Outline Model Energy Minimization Parameter Learning Results Future Work Kumar, Turki, Preston and Koller, In Submission
28
Supervised Learning x1x1 y1y1 x2x2 y2y2 P(x,y) α exp(-E(x,y)) = exp(w T Ψ (x,y)) P(y|x 1 ) y P(y|x 2 ) y y1y1 y2y2 Well-studied problem, efficient solutions
29
Diverse Data Learning xa h Generic Class Annotation
30
Diverse Data Learning xa h Bounding Box Annotation
31
Diverse Data Learning x a = “Cow” h Image Level Annotation
32
Learning with Missing Information Expectation Maximization A. Dempster et al. Maximum Likelihood from Incomplete Data via the EM Algorithm. JRSS, 1977. M. Jamshadian et al. Acceleration of the EM Algorithm by Using Quasi-Newton Methods. JRSS, 1997. R. Neal et al. A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants. LGM, 1999. R. Sundberg. Maximum Likelihood Theory for Incomplete Data from an Exponential Family. SJS 1974. Latent Support Vector Machine P. Felzenszwalb et al. A Discriminatively Trained, Multiscale, Deformable Part Model. CVPR, 2008. C.-N. Yu et al. Learning Structural SVMs with Latent Variables. ICML, 2009. Computationally Inefficient Only requires an energy minimization algorithm Hard EM
33
Latent SVM w T Ψ (x i,a i, h i ) ≤ w T Ψ (x i,a,h) – ξ i min Σ i ξ i + λ w 2 || min h i Energy of Ground-truth Energy of Other Labelings ≤ User-defined loss Difference of ConvexCCCP + Δ(a i,a,h) Number of disagreements Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008
34
CCCP Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem min ∑ i i w T (x i,a i,h i ) - w T (x i,a,h) ≤ (a i,a,h) - i h i = min h w t T (x i,a i,h) Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008 + λ w 2 || Energy Minimization
35
Generic Class Annotation Generic background with specific background Generic foreground with specific foreground
36
Bounding Box Annotation Every row “contains” the object Every column “contains” the object
37
Image Level Annotation The image “contains” the object “Cow”
38
CCCP Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem min ∑ i i w T (x i,a i,h i ) - w T (x i,a,h) ≤ (a i,a,h) - i h i = min h w t T (x i,a i,h) Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008 + λ w 2 || Energy Minimization Bad Local Minimum!!
39
White sky Grey road EASY Green grass
40
White sky Blue water Green grass EASY
41
Cow? Cat? Horse? HARD
42
Red Sky? Black Mountain? All images are not equal HARD
43
Real Numbers Imaginary Numbers e iπ +1 = 0 Math is for losers !!
44
Real Numbers Imaginary Numbers e iπ +1 = 0 Euler was a genius!! Self-Paced Learning
45
Easy vs. Hard Easy for human Easy for machine Simultaneously estimate easiness and parameters
46
Self-Paced Learning Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem h i = min h w t T (x i,a i,h) Kumar, Packer and Koller, NIPS 2010 min ∑ I i w T (x i,a i,h i ) - w T (x i,a,h) ≤ (a i,a,h) - i + λ w 2 || vivi -∑ i v i /K v i {0,1} v i [0,1] v i = 1 for easy examplesv i = 0 for hard examples Biconvex Optimization Alternate Convex Search
47
Self-Paced Learning Start with an initial estimate w 0 Update Update w t+1 by solving a biconvex problem min ∑ I i v i w T (x i,a i,h i ) - w T (x i,a,h) ≤ (a i,a,h) - i h i = min h w t T (x i,a i,h) Kumar, Packer and Koller, NIPS 2010 + λ w 2 || -∑ i v i /K Decrease K K/ As Simple As CCCP!!
48
Self-Paced Learning Kumar, Packer and Koller, NIPS 2010 h x a = “Deer” Test Error Image Classification x a = -1 or +1 h = Motif Position Test Error Motif Finding
49
Learning to Segment CCCP SPL
50
Learning to Segment CCCP SPL Iteration 1
51
Learning to Segment CCCP SPL Iteration 3
52
Learning to Segment CCCP SPL Iteration 6
53
Learning to Segment CCCP SPL
54
Learning to Segment CCCP SPL Iteration 1
55
Learning to Segment CCCP SPL Iteration 2
56
Learning to Segment CCCP SPL Iteration 4
57
Outline Model Energy Minimization Parameter Learning Results Future Work
58
Dataset Stanford Background Generic background class 20 foreground classes Generic foreground class 7 background classes PASCAL VOC 2009 +
59
Dataset Stanford BackgroundPASCAL VOC 2009 + Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images
60
Baseline Results for SBD Gould, Fulton and Koller, ICCV 2009 Classes Overlap Score Foreground 36.0% Road 70.1% CLL Average 53.1% Mountain 0%
61
Improvement for SBD Classes Difference (SPL-CLL) Input CLLSPL Road 75.5% (+5.4) CLL Average 53.1% SPL Average 54.3% Foreground 39.1% (+3.1)
62
Baseline Results for VOC Gould, Fulton and Koller, ICCV 2009 Overlap Score Classes Bird 9.5% Aeroplane 32.1% TV 23.6% CLL Average 24.7%
63
Improvement for VOC Input CLLSPL Difference (SPL-CLL) Classes Aeroplane 41.4% (+9.3) TV 31.3% (+7.7) CLL Average 24.7% SPL Average 26.9%
64
Weakly Supervised Dataset ImageNetVOC Detection 2009 + Train - 1564 imagesTrain - 1000 images Bounding Box Data Image-Level Data
65
Improvement for SBD Input GenericAll Difference (All-Generic) Classes Generic Average 54.3% All Average 55.3% Foreground 41.3% (+2.2) Water 60.1% (+5.0)
66
Improvement for VOC Difference (All-Generic) Classes Input GenericAll Generic Average 26.9% All Average 28.8% Motorbike 40.4% (+6.9) Person 42.2% (+4.9)
67
Improvement over CCCP Classes Difference (SPL-CCCP) CCCP 24.7% SPL 28.8% CCCP 53.8% SPL 55.3% No Improvement with CCCP SPL is Essential!! Difference (SPL-CCCP) Classes
68
Energy minimization for region-based model – Tight LP relaxation of integer program Self-paced learning – Simultaneously select examples and learn parameters Even weak annotation is useful Summary
69
Outline Model Energy Minimization Parameter Learning Results Future Work
70
Learning with Diverse Data Noise in LabelsSize of Problem
71
Learning Diverse Tasks Object Detection Action Recognition Pose Estimation 3D Reconstruction
72
Daphne Koller Stephen GouldBen Packer Haithem Turki Dan Preston Andrew Zisserman Phil Torr Vladimir Kolmogorov
73
Summary Questions? Energy minimization for region-based model – Tight LP relaxation of integer program Self-paced learning – Simultaneously select examples and learn parameters Even weak annotation is useful
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.