Download presentation
Presentation is loading. Please wait.
1
Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston
2
Aim Learn accurate parameters for a segmentation model - Segmentation without generic foreground or background classes - Train using both strongly and weakly supervised data
3
Data in Vision “Strong” Supervision “Car” “Weak” Supervision “One hand tied behind the back…. “
4
Data for Vision “Car” “Strong” Supervision “Weak” Supervision
5
Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets
6
Types of Data Specific background classes, generic foreground class Stanford Background Dataset
7
Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images Current methods only use small, controlled datasets
8
Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”
9
Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images
10
Outline Region-based Segmentation Model Problem Formulation Inference Results
11
Region-based Segmentation Model Object Models Pixels Regions
12
Outline Region-based Segmentation Model Problem Formulation Inference Results
13
Problem Formulation Treat missing information as latent variables Joint Feature Vector Image xAnnotation y Complete Annotation (y,h) Region features Detection features Pairwise contrast Pairwise context (x,y,h)(x,y,h)
14
Problem Formulation Treat missing information as latent variables Image xAnnotation y Complete Annotation (y,h) (y*,h*) = argmax w T (x,y,h)(x,y,h) Latent Structural SVM Trained by minimizing overlap loss ∆
15
Self-Paced Learning Start with an initial estimate w 0 Update w t+1 by solving a biconvex problem min ||w|| 2 + C∑ i v i i - K∑ i v i w T (x i,y i,h i ) - w T (x i,y,h) ≥ (y i, y, h) - i Update h i = max h H w t T (x i,y i,h) Kumar, Packer and Koller, 2010 Annotation Consistent Inference Loss Augmented Inference
16
Outline Region-based Segmentation Model Problem Formulation Inference Results
17
Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min T y s.t. y SELECT(D) Kumar and Koller, 2010
18
Generic Classes Binary y r (0) = 1 iff r is not selected Binary y r (1) = 1 iff r is selected min y ∑ r (i)y r (i) + ∑ rs (i,j)y rs (i,j) s.t. y r (0) + y r (1) = 1 Assign one label to r from L y rs (i,0) + y rs (i,1) = y r (i) Ensure y rs (i,j) = y r (i)y s (j) ∑ r “covers” u y r (1) = 1 Each super-pixel is covered by exactly one selected region y r (i), y rs (i,j) {0,1} Binary variables Minimize the energy y rs (0,j) + y rs (1,j) = y s (j)
19
Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min T y s.t. y SELECT(D) Kumar and Koller, 2010 ∆ new ≤ ∆ prev Simultaneous region selection and labeling
20
Examples Iteration 1 Iteration 3Iteration 6
21
Examples Iteration 1 Iteration 3Iteration 6
22
Examples Iteration 1 Iteration 3Iteration 6
23
Bounding Boxes min T y y SELECT(D) ∆ new ≤ ∆ prev z a {0,1} z a ≤ r “covers” a y r (c) + K a (1-z a ) Each row and each column of bounding box is covered
24
Examples Iteration 1 Iteration 2Iteration 4
25
Examples Iteration 1 Iteration 2Iteration 4
26
Examples Iteration 1 Iteration 2Iteration 4
27
Image-Level Labels min T y y SELECT(D) ∆ new ≤ ∆ prev z {0,1} z ≤ y r (c) + K (1-z) Image must contain the specified object
28
Outline Region-based Segmentation Model Problem Formulation Inference Results
29
Dataset Stanford Background Generic background class 20 foreground classes Generic foreground class 7 background classes PASCAL VOC 2009 +
30
Dataset Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Stanford BackgroundPASCAL VOC 2009 + Baseline: Closed-loop learning (CLL), Gould et al., 2009
31
Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% CLL - 53.1% LSVM - 54.3%
32
Dataset Stanford BackgroundPASCAL VOC 2009 + 2010 + Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images
33
Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% BOX - 28.3% CLL - 53.1% LSVM - 54.3% BOX - 54.8%
34
Dataset Stanford BackgroundPASCAL VOC 2009 + 2010 + Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images + 1000 image-level labels (ImageNet)
35
Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% BOX - 28.3% LABEL - 28.8% CLL - 53.1% LSVM - 54.3% BOX - 54.8% LABEL - 55.3%
36
Examples
38
Failure Modes
39
Examples
40
Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets
41
Types of Data Specific background classes, generic foreground class Stanford Background Dataset
42
Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images
43
Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”
44
Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images
45
Two Problems The “Noise” Problem Self-Paced Learning The “Size” Problem Self-Paced Learning
46
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.