Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston
Aim Learn accurate parameters for a segmentation model - Segmentation without generic foreground or background classes - Train using both strongly and weakly supervised data
Data in Vision “Strong” Supervision “Car” “Weak” Supervision “One hand tied behind the back…. “
Data for Vision “Car” “Strong” Supervision “Weak” Supervision
Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets
Types of Data Specific background classes, generic foreground class Stanford Background Dataset
Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images Current methods only use small, controlled datasets
Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”
Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images
Outline Region-based Segmentation Model Problem Formulation Inference Results
Region-based Segmentation Model Object Models Pixels Regions
Outline Region-based Segmentation Model Problem Formulation Inference Results
Problem Formulation Treat missing information as latent variables Joint Feature Vector Image xAnnotation y Complete Annotation (y,h) Region features Detection features Pairwise contrast Pairwise context (x,y,h)(x,y,h)
Problem Formulation Treat missing information as latent variables Image xAnnotation y Complete Annotation (y,h) (y*,h*) = argmax w T (x,y,h)(x,y,h) Latent Structural SVM Trained by minimizing overlap loss ∆
Self-Paced Learning Start with an initial estimate w 0 Update w t+1 by solving a biconvex problem min ||w|| 2 + C∑ i v i i - K∑ i v i w T (x i,y i,h i ) - w T (x i,y,h) ≥ (y i, y, h) - i Update h i = max h H w t T (x i,y i,h) Kumar, Packer and Koller, 2010 Annotation Consistent Inference Loss Augmented Inference
Outline Region-based Segmentation Model Problem Formulation Inference Results
Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min T y s.t. y SELECT(D) Kumar and Koller, 2010
Generic Classes Binary y r (0) = 1 iff r is not selected Binary y r (1) = 1 iff r is selected min y ∑ r (i)y r (i) + ∑ rs (i,j)y rs (i,j) s.t. y r (0) + y r (1) = 1 Assign one label to r from L y rs (i,0) + y rs (i,1) = y r (i) Ensure y rs (i,j) = y r (i)y s (j) ∑ r “covers” u y r (1) = 1 Each super-pixel is covered by exactly one selected region y r (i), y rs (i,j) {0,1} Binary variables Minimize the energy y rs (0,j) + y rs (1,j) = y s (j)
Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min T y s.t. y SELECT(D) Kumar and Koller, 2010 ∆ new ≤ ∆ prev Simultaneous region selection and labeling
Examples Iteration 1 Iteration 3Iteration 6
Examples Iteration 1 Iteration 3Iteration 6
Examples Iteration 1 Iteration 3Iteration 6
Bounding Boxes min T y y SELECT(D) ∆ new ≤ ∆ prev z a {0,1} z a ≤ r “covers” a y r (c) + K a (1-z a ) Each row and each column of bounding box is covered
Examples Iteration 1 Iteration 2Iteration 4
Examples Iteration 1 Iteration 2Iteration 4
Examples Iteration 1 Iteration 2Iteration 4
Image-Level Labels min T y y SELECT(D) ∆ new ≤ ∆ prev z {0,1} z ≤ y r (c) + K (1-z) Image must contain the specified object
Outline Region-based Segmentation Model Problem Formulation Inference Results
Dataset Stanford Background Generic background class 20 foreground classes Generic foreground class 7 background classes PASCAL VOC
Dataset Train images Validation - 53 images Test - 90 images Train images Validation images Test images Stanford BackgroundPASCAL VOC Baseline: Closed-loop learning (CLL), Gould et al., 2009
Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % CLL % LSVM %
Dataset Stanford BackgroundPASCAL VOC Train images Validation - 53 images Test - 90 images Train images Validation images Test images Bounding Boxes images
Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % BOX % CLL % LSVM % BOX %
Dataset Stanford BackgroundPASCAL VOC Train images Validation - 53 images Test - 90 images Train images Validation images Test images Bounding Boxes images image-level labels (ImageNet)
Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % BOX % LABEL % CLL % LSVM % BOX % LABEL %
Examples
Failure Modes
Examples
Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets
Types of Data Specific background classes, generic foreground class Stanford Background Dataset
Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images
Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”
Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images
Two Problems The “Noise” Problem Self-Paced Learning The “Size” Problem Self-Paced Learning
Questions?