Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.

Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston

Aim Learn accurate parameters for a segmentation model - Segmentation without generic foreground or background classes - Train using both strongly and weakly supervised data

Data in Vision “Strong” Supervision “Car” “Weak” Supervision “One hand tied behind the back…. “

Data for Vision “Car” “Strong” Supervision “Weak” Supervision 

Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets

Types of Data Specific background classes, generic foreground class Stanford Background Dataset

Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images Current methods only use small, controlled datasets

Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”

Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images

Outline Region-based Segmentation Model Problem Formulation Inference Results

Region-based Segmentation Model  Object Models Pixels Regions

Problem Formulation Treat missing information as latent variables Joint Feature Vector Image xAnnotation y Complete Annotation (y,h) Region features Detection features Pairwise contrast Pairwise context (x,y,h)(x,y,h)

Problem Formulation Treat missing information as latent variables Image xAnnotation y Complete Annotation (y,h) (y*,h*) = argmax w T (x,y,h)(x,y,h) Latent Structural SVM Trained by minimizing overlap loss ∆

Self-Paced Learning Start with an initial estimate w 0 Update w t+1 by solving a biconvex problem min ||w|| 2 + C∑ i v i  i - K∑ i v i w T  (x i,y i,h i ) - w T  (x i,y,h) ≥  (y i, y, h) -  i Update h i = max h  H w t T  (x i,y i,h) Kumar, Packer and Koller, 2010 Annotation Consistent Inference Loss Augmented Inference

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min  T y s.t. y  SELECT(D) Kumar and Koller, 2010

Generic Classes Binary y r (0) = 1 iff r is not selected Binary y r (1) = 1 iff r is selected min y ∑  r (i)y r (i) + ∑  rs (i,j)y rs (i,j) s.t. y r (0) + y r (1) = 1 Assign one label to r from L y rs (i,0) + y rs (i,1) = y r (i) Ensure y rs (i,j) = y r (i)y s (j) ∑ r “covers” u y r (1) = 1 Each super-pixel is covered by exactly one selected region y r (i), y rs (i,j)  {0,1} Binary variables Minimize the energy y rs (0,j) + y rs (1,j) = y s (j)

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min  T y s.t. y  SELECT(D) Kumar and Koller, 2010 ∆ new ≤ ∆ prev Simultaneous region selection and labeling

Examples Iteration 1 Iteration 3Iteration 6

Bounding Boxes min  T y y  SELECT(D) ∆ new ≤ ∆ prev z a  {0,1} z a ≤  r “covers” a y r (c) +  K a (1-z a ) Each row and each column of bounding box is covered

Examples Iteration 1 Iteration 2Iteration 4

Image-Level Labels min  T y y  SELECT(D) ∆ new ≤ ∆ prev z  {0,1} z ≤  y r (c) +  K (1-z) Image must contain the specified object

Dataset Stanford Background Generic background class 20 foreground classes Generic foreground class 7 background classes PASCAL VOC 2009 +

Dataset Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Stanford BackgroundPASCAL VOC 2009 + Baseline: Closed-loop learning (CLL), Gould et al., 2009

Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% CLL - 53.1% LSVM - 54.3%

Dataset Stanford BackgroundPASCAL VOC 2009 + 2010 + Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images

Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% BOX - 28.3% CLL - 53.1% LSVM - 54.3% BOX - 54.8%

Dataset Stanford BackgroundPASCAL VOC 2009 + 2010 + Train - 572 images Validation - 53 images Test - 90 images Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images + 1000 image-level labels (ImageNet)

Results PASCAL VOC 2009 SBD Improvement over CLL CLL - 24.7% LSVM - 26.9% BOX - 28.3% LABEL - 28.8% CLL - 53.1% LSVM - 54.3% BOX - 54.8% LABEL - 55.3%

Examples

Failure Modes

Examples

Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets

Types of Data Specific background classes, generic foreground class Stanford Background Dataset

Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images

Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”

Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images

Two Problems The “Noise” Problem Self-Paced Learning The “Size” Problem Self-Paced Learning

Questions?

Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.

Similar presentations

Presentation on theme: "Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.

Similar presentations

Presentation on theme: "Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston."— Presentation transcript:

Similar presentations

About project

Feedback