Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.

Slides:

Advertisements

Similar presentations

Self-Paced Learning for Semantic Segmentation

Advertisements

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Learning Specific-Class Segmentation from Diverse Data M. Pawan Kumar, Haitherm Turki, Dan Preston and Daphne Koller at ICCV 2011 VGG reading group, 29.

Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.

Curriculum Learning for Latent Structural SVM

Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.

Ľubor Ladický1 Phil Torr2 Andrew Zisserman1

Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)

Loss-based Visual Learning with Weak Supervision M. Pawan Kumar Joint work with Pierre-Yves Baudin, Danny Goodman, Puneet Kumar, Nikos Paragios, Noura.

Max-Margin Latent Variable Models M. Pawan Kumar.

Data-driven Visual Similarity for Cross-domain Image Matching

Learning Structural SVMs with Latent Variables Xionghao Liu.

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 143, Brown James Hays 02/22/11 Many slides from Derek Hoiem.

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/15/12.

Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.

GrabCut Interactive Image (and Stereo) Segmentation Joon Jae Lee Keimyung University Welcome. I will present Grabcut – an Interactive tool for foreground.

Large-Scale Object Recognition with Weak Supervision

Belief Propagation on Markov Random Fields Aggeliki Tsoli.

Restrict learning to a model-dependent “easy” set of samples General form of objective: Introduce indicator of “easiness” v i : K determines threshold.

Learning to Segment with Diverse Data M. Pawan Kumar Stanford University.

Recognition: A machine learning approach

Learning Spatial Context: Using stuff to find things Geremy Heitz Daphne Koller Stanford University October 13, 2008 ECCV 2008.

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz.

1 Accurate Object Detection with Joint Classification- Regression Random Forests Presenter ByungIn Yoo CS688/WST665.

Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or.

What, Where & How Many? Combining Object Detectors and CRFs

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Modeling Latent Variable Uncertainty for Loss-based Learning Daphne Koller Stanford University Ben Packer Stanford University M. Pawan Kumar École Centrale.

Loss-based Learning with Weak Supervision M. Pawan Kumar.

Self-paced Learning for Latent Variable Models

Loss-based Learning with Latent Variables M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Ben.

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.

Learning a Small Mixture of Trees M. Pawan Kumar Daphne Koller Aim: To efficiently learn a.

Modeling Latent Variable Uncertainty for Loss-based Learning Daphne Koller Stanford University Ben Packer Stanford University M. Pawan Kumar École Centrale.

Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.

Optimizing Average Precision using Weakly Supervised Data Aseem Behl IIIT Hyderabad Under supervision of: Dr. M. Pawan Kumar (INRIA Paris), Prof. C.V.

CS 4487/6587 Algorithms for Image Analysis

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.

Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Update any set S of nodes simultaneously with step-size We show fixed point update is monotone for · 1/|S| Covering Trees and Lower-bounds on Quadratic.

Efficient Discriminative Learning of Parts-based Models M. Pawan Kumar Andrew Zisserman Philip Torr

Object detection, deep learning, and R-CNNs

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Learning from Big Data Lecture 5

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/22/11.

Optimizing Average Precision using Weakly Supervised Data Aseem Behl 1, C.V. Jawahar 1 and M. Pawan Kumar 2 1 IIIT Hyderabad, India, 2 Ecole Centrale Paris.

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Loss-based Learning with Weak Supervision M. Pawan Kumar.

Discriminative Machine Learning Topic 4: Weak Supervision M. Pawan Kumar Slides available online

Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.

Learning a Region-based Scene Segmentation Model

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Semantic Object and Instance Segmentation

Multimodal Learning with Deep Boltzmann Machines

Saliency detection Donghun Yeo CV Lab..

Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.

Efficiently Selecting Regions for Scene Understanding

Group Norm for Learning Latent Structural SVMs

“The Truth About Cats And Dogs”

“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts

Weakly Supervised Action Recognition

Presentation transcript:

Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston

Aim Learn accurate parameters for a segmentation model - Segmentation without generic foreground or background classes - Train using both strongly and weakly supervised data

Data in Vision “Strong” Supervision “Car” “Weak” Supervision “One hand tied behind the back…. “

Data for Vision “Car” “Strong” Supervision “Weak” Supervision 

Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets

Types of Data Specific background classes, generic foreground class Stanford Background Dataset

Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images Current methods only use small, controlled datasets

Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”

Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images

Outline Region-based Segmentation Model Problem Formulation Inference Results

Region-based Segmentation Model  Object Models Pixels Regions

Outline Region-based Segmentation Model Problem Formulation Inference Results

Problem Formulation Treat missing information as latent variables Joint Feature Vector Image xAnnotation y Complete Annotation (y,h) Region features Detection features Pairwise contrast Pairwise context (x,y,h)(x,y,h)

Problem Formulation Treat missing information as latent variables Image xAnnotation y Complete Annotation (y,h) (y*,h*) = argmax w T (x,y,h)(x,y,h) Latent Structural SVM Trained by minimizing overlap loss ∆

Self-Paced Learning Start with an initial estimate w 0 Update w t+1 by solving a biconvex problem min ||w|| 2 + C∑ i v i  i - K∑ i v i w T  (x i,y i,h i ) - w T  (x i,y,h) ≥  (y i, y, h) -  i Update h i = max h  H w t T  (x i,y i,h) Kumar, Packer and Koller, 2010 Annotation Consistent Inference Loss Augmented Inference

Outline Region-based Segmentation Model Problem Formulation Inference Results

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min  T y s.t. y  SELECT(D) Kumar and Koller, 2010

Generic Classes Binary y r (0) = 1 iff r is not selected Binary y r (1) = 1 iff r is selected min y ∑  r (i)y r (i) + ∑  rs (i,j)y rs (i,j) s.t. y r (0) + y r (1) = 1 Assign one label to r from L y rs (i,0) + y rs (i,1) = y r (i) Ensure y rs (i,j) = y r (i)y s (j) ∑ r “covers” u y r (1) = 1 Each super-pixel is covered by exactly one selected region y r (i), y rs (i,j)  {0,1} Binary variables Minimize the energy y rs (0,j) + y rs (1,j) = y s (j)

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS SELECT REGIONS ITERATE UNTIL CONVERGENCE Current RegionsOver-Segmentations min  T y s.t. y  SELECT(D) Kumar and Koller, 2010 ∆ new ≤ ∆ prev Simultaneous region selection and labeling

Examples Iteration 1 Iteration 3Iteration 6

Examples Iteration 1 Iteration 3Iteration 6

Examples Iteration 1 Iteration 3Iteration 6

Bounding Boxes min  T y y  SELECT(D) ∆ new ≤ ∆ prev z a  {0,1} z a ≤  r “covers” a y r (c) +  K a (1-z a ) Each row and each column of bounding box is covered

Examples Iteration 1 Iteration 2Iteration 4

Examples Iteration 1 Iteration 2Iteration 4

Examples Iteration 1 Iteration 2Iteration 4

Image-Level Labels min  T y y  SELECT(D) ∆ new ≤ ∆ prev z  {0,1} z ≤  y r (c) +  K (1-z) Image must contain the specified object

Outline Region-based Segmentation Model Problem Formulation Inference Results

Dataset Stanford Background Generic background class 20 foreground classes Generic foreground class 7 background classes PASCAL VOC

Dataset Train images Validation - 53 images Test - 90 images Train images Validation images Test images Stanford BackgroundPASCAL VOC Baseline: Closed-loop learning (CLL), Gould et al., 2009

Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % CLL % LSVM %

Dataset Stanford BackgroundPASCAL VOC Train images Validation - 53 images Test - 90 images Train images Validation images Test images Bounding Boxes images

Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % BOX % CLL % LSVM % BOX %

Dataset Stanford BackgroundPASCAL VOC Train images Validation - 53 images Test - 90 images Train images Validation images Test images Bounding Boxes images image-level labels (ImageNet)

Results PASCAL VOC 2009 SBD Improvement over CLL CLL % LSVM % BOX % LABEL % CLL % LSVM % BOX % LABEL %

Examples

Failure Modes

Examples

Types of Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets

Types of Data Specific background classes, generic foreground class Stanford Background Dataset

Types of Data Bounding boxes for objects PASCAL VOC Detection Datasets Thousands of freely available images

Types of Data Image-level labels ImageNet, Caltech … Thousands of freely available images “Car”

Types of Data Noisy data from web search Google Image, Flickr, Picasa ….. Millions of freely available images

Two Problems The “Noise” Problem Self-Paced Learning The “Size” Problem Self-Paced Learning

Questions?