Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL.

Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL

Localization vs. Recognition Traditional question: “Is there an object of type X in this image?” Airplane? NO Human? YES Dog? YES Our question: “Where in this image is the object of type X?” MAN DOG The man is walking the dog

Outline Registration Shape Learning Shape and Appearance Learning ?

Outline Landmark-based shape model Registration as inference Learning Results 3D preview

Shape Model Set of landmarks Outline is defined via piecewise-linear contour Features of individual landmarks Features of pairs of landmarks tail nose

“Registering” the Model to an Image Task: Assign each landmark l  L to a pixel p l  P ? ? Basic tool: local matches Local score feature + p l

“Shape aware” features Shape Template Patch Appearance (Foreground/Background)

“Shape aware” features Shape template: vector of expected pixel locations relative to landmark good matchbad match

FG/BGappearance feature Histogram of various appearance components (A) Includes RGB, HSV, Texture components … a 1 = H a 2 = T FG … a 1 = H a 2 = T BG

FG/BGappearance feature Histogram of various appearance components (A) Includes RGB, HSV, Texture components … a 1 = H a 2 = T I … a 1 = H a 2 = T M

Pairwise Landmark Features plpl pmpm dXdX XX YY l m dYdY

Registration Are local cues enough? Need to jointly consider all cues (features) “Correct” pixel is often not the best match! Inference using max-product

Registration Markov Random Field Random variables = Landmarks Domain = Image pixels Potentials = scores of features Inference using max-product Registration = Most Likely Assignment L nose L tail L under L cockpit

Registration Likelihood Given an assignment of landmarks to pixels The likelihood is defined as local features pairwise features rotations

Domain Pruning Can we use all pixels as the domain? Average Image = 60,000 pixels PW Potentials: 60K X 60K = 3.6B entries First consider only edge pixels Order of 1K Canny edge pixels per image Consider top matches of local features Correct match generally in top 50

Bootstrap from simple instances where outlining is easy = cartoons / drawings Learning Challenge Hand LabelHidden Variables Costly, and time-consuming Where to start? Local optima problem no confusing background outline (shape) is easily recovered using snake ? ? ? ? ?

Learning from Cartoon Drawings Registration Shape Learning Shape and Appearance Learning + ?

Contour-Contour Registration Treat one contour as an image Build a model from the other contour (with prior on variances) Registration technique/code is the same

Model Construction Shape template: Sampled along contour, rotated to match, averaged over instances Appearance template Average inside/outside masks Location and pairwise parameters Maximum likelihood estimation of Gaussian moments = ModelInstance 1Instance 2 + Instance 1Instance 2 + OUT IN OUT IN = OUT IN Model

Phase I: Learning from Cartoons Extract high resolution contour using snake Create shape-based model from training contours Pairwise merging of models Selection of landmarks Registration Pyramid Final Shape Model

Learning from Cartoons Extract high resolution contour using snake Create shape-based model (with prior variances) Pairwise merging of models Selection of landmarks Final Shape Model

Outline Landmark-based shape model Registration as inference Learning from cartoon drawings Results 3D preview

Measure of success Typical prediction measure: has nothing to do with localization / outlining 0.77 automatic outlining true outline OMR

Localization Results 0.84 0.750.840.720.18 0.81 0.660.770.40 training cartoons sample registration

Localization Results - 2

Cartoon vs. Hand Segmentation 012345 Number of Training Instances 0.1 0.3 0.5 0.7 0.9 Mean Overlap Score Learned from Drawings Hand Constructed Human Inter-Observer cartoon hand segmented Learning shape from cartoons is competitive with hand segmentation!

Prediction 00.20.40.60.81 False positive rate 0 0.2 0.4 0.6 0.8 1 True positive rate objectrecognition car side86% cougar86% airplane86% buddha84% bass76% rooster73%  Comparable to constellation w/ 5 instances (Fei Fei et. Al)  Leading (discriminative) methods require many instances TP=FP rate

How hard is the problem? registration of shape appears absurd support of edges is not what we expect

Effect of different features shape only with appearancewith location prior

Learning Appearance No Appearance FG/BG Appearance 0246810 0.46 0.48 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.64 Average overlap shape shape + appearance # images in phase II

Other approaches Other ApproachOur Approach Discriminative [Grauman/Darrell, Serre] Ignorant of shape Rely on image stats Many Instances Shape aware Rely on object stats Few instances Geometric [Fergus,Fei-Fei,Quattoni] Good recognition Localization: not great Good precise outlining Competitive recognition Landmark based [Berg] Match to template rather than model Leverage image stats Model shape, match to object in image Outlining [Cootes,Coughlan,Kumar] Limited scenarios Few D.O.F. Varied classes Flexible model

3D model to 2D images Capture all properties of an object Model projection invariants decimation

3D to 2D registration Projection 1P2P3 Given projection this is just 2D registration

Learning 3D to 2D invariants initial (prior) model P1P2P3 Probabilistic 3D model Distances: EM with least-squares problem Appearance: ???

Summary and Future Work  Flexible probabilistic shape model  Effective outlining of object  Novel learning from cartoons Develop a better appearance model Generalize to 3D models Learn projection invariants

Thanks! Gal Elidan Geremy Heitz Daphne Koller Stanford University

Training Set Selection high score low score Phase II: Learning from Images Correspond initial model to training images Select best correspondences as training instances Learn final shape- and appearance-based model Cartoon Phase Model Natural Image Model Transfer

Transfer of Object Shape Transfer of shape speeds up learning Benefit of shape transfer 0246810 0 0.1 0.2 0.3 0.4 0.5 0.6 # images in phase II Average overlap transfer no transfer

Training Instance Selection AUTOPICKED AUTO HAND 048121620 0.3 0.4 0.5 0.6 0.7 # images in phase II Average overlap PICKED

Training Instance Selection AUTO HAND PICKED 048121620 0.5 0.55 0.6 0.65 0.7 # images in phase II Average overlap

Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL.

Similar presentations

Presentation on theme: "Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL.

Similar presentations

Presentation on theme: "Learning object shape Gal Elidan Geremy Heitz Daphne Koller February 12 th, 2006 PAIL."— Presentation transcript:

Similar presentations

About project

Feedback