Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS.

Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS

Outline Goal – Scene Understanding Existing Methods CASPER Preliminary Experiments Future Direction – Going Discriminative

Building Tree Car

Representation Building Tree Car Building Car l = bag of object categories ρ = location of centroids We model P( ρ, l) Why? Because we use a generative model P(ρ, l | I) ~ P(ρ, l) P(I|ρ, l) I = the Image

Building Tree Car Building Car Building Tree Car Building Car Tree Car Which one makes more sense? Does Context matter?

Can it help Object Recognition? LOOPS

Fixed Order Model Each image has the same bag of objects example: 1 car, 2 buildings, 1 tree Object centroids are drawn jointly 1 P(ρ, l) = 1 {l = l_fixed_order} P(ρ | l) Similar to constellations (Fergus) Problem: We don't always know the exact set of objects

TDP (Sudderth, 2005) Each image has a different bag of objects Object centroids are drawn independently P(ρ, l) = P(l) П P(ρ i | l i ) Problems: This doesn't take pairwise constraints into account We have lost context

CASPER Each image has a different bag of objects Object centroids are drawn jointly given l P(ρ,l) = P(l) P(ρ | l) Questions: How do we represent P(l)? How do we represent P(ρ | l)? How do we learn? How do we infer?

P(l) Dirichlet Process We don’t want to get into that now Other options Multinomial Uniform

P( ρ | l) - Desiderata Correlations between ρ's Sharing of parameters between l's Intuitive parameterization Continuous Multivariate Distribution Easy to learn parameters Easy to evaluate likelihood Easy to condition Gaussian?

MV Gaussian - Options Learn a different Gaussian for every l Can't share parameters Large number (∞) of l's Gaussian Process ρ(x) ~ GP(mu(x), K(x,x’)) Every finite set of x’s produces a Gaussian ρ [ρ(x 1 ) ρ(x 2 ) … ρ(x k )] ~ Gaussian x t is a hidden function of the class l t Mu(x t ) = Ax t K(x t,x t’ ) = c exp(-||B(x t -x t’ )|| 2 ) Two objects of the same class -> same x? Is correlation the natural space?

Car Spatial Distribution - Options “Singleton Expert” P(ρ i |l i ) Gaussian over absolute object location “Pairwise Expert”P(ρ i -ρ j | li,lj ) Gaussian offset between objects Expert can be one of K mixture components Tree Car k = 1 k = 2 k = 1

CASPER P(ρ|l) How to use experts? Introduce an auxiliary variable d P(ρ|d,l) d tells us which experts are ‘on’ Building Tree Car Building Car For each edge e= (l i,l j ), d e indexes all possible experts for this edge Default is a uniform expert P(ρ|d,l) ~ POE d POE d = П P(ρ i |l i ) П P(ρ i -ρ j | dij,li,lj ) Product of Gaussians is a Gaussian

Learning the Experts Training set with supervised (ρ,l) pairs (one pair for each image) Gibbs over the hidden variables d e Loop over edges Update expert sufficient statistics with each update Does it converge? not as much as we want it to Work in progress Building Tree Car Building Car

Preliminary Experiments LabelMe Datasets STREETS BEDROOMS

* * * * * * * * * * * * * ** * * * * * * * * * * * FEATURES Harris Interest Operator -> y i SIFT Descriptor -> w i Instance membership -> t i INSTANCES Centroid -> ρ t Class label -> l t * * Car ρtρt (y i, w i, t i ) (ρ t, l t ) Observed P(I| ρ,l) = P(y, w|ρ,l)

What do the true ρ’s look like? Car -> Car Lamp -> Lamp Bed -> Lamp

Learning/Inference in Full Model TDP - Three stage Gibbs: Assign features to instances (Sample t i for every feature) Assign expert components (Sample d e for every edge) Assign instances to classes (Sample l t, ρ t for every instance) Training Supervise (t,l) variables Gibbs over d and ρ Testing Introduce new images Gibbs (t,l,d,ρ) of new images Independent-TDP: ρ’s are independent CASPER-TDP: ρ’s are distributed according to CASPER

Learned Experts

* * * * * * * * * * * * * ** * * * * * * * * * * * FEATURES * * (y i, w i, t i ) * * * * *

IMAGE GROUNDTRUTH IND – N = 0.1IND – N = 0.5

Evaluation – Gen Model N = 0.1N = 0.3N = 0.5 Bed 0.61110.62860.5882 Lamp 0.30770.16670.0000 Painting 0.53330.33330.2857 Window 0.90910.76920.5455 Table 0.66670.42110.3529 “Synthetic Appearance” Visual words give strong indicator for the class Evaluated on Detection Performance Precision/Recall F1 score for centroid and class identification Results here with Independent TDP Can we hope to do this well?

Evaluation - Context INDEPENDENTCASPER Bed 0.58820.5714 Lamp 0.0000 Painting 0.28570.1333 Window 0.54550.4000 Table 0.35290.1250 Independent-TDP vs CASPER-TDP N = 0.5 Why isn’t context helping here?

Problems with this Setup Bad Feelings Supervised setting – Detection Our model is not trained to maximize detection ability We will lose to many/most discriminative approaches Context is NOT the main reason why TDP fails Unsupervised setting Likelihood? Does anyone care? Object discovery? Context is a lower-order consideration How would we show that CASPER > Independent?

Going Discriminative Up to now we have been generative: P(I, ρ, l) = P(I | ρ, l) P(ρ, l) How do we convert this into discriminative? Include CASPER distribution over (ρ,l) Include term with boosted object detectors Slap on a partition function P(ρ, l | I) = 1/Z * CASPER * DETECTORS

Discriminative Framework Boosted Detectors “Over detect” Each “candidate” has: location ρ t, class variable l t detection score D I (l t ) P(ρ, l | I) ~ P(ρ, l) Π D I (l t ) Goal: Reassign detection candidates to classes Respects the “detection strength” Respects the context between objects D I (face) = 0.09 D I (face) = 0.92

Similarities to Steve’s work “Over detection” using boosted detectors But some detections don’t make sense in context 3D information allows him to “sort out” which detections are correct

CASPER Learning/Inference Gibbs Inference Loop over images Loop over detection candidates t Sample (l t | everything else) Loop over pairs of candidates Sample (d e | everything else) Training l t is known, Gibbs over d e Evaluation Precision/Recall for detections

Possible Datasets

Short Term Plan Learn the boosted detectors Determine our baseline performance Add Gibbs inference Submit to a conference that is far far away… ICML = Helsinki, Finland

Alternate Names Spatial Priors for Arbitrary Groups of Objects

Product of Experts Precision Space View P1(x) = N(a, A) P2(x) = N(b, B) P1(x)P2(x) = Z N(c, C) Z = N(a ; b, B+A) C -1 = A -1 + B -1 c = C(A -1 a + B -1 b) What does this mean? Precision matrices of the experts ADD Even if each expert has a singular A -1 the sum is PSD

CASPER Detection Detection strength component DI(lt) = P(lt | I[ρt]) Occurrence component P(l) = Π P(lt)lt ~ Multinomial CASPER component P(ρ,d | l) ~ POEd 0.92 0.09

Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS.

Similar presentations

Presentation on theme: "Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS.

Similar presentations

Presentation on theme: "Context Aware Spatial Priors using Entity Relations (CASPER) Geremy Heitz Jonathan Laserson Daphne Koller December 10 th, 2007 DAGS."— Presentation transcript:

Similar presentations

About project

Feedback