Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.

Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2 University of Oxford

S. Savarese, 2003

P. Buegel, 1562

Constellation model of object categories Fischler & Elschlager 1973Yuille ‘91 Brunelli & Poggio ’93Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ’95Amit & Geman ‘95, ’99 Perona et al. ‘95, ‘96, ’98, ’00, ’03Many more recent works…

X (location) (x,y) coords. of region center A (appearance) Projection onto PCA basis c1c1 c2c2 c 10 ….. normalize 11x11 patch

X (location) (x,y) coords. of region center A (appearance) Projection onto PCA basis c1c1 c2c2 c 10 ….. normalize 11x11 patch X A h XX XX I AA AA The Generative Model

Hypothesis (h) node X A h XX XX I AA AA h is a mapping from interest regions to parts 3 5 91 2 4 6 7 10 8 e.g. h i = [3, 5, 9]

X A h XX XX I AA AA h is a mapping from interest regions to parts 3 5 91 2 4 6 7 10 8 e.g. h j = [2, 4, 8] The hypothesis (h) node

(x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) X A h XX XX I AA AA The spatial node

Spatial parameters node X A h I AA AA Joint Gaussian Joint density over all parts XX XX

The appearance node PCA coefficients on fixed basis Pt 1. (c 1, c 2, c 3,…) Pt 2. (c 1, c 2, c 3,…) Pt 3. (c 1, c 2, c 3,…) X A h I AA AA XX XX

X A h I Appearance parameter node Gaussian Independence assumed between the P parts P Fixed PCA basis XX XX AA AA

X A h I P Maximum Likelihood interpretation observed variables hidden variable parameters XX XX AA AA Also have background model – constant for given image

X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A MAP solution Choose conjugate form: Normal – Wishart distributions: P( ,  ) = p(  |  )p(  ) = N(  |m, β  ) W(  |a,B) observed variables hidden variable parameters Introduce priors over parameters priors

X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A Variational Bayesian model observed variables hidden variable parameters Estimate posterior distribution on parameters – approximate with Normal – Wishart -- has parameters: {m X, β X, a X, B X, m A, β A, a A, B A } priors

ML/MAP Learning nn 11 22 where  = {µ X,  X, µ A,  A } ML/MAP Weber et al. ’98 ’00, Fergus et al. ’03 X A h I P AA AA XX XX Performed by EM

Bayesian Variational Learning nn 11 22 Parameters to estimate: {m X, β X, a X, B X, m A, β A, a A, B A } i.e. parameters of Normal-Wishart distribution Fei-Fei et al. ’03, ‘04 X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A Performed by Variational Bayesian EM

E-Step Random initialization Variational EM prior knowledge of p(  ) new estimate of p(  |train) M-Step new  ’s (Attias, Hinton, Beal, etc.)

No labelingNo segmentationNo alignment Weakly supervised learning

Experiments Training: 1- 6 images (randomly drawn) Detection test: Datasets: foreground and background The Caltech-101 Object Categories www.vision.caltech.edu/feifeili/Datasets.htm 50 fg/ 50 bg images object present/absent

The prior Captures commonality between different classes Crucial when training from few images Constructed by: –Learning lots of ML models from other classes –Each model is a point in θ space –Fit Norm-Wishart distribution to these points using moment matching i.e. estimate {m 0 X, β 0 X, a 0 X, B 0 X, m 0 A, β 0 A, a 0 A, B 0 A }

What priors tell us? – 1. means Appearance likelyunlikely Shape

What priors tell us? – 2. variability Renoir Picasso, 1951 Picasso, 1936 Miro, 1949 Warhol, 1967 Magritte, 1928 Arcimboldo, 1590 Da Vinci, 1507 likely unlikely AppearanceShape

The prior on Appearance Blue: Airplane; Green: Leopards; Red: FacesMagenta: Background

The prior on Shape X-coord Y-coord Blue: Airplane; Green: Leopards; Red: FacesMagenta: Background

Motorbikes 6 training images Classification task (Object present/absent)

Grand piano

Cougar faces

Number of classes in prior

How good is the prior alone?

Performance over all 101 classes

Conclusions Hierarchical Bayesian parts and structure model Learning and recognition of new classes assisted by transferring information from unrelated object classes Variational Bayes superior to MAP

Visualization of learning

Sensitivity to quality of feature detector

Discriminative evaluation Mean on diagonal: 18% More recent work by Holub, Welling & Perona  40% Using gen./disc hybrid

Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.

Similar presentations

Presentation on theme: "Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.

Similar presentations

Presentation on theme: "Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2."— Presentation transcript:

Similar presentations

About project

Feedback