Download presentation
Presentation is loading. Please wait.
1
Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2 University of Oxford
2
S. Savarese, 2003
3
P. Buegel, 1562
4
Constellation model of object categories Fischler & Elschlager 1973Yuille ‘91 Brunelli & Poggio ’93Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ’95Amit & Geman ‘95, ’99 Perona et al. ‘95, ‘96, ’98, ’00, ’03Many more recent works…
5
X (location) (x,y) coords. of region center A (appearance) Projection onto PCA basis c1c1 c2c2 c 10 ….. normalize 11x11 patch
6
X (location) (x,y) coords. of region center A (appearance) Projection onto PCA basis c1c1 c2c2 c 10 ….. normalize 11x11 patch X A h XX XX I AA AA The Generative Model
7
Hypothesis (h) node X A h XX XX I AA AA h is a mapping from interest regions to parts 3 5 91 2 4 6 7 10 8 e.g. h i = [3, 5, 9]
8
X A h XX XX I AA AA h is a mapping from interest regions to parts 3 5 91 2 4 6 7 10 8 e.g. h j = [2, 4, 8] The hypothesis (h) node
9
(x 1,y 1 ) (x 2,y 2 ) (x 3,y 3 ) X A h XX XX I AA AA The spatial node
10
Spatial parameters node X A h I AA AA Joint Gaussian Joint density over all parts XX XX
11
The appearance node PCA coefficients on fixed basis Pt 1. (c 1, c 2, c 3,…) Pt 2. (c 1, c 2, c 3,…) Pt 3. (c 1, c 2, c 3,…) X A h I AA AA XX XX
12
X A h I Appearance parameter node Gaussian Independence assumed between the P parts P Fixed PCA basis XX XX AA AA
13
X A h I P Maximum Likelihood interpretation observed variables hidden variable parameters XX XX AA AA Also have background model – constant for given image
14
X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A MAP solution Choose conjugate form: Normal – Wishart distributions: P( , ) = p( | )p( ) = N( |m, β ) W( |a,B) observed variables hidden variable parameters Introduce priors over parameters priors
15
X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A Variational Bayesian model observed variables hidden variable parameters Estimate posterior distribution on parameters – approximate with Normal – Wishart -- has parameters: {m X, β X, a X, B X, m A, β A, a A, B A } priors
16
ML/MAP Learning nn 11 22 where = {µ X, X, µ A, A } ML/MAP Weber et al. ’98 ’00, Fergus et al. ’03 X A h I P AA AA XX XX Performed by EM
17
Bayesian Variational Learning nn 11 22 Parameters to estimate: {m X, β X, a X, B X, m A, β A, a A, B A } i.e. parameters of Normal-Wishart distribution Fei-Fei et al. ’03, ‘04 X A h I P XX XX AA AA a0XB0Xa0XB0X m0Xβ0Xm0Xβ0X a0AB0Aa0AB0A m0Aβ0Am0Aβ0A Performed by Variational Bayesian EM
18
E-Step Random initialization Variational EM prior knowledge of p( ) new estimate of p( |train) M-Step new ’s (Attias, Hinton, Beal, etc.)
19
No labelingNo segmentationNo alignment Weakly supervised learning
20
Experiments Training: 1- 6 images (randomly drawn) Detection test: Datasets: foreground and background The Caltech-101 Object Categories www.vision.caltech.edu/feifeili/Datasets.htm 50 fg/ 50 bg images object present/absent
24
The prior Captures commonality between different classes Crucial when training from few images Constructed by: –Learning lots of ML models from other classes –Each model is a point in θ space –Fit Norm-Wishart distribution to these points using moment matching i.e. estimate {m 0 X, β 0 X, a 0 X, B 0 X, m 0 A, β 0 A, a 0 A, B 0 A }
25
What priors tell us? – 1. means Appearance likelyunlikely Shape
26
What priors tell us? – 2. variability Renoir Picasso, 1951 Picasso, 1936 Miro, 1949 Warhol, 1967 Magritte, 1928 Arcimboldo, 1590 Da Vinci, 1507 likely unlikely AppearanceShape
27
The prior on Appearance Blue: Airplane; Green: Leopards; Red: FacesMagenta: Background
28
The prior on Shape X-coord Y-coord Blue: Airplane; Green: Leopards; Red: FacesMagenta: Background
29
Motorbikes 6 training images Classification task (Object present/absent)
30
Grand piano
31
Cougar faces
32
Number of classes in prior
33
How good is the prior alone?
34
Performance over all 101 classes
35
Conclusions Hierarchical Bayesian parts and structure model Learning and recognition of new classes assisted by transferring information from unrelated object classes Variational Bayes superior to MAP
37
Visualization of learning
38
Sensitivity to quality of feature detector
39
Discriminative evaluation Mean on diagonal: 18% More recent work by Holub, Welling & Perona 40% Using gen./disc hybrid
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.