Download presentation
Presentation is loading. Please wait.
1
Unsupervised Learning for Recognition Pietro Perona California Institute of Technology & Universita di Padova 11 th British Machine Vision Conference – Manchester, September 2001
2
Representation and Learning for Visual Object Recognition Pietro Perona California Institute of Technology & Università di Padova First SIAM-EMS Conference – Berlin, 6 Sept. 2001
3
Representation and Learning for Visual Object Recognition Pietro Perona California Institute of Technology & Università di Padova University of Plymouth, 10 Sept. 2001
5
OBJECTS ANIMALS INANIMATE PLANTS MAN-MADENATURAL VERTEBRATE ….. MAMMALS BIRDS GROUSEBOARTAPIR CAMERA
6
S. Thorpe et al. Nature 1996 J. Braun et al. J. Neurosci. 1998 Fei Fei Li et al. Unpublished animal not animal
8
Issues: Representation Recognition Learning
9
Meet the xyz
10
Spot the xyz
11
Meet the Boletus Edulis
12
Object categories individual objects `visual’ categories `functional’ categories *
13
Variability within a category Intrinsic Deformation
14
Part similarity
15
Importance of `mutual position’
16
SVD
17
SVD (2)
18
Model: constellation of Parts Fischler & Elschlager, 1973 Yuille, ‘91 Brunelli & Poggio, ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ‘95 Amit & Geman, ‘95, ‘99 Perona et al. ‘95, ‘96, ’98, ‘00 Tanaka et al., 1993 Perrett & Oram, 1993
19
A B D C Deformations
20
Presence / Absence of Features occlusion
21
Background clutter
22
Generative probabilistic model Model (Parameters) Example Object shape pdf e.g. p(x)=G(x| , ) Detector specification and prob. of detection 0.8 0.9 0.6 p Poisson (N 2 | 2 ) p Poisson (N 1 | 1 ) p Poisson (N 3 | 3 ) p(x)=A -1 (uniform) 1. Object Part Positions3a. N false detect2. Part Absence N1N1 N3N3 N2N2 Clutter pdf 3b. Position f. detect Prob. of N detect.Pdf of location Final Image
23
Affine Shape Translation, rotation and scaling Euclidean Shape Add weak perspective projection Affine Shape What is the probability density for the affine shape variables? Feature spaceEuclidean shape Affine shape
24
Affine Shape Density [Leung, Burl & Perona ’98] Gaussian figure space density: Affine Shape density: (1)exact if N is odd; (2)good approximation if probability that bases points flip sign is low. goodCareful!
25
Example Affine Shape Densities Model points Shape density (ground truth) Shape density (approximation)
26
Generative probabilistic model Model (Parameters) Example Foregrond pdf e.g. p(x)=G(x| , ) Prob. of Detection 0.80.9 p Poisson (N 2 | 2 ) p Poisson (N 1 | 1 ) p Poisson (N 3 | 3 ) p(x)=A -1 (uniform) 1. Object Part Positions3a. N false detect2. Part Absence N1N1 N3N3 N2N2 Background pdf 3b. Position f. detect Prob. of N detect.Pdf of location Final Image
27
Detection by likelihood ratio + + +++ + + + + + + ++ + + + + + + [From Burl et al. – ICCV’95, CVPR’96] + P(object | data) vs. P(clutter | data)
28
Learning Models `Manually’ Obtain set of training images Label parts by hand, train detectors Learn model from labeled parts Choose parts
29
Unsupervised learning
30
Unsupervised detector training - 1 Highly textured neighborhoods are selected automatically produces 100-1000 patterns per image 10
31
Unsupervised detector training - 2 “Pattern Space” (100+ dimensions)
32
Unsupervised detector training - 3 100-1000 images ~100 detectors
33
Parameter Estimation Take training images. Consider set of detectors… Apply detectors…..
34
Parameter Estimation Signal? Clutter? Correspondence? Chicken-and-egg problem with shape and correspondence. Use EM. optimize for representation (ML on generative models)
35
ML using EM 1. Current estimate... Image 1 Image 2 Image i 2. Assign probabilities to constellations Large P Small P 3. Use probabilities as weights to reestimate parameters. Example: Large Px+Small Px pdf new estimate of + … =
36
Final Part Selection Parameter Estimation Choice 1 Choice 2 Parameter Estimation Model 1 Model 2 Predict / measure model performance (validation set or directly from model) Preselected Parts ( 100)
37
Frontal Views of Faces 200 Images (100 training, 100 testing) 30 people, different for training and testing
38
Face images
39
Background images
40
Learned face model Preselected Parts Model Foreground pdf Sample Detection Parts in Model Test Error: 6% (4 Parts)
41
Rear Views of Cars 200 Images (100 training, 100 testing) Only one image per car High-pass filtered
42
Preselected Parts Model Foreground pdf Sample Detection Parts in Model Learned Model Test Error: 13% (5 Parts)
43
Detections of Cars
44
Background Images
45
“Wildcard” Parts
46
Parts Shape Context
47
Dilbert vs. 77 examples 125 examples
48
Dilbert Model Test Error: 15% (4 Parts) Preselected Parts Model Foreground pdf Sample Detection Parts in Model
49
Manual vs. Automatic Part Design & Selection Manual Automatic 16% Error 7% Error Task: `E’ vs. No `E’ Similar to manual Used in best models Markus Weber: move task up left color thicker Markus Weber: move task up left color thicker
50
“Strictly Unsupervised” Learning (Single Class) Training Set 100% Faces (so far)... 66% Faces 50% Faces Test Error 6% 10% 12%
51
1:2 1:4 1:81:16 Which Part Size and Scale? Markus Weber: Trade-off informativity occlusion sensitivity Markus Weber: Trade-off informativity occlusion sensitivity
52
Multi-Scale Experiment 123456 Gaussian Pyramid Preselected Parts
53
Multi-Scale: Detection Performance 2224622 Test Error single scale: 6% (4 parts) multi-scale: 11% (5 parts)
54
Occlusion Experiment no occlusion: 6% (4 parts) occlusion: 18% (5 parts) Test Error Markus Weber: Say what we do here. Occlusion in TRAINING and TESTING. Is this possible? Fewer Errors below. Markus Weber: Say what we do here. Occlusion in TRAINING and TESTING. Is this possible? Fewer Errors below. Are learning and detection possible under partial occlusion?
55
View - Based 3D Model
56
Background Examples
57
Test Images with Faces
58
3D Orientation Tuning 0 ° 45 ° 90 ° -15 ° ° 30 ° - 60 ° 75 ° - 105 ° -15 ° - 105 ° Markus Weber: Canonical views add axes info Markus Weber: Canonical views add axes info Frontal Profile 020406080100 50 55 60 65 70 75 80 85 90 95 100 Orientation Tuning angle in degrees % Correct
60
Johansson’s experiments [‘70s]
61
What is your brain doing? InputOutput Combinatorial Missing features Noise X i (t)
62
From trajectories to labels InputOutput x i, v i L i = EL i = 1,…,M
63
Representation dilemma X WL (t) ??? 2 PROPOSALS: A B
64
What is this???
65
learn joint p.d.f. Pr(data | labels) labelling by maximizing likelihood Unfortunately: –High dimensional p.d.f. cumbersome (62 variables -> 10 3 - 10 4 param.) need lots of learning examples – Search cost: M! (try all labellings) E.g. M=16 -> 16!=2*10 13 Probabilistic approach to learning
66
Approximate decomposition Human body as kinematic chain Markov property: Fewer parameters Find global max with dynamic programming –polynomial cost Pr(A, B, C, D, E) = Pr(A, B, C)Pr(D|B, C)Pr(E|C, D)
67
Triangulated decomposition (by hand) (a) LE LS LH H N LK LA LF LW LE LS LH LK LA LF LW 1 2 3 4 5 8 7 6 9 10 11 12 13 14 10 2 - 10 3 parameters Markov property Solve in O(M 4 ) [See also recent results on turbo- decoding and bayesian inference]
68
Training sequences
69
Unsupervised model A B C D E F G H I J K L A B C D E F G H I J K L Means Correlations
70
Positive example
71
Negative example 1
72
Negative example 2
73
Person walking left-to-right?
74
Learning for visual recognition Supervised [Manual alignment/correspondence of training examples] Unsupervised (1 class) [Training images contain examples of 1 class + clutter] Unsupervised (multi-class) [Turn your camera on, come back one year later]
75
OBJECTS ANIMALS INANIMATE PLANTS MAN-MADENATURAL VERTEBRATE ….. MAMMALS BIRDS GROUSEBOARTAPIR CAMERA
76
Discovering multiple classes Cars (rear and side view) Leaves (three species) Human Heads (90 o viewing range)
77
Preselected Parts for Mixture Models HeadsCarsLeaves
78
Mixture Model of Heads
79
Tuning of Mixture Models
81
Summary Probabilistic constellation models Learning based on Maximum Likelihood Unsupervised learning of object categories 3D invariance Biological motion
82
Main accomplices Markus Weber Thomas Leung Max Welling Yang Song Michael Burl
84
References [available from: www.vision.caltech.edu] CVPR98 (affine shape) FG00 (viewpoint invariance) ECCV00 (EM algor. for unsupervised learning) CVPR00 (learning of multiple classes) ECCV00, CVPR00, NIPS01, CVPR01 (biological motion) Funded by: National Science Foundation Sloan Foundation INTEL
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.