Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unsupervised Learning for Recognition Pietro Perona California Institute of Technology & Universita di Padova 11 th British Machine Vision Conference –

Similar presentations


Presentation on theme: "Unsupervised Learning for Recognition Pietro Perona California Institute of Technology & Universita di Padova 11 th British Machine Vision Conference –"— Presentation transcript:

1 Unsupervised Learning for Recognition Pietro Perona California Institute of Technology & Universita di Padova 11 th British Machine Vision Conference – Manchester, September 2001

2 Representation and Learning for Visual Object Recognition Pietro Perona California Institute of Technology & Università di Padova First SIAM-EMS Conference – Berlin, 6 Sept. 2001

3 Representation and Learning for Visual Object Recognition Pietro Perona California Institute of Technology & Università di Padova University of Plymouth, 10 Sept. 2001

4

5 OBJECTS ANIMALS INANIMATE PLANTS MAN-MADENATURAL VERTEBRATE ….. MAMMALS BIRDS GROUSEBOARTAPIR CAMERA

6 S. Thorpe et al. Nature 1996 J. Braun et al. J. Neurosci. 1998 Fei Fei Li et al. Unpublished animal not animal

7

8 Issues: Representation Recognition Learning

9 Meet the xyz

10 Spot the xyz

11 Meet the Boletus Edulis

12 Object categories individual objects `visual’ categories `functional’ categories *

13 Variability within a category Intrinsic Deformation

14 Part similarity

15 Importance of `mutual position’

16 SVD

17 SVD (2)

18 Model: constellation of Parts Fischler & Elschlager, 1973 Yuille, ‘91 Brunelli & Poggio, ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ‘95 Amit & Geman, ‘95, ‘99 Perona et al. ‘95, ‘96, ’98, ‘00 Tanaka et al., 1993 Perrett & Oram, 1993

19 A B D C Deformations

20 Presence / Absence of Features occlusion

21 Background clutter

22 Generative probabilistic model Model (Parameters) Example Object shape pdf e.g. p(x)=G(x| ,  ) Detector specification and prob. of detection 0.8 0.9 0.6 p Poisson (N 2 | 2 ) p Poisson (N 1 | 1 ) p Poisson (N 3 | 3 ) p(x)=A -1 (uniform) 1. Object Part Positions3a. N false detect2. Part Absence N1N1 N3N3 N2N2 Clutter pdf 3b. Position f. detect Prob. of N detect.Pdf of location Final Image

23 Affine Shape Translation, rotation and scaling Euclidean Shape Add weak perspective projection Affine Shape What is the probability density for the affine shape variables? Feature spaceEuclidean shape Affine shape   

24 Affine Shape Density [Leung, Burl & Perona ’98] Gaussian figure space density: Affine Shape density: (1)exact if N is odd; (2)good approximation if probability that bases points flip sign is low. goodCareful!

25 Example Affine Shape Densities Model points Shape density (ground truth) Shape density (approximation)

26 Generative probabilistic model Model (Parameters) Example Foregrond pdf e.g. p(x)=G(x| ,  ) Prob. of Detection 0.80.9 p Poisson (N 2 | 2 ) p Poisson (N 1 | 1 ) p Poisson (N 3 | 3 ) p(x)=A -1 (uniform) 1. Object Part Positions3a. N false detect2. Part Absence N1N1 N3N3 N2N2 Background pdf 3b. Position f. detect Prob. of N detect.Pdf of location Final Image

27 Detection by likelihood ratio + + +++ + + + + + + ++ + + + + + + [From Burl et al. – ICCV’95, CVPR’96] + P(object | data) vs. P(clutter | data)

28 Learning Models `Manually’ Obtain set of training images Label parts by hand, train detectors Learn model from labeled parts Choose parts

29 Unsupervised learning

30 Unsupervised detector training - 1 Highly textured neighborhoods are selected automatically produces 100-1000 patterns per image 10

31 Unsupervised detector training - 2 “Pattern Space” (100+ dimensions)

32 Unsupervised detector training - 3 100-1000 images ~100 detectors

33 Parameter Estimation Take training images. Consider set of detectors… Apply detectors…..

34 Parameter Estimation Signal? Clutter? Correspondence? Chicken-and-egg problem with shape and correspondence. Use EM. optimize for representation (ML on generative models)

35 ML using EM 1. Current estimate... Image 1 Image 2 Image i 2. Assign probabilities to constellations Large P Small P 3. Use probabilities as weights to reestimate parameters. Example:  Large Px+Small Px pdf new estimate of  + … =

36 Final Part Selection Parameter Estimation Choice 1 Choice 2 Parameter Estimation Model 1 Model 2 Predict / measure model performance (validation set or directly from model) Preselected Parts (  100)

37 Frontal Views of Faces 200 Images (100 training, 100 testing) 30 people, different for training and testing

38 Face images

39 Background images

40 Learned face model Preselected Parts Model Foreground pdf Sample Detection Parts in Model Test Error: 6% (4 Parts)

41 Rear Views of Cars 200 Images (100 training, 100 testing) Only one image per car High-pass filtered

42 Preselected Parts Model Foreground pdf Sample Detection Parts in Model Learned Model Test Error: 13% (5 Parts)

43 Detections of Cars

44 Background Images

45 “Wildcard” Parts

46 Parts Shape Context

47 Dilbert vs. 77 examples 125 examples

48 Dilbert Model Test Error: 15% (4 Parts) Preselected Parts Model Foreground pdf Sample Detection Parts in Model

49 Manual vs. Automatic Part Design & Selection Manual Automatic  16% Error  7% Error Task: `E’ vs. No `E’ Similar to manual Used in best models Markus Weber: move task up left color thicker Markus Weber: move task up left color thicker

50 “Strictly Unsupervised” Learning (Single Class) Training Set 100% Faces (so far)... 66% Faces 50% Faces Test Error 6% 10% 12%

51 1:2 1:4 1:81:16 Which Part Size and Scale? Markus Weber: Trade-off informativity occlusion sensitivity Markus Weber: Trade-off informativity occlusion sensitivity

52 Multi-Scale Experiment 123456 Gaussian Pyramid Preselected Parts

53 Multi-Scale: Detection Performance 2224622 Test Error single scale: 6% (4 parts) multi-scale: 11% (5 parts)

54 Occlusion Experiment no occlusion: 6% (4 parts) occlusion: 18% (5 parts) Test Error Markus Weber: Say what we do here. Occlusion in TRAINING and TESTING. Is this possible? Fewer Errors below. Markus Weber: Say what we do here. Occlusion in TRAINING and TESTING. Is this possible? Fewer Errors below. Are learning and detection possible under partial occlusion?

55 View - Based 3D Model

56 Background Examples

57 Test Images with Faces

58 3D Orientation Tuning 0 ° 45 ° 90 ° -15 ° ° 30 ° - 60 ° 75 ° - 105 ° -15 ° - 105 ° Markus Weber: Canonical views add axes info Markus Weber: Canonical views add axes info Frontal Profile 020406080100 50 55 60 65 70 75 80 85 90 95 100 Orientation Tuning angle in degrees % Correct

59

60 Johansson’s experiments [‘70s]

61 What is your brain doing? InputOutput Combinatorial Missing features Noise X i (t)

62 From trajectories to labels InputOutput x i, v i L i = EL i = 1,…,M

63 Representation dilemma X WL (t) ??? 2 PROPOSALS: A B

64 What is this???

65 learn joint p.d.f. Pr(data | labels) labelling by maximizing likelihood Unfortunately: –High dimensional p.d.f. cumbersome (62 variables -> 10 3 - 10 4 param.) need lots of learning examples – Search cost: M! (try all labellings) E.g. M=16 -> 16!=2*10 13 Probabilistic approach to learning

66 Approximate decomposition Human body as kinematic chain Markov property: Fewer parameters Find global max with dynamic programming –polynomial cost Pr(A, B, C, D, E) = Pr(A, B, C)Pr(D|B, C)Pr(E|C, D)

67 Triangulated decomposition (by hand) (a) LE LS LH H N LK LA LF LW LE LS LH LK LA LF LW 1 2 3 4 5 8 7 6 9 10 11 12 13 14 10 2 - 10 3 parameters Markov property Solve in O(M 4 ) [See also recent results on turbo- decoding and bayesian inference]

68 Training sequences

69 Unsupervised model A B C D E F G H I J K L A B C D E F G H I J K L Means Correlations

70 Positive example

71 Negative example 1

72 Negative example 2

73 Person walking left-to-right?

74 Learning for visual recognition Supervised [Manual alignment/correspondence of training examples] Unsupervised (1 class) [Training images contain examples of 1 class + clutter] Unsupervised (multi-class) [Turn your camera on, come back one year later]

75 OBJECTS ANIMALS INANIMATE PLANTS MAN-MADENATURAL VERTEBRATE ….. MAMMALS BIRDS GROUSEBOARTAPIR CAMERA

76 Discovering multiple classes Cars (rear and side view) Leaves (three species) Human Heads (90 o viewing range)

77 Preselected Parts for Mixture Models HeadsCarsLeaves

78 Mixture Model of Heads

79 Tuning of Mixture Models

80

81 Summary Probabilistic constellation models Learning based on Maximum Likelihood Unsupervised learning of object categories 3D invariance Biological motion

82 Main accomplices Markus Weber Thomas Leung Max Welling Yang Song Michael Burl

83

84 References [available from: www.vision.caltech.edu] CVPR98 (affine shape) FG00 (viewpoint invariance) ECCV00 (EM algor. for unsupervised learning) CVPR00 (learning of multiple classes) ECCV00, CVPR00, NIPS01, CVPR01 (biological motion) Funded by: National Science Foundation Sloan Foundation INTEL


Download ppt "Unsupervised Learning for Recognition Pietro Perona California Institute of Technology & Universita di Padova 11 th British Machine Vision Conference –"

Similar presentations


Ads by Google