Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada.

Similar presentations


Presentation on theme: "Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada."— Presentation transcript:

1 Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada Beckman Institute & ECE, Univ of Illinois at Urbana Nebojsa Jojic Beckman Institute, University of Illinois at Urbana

2 Subspace models of images Example: Image, R 1200 = f (y, R 2 ) Frown Shut eyes

3 Generative density modeling Find a probability model that –reflects desired structure –randomly generates plausible images, –represents the data by parameters ML estimation p(image|class) used for recognition, detection,...

4 Factor analysis (generative PCA) The density of the subspace point y is p(y) = N(y; 0, I) y

5 y z The density of pixel intensities z given subspace point y is p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Factor analysis (generative PCA) Manifold: f (y) =  +  y, linear

6 Parameters ,  represent the manifold Observing z induces a Gaussian p(y|z): COV[y|z] = (      I)  E[y|z] = COV[y|z]     z y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Factor analysis (generative PCA)

7 Example: Hand-crafted model Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

8 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

9 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

10 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

11 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

12 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

13 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

14 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

15 Example: Simulation Shut eyes Frown  = y z p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Frn SE  =

16 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set  =

17 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set  =

18 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set p(y|z)p(y|z)  =

19 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set  =

20 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set  =

21 Example: Inference Shut eyes Frown  = y z Frn SE Images from data set p(y|z)p(y|z)  =

22 EM algorithm for ML Learning Initialize ,  and  to small, random values E Step –For each training case z (t), infer q (t) (y) = p(y|z (t) ) M Step –Compute  new,  new and  new  that maximize  t E[ log p(y) p(z (t) |y) ], where E[] is wrt q (t) (y) Each iteration increases log p(Data)

23 Kind of data we’re interested in Even after tracking, the features still have unknown positions, rotations, scales, levels of shearing,...

24 Problem: Factor analysis and PCA are sensitive to spatial transformations, eg translation Pix 1 Pix 2 Pix 1 Swap Pix 1 and 2 in 1/2 of cases

25 One approach Normalization Pattern Analysis Images Normalized images Labor

26 Another approach Apply transformations to each image Pattern Analysis Images Huge data set Assumes transformations are equally likely noise gets copied analysis is more complex

27 Yet another approach Extract transformation- invariant features Pattern Analysis Images Transformation- invariant data Difficult to work with May hide useful features

28 Our approach Joint Normalization and Pattern Analysis Images

29 A continuous transformation moves an image,, along a continuous curve Our subspace model should assign images near this nonlinear manifold to the same point in the subspace What transforming an image does in the vector space of pixel intensities

30 Tractable approaches to modeling the transformation manifold \ Linear approximation - good locally Discrete approximation - good globally

31 Generative models Local invariance: PCA, Turk, Moghaddam, Pentland (96); factor analysis, Hinton, Revow, Dayan, Ghahramani (96); Frey, Colmenarez, Huang (98) Layered motion: Adelson,Black,Blake,Jepson,Wang, Weiss Learning discrete representations of generative manifolds Generative topographic maps, Bishop,Svensen,Williams (98) Discriminative models Local invariance: tangent distance, tangent prop, Simard, Le Cun, Denker, Victorri (92-93) Global invariance: convolutional neural networks, Le Cun, et al (98); multiresolution tangent dist, Vasconcelos et al (98) Related work

32 Adding “transformation” as a discrete latent variable Say there are N pixels We assume we are given a set of sparse N x N transformation generating matrices G 1,…,G l,…,G L These generate points from point

33 Transformed Component Analysis The density of the subspace point y is p(y) = N(y; 0, I) y

34 y z The probability of latent image z given subspace point y is p(z|y) = N(z;  +  y,  ) p(y) = N(y; 0, I) Transformed Component Analysis

35 y z p(z|y) = N(z;  +  y,  ) Transformed Component Analysis l The probability of transf l = 1,2,… is P(l) =  l p(y) = N(y; 0, I)

36 y z p(z|y) = N(z;  +  y,  ) Transformed Component Analysis l P(l) =  l p(y) = N(y; 0, I) The probability of observed image x is p(x|z,l) = N(x; G l z,  ) x

37 Example: Hand-crafted model Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x l = 1, 2, 3  1 =  2 =  3 = 0.33 y Frn SE

38 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

39 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

40 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

41 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l=1 x y Frn SE

42 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l=1 y Frn SE x

43 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

44 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

45 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x y Frn SE

46 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l=3 x y Frn SE

47 Example: Simulation Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l=3 y Frn SE x

48 Example: Inference Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x Training data y Frn SE

49 Example: Inference Shut eyes Frown  =  = G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l x Training data y Frn SE

50 Example: Inference G 1 = shift left & up, G 2 = I, G 3 = shift right & up z l=3 x y Frn SE z l=2 x y Frn SE z l=1 x y Frn SE G a r b a g e G a r b a g e P(l=1|x) =  P(l=3|x) =  P(l=2|x) = 

51 EM algorithm for TCA Initialize ,  , ,  to random values E Step –For each training case x (t), infer q (t) (l,z,y) = p(l,z,y |x (t) ) M Step –Compute  new,  new,  new,  new,  new to maximize  t E[ log p(y) p(z|y) P(l) p(x (t) |z,l)], where E[] is wrt q (t) (l,z,y) Each iteration increases log p(Data)

52 A tough toy problem 144, 9 x 9 images 1 shape (pyramid) 3-D lighting cluttered background 25 possible locations

53 1st 8 principal components: TCA: 3 components 81 transformations - 9 horiz shifts - 9 vert shifts 10 iters of EM Model generates realistic examples  :1  :2  :3

54 Expression modeling 100 16 x 24 training images variation in expression imperfect alignment

55 PCA: Mean + 1st 10 principal components Factor Analysis: Mean + 10 factors after 70 its of EM TCA: Mean + 10 factors after 70 its of EM

56 Fantasies from FA modelFantasies from TCA model

57 Modeling handwritten digits 200 8 x 8 images of each digit preprocessing normalizes vert/horiz translation and scale different writing angles (shearing) - see “7”

58 TCA: - 29 shearing + translation combinations - 10 components per digit - 30 iterations of EM per digit Mean of each digit Transformed means

59 FA: Mean + 10 components per digit TCA: Mean + 10 components per digit

60 Classification Performance Training: 200 cases/digit, 20 components, 50 EM iters Testing: 1000 cases, p(x|class) used for classification Results: MethodError rate k-nearest neighbors (optimized k)7.6% Factor analysis3.2% Tranformed component analysis2.7% Bonus: P(l|x) infers the writing angle!

61 Wrap-up MATLAB scripts: www.cs.uwaterloo.ca/~frey Other domains: audio, bioinformatics, … Other latent image models, p(z) –clustering (mixture of Gaussians) (CVPR99) –mixtures of factor analyzers (NIPS99) –time series (CVPR00)

62 Wrap-up Discrete+Linear Combination: Set some components equal to derivatives of  wrt transformations Multiresolution approach Fast variational methods, belief propagation,...


Download ppt "Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada."

Similar presentations


Ads by Google