Presentation is loading. Please wait.

Presentation is loading. Please wait.

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,

Similar presentations


Presentation on theme: "Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,"— Presentation transcript:

1 Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage, Alaska.

2

3 Location Recognition Where am I? Instance recognition Category recognition (more difficult) Lobby? Cubicle? Hallway? Kitchen?

4

5 Geometry Based Recognition SLAM & structure from motion Why do we need metric reconstruction? Lose the flexibility to do class recognition. F. Schaffalitzky and A. Zisserman Training Images Testing Image Geometry &Labels Features Local Feature Database G. Schindler, M. Brown, R. Szeliski

6 Appearance Based Recognition Capture global appearance information Gaussian mixture model used by A. Torralba, et. al Training Images ImageVectors Appearance Model Preprocessing Training A. Torralba, K. Murphy, W. T. Freeman and M. A. Rubin M. Cummins and P. Newman (e.g. PCA)

7 Appearance or Geometry? Can we do better by fusing both information together? A small example with 2 location labels: cubicle and corridor

8 The Simplest Model Nearest neighbor classification Naive but still effective with enough samples. A small shift may disrupt the recognition. Does not capture uncertainty.

9 How to Incorporate Translation Invariance? We need something better than a “bag of frames” model Training images Testing image

10 Panorama It models both appearance & geometry Adapts to camera rotation and focal length change M. Brown and D. G. Lowe Generative An image is a patch “extracted” from the panorama

11 Cons of Panoramas Not easy to build a panorama due to parallax Do not capture uncertainty Only work for location instance recognition No compact representation for repetitive scenes

12 Gaussian Mixture Model Six mixtures trained as in Torralba et al’s paper Handles uncertainties but no translation invariance Means Variances Remove boundariesMuch more blurred

13 A Weak Panorama 3D motions can be roughly modeled by 2D translation + scaling. 2D translation Scaling

14 Epitome = Panorama + GMM Epitome Generative model for image patches /video frames Captures repetitive patterns in the original image Mapping = 2D translation + scaling A source imageImage patches Epitome N. Jojic et.al., ICCV 2003; N. Petrovic, et.al., CVPR 2006

15 Means Variances Location Epitome Epitome as Probabilistic Panorama Model 3D scenes rather than a single 2D image Environment = Virtual panorama

16 Learning the Location Epitome Initialize epitome randomly EM Iterations E-step: infer the posteriors over all mappings M-step: use the posteriors as weights to update the mean and variance of epitome pixels Free energy EM iterations

17 Model Comparison Epitome is a smart mixture of Gaussians model with parameters sharing among components For the same number of parameters, the epitome generalizes better

18

19 Build Label Maps The label maps are the posterior of the label given the mapping Epitome Label maps Corridor label map Cubicle label map

20 Recognition from Location Epitomes Fast correlation: infer the best mapping region Sum the pixel-wise votes Temporal smoothing using HMM Input testing image Best matching patch Corridor label map Cubicle label map Location epitome

21

22 Color is not always the best feature Other features besides RGB For example, stereo feature captures the depth info. Do not need high stereo accuracy (efficient DP here) CorridorCubicleKitchen

23 BG R Stereo Integrating Multiple Features Stack multiple feature “channels”

24 Local Histograms Enable better translation invariance and more generalization Error rate: 0.49  0.36 in a test, 4-class dataset Improve the efficiency dramatically: 30 times speed-up

25 Supervised Learning Incorporates training image labels Helps discriminate images with similar features but different location labels. An example epitome An example label feature A monitor in the cubicle A microwave in the kitchen Discriminative features

26

27 MIT Image Database Created by Antonio Torralba, and et. al. 17 sequences, 62 locations, 7 categories, 72077 images

28 Results on Recognizing Location Instances Location epitome vs. GMM, 10% better in average

29 Results on Recognizing Location Classes Location Epitome vs. GMM, 10%-20% better

30 MSRC Data Set Captured with a stereo camera 5409 images collected at the speed of 4 fps 11 sequences and 7 classes corridor_visionlabcubicle_mlpkitchen-fl2-northlectureroom-large lectureroom-smallstairs-1st-to-2ndstairs-2nd-to-1st

31 Integrate Depth Cues corridor_visionlabcubicle_mlpkitchen-fl2-northlectureroom-large lectureroom-smallstairs-1st-to-2ndstairs-2nd-to-1st

32 Instance Recognition with Multiple Features RGB & Stereo overwhelms the other features Learning: 5.7 fps Recognition: 116 fps = 29 times the capture speed

33 Summary A generative model for the recognition of both location instances and classes Fast: capable of real-time applications Flexible: capable of integrating various features Probabilistic: capable of capturing uncertainties Future applications Navigation for visually impaired people Appearance-based loop closing for SLAM problems

34 Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn Thank you !

35 Local Histograms (2) Improves efficiency (both training and testing) The bottle neck: convoluting epitome and images Compression rate: 3*(C 1 C 2 ) 2 /50 = 2400 Learning: 3 hours  6 mins, 30 times faster Convolute 3-dimension RGB featuresConvolute 50-dimension local histograms M N Image Epitome MeMe NeNe M/C 1 N/C 2 * * M e /C 1 N e /C 2


Download ppt "Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,"

Similar presentations


Ads by Google