Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage, Alaska.

Location Recognition Where am I? Instance recognition Category recognition (more difficult) Lobby? Cubicle? Hallway? Kitchen?

Geometry Based Recognition SLAM & structure from motion Why do we need metric reconstruction? Lose the flexibility to do class recognition. F. Schaffalitzky and A. Zisserman Training Images Testing Image Geometry &Labels Features Local Feature Database G. Schindler, M. Brown, R. Szeliski

Appearance Based Recognition Capture global appearance information Gaussian mixture model used by A. Torralba, et. al Training Images ImageVectors Appearance Model Preprocessing Training A. Torralba, K. Murphy, W. T. Freeman and M. A. Rubin M. Cummins and P. Newman (e.g. PCA)

Appearance or Geometry? Can we do better by fusing both information together? A small example with 2 location labels: cubicle and corridor

The Simplest Model Nearest neighbor classification Naive but still effective with enough samples. A small shift may disrupt the recognition. Does not capture uncertainty.

How to Incorporate Translation Invariance? We need something better than a “bag of frames” model Training images Testing image

Panorama It models both appearance & geometry Adapts to camera rotation and focal length change M. Brown and D. G. Lowe Generative An image is a patch “extracted” from the panorama

Cons of Panoramas Not easy to build a panorama due to parallax Do not capture uncertainty Only work for location instance recognition No compact representation for repetitive scenes

Gaussian Mixture Model Six mixtures trained as in Torralba et al’s paper Handles uncertainties but no translation invariance Means Variances Remove boundariesMuch more blurred

A Weak Panorama 3D motions can be roughly modeled by 2D translation + scaling. 2D translation Scaling

Epitome = Panorama + GMM Epitome Generative model for image patches /video frames Captures repetitive patterns in the original image Mapping = 2D translation + scaling A source imageImage patches Epitome N. Jojic et.al., ICCV 2003; N. Petrovic, et.al., CVPR 2006

Means Variances Location Epitome Epitome as Probabilistic Panorama Model 3D scenes rather than a single 2D image Environment = Virtual panorama

Learning the Location Epitome Initialize epitome randomly EM Iterations E-step: infer the posteriors over all mappings M-step: use the posteriors as weights to update the mean and variance of epitome pixels Free energy EM iterations

Model Comparison Epitome is a smart mixture of Gaussians model with parameters sharing among components For the same number of parameters, the epitome generalizes better

Build Label Maps The label maps are the posterior of the label given the mapping Epitome Label maps Corridor label map Cubicle label map

Recognition from Location Epitomes Fast correlation: infer the best mapping region Sum the pixel-wise votes Temporal smoothing using HMM Input testing image Best matching patch Corridor label map Cubicle label map Location epitome

Color is not always the best feature Other features besides RGB For example, stereo feature captures the depth info. Do not need high stereo accuracy (efficient DP here) CorridorCubicleKitchen

BG R Stereo Integrating Multiple Features Stack multiple feature “channels”

Local Histograms Enable better translation invariance and more generalization Error rate: 0.49  0.36 in a test, 4-class dataset Improve the efficiency dramatically: 30 times speed-up

Supervised Learning Incorporates training image labels Helps discriminate images with similar features but different location labels. An example epitome An example label feature A monitor in the cubicle A microwave in the kitchen Discriminative features

MIT Image Database Created by Antonio Torralba, and et. al. 17 sequences, 62 locations, 7 categories, 72077 images

Results on Recognizing Location Instances Location epitome vs. GMM, 10% better in average

Results on Recognizing Location Classes Location Epitome vs. GMM, 10%-20% better

MSRC Data Set Captured with a stereo camera 5409 images collected at the speed of 4 fps 11 sequences and 7 classes corridor_visionlabcubicle_mlpkitchen-fl2-northlectureroom-large lectureroom-smallstairs-1st-to-2ndstairs-2nd-to-1st

Integrate Depth Cues corridor_visionlabcubicle_mlpkitchen-fl2-northlectureroom-large lectureroom-smallstairs-1st-to-2ndstairs-2nd-to-1st

Instance Recognition with Multiple Features RGB & Stereo overwhelms the other features Learning: 5.7 fps Recognition: 116 fps = 29 times the capture speed

Summary A generative model for the recognition of both location instances and classes Fast: capable of real-time applications Flexible: capable of integrating various features Probabilistic: capable of capturing uncertainties Future applications Navigation for visually impaired people Appearance-based loop closing for SLAM problems

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn Thank you !

Local Histograms (2) Improves efficiency (both training and testing) The bottle neck: convoluting epitome and images Compression rate: 3*(C 1 C 2 ) 2 /50 = 2400 Learning: 3 hours  6 mins, 30 times faster Convolute 3-dimension RGB featuresConvolute 50-dimension local histograms M N Image Epitome MeMe NeNe M/C 1 N/C 2 * * M e /C 1 N e /C 2

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,

Similar presentations

Presentation on theme: "Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,

Similar presentations

Presentation on theme: "Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR 2008. Anchorage,"— Presentation transcript:

Similar presentations

About project

Feedback