Presentation is loading. Please wait.

Presentation is loading. Please wait.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Similar presentations


Presentation on theme: "Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon."— Presentation transcript:

1 Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon

2 MESA BRIDGES Project Outline A content-based image retrieval system on Android phone Finding similar images that matching the image captured on the cell phone Gist Algorithm

3 MESA BRIDGES Accuracy: should retain enough information to be able to make broad categorizations Speed: should be able to quickly perform gist transformation and exemplar matching Gist & Scene Categorization Source image 160 x 120 pixels 19,200 numbers (grayscale) Gist vector ~100 numbers Requirements Category Exemplars Some new scene

4 MESA BRIDGES Client-Server application Project Design Camera Image Recorder Gist Estimator Http Handler User Interface Web Server PHP handler Perl Module C++ SVM Classifier Image Database Http Request Http Response

5 Compute SIFT grid Feature Extraction Spatial Pyramid Spatial Histogram Computer Gist Vector SVM Classification MESA BRIDGES Lazebnik Algorithm

6 MESA BRIDGES Edge points at 8 orientations and 2 scales. These channels are the vocabulary. Vocabulary size M = 16 SIFT on 16 x 16 pixel patches Vocabulary from K-means on SIFT descriptors. Typically, M = 200 or 400 Lazebnik Algorithm Feature Extraction Weak Features Strong Features

7 MESA BRIDGES Lazebnik Algorithm Spatial Matching The idea is to “contextualize” the visual words by performing a sort of geometric match X m and Y m are sets of 2D vectors representing positions of the visual words in the input and training images For each word, we apply the pyramid match kernel K L to the above position vectors Categorization is done with an SVM trained using the one-versus-all rule

8 MESA BRIDGES Lazebnik Algorithm Pyramid Matching

9 MESA BRIDGES Caltech 101. 100%-0%,75%-25%,50%-50% 8 categories: Car Side, Cellphone, Chair, Cup, Faces, Laptop, Motorbikes, Pizza Vocabulary Size: 25,50,100,200 Training is done on the server-side Experimental Setup

10 MESA BRIDGES 25% Training 75% Testing. 200 Vocabulary 57.3% overall classification accuracy Testing Result Car SideCellphoneChairCupFacesLaptopMotorbikesPizzaUnknown Car Side 8700000005 Cellphone 0400000004 Chair 0000002044 Cup 1000400037 Faces 00003210005 Laptop 01000161042 Motorbikes 000000576022 Pizza 00001012017 Ground Truth

11 MESA BRIDGES 123 Speed vs. Accuracy

12 MESA BRIDGES Edge points at 8 orientations and 2 scales. These channels are the vocabulary. Vocabulary size M = 16 SIFT on 16 x 16 pixel patches Vocabulary from K-means on SIFT descriptors. Typically, M = 200 or 400 Result 3 Pyramid Matching

13 MESA BRIDGES Client-Server Design makes application easy to port different embedded system. Compute gist vector is an expensive process on embedded system. Reduce vocabulary size will improve processing speed with lower some accuracy Discussion & Conclusions

14 MESA BRIDGES Lazebnik, S., Schmid, C., Ponce, J. "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Catgories“ CVPR, 2006 Iryna Gordon and David G. Lowe, "Scene modelling, recognition and tracking with invariant image features," International Symposium on Mixed and Augmented Reality (ISMAR) 2004. http://ilab.usc.edu/wiki/index.php/Goggle Or http://ilab.usc.edu/~mviswana/Goggle Or http://ilab.usc.edu/~kai/Goggle


Download ppt "Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon."

Similar presentations


Ads by Google