Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Clustering with k-means and mixture of Gaussian densities Jakob Verbeek December 3, 2010 Course website:
Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Ziming Zhang *, Ze-Nian Li, Mark Drew School of Computing Science, Simon Fraser University, Vancouver, B.C., Canada {zza27, li, Learning.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Empowering visual categorization with the GPU Present by 陳群元 我是強壯 !
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Generic Object Recognition -- by Yatharth Saraf A Project on.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Local Features and Kernels for Classification of Object Categories J. Zhang --- QMUL UK (INRIA till July 2005) with M. Marszalek and C. Schmid --- INRIA.
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Pyramids of Features For Categorization Greg Griffin and Will Coulter (see Lazebnik et al., CVPR 2006, too)
Spatial Pyramid Pooling in Deep Convolutional
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Multiclass object recognition
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Computer vision.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Learning Visual Similarity Measures for Comparing Never Seen Objects By: Eric Nowark, Frederic Juric Presented by: Khoa Tran.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Handwritten digit recognition Jitendra Malik. Handwritten digit recognition (MNIST,USPS) LeCun’s Convolutional Neural Networks variations (0.8%, 0.6%
Andrew Bender Alexander Cobian
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Methods for classification and image representation
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
The SUN Database Slides by Jennifer Baulier. What is the SUN database? Scene Understanding Database Scene = a place humans could act within Full database.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Another Example: Circle Detection
Learning Mid-Level Features For Recognition
Paper Presentation: Shape and Matching
ICCV Hierarchical Part Matching for Fine-Grained Image Classification
Digit Recognition using SVMS
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
CS 1674: Intro to Computer Vision Scene Recognition
CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes
Brief Review of Recognition + Context
KFC: Keypoints, Features and Correspondences
SIFT keypoint detection
Presentation transcript:

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon

MESA BRIDGES Project Outline A content-based image retrieval system on Android phone Finding similar images that matching the image captured on the cell phone Gist Algorithm

MESA BRIDGES Accuracy: should retain enough information to be able to make broad categorizations Speed: should be able to quickly perform gist transformation and exemplar matching Gist & Scene Categorization Source image 160 x 120 pixels 19,200 numbers (grayscale) Gist vector ~100 numbers Requirements Category Exemplars Some new scene

MESA BRIDGES Client-Server application Project Design Camera Image Recorder Gist Estimator Http Handler User Interface Web Server PHP handler Perl Module C++ SVM Classifier Image Database Http Request Http Response

Compute SIFT grid Feature Extraction Spatial Pyramid Spatial Histogram Computer Gist Vector SVM Classification MESA BRIDGES Lazebnik Algorithm

MESA BRIDGES Edge points at 8 orientations and 2 scales. These channels are the vocabulary. Vocabulary size M = 16 SIFT on 16 x 16 pixel patches Vocabulary from K-means on SIFT descriptors. Typically, M = 200 or 400 Lazebnik Algorithm Feature Extraction Weak Features Strong Features

MESA BRIDGES Lazebnik Algorithm Spatial Matching The idea is to “contextualize” the visual words by performing a sort of geometric match X m and Y m are sets of 2D vectors representing positions of the visual words in the input and training images For each word, we apply the pyramid match kernel K L to the above position vectors Categorization is done with an SVM trained using the one-versus-all rule

MESA BRIDGES Lazebnik Algorithm Pyramid Matching

MESA BRIDGES Caltech %-0%,75%-25%,50%-50% 8 categories: Car Side, Cellphone, Chair, Cup, Faces, Laptop, Motorbikes, Pizza Vocabulary Size: 25,50,100,200 Training is done on the server-side Experimental Setup

MESA BRIDGES 25% Training 75% Testing. 200 Vocabulary 57.3% overall classification accuracy Testing Result Car SideCellphoneChairCupFacesLaptopMotorbikesPizzaUnknown Car Side Cellphone Chair Cup Faces Laptop Motorbikes Pizza Ground Truth

MESA BRIDGES 123 Speed vs. Accuracy

MESA BRIDGES Edge points at 8 orientations and 2 scales. These channels are the vocabulary. Vocabulary size M = 16 SIFT on 16 x 16 pixel patches Vocabulary from K-means on SIFT descriptors. Typically, M = 200 or 400 Result 3 Pyramid Matching

MESA BRIDGES Client-Server Design makes application easy to port different embedded system. Compute gist vector is an expensive process on embedded system. Reduce vocabulary size will improve processing speed with lower some accuracy Discussion & Conclusions

MESA BRIDGES Lazebnik, S., Schmid, C., Ponce, J. "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Catgories“ CVPR, 2006 Iryna Gordon and David G. Lowe, "Scene modelling, recognition and tracking with invariant image features," International Symposium on Mixed and Augmented Reality (ISMAR) Or Or