HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented.

HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented By: Yonatan Dishon Nov 2013

Today talk  Motivation  Related Work  HOGgles under the hood.  3 Baselines + main algorithm  Limitations  Quantative + Qualitative evaluation  Human+HOG detector  Paper Conclusion  Future Development

Links  PASCAL 2 challenge PASCAL 2 challenge  Deformable Part Model Deformable Part Model

So why HOGgles? Image from: C. Vondrick, A. Khosla, T. Malisiewicz, A. Torralba. "HOGgles: Visualizing Object Detection Features" 2013

So why HOGgles? !? Maybe I should put my HOGgles!! HOGgles Visualization

So why HOGgles?  This is a visualization of the descriptor space – this is what a classification/detection algorithm sees!  So which of the following would you classify as a car?  Remember Humans are the “perfect” classifier!  Object categories (PASCAL VOC): Airplane, Bicycle, Bird, Boat, Bottle, Bus, Car, Cat, Chair, Cow, Table, Dog, Horse, Motorbike, Person, Potted Plant, Sheep, Sofa, Train, TV/Monitor 1 2 3 4 5 6 7

Motivation  So why did my detector failed?  Training set Maybe not a good one?  Learning Algorithm Maybe should have been different?  Features Maybe there are better features for this kind of problem?  Visualizing the Feature space can bring us to an intuitive understanding of our detection system limitations and failures

HOGgles Contributions  A tool to explain some of the failure of object detection systems.  Algorithm to present the feature space of object detectors (general features – not only HOG!).  4 different algorithms to do so are presented.  Public feature visualization toolbox - http://web.mit.edu/vondrick/ihog/#code http://web.mit.edu/vondrick/ihog/#code

Related Work (1)  Iterative process for recovering images from GIST features A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic. representation of the spatial envelope. IJCV, 2001. 2  Computational model for recognition real world scenes.  Based on Spatial Envelope

Related Work (1)

 Reconstruct an image given keypoint of SIFT based on a huge database P. Weinzaepfel, H. Jégou, and P. Pérez. Reconstructing an image from its local descriptors. In CVPR, 2011  Image from: P. Weinzaepfel, H. Jégou, and P. Pérez. Reconstructing an image from its local descriptors. In CVPR, 2011.

Related Work (1)

original image copied patchesblending interpolation Reconstructing an image from its local descriptors, Philippe Weinzaepfel, Hervé Jégou and Patrick Pérez, Proc. IEEE CVPR’11. calculate SIFTelliptic region of interest affine normalization to square patch Slide credit: Ezgi Mercan

Related Work (1)  Image from: P. Weinzaepfel, H. J´egou, and P. P´erez. Reconstructing an image from its local descriptors. In CVPR, 2011.  Website for more examples : http://www.irisa.fr/texmex/people/jegou/projects/reconstructing/index.html http://www.irisa.fr/texmex/people/jegou/projects/reconstructing/index.html

Related Work (2)  Reconstruct an image given only LBP features E. d’Angelo, A. Alahi, and P. Vandergheynst. Beyond bits: Reconstructing images from local binary descriptors. ICPR, 2012. 2  Using LBD descriptors (Local Binary Descriptors) – BRIEF and FREAK  No external information!  Reconstruction as a regularized inverse problem  A. Alahi, R. Ortiz, and P. Vandergheynst. FREAK: Fast Retina Keypoint. In IEEE Conference on Computer Vision and Pattern Recognition (To Appear), 2012.  M. Calonder, V. Lepetit, C. Strecha, and P. Fua. BRIEF: Binary Robust Independent Elementary Features. Computer Vision–ECCV 2010, pages 778–792, 2010.

Related Work (2) Image reconstruction results of FREAK (top row) and BRIEF (bottom row) descriptors. E. d’Angelo, A. Alahi, and P. Vandergheynst. Beyond bits: Reconstructing images from local binary descriptors. ICPR, 2012. 2

Related Work (3)  Human Debugging of Object detectors - D. Parikh and C. L. Zitnick. The role of features, algorithms and data in visual recognition. In CVPR, 2010. 2, 7

Related Work (4)  Have We reached Bayes risk for HOG features? X. Zhu, C. Vondrick, D. Ramanan, and C. Fowlkes. Do we need more training data or better models for object detection? BMVC, 2012. 2  Do existing detectors for object recognition are improve as the training set size continue to grow?

Related Work (4)  SVM are sensitive to noise and should be trained with “clean” data.  The HOG features are limited but there are still a performance to squeeze out of them.

Related Work (4)  RMP is better than one template that contain all training data

HOGgles added value to related work  The algorithm is feature independent  The algorithms are Fast! - under a second for visualizing on a desktop computer – interactive debugging process.  The visualization can be easily understood by humans!

HOGgles – under the hood  The problem of feature visualization as a feature inversion problem. Given a feature vector - what was the image/patch that created it?  Let be an image and be the corresponding HOG feature descriptor.  is a many to one function. The inversion problem cannot be solved analytically!

HOGgles under the hood  The problem is formalized as an optimization problem – given a descriptor y we seek an image x that minimize:  This function isn’t convex  Trying to find a minima with ordinary optimization algorithms didn’t work (Steepest decent and Newton’s method).

HOGgles under the hood  4 algorithms are showed – 3 as a baselines and one is offered as the main algorithm.  Baseline 1: Exampler LDA  Baseline 2: Ridge Regression  Baseline 3: Direct Optimization  Main Algorithm : Paired Dictionary

HOGgles under the hood (Baseline 1 -ELDA)  Baseline 1: Exampler LDA ( B.Hariharan,, J.Malik & D.Ramanan ECCV 2012)  HOG inverse is the average of the top K detections of the ELDA detector in RGB space.

HOGgles under the hood (Baseline 1 -ELDA) Slide credit: Ezgi Mercan

HOGgles under the hood (Baseline 1 -ELDA)  PROs:  Simple.  Surprisingly accurate results! Even when the database doesn’t contain the category of HOG template!  CONs:  Computationally expensive! – running an object detector over a large database  Yields blurred results.

HOGgles under the hood (Baseline 2 –Ridge Regression)  Baseline 2: Ridge Regression  Statistically most likely image given a HOG descriptor.  Calculating the most probable grayscale image given its HOG feature.  Modeling as  The HOG inverse (the visualization) is given by  are estimated on a large database  Single matrix multiplication!

HOGgles under the hood (Baseline 2 –Ridge Regression)  PROs:  Simple – this is a matrix multiplication!  Very fast! – under a second for inversion.  CONs:  Inversion yields blur images.

HOGgles under the hood (Baseline 3 -Direct)  Baseline 3: Direct Optimization  Describing a natural image basis  Any image can be encoded by coefficients in this basis:  And we wish to find

HOGgles under the hood (Baseline 3 -Direct)  PROs:  Recover high frequencies  CONs:  adding noise to the image

HOGgles under the hood (the main algorithm –Pair Dictionary)  Letbe an image be its HOG descriptor. Suppose we write x and y in terms of bases and respectively where are shared coefficients  y can be projected to V basis and then to image basis U

HOGgles under the hood (Main Algorithm – PairDict)

HOGgles under the hood (Baseline 4 – PairDict)  How the bases U and V are found?  Solving Pair dictionaries learning problem.  The objective is simplified to a standard sparse coding and dictionary learning problem, optimized with SPAMS J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In ICML, 2009. 4 SPAMS – SParse Modeling Software, Code available at http://spams-devel.gforge.inria.fr/http://spams-devel.gforge.inria.fr/

HOGgles under the hood (Baseline 4 – PairDict)  Dictionaries optimization time is a few hours (offline).  U and V are estimated with dictionary size  Training samples from a large database  examples for U & V pairs – notice the correlation of the dictionaries

HOGgles visual results

HOGgles – Limitations  There exist a better visualization than paired dictionaries although it may not be traceable to construct it.  On Recursive iterative solution some high frequencies are lost.

HOGgles – Limitations – cont.  Inversion is sensitive to the HOG dimensionality

Evaluation of Inversions  PASCAL VOC 2011 dataset.  Inverting patches correspond to objects.  Quantitative evaluation:  How well each pixel in x is reconstructed from y by each Algorithm?  Qualitative evaluation:  How well the high level content is saved? (Human research using MTurk platform).

Evaluation of Inversions Quantitative  Mean normalized cross correlation of inverse image to ground truth. (Higher is better, max. is 1)

Evaluation of Inversions Qualitative  MTurk workers where asked to classified inversions to 1 of 20 categories.  MIT PhD. In CV refers as experts. Trying the same task with HOG Glyphs. (*) Numbers are percentage classified correctly, chance is 0.05

Evaluation of Inversions Qualitative  Graphs credit : Ezgi Mercan

Evaluation of Inversions Qualitative  Gliphs vs. HOGgles

Human+HOG detector  Get Insight on performance of HOG with the perfect learning algorithm (people).  Large Human Expirement consist of:  MTurk workers  Dataset – Top detections from DPM on PASCAL 2007 VOC.  5000 windows per category. 20% are true positive.  25 votes per window.

Human+HOG detector Results

Paper Conclusions  DPM is operating very close to the performance limit of HOG.  HOG may be too lossy of a descriptor for high performance object detection.  The features we are using are the one to blame in current novel object detection algorithms.  To advance to the next level recognition – finer details and higher level information capturing features are needed to be built.

Written Level of the Paper  Pros:  Well written  Well referenced  Novel solution  Large and detailed Human experiment  Website.  Cons:  Details/examples on baseline algorithms are lacking  The chosen algorithm settings are lacking  Some conclusions on the Human+HOG are questionable

Future development  Applying visualization on other descriptors.  Color reconstruction of HOG features.  Developing new feature that can be more discriminate.  Developing Algorithms that better model the interaction of the simple atoms of the image.

Links  PASCAL 2 challenge PASCAL 2 challenge  Deformable Part Model Deformable Part Model  HOGgles HOGgles

HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented.

Similar presentations

Presentation on theme: "HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented.

Similar presentations

Presentation on theme: "HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented."— Presentation transcript:

Similar presentations

About project

Feedback