Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.

Slides:



Advertisements
Similar presentations
Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.
Advertisements

Multiclass SVM and Applications in Object Classification
Handwritten digit recognition
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
Classification using intersection kernel SVMs is efficient Joint work with Subhransu Maji and Alex Berg Jitendra Malik UC Berkeley.
UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Computer Vision, ICCV IEEE 11th International.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.
Detecting Categories in News Video Using Image Features Slav Petrov, Arlo Faria, Pascal Michaillat, Alex Berg, Andreas Stolcke, Dan Klein, Jitendra Malik.
1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.
Thesis title: “Studies in Pattern Classification – Biological Modeling, Uncertainty Reasoning, and Statistical Learning” 3 parts: (1)Handwritten Digit.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Fast intersection kernel SVMs for Realtime Object Detection
Computer Vision Group University of California Berkeley Estimating Human Body Configurations using Shape Context Matching Greg Mori and Jitendra Malik.
Computer Vision Group University of California Berkeley Shape Matching and Object Recognition using Shape Contexts Jitendra Malik U.C. Berkeley (joint.
The Visual Recognition Machine Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Computer Vision Group University of California Berkeley Recognizing objects and actions in images and video Jitendra Malik U.C. Berkeley.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Region-based Voting Exemplar 1 Query 1 Exemplar 2.
Recognizing Objects and Actions in Images Jitendra Malik U.C. Berkeley.
CVR05 University of California Berkeley 1 Familiar Configuration Enables Figure/Ground Assignment in Natural Scenes Xiaofeng Ren, Charless Fowlkes, Jitendra.
Computer Vision Group University of California Berkeley Visual Grouping and Object Recognition Jitendra Malik * U.C. Berkeley * with S. Belongie, C. Fowlkes,
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.
Computer Vision Group University of California Berkeley Matching Shapes Serge Belongie *, Jitendra Malik and Jan Puzicha U.C. Berkeley * Present address:
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Andrea Frome, Yoram Singer, Fei Sha, Jitendra.
Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.
Spatial Pyramid Pooling in Deep Convolutional
A New Correspondence Algorithm Jitendra Malik Computer Science Division University of California, Berkeley Joint work with Serge Belongie, Jan Puzicha,
Categorization: Scenes & Objects (P) Lavanya Sharan March 16th, 2011.
Computational Vision Jitendra Malik, UC Berkeley.
A String Matching Approach for Visual Retrieval and Classification Mei-Chen Yeh* and Kwang-Ting Cheng Learning-Based Multimedia Lab Department of Electrical.
Multiclass object recognition
Efficient Algorithms for Matching Pedro Felzenszwalb Trevor Darrell Yann LeCun Alex Berg.
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
Handwritten digit recognition Jitendra Malik. Handwritten digit recognition (MNIST,USPS) LeCun’s Convolutional Neural Networks variations (0.8%, 0.6%
Shape Matching Tuesday, Nov 18 Kristen Grauman UT-Austin.
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
Visual Grouping and Recognition Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.
Visual Recognition: The Big Picture Jitendra Malik University of California at Berkeley.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Handwritten digit recognition
Human vision Jitendra Malik U.C. Berkeley. Visual Areas.
A feature-based kernel for object classification P. Moreels - J-Y Bouguet Intel.
Methods for 3D Shape Matching and Retrieval
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
Computational Vision Jitendra Malik University of California, Berkeley.
Recognizing Deformable Shapes
Paper Presentation: Shape and Matching
Geometric Blur Descriptors for Point Correspondence
Digit Recognition using SVMS
Object Recognition by Parts
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Shape matching and object recognition using shape contexts
Brief Review of Recognition + Context
Recognizing Deformable Shapes
Object Recognition by Parts
Presentation transcript:

Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley

Computer Vision Group University of California Berkeley From Pixels to Perception Tiger Grass Water Sand outdoor wildlife Tiger tail eye legs head back shadow mouse

Computer Vision Group University of California Berkeley Object Category Recognition

Computer Vision Group University of California Berkeley Defining Categories What is a “visual category”? –Not semantic –Working hypothesis: Two instances of the same category must have “correspondence” (i.e. one can be morphed into the other) e.g. Four-legged animals –Biederman’s estimate of 30,000 basic visual categories

Computer Vision Group University of California Berkeley Facts from Biological Vision Timing Abstraction/Generalization Taxonomy and Partonomy

Computer Vision Group University of California Berkeley Detection can be very fast On a task of judging animal vs no animal, humans can make mostly correct saccades in 150 ms (Kirchner & Thorpe, 2006) –Comparable to synaptic delay in the retina, LGN, V1, V2, V4, IT pathway. –Doesn’t rule out feed back but shows feed forward only is very powerful

Computer Vision Group University of California Berkeley As Soon as You Know It Is There, You Know What It Is Grill-Spector & Kanwisher, Psychological Science, 2005

Computer Vision Group University of California Berkeley Abstraction/Generalization Configurations of oriented contours Considerable toleration for small deformations

Computer Vision Group University of California Berkeley Attneave’s Cat (1954) Line drawings convey most of the information

Computer Vision Group University of California Berkeley Taxonomy and Partonomy Taxonomy: E.g. Cats are in the order Felidae which in turn is in the class Mammalia –Recognition can be at multiple levels of categorization, or be identification at the level of specific individuals, as in faces. Partonomy: Objects have parts, they have subparts and so on. The human body contains the head, which in turn contains the eyes. These notions apply equally well to scenes and to activities. Psychologists have argued that there is a “basic-level” at which categorization is fastest (Eleanor Rosch et al). In a partonomy each level contributes useful information fro recognition.

Computer Vision Group University of California Berkeley Matching with Exemplars Use exemplars as templates Correspond features between query and exemplar Evaluate similarity score Query Image Database of Templates

Computer Vision Group University of California Berkeley Matching with Exemplars Use exemplars as templates Correspond features between query and exemplar Evaluate similarity score Query Image Database of Templates Best matching template is a helicopter

Computer Vision Group University of California Berkeley 3D objects using multiple 2D views View selection algorithm from Belongie, Malik & Puzicha (2001)

Computer Vision Group University of California Berkeley Error vs. Number of Views

Computer Vision Group University of California Berkeley Three Big Ideas Correspondence based on local shape/appearance descriptors Deformable Template Matching Machine learning for finding discriminative features

Computer Vision Group University of California Berkeley Three Big Ideas Correspondence based on local shape/appearance descriptors Deformable Template Matching Machine learning for finding discriminative features

Computer Vision Group University of California Berkeley Comparing Pointsets

Computer Vision Group University of California Berkeley Shape Context Count the number of points inside each bin, e.g.: Count = 4 Count = FCompact representation of distribution of points relative to each point (Belongie, Malik & Puzicha, 2001)

Computer Vision Group University of California Berkeley Shape Context

Computer Vision Group University of California Berkeley Geometric Blur (Local Appearance Descriptor) Geometric Blur Descriptor ~ Compute sparse channels from image Extract a patch in each channel Apply spatially varying blur and sub-sample (Idealized signal) Descriptor is robust to small affine distortions Berg & Malik '01

Computer Vision Group University of California Berkeley Three Big Ideas Correspondence based on local shape/appearance descriptors Deformable Template Matching Machine learning for finding discriminative features

Computer Vision Group University of California Berkeley Modeling shape variation in a category D’Arcy Thompson: On Growth and Form, 1917 –studied transformations between shapes of organisms

Computer Vision Group University of California Berkeley Matching Example modeltarget

Computer Vision Group University of California Berkeley Handwritten Digit Recognition MNIST : –linear: 12.0% –40 PCA+ quad: 3.3% –1000 RBF +linear: 3.6% –K-NN: 5% –K-NN (deskewed) : 2.4% –K-NN (tangent dist.) : 1.1% –SVM: 1.1% –LeNet 5: 0.95% MNIST (distortions): –LeNet 5: 0.8% –SVM: 0.8% –Boosted LeNet 4: 0.7% MNIST : –K-NN, Shape Context matching: 0.63%

Computer Vision Group University of California Berkeley

Computer Vision Group University of California Berkeley EZ-Gimpy Results 171 of 192 images correctly identified: 92 % horse smile canvas spade join here

Computer Vision Group University of California Berkeley Three Big Ideas Correspondence based on local shape/appearance descriptors Deformable Template Matching Machine learning for finding discriminative features

Computer Vision Group University of California Berkeley Discriminative learning (Frome, Singer, Malik, 2006) weights on patch features in training images distance functions from training images to any other images browsing, retrieval, classification 83/400 79/400

Computer Vision Group University of California Berkeley triplets learn from relative similarity image i image j image k want: image-to-image distances based on feature-to- image distances compare image-to-image distances

Computer Vision Group University of California Berkeley focal image version image i (focal) image j image k = x ijk d ik d ij

Computer Vision Group University of California Berkeley large-margin formulation slack variables like soft-margin SVM w constrained to be positive L2 regularization

Computer Vision Group University of California Berkeley Caltech-101 [Fei-Fei et al. 04] 102 classes, images/class

Computer Vision Group University of California Berkeley retrieval example query image retrieval results:

Computer Vision Group University of California Berkeley Caltech 101 classification results (see Manik Verma’s talks for the best yet..)

Computer Vision Group University of California Berkeley 15 training/class, 63.2%

Computer Vision Group University of California Berkeley Conclusion Correspondence based on local shape/appearance descriptors Deformable Template Matching Machine learning for finding discriminative features Integrating Perceptual Organization and Recognition