Object Recognizing
Object Classes
Individual Recognition
Object parts Full Interpretation Headlight Window Door knob Back wheel Mirror Front wheel Headlight Window Bumper
Action recognition (except 2)
ClassNon-class
Is this an airplane?
Features and Classifiers Same features with different classifiers Same classifier with different features
Generic Features Simple (wavelets)Complex (Geons)
Marr-Nishihara
Mental Rotation
3-D Parts Implementations – poor results View-specific recognition fMRI studies Instead: Using image patches
Class-specific Features: Common Building Blocks
Optimal Class Components? Large features are too rare Small features are found everywhere Find features that carry the highest amount of information
Mutual information H(C) when F=1H(C) when F=0 I(C;F) = H(C) – H(C/F) F=1 F=0 H(C)
Mutual Information I(C,F) Class: Feature: I(F,C) = H(C) – H(C|F)
Horse-class features Car-class features Pictorial features Learned from examples
Star model Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme
Recognition Features in the Brain
fMRI Functional Magnetic Resonance Imaging
תמונות של פעילות המח
V1 early processing LO object recognition
Class-fragments and Activation Malach et al 2008
Bag of words
Bag of visual words A large collection of image patches –
Each class has its words historgram – – – Limited or no Geometry Simple and popular Visual words are used, but not for full recognition model
HoG Descriptor Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection
SIFT: Scale-invariant Feature Transform MSER: Maximally Stable Extremal Regions SURF: Speeded-up Robust Features Cross correlation …. HoG and SIFT are the most widely used.
DPM Felzenszwalb Felzenszwalb, McAllester, Ramanan CVPR A Discriminatively Trained, Multiscale, Deformable Part Model Many implementation details, will describe the main points.
HoG descriptor
Using patches with HoG descriptors and classification by SVM Person model: HoG
Object model using HoG A bicycle and its ‘root filter’ The root filter is a patch of HoG descriptor Image is partitioned into 8x8 pixel cells In each block we compute a histogram of gradient orientations
The filter is searched on a pyramid of HoG descriptors, to deal with unknown scale Dealing with scale: multi-scale analysis
A part Pi = (Fi, vi, si, ai, bi). Fi is filter for the i-th part, vi is the center for a box of possible positions for part i relative to the root position, si the size of this box ai and bi are two-dimensional vectors specifying coefficients of a quadratic function measuring a score for each possible placement of the i-th part. That is, a i and b i are two numbers each, and the penalty for deviation ∆x, ∆y from the expected location is a 1 ∆ x + a 2 ∆y + b 1 ∆x 2 + b 2 ∆y 2 Adding Parts
Bicycle model: root, parts, spatial map Person model