Object Recognition Vision Class
Object Classes
Individual Recognition
Brief History: Recognition
Mental Rotation
Three-point alignment Huttenlocher D. & Ullman, S. Recognizing solid objects by alignment with an image. Int. J. Computer Vision 5(3), 195 – 212, 1990.
Object Alignment Given three model points P 1, P 2, P 3, and three image points p 1, p 2, p 3, there is a unique transformation (rotation, translation, scale) that aligns the model with the image. (SR + d)P i = p i
Alignment -- comments The projection is orthographic projection (combined with scaling). The 3 points are required to be non-collinear. The transformation is determined up to a reflection of the points about the image plane and translation in depth.
Car Recognition
Car Models
Alignment: Cars
Alignment: Mismatch
Brief History: Classification
RBC
Structural Description G2 G4 G3 G1 G4 Above Right-of Left-of Touch
Classification: Current Approaches
Visual Class: Similar Arrangement of Shared Components
Optimal Class Components? Large features are too rare Small features are found everywhere Find features that carry the highest amount of information
Entropy Entropy:H = -Σp(x i ) log 2 p(x i ) x =01H p =0.50.5?
Mutual information H(C) when F=1H(C) when F=0 I(C;F) = H(C) – H(C/F) F=1 F=0 H(C)
Mutual Information I X alone: p(x) = 0.5, 0.5H = 1.0 X given Y: Y = 0 Y = 1 p(x) = 0.8, 0.2 H = 0.72 p(x) = 0.1, 0.9 H = 0.47 H(X|Y) = 0.5* *0.47 = H(X) – H(X|Y) = 1 – = I(X,Y) = 0.405
Mutual Information II
Computing MI from Examples Mutual information can be measured from examples: 100 Faces100 Non-faces Feature: 44 times 6 times Mutual information: H(C) = 1, H(C|F) =
Fragments Selection For a set of training images: Generate candidate fragments –Measure p(F/C), p(F/NC) Compute mutual information Select optimal fragment After k fragments: Maximizing the minimal addition in mutual information with respect to each of the first k fragments
Highly Informative Face Fragments
Horse-class features Car-class features
Fragment ‘Weight’ Likelihood ratio: Weight of F: Decision: ∑wi Fi > θ
Combining fragments w1w1 wkwk w2w2 D1D1 D2D2 DkDk Feature detection: Within a region S(F,I) > Threshold
Fragment-based Classification Leibe, Schiele 2003 Fergus, Perona, Zisserman 2003 Agarwal, Roth 2002
Recognition: ROC Curves
Training & Test Images Frontal faces without distinctive features (K:496,W:385) Minimize background by cropping Training images for extraction: 32 for each class Training images for evaluation: 100 for each class Test images: 253 for Western and 364 for Korean
Training – Fragment Extraction
Western Fragment Score Weight Korean Fragment Score Weight Extracted Fragments
Classifying novel images Westerner Korean Unknown kFkF wFwF Detect Fragments Compare Summed Weights Decision
Effect of Number of Fragments 7 fragments: 95%, 80 fragments: 100% Inherent redundancy of the features Slight violation of independence assumption
Harris Corner Detection I x 2 I x I y I x I y I y 2 ∑
Harris Corner Operator < I x I y < H = Averages within a neighborhood. Corner: The two eigenvalues λ1, λ2 are large Indirectly: ‘Corner’ = det(H) – k trace 2 (H)
Harris Corner Examples
SIFT descriptor David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp Example: 4*4 sub-regions Histogram of 8 orientations in each V = 128 values: g 1,1,…g 1,8,……g 16,1,…g 16,8
Constellation of Patches Using interest points Fegurs, Perona, Zissermann 2003
2004 Carnegie Mellon University, all rights reserved. A CAPTCHA TM is a program that can generate and grade tests that most humans can pass, but current computer programs can't pass.
Classification: Class Examples