Download presentation
Presentation is loading. Please wait.
1
Object Recognition
2
So what does object recognition involve?
3
Verification: is that a bus?
4
Detection: are there cars?
5
Identification: is that a picture of Mao?
6
Object categorization sky building flag wall banner bus cars bus face street lamp
7
Challenges 1: view point variation Michelangelo 1475-1564
8
Challenges 2: illumination slide credit: S. Ullman
9
Challenges 3: occlusion Magritte, 1957
10
Challenges 4: scale
11
Challenges 5: deformation Xu, Beihong 1943
12
Challenges 7: intra-class variation
13
Two main approaches Part-based Global sub-window
14
Global Approaches x1x1 x2x2 x3x3 Vectors in high- dimensional space Aligned images
15
x1x1 x2x2 x3x3 Vectors in high-dimensional space Global Approaches Training Involves some dimensionality reduction Detector
16
–Scale / position range to search over Detection
17
–Scale / position range to search over
18
Detection –Scale / position range to search over
19
Detection –Combine detection over space and scale.
20
PROJECT 1
21
Turk and Pentland, 1991 Belhumeur et al. 1997 Schneiderman et al. 2004 Viola and Jones, 2000 Keren et al. 2001 Osadchy et al. 2004 Amit and Geman, 1999 LeCun et al. 1998 Belongie and Malik, 2002 Schneiderman et al. 2004 Argawal and Roth, 2002 Poggio et al. 1993
22
Object Detection Problem: Locate instances of object category in a given image. Asymmetric classification problem! BackgroundObject (Category) Very largeRelatively small Complex (thousands of categories) Simple (single category) Large prior to appear in an image Small prior Easy to collect (not easy to learn from examples) Hard to collect
23
All images Intuition Denote H to be the acceptance region of a classifier. We propose to minimize the Pr(All images) ( Pr(bkg)) in H except for the object samples. Background Object class All images Background We have a prior on the distribution of all natural images H H Black H is better!
24
Image smoothness measure Lower probability Distribution of Natural Images – Boltzmann distribution In frequency domain:
25
Antiface Lower probability Ω d object images Acceptance region
26
Main Idea Claim: for random natural images viewed as unit vectors, is large on average. – for all positive class – d is smooth is large on average for random natural image. Anti-Face detector is defined as a vector d satisfying:
27
Discrimination SMALL LARGE If x is an image and is a target class:
28
Cascade of Independent Detectors 7 inner products 4 inner products
29
PROJECT 2 Detect road signs in video 1)Use antiface method to learn a road sign under viewpoint variation 2)Use sign spatial location in the frame as an additional cue 3)Use scale change as an additional cue 4) Use evidence integration to combine evidence of sign presence in video stream.
30
Training with small number of examples Majority of object detection method require a large number of training examples. Goal: to design a classifier that can learn from a small number of examples Train existing classifiers on few examples Overfiting: learns by hart the training examples, performs poor on unseen examples.
31
Linear SVM Maximal margin Enough training data Class 1 Class 2 Not Enough training data
32
Linear SVM –Detection Task Class 1 Class 2
33
MM with prior Object class
34
Other Priors? Current prior uses the simplest features – DCT. These features are not robust to deformations. State-of-state of the art features – SIFT: local image features that are invariant to translation, rotation, scale. In addition, minor variations in illumination and viewpoint.
35
SIFT – Scale Invariant Feature Transform Descriptor overview: –Determine scale, local orientation as the dominant gradient direction. Use this scale and orientation to make all further computations invariant to scale and rotation. –Compute gradient orientation histograms of several small windows (128 values for each point) –Normalize the descriptor to make it invariant to intensity change David G. Lowe, "Distinctive image features from scale-invariant keypoints,“ International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
36
PROJECT 3 SIFT statistics: The goal of the project is to learn the statistics of state of the art features – SIFT to design a prior for recognition of images represented by SIFTs.
37
Patch-Based Face Representation Patched-based representation of a human face has several advantages – It can be used in privacy preserving applications where the identity of the person, specifically its photo is classified. –It can be used in face identification with occlusions, such as glasses, facial hair, etc. –Since local patches can be assumed planar, it can also remove the effect of illumination change.
38
Patch-Based Face Representation A face is represented by a collection of informative patches: Assume that the face is represented by N patches. Patch centers Patch size –could vary
39
Gallery Public database of faces – M faces 1 2 N
40
Indexing 14
41
… 1 14 28 N 5 V= Resulting vector V could be used for face recognition, but the picture of the person is not saved, thus it cannot be misused.
42
Recognition Enrolled people V1 V2 … Vk … 1 14 28 N 5 V = V is matched to each Vi (i=1..k) using Hamming Distance Can be done more robust – see project description
43
PROJECT 4 “Clusteron” This project investigates a new patched-based representation of a human face and applies it to face identification.
44
Lighting changes objects appearance.
45
Specular Lambertian How do we recognize these objects?
46
Few Definitions: Reflection Reflection - The scattering of light from an object. Two extreme cases: diffuse reflection and specular reflection. Real objects reflect light as a mixture of these two extremes.
47
Few Definitions: Lambertian Reflection Surface reflects equally in all directions. –Examples: chalk, clay, cloth, matte paint Brightness doesn’t depend on viewpoint. Amount of light striking surface proportional to cos θ. intensity albedo surface normal (light intensity)* (light direction)
48
Few Definitions: Specular Reflection Specular surfaces reflect light more strongly in some directions than in others. Appearance of a surface depends on the direction L of the light source, direction of the surface normal N, and direction V of viewing. The vectors L, N and R all lie in one plane
49
Few Definitions: Specular Reflection Perfect mirror: The angle of incidence equals the angle of reflection. rough specular R N L mirror R N L θθ Rough specular : Most specular surfaces reflect energy in a tight distribution (or lobe) centered on the optical reflection direction –Examples: metals,glass
50
N L llll R V rrrr Few Definitions: Phong Model Determine the angle α between the direction V of viewing and the direction R of reflection by an ideal mirror. Assume the intensity of reflected light is proportional to cos(α) The exponent n (“shine”) is determined empirically. Large values of n make the surface behave more like an ideal mirror.
51
Phong’s exponent controls how fast the highlight “falls-off”
52
Lambertian Main Approaches 2D methods based on quasi-invariance to lighting Model- based: 3D to 2D 3D image rendering Low dimensional representation of an object’s image set under different lightings compare
53
Main Approaches Specular 2D Methods: will be distracted by highlights and lack of real edges. 3D Methods: Specular objects cannot be well approximated by low- dimensional linear sub- spaces. Apply Lambertian methods and treat specularities as noise ?
54
Use specularities for recognition
55
Mapping image Gaussian sphere N L llll R V rrrr
56
Finding Specularity query map onto the sphere consistent specularity disk map back recovered highlights threshold specular candidates
57
Wrong Match query inconsistent map onto the sphere specularity disk map back recovered highlights threshold specular candidates
58
PROJECT 5 “Specularity detection” Assume that there are two types of points on a 3D sphere. A plane intersect a sphere in a disk. 1)Find a plane that separate points into two regions: a disk and the rest of the sphere with the minimal number of misclassification. (classification algorithm) 2)Test it on specularities obtained from images of real objects using mapping via 3D normals. (scan models using 3d scanner and take their pictures under different lighting directions).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.