Face Recognition: An Introduction

Face Recognition: An Introduction

Face Recognition Face is the most common biometric used by humans
Applications range from static, mug-shot verification to a dynamic, uncontrolled face identification in a cluttered background Challenges: automatically locate the face recognize the face from a general view point under different illumination conditions, facial expressions, and aging effects O O Face recognition can be categorized into appearance-based, geometry-based, and hybrid approaches.

Authentication vs Identification
Face Authentication/Verification (1:1 matching) Face Identification/Recognition (1:N matching)

Applications  Access Control

Applications Face Scan at Airports
 Video Surveillance (On-line or off-line) Face Scan at Airports

Why is Face Recognition Hard?
Many faces of Madonna

Face Recognition Difficulties
Identify similar faces (inter-class similarity) Accommodate intra-class variability due to: head pose illumination conditions expressions facial accessories aging effects O O Face recognition can be categorized into appearance-based, geometry-based, and hybrid approaches. Cartoon faces

news.bbc.co.uk/hi/english/in_depth/americas/2000/us_elections
Inter-class Similarity Different persons may have very similar appearance news.bbc.co.uk/hi/english/in_depth/americas/2000/us_elections O O Face recognition can be categorized into appearance-based, geometry-based, and hybrid approaches. Twins Father and son

Intra-class Variability
Faces with intra-subject variations in pose, illumination, expression, accessories, color, occlusions, and brightness O O Face recognition can be categorized into appearance-based, geometry-based, and hybrid approaches.

Sketch of a Pattern Recognition Architecture
Feature Extraction Classification Image (window) Object Identity Feature Vector

Example: Face Detection
Scan window over image Classify window as either: Face Non-face Classifier Window Face Non-face

Detection Test Sets

Profile views Schneiderman’s Test set

Face Detection: Experimental Results
Test sets: two CMU benchmark data sets Test set 1: 125 images with 483 faces Test set 2: 20 images with 136 faces [See also work by Viola & Jones, Rehg, more recent by Schneiderman]

Example: Finding skin Non-parametric Representation of CCD
Skin has a very small range of (intensity independent) colors, and little texture Compute an intensity-independent color measure, check if color is in this range, check if there is little texture (median filter) See this as a classifier - we can set up the tests by hand, or learn them. get class conditional densities (histograms), priors from data (counting) Classifier is

Face Detection

Face Detection Algorithm
Lighting Compensation Color Space Transformation Skin Color Detection Input Image Variance-based Segmentation Connected Component & Grouping Face Localization Eye/ Mouth Detection O O Face recognition can be categorized into appearance-based, geometry-based, and hybrid approaches. Face Boundary Detection Verifying/ Weighting Eyes-Mouth Triangles Facial Feature Detection Output Image

Canon Powershot

Face Recognition: 2-D and 3-D
Time (video) 2-D Face Database 2-D Recognition Data Recognition Comparison 3-D 3-D Play this, it’s animated – Green denotes topics covered subsequent slides Prior knowledge of face class

Taxonomy of Face Recognition
Algorithms Pose-dependency Pose-dependent Pose-invariant Viewer-centered Images Object-centered Models Face representation Matching features -- Gordon et al., 1995 Appearance-based (Holistic) Hybrid Feature-based (Analytic) -- Lengagne et al., 1996 -- Atick et al., 1996 -- Yan et al., 1996 PCA, LDA LFA -- Zhao et al., 2000 EGBM -- Zhang et al., 2000

Image as a Feature Vector
x 1 2 3 Consider an n-pixel image to be a point in an n-dimensional space, x Rn. Each pixel value is a coordinate of x.

Nearest Neighbor Classifier
{ Rj } are set of training images. x 1 2 3 R1 R2 I

Comments Sometimes called “Template Matching”
Variations on distance function (e.g. L1, robust distances) Multiple templates per class- perhaps many training images per class. Expensive to compute k distances, especially when each image is big (N dimensional). May not generalize well to unseen examples of class. Some solutions: Bayesian classification Dimensionality reduction

Eigenface (Turk, Pentland, 91) -1
Use Principle Component Analysis (PCA) to determine the most discriminating features between images of faces.

Eigenfaces: linear projection
An n-pixel image xRn can be projected to a low-dimensional feature space yRm by y = Wx where W is an n by m matrix. Recognition is performed using nearest neighbor in Rm. How do we choose a good W?

Eigenfaces: Principal Component Analysis (PCA)
Main point is that the first principal component is the direction of largest variance. I usually prove this on the whiteboard Some details: Use Singular value decomposition, “trick” described in text to compute basis when n<<d

How do you construct Eigenspace?
[ ] [ ] [ x1 x2 x3 x4 x5 ] W Construct data matrix by stacking vectorized images and then apply Singular Value Decomposition (SVD)

Matrix Decompositions
Definition: The factorization of a matrix M into two or more matrices M1, M2,…, Mn, such that M = M1M2…Mn. Many decompositions exist… QR Decomposition LU Decomposition LDU Decomposition Etc.

[m x n] = [m x m][m x n][n x n]
Singular Value Decomposition Excellent ref: ‘Matrix Computations,” Golub, Van Loan Any m by n matrix A may be factored such that A = UVT [m x n] = [m x m][m x n][n x n] U: m by m, orthogonal matrix Columns of U are the eigenvectors of AAT V: n by n, orthogonal matrix, columns are the eigenvectors of ATA : m by n, diagonal with non-negative entries (1, 2, …, s) with s=min(m,n) are called the called the singular values Singular values are the square roots of eigenvalues of both AAT and ATA Result of SVD algorithm: 1  2  …  s

SVD Properties In Matlab [u s v] = svd(A), and you can verify that: A=u*s*v’ r=Rank(A) = # of non-zero singular values. U, V give us orthonormal bases for the subspaces of A: 1st r columns of U: Column space of A Last m - r columns of U: Left nullspace of A 1st r columns of V: Row space of A last n - r columns of V: Nullspace of A For d<= r, the first d column of U provide the best d-dimensional basis for columns of A in least squares sense.

[m x n] = [m x n][n x n][n x n]
Thin SVD Any m by n matrix A may be factored such that A = UVT [m x n] = [m x n][n x n][n x n] If m>n, then one can view  as: Where ’=diag(1, 2, …, s) with s=min(m,n), and lower matrix is (n-m by m) of zeros. Alternatively, you can write: A = U’’VT In Matlab, thin SVD is:[U S V] = svds(A)

Application: Pseudoinverse
Given y = Ax, x = A+y For square A, A+ = A-1 For any A… A+ = V-1UT A+ is called the pseudoinverse of A. x = A+y is the least-squares solution of y = Ax. Alternative to previous solution.

Performing PCA with SVD
Singular values of A are the square roots of eigenvalues of both AAT and ATA & Columns of U are corresponding Eigenvectors And Covariance matrix is: So, ignoring 1/n subtract mean image  from each input image, create data matrix, and perform thin SVD on the data matrix.

Mean First Principal Component Direction of Maximum Variance
Figure story in the caption Mean

Eigenfaces Modeling Recognition
Given a collection of n labeled training images, Compute mean image and covariance matrix. Compute k Eigenvectors (note that these are images) of covariance matrix corresponding to k largest Eigenvalues. Project the training images to the k-dimensional Eigenspace. Recognition Given a test image, project to Eigenspace. Perform classification to the projected training images.

Eigenfaces: Training Images
[ Turk, Pentland 01

Eigenfaces Mean Image Basis Images

Variable Lighting

Projection, and reconstruction
An n-pixel image xRn can be projected to a low-dimensional feature space yRm by y = Wx From yRm , the reconstruction of the point is WTy The error of the reconstruction is: ||x-WTWx||

Reconstruction using Eigenfaces
Given image on left, project to Eigenspace, then reconstruct an image (right).

Underlying assumptions
Background is not cluttered (or else only looking at interior of object Lighting in test image is similar to that in training image. No occlusion Size of training image (window) same as window in test image.

Face detection using “distance to face space”
Scan a window  across the image, and classify the window as face/not face as follows: Project window to subspace, and reconstruct as described earlier. Compute distance between  and reconstruction. Local minima of distance over all image locations less than some treshold are taken as locations of faces. Repeat at different scales. Possibly normalize windows intensity so that || = 1.

Difficulties with PCA Projection may suppress important detail
smallest variance directions may not be unimportant Method does not take discriminative task into account typically, we wish to compute features that allow good discrimination not the same as largest variance

Fig 22.7 Principal components will give a very poor repn of this data set

22. 10 - Two classes indicated by
Two classes indicated by * and o; the first principal component captures all the variance, but completely destroys any ability to discriminate. The second is close to what’s required.

Illumination Variability
Same person under variable lighting “The variations between the images of the same face due to illumination and viewing direction are almost always larger than image variations due to change in face identity.” -- Moses, Adini, Ullman, ECCV ‘94

Fisherfaces: Class specific linear projection
P. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, PAMI, July 1997, pp An n-pixel image xRn can be projected to a low-dimensional feature space yRm by y = Wx where W is an n by m matrix. Recognition is performed using nearest neighbor in Rm. How do we choose a good W?

PCA & Fisher’s Linear Discriminant
Between-class scatter Within-class scatter Total scatter Where c is the number of classes i is the mean of class i | i | is number of samples of i.. 1 2  1 2 Use this slide to explain the three types of scatter

PCA & Fisher’s Linear Discriminant
PCA (Eigenfaces) Maximizes projected total scatter Fisher’s Linear Discriminant Maximizes ratio of projected between-class to projected within-class scatter 2 1 Point to emphasize is that PCA maximizes the projected total scatter – I.e., it presserves the information in the training set, and is optimal in a least squares sense for reconstruction. But in dropping dimensions, it may smear classes together. FLD trades off two desirable effects for recognition. A. The within class scatter over all classes is minimized – this makes classification using nearest neighbor effective. B. The between class scatter is maximized – this causes classes to be far apart in the feature space. FLD

Computing the Fisher Projection Matrix
The wi are orthonormal There are at most c-1 non-zero generalized Eigenvalues, so m <= c-1 Can be computed with eig in Matlab

Fisherfaces Since SW is rank N-c, project training set to subspace spanned by first N-c principal components of the training set. Apply FLD to N-c dimensional subspace yielding c-1 dimensional feature space. Fisher’s Linear Discriminant projects away the within-class variation (lighting, expressions) found in training set. Fisher’s Linear Discriminant preserves the separability of the classes. Here’s how W is actually calculated. The rows of W have the dimensions of images and like the “eigenfaces”, these can be viewed as “fisherfaces.”

PCA vs. FLD

Experimental Results - 1
Variation in Facial Expression, Eyewear, and Lighting Input: 160 images of 16 people Train: 159 images Test: 1 image With glasses Without glasses 3 Lighting conditions 5 expressions

Performance Evaluation
Leave-one-out evaluation of PCA and LDA on the Yale Face Database [Belhumer, Hespanha, Kriegman 97] Approach Dim. of the subspace Error rate (close crop) Error rate (full face) Eigenface (PCA) 30 24.4% 19.4% Fisherface (LDA) 15 7.3% 0.6%

Experimental Results - 2

Harvard Face Database 10 individuals 66 images per person
Train on 6 images at 15o Test on remaining images 60o

Recognition Results: Lighting Extrapolation
Training on near frontal subset (Within 15 degrees of frontal), and testing on more extremes of lighting. As expected, Correlation performs slightly better than Eigenfaces. Removing the first three principal components works better since the capture the lighting variation, but they also contain useful discriminatory information which is lost. Fisherface performs better than other methods.

Face Recognition: An Introduction

Similar presentations

Presentation on theme: "Face Recognition: An Introduction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Face Recognition: An Introduction

Similar presentations

Presentation on theme: "Face Recognition: An Introduction"— Presentation transcript:

Similar presentations

About project

Feedback