Recognition with Expression Variations Pattern Recognition Theory – Spring 2003 Prof. Vijayakumar Bhagavatula Derek Hoiem Tal Blum
Method Overview Training Image (Reduced) m < N Variables Test Image Dimensionality Reduction 1-NN Euclidean N Variables Classification
Principal Components Analysis Minimize representational error in lower dimensional subspace of input Choose eigenvectors corresponding to m largest eigenvalues of total scatter as the weight matrix T k N x S ) )( ( 1 m - = å ] ... [ max arg 2 1 m T W opt w S =
Linear Discriminant Analysis Maximize the ratio of the between-class scatter to the within-class scatter in lower dimensional space than input Choose top m eigenvectors of generalized eigenvalue solution T i c B S ) ( 1 m - = å T i k c x W S ) ( 1 m w - = å Î ] ... [ max arg 2 1 m W T B opt w S = i W B w S l =
LDA: Avoiding Singularity For N samples and c classes: Reduce dimensionality to N - c using PCA Apply LDA to reduced space Combine weight matrices PCA LDA opt W =
Discriminant Analysis of Principal Components For N samples and c classes: Reduce dimensionality m < N - c Apply LDA to reduced space Combine weight matrices PCA LDA opt W =
When PCA+LDA Can Help Test includes subjects not present in training set Very few (1-3) examples available per class Test samples vary significantly from training samples
Why Would PCA+LDA Help? Allows more freedom of movement for maximizing between-class scatter Removes potentially noisy low-ranked principal components in determining LDA projection Goal is improved generalization to non-training samples
PCA Projections Best 2-D Projection Training Testing
LDA Projections Best 2-D Projection Training Testing
PCA+LDA Projections Best 2-D Projection Training Testing
Processing Time Training time: < 3 seconds (Matlab 1.8 GHz) Testing time: O( d * (N + T) ) Method images/sec PCA (30) 224 LDA (12) 267 PCA+LDA (12)
Results Dimensions PCA LDA PCA+LDA 1 37.52% 11.28% 14.15% 2 2.56% 1.13% 1.03% 3 0.41% 0.00% 0.72% 4
Sensitivity of PCA+LDA to Number of PCA Vectors Removed 13 23.69% 11.28% 14 7.08% 15 14.15% 16 9.33% 17 6.56%
Conclusions Recognition under varying expressions is an easy problem LDA and LDA+PCA produce better subspaces for discrimination than PCA Simply removing lowest ranked PCA vectors may not be good strategy for PCA+LDA Maximizing the minimum between-class distance may be a better strategy than maximizing the Fisher ratio
References M. Turk and A. Pentland, “Face recognition using eigenfaces,” in Proc. IEEE Conf. on Comp. Vision and Patt. Recog., pages 586-591, 1991 P.N. Belhumeur, J.P. Hespanha, and D.J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” in Proc. European Conf. on Computer Vision, April 1996 W. Zhao, R. Chellappa, and P.J. Phillips, “Discriminant Analysis of Principal Components for Face Recognition,” in Proceedings, International Conference on Automatic Face and Gesture Recognition, pp. 336-341, 1998