Page 1 CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof.

CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof Givens*, J. Ross Beveridge, Bruce A. Draper & David Bolme Computer Science, Colorado State University *Statistics, Colorado State University CVPR Workshop: Statistical Analysis in Computer Vision June 22, 2003

CVPR Workshop: Statistical Analysis in Computer Vision Some Human Subjects Are Harder to Recognize than Others. Why?

CVPR Workshop: Statistical Analysis in Computer Vision Better Formulate the Question Choose an algorithm –Something standard and simple. –PCA using our publicly released version. Choose a distance measure: –Yambor Angle is good and similar to Moon-Phillips FERET. –Much better than standards such as L2. Define & Collect Covariates: –Few covariates were collected initially with FERET images. –One person at CSU, David Bolme, scored all images. Big questions. –What to measure? –Covariates in isolation or together?

CVPR Workshop: Statistical Analysis in Computer Vision Refinement of NIST preprocessing used in FERET. Some Basics: Image Preprocessing Integer to float conversion –Converts 256 gray levels to single- floats Geometric Normalization –Aligns human chosen eye coordinates Masking –Crop with elliptical mask leaving only face visible. Histogram Equalization –Histogram equalizes unmasked pixels: 256 levels. Pixel normalization –Shift and scale pixel values so mean pixel value is zero and standard deviation over all pixels is one.

CVPR Workshop: Statistical Analysis in Computer Vision More Basics: Standard PCA Algorithm. … PCA space projection Training images Eigenspace Training Testing … Distance Matrix Remove in/out training issue by training on all images.

CVPR Workshop: Statistical Analysis in Computer Vision What Covariates? Glasses Bangs Facial Hair Mouth Smiling? Eyes AgeGenderRace

CVPR Workshop: Statistical Analysis in Computer Vision The Final Set of Subject Covariates

CVPR Workshop: Statistical Analysis in Computer Vision Collecting the Covariates

CVPR Workshop: Statistical Analysis in Computer Vision What to Measure? Recognition Rate on Partitioned Data Measure Recognition Rate for Partitioned Images. –Partition images by covariate : e.g. male versus female. –Compare recognition rate on different sets. Good –Answers a very specific version of the question. –Recognition rate is a standard performance measure. Bad –Fails to adjust, i.e. control for, other covariates. –Recognition rate for a probe set, not a single subject. –Hidden dependence on gallery image set. –Statistical significance is hard to interpret.

CVPR Workshop: Statistical Analysis in Computer Vision What to measure? Intrapersonal Image Pair Distance Measure Distance (Similarity) Between Images – For two images of one subject, closer is better. Good –Measure is independent of other subjects, probe sets, etc. –Measure is continuous, analysis of variance appropriate. –Linear model accounts for all covariates at one time. –Standard tests of statistical significance apply. Bad –Distance to recognition rate connection indirect. –Supplemental analysis required to establish linkage. Made through intermediate measure: recognition rank.

CVPR Workshop: Statistical Analysis in Computer Vision A Glimpse of Distance and Rank Data

CVPR Workshop: Statistical Analysis in Computer Vision Image Pairs for Three Example Subjects Shown Above Recognition Rank = 1 Distance = -299*Distance = -206*Distance = -110* Distance x 100,000 *

CVPR Workshop: Statistical Analysis in Computer Vision Linear Model Relating Distance to Subject Covariates Y i = Distance metric for image pair i. X i = Human covariate factors for image pair i.  = Parameters quantifying factor effects. Y i =  0 +  1 X i1 +  2 X i2 + … +  i with  i ~ iid Normal(0,  2 )

CVPR Workshop: Statistical Analysis in Computer Vision Comments on ANOVA 1,072 Subject image pairs. –Each image pair corresponds to a unique person. Both images from the same day –Delay between images is known to make recognition harder. Subjects did not add/remove glasses –Pilot studies included this case, it is harder. ANOVA yielded R 2 = 0.39 –Covariates explain 39% of observed variation. –Notable given 75% of subjects can be recognized at rank 1.

CVPR Workshop: Statistical Analysis in Computer Vision FERET Subject Covariates Summary of Results Glasses Off Expression Changes Makeup Changes Mouth Changes Glasses Always On Age YoungAge Old Eyes OpenEyes Open/ClosedEyes Always Closed Expression Neutral Always Non-neutral Race White Race Asian Race African-Amer. Race Other No Facial Hair Always Facial HairFacial Hair Changes MaleFemale -50%-40%-30%-20%-10%0% 10%20%30%40%50% Change in Similarity Measure Harder to Recognize Easier to Recognize No MakeupAlways Makeup Mouth Closed Mouth Always Open No Bangs Always Bangs Bangs Change Skin ClearSkin Not Clear Base-Case

CVPR Workshop: Statistical Analysis in Computer Vision Two Possible Reservations Unbalanced training biases conclusions?Unbalanced training biases conclusions? “Asians are closer to each other than whites because the algorithm was trained on very few of them. The unbalanced training means less well represented sets of samples are not as well separated.” Choice of Y = distance renders results irrelevant?Choice of Y = distance renders results irrelevant? “We care about recognition rank, or rank-k recognition rate. I don’t know whether your response variable is strongly related to this. Even if it is related overall, does the relationship hold for specific groups?”

CVPR Workshop: Statistical Analysis in Computer Vision Plot of Distance versus the Log of Recognition Rank -0.5 0 0.5 1 1.5 2 2.5 3 3.5 -350-300-250-200-150-100-500 Distance (x 100,000) Log of Recognition Rank 845 out of 1,072 subject image pairs are closest matches: i.e. Rank is 1. Visual inspection suggests strong relationship between distance and rank.

CVPR Workshop: Statistical Analysis in Computer Vision Modeling the Relationship Between Distance and Recognition Rank Y i = Was the ith image pair matched at rank 1 ? (i.e. Y i = 1 if R i = 0 and otherwise Y i = 0) X i = Distance metric for image pair i.  = Parameters quantifying the relationship. g(  Yi|Xi ) = X i '  =  0 +  1 X i Y i | X i ~ f(  Yi|Xi ) independently Now: g(z) = log (z/(1-z)), f(  Yi|Xi ) = Bernoulli(  Yi|Xi )

CVPR Workshop: Statistical Analysis in Computer Vision Logistic Regression of Rank Indicator on Distance -0.2 0 0.2 0.4 0.6 0.8 1 1.2 -350-300-250-200-150-100-500 Distance (x 100,000) Z i - One if Rank is One The probability of a rank 1 match decreases sharply with increasing distance;  1 is - 10585.3 with p-value = 0.049 Similar when grouped for race, gender, age, skin and glasses.

CVPR Workshop: Statistical Analysis in Computer Vision Supplemental Balanced Training Experiment for Race. YoungOld Asian896 White896 Black7816 White7816 Other626 White626 To Test Race p-value Asian38078 Glasses = Off, Eyes = Open, Skin = Clear Balance Age √ √ √ 0.0104 0.0064 0.0249 Black Other 1 2 3 TestCompareConfirm PCA Dim.Total Images 75 102376 272 Experiment 720 White 143 Asian 121 Black 88 Other Follow-up experiment replicates the first, but with carefully balanced training. Distribution of race in 1,072 Subjects is not balanced.

CVPR Workshop: Statistical Analysis in Computer Vision Supplemental Balanced Training Experiments for Other Covariates CompareClearOther Old13157 Young13157 OldYoung Clear7257 Other7257 OldYoung Off1416 On1416 To Test Age 50 Glasses = Off, Eyes = Open, Race = White To Test Skin Glasses = Off, Eyes = Open, Race = White To Test Glasses √ √ 0.0001 0.0122 Compare 5 6 Balance Skin Balance Age Eyes = Open, Race = White, Skin = Clear Age Skin Glasses120 4 0.0005 Test √ 516 752130 TestCompare Total Images PCA Dim.Total Images 117 PCA Dim. Total Images Confirmp-value Confirmp-value Confirmp-value Experiment

CVPR Workshop: Statistical Analysis in Computer Vision Sampling of Related Work Nicholas Furl, P. Jonathon Phillips and Alice J. O’Toole. Face recognition algorithms and the other-race effect: computational mechanisms for a developmental contact hypothesis. PCA-based study, recall style experiment, looked at White vs. Asian distinction in FERET, training bias like that of humans. Jeffrey F. Cohn Ralph Gross, Jianbo Shi. Quo vadis face recognition?: The current state of the art in face recognition, 2002 Partitioned probe set of 1,119 subjects: 87.6 vs. 93.7 recognition rate for females vs. males. Used FaceIt and not FERET data. Found men easier. Face Recognition Vendor Test 2002, Evaluation Report, P. Jonathon Phillips, Patrick Grother, Ross J. Micheals, Duane M. Blackburn, Elham Tabassi, Mike Bone Large Data Set, Eight Systems, CMC and ROC analysis. Gender test partitioned probe set and found men easier.

CVPR Workshop: Statistical Analysis in Computer Vision More on the Gender Effect, or Lack of Gender Effect. From FRVT 2002 Evaluation Report These results are based upon simple partition of probe set.

CVPR Workshop: Statistical Analysis in Computer Vision What Happens if We Replicate Simpler Partition on Gender? Fit our data with a simple on-way ANOVA LM on Gender. Analogous to partition of male vs. female probes. Result –“statistically significant” gender effect. –Male subjects about 13% more easier: p < 0.0001 But –We already know this result is wrong! –Full analysis with all our covariates shows no gender effect. This exercise illustrates a classic mistake It is far too much to hope that a ‘sample of convenience’ will be balanced with respect to every confounding variable. What might be confounding?

CVPR Workshop: Statistical Analysis in Computer Vision Conclusions Most comprehensive study of its kind to date. –i.e. most covariates considered together. Interesting Discoveries –Glasses aid recognition (but don’t take them off). –Non-white subjects easier, despite smaller sample. –Lack of gender effect, why? Was there enough data? –Yes, compares favorably to standards for statistical analysis. Hubris –All studies have limitations. –Linkage between distance and recognition performance. Future work –Analogous experiment on portion of FRVT 2002 data. –Image covariate analysis.

CVPR Workshop: Statistical Analysis in Computer Vision A Little Extra: The Best and Worst of FERET Best Subjects (Closest Pairs) Worst Subjects (Most Distant Pairs)

Page 1 CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof.

Similar presentations

Presentation on theme: "Page 1 CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Page 1 CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof.

Similar presentations

Presentation on theme: "Page 1 CVPR Workshop: Statistical Analysis in Computer Vision A Statistical Assessment of Subject Factors in the PCA Recognition of Human Subjects Geof."— Presentation transcript:

Similar presentations

About project

Feedback