Parametric measures to estimate and predict performance of identification techniques Amos Y. Johnson & Aaron Bobick STATISTICAL METHODS FOR COMPUTATIONAL EXPERIMENTS IN VISUAL PROCESSING & COMPUTER VISION NIPS 2002
Setup – for example Given a particular human identification technique
Setup – for example Given a particular human identification technique This technique measures 1 feature (q) from n individuals x - 1D Feature Space -
Setup – for example Given a particular human identification technique This technique measures 1 feature (q) from n individuals Measure the feature again - 1D Feature Space - x
Setup – for example Given a particular human identification technique This technique measures 1 feature (q) from n individuals Measure the feature again - 1D Feature Space - x Gallery Probe
Setup – for example Given a particular human identification technique This technique measures 1 feature (q) from n individuals Measure the feature again - 1D Feature Space - x Gallery Probe For template Target
Setup – for example Given a particular human identification technique This technique measures 1 feature (q) from n individuals Measure the feature again - 1D Feature Space - x Gallery Probe For template TargetImposters
Question For a given human identification technique, how should identification performance be evaluated? - 1D Feature Space - x Gallery Probe For template TargetImposters
Possible ways to evaluate performance For a given classification threshold, compute False accept rate (FAR) of impostors Correct accept rate (HIT) of genuine targets - 1D Feature Space - x Gallery Probe For template TargetImposters
Possible ways to evaluate performance For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve)
Possible ways to evaluate performance For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve) Compute area under a ROC curve (AUROC) Probability of correct classification
Possible ways to evaluate performance For various classification thresholds, plot Multiple FAR and HIT rates (ROC curve) Compute 1 - area under a ROC curve (1 -AUROC) Probability of incorrect classification
Problem Database size If the database is not of sufficient size, then results may not estimate or predict performance on a larger population of people. 1 - AUROC
Our Goal To estimate and predict identification performance with a small number subjects 1 - AUROC
Our Solution Derive two parametric measures Expected Confusion (EC) Transformed Expected-Confusion (EC*)
Our Solution Derive two parametric measures Expected Confusion (EC) Transformed Expected-Confusion (EC*) Probability that an imposters feature vector is within the measurement variation of a targets template
Our Solution Derive two parametric measures Expected Confusion (EC) Transformed Expected-Confusion (EC*) Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Our Solution Derive two parametric measures Expected Confusion (EC) Transformed Expected-Confusion (EC*) EC* = 1 - AUROC
Expected Confusion Probability that an imposters feature vector is within the measurement variation of a targets template - 1D Feature Space - x Gallery Probe For template TargetImposters
Expected Confusion - Uniform The templates of the n individuals, are from an uniform density P p (x) = 1/n - 1D Feature Space - x P(x) 1/n Pp(x)Pp(x)
Expected Confusion - Uniform The measurement variation of a template is also uniform P i (x) = 1/m - 1D Feature Space - x P(x) 1/n Pp(x)Pp(x) 1/m Pi(x)Pi(x)
Expected Confusion - Uniform The probability that an imposters feature vector is within the measurement variation of template q 3 is the area of overlap True if m << n - 1D Feature Space - x P(x) 1/n Pp(x)Pp(x) 1/m Pi(x)Pi(x)
Expected Confusion - Uniform The probability that an imposters feature vector is within the measurement variation of any template q True if m << n x P(x) 1/n Pp(x)Pp(x) 1/m Pi(x)Pi(x)
Following the same analysis, for the multidimensional Gaussian case Expected Confusion - Gaussian : Population density : Measurement variation
Expected Confusion - Gaussian Following the same analysis, for the multidimensional Gaussian case True if the measurement variation is significantly less then the population variation Probability that an imposters feature vector is within the measurement variation of a targets template
Expected Confusion - Gaussian Relationship to other metrics Mutual Information The negative natural log of the EC is the mutual information of two Gaussian densities
Transformed Expected-Confusion Probability that an imposters feature vector is closer to a targets template, than the targets feature vector - 1D Feature Space - x Gallery Probe For template TargetImposters
Transformed Expected-Confusion First: We find the probability that a targets feature vector is some distance k away from its template x For template TargetImposters k
x For template TargetImposters k Transformed Expected-Confusion Second: We find the probability that an imposters feature vector is less than or equal to that distance k
x TargetImposters k Transformed Expected-Confusion Therefore: The probability that an imposters feature is closer to the targets template, than the targets feature (for a distance k) is
x TargetImposters k Transformed Expected-Confusion Therefore: The probability that an imposters feature is closer to the targets template, than the targets feature (for any distance k) is
x Transformed Expected-Confusion Therefore: The expected value of this probability over all targets templates is
Transformed Expected-Confusion Next: Replace the density of the distance between a targets feature-vectors and its template q
Transformed Expected-Confusion Answer: Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Transformed Expected-Confusion This probability can be shown to be one minus the area under a ROC curve Following the analysis of Green and Swets (1966)
Transformed Expected-Confusion Integrate: With these assumptions
Transformed Expected-Confusion Integrate: With these assumptions
Transformed Expected-Confusion Integrate: With these assumptions
Transformed Expected-Confusion Integrate: With these assumptions
Transformed Expected-Confusion Integrate: Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Transformed Expected-Confusion Compare: EC* with 1 - AUROC EC* = 1 - AUROC
Conclusion Derive two parametric measures Expected Confusion (EC) Transformed Expected- Confusion (EC*) Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Conclusion Derive two parametric measures Expected Confusion (EC) Transformed Expected- Confusion (EC*) Probability that an imposters feature vector is within the measurement variation of a targets template Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Conclusion Derive two parametric measures Expected Confusion (EC) Transformed Expected- Confusion (EC*) Probability that an imposters feature vector is within the measurement variation of a targets template Probability that an imposters feature vector is closer to a targets template, than the targets feature vector
Future Work Developing a mathematical model of the cumulative match characteristic (CMC) curve Benefit: To predict how the CMC curve changes as more subjects are added