Download presentation
Presentation is loading. Please wait.
1
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems
2
05/06/2005CSIS © M. Gibbons Introduction –Biometric models, Error rate types –Motivation, Hypothesis, Approach Visualizations in Pattern Classification –Classifier Overview Experiments ConclusionsOverview
3
05/06/2005CSIS © M. Gibbons What is a biometric? –A person’s biological characteristic, i.e., finger- print, voice, iris or hand geometry –A person’s behavioral characteristic, i.e., signature Biometric applications have started drawing a lot of attention, but there is danger… –Biometrics is not 100% reliableIntroduction
4
05/06/2005CSIS © M. Gibbons Two types of biometric models: –Verification systems A user is identified by an ID or smart card and is verified by their biometric –Identification systems A user is identified by their biometric Positive vs. Negative models Biometric Models
5
05/06/2005CSIS © M. Gibbons Closed environment –A system consisting of only the people in this room Open environment –A system consisting of the U.S. population Open vs. Closed Environments
6
05/06/2005CSIS © M. Gibbons Error Rate Types Errors in a Nearest Neighbor approach Errors in an SVM or ANN approach FR FA(1) FA(2)
7
05/06/2005CSIS © M. Gibbons The frequencies at which the false accepts and false rejects occur are known as the False Accept Rate (FAR) and the False Reject Rate (FRR), respectively These two error rates are used to determine the two key performance measurements of a biometric system: Convenience and Security
8
05/06/2005CSIS © M. Gibbons Many researchers have claimed high identification accuracies on closed systems consisting of a few hundred or thousand members –One may ask if there really are any situations that correspond to closed worlds What happens to the security of these systems as the population becomes larger and open, i.e., non-members are added? This study considers the identification problem in an open environmentMotivation
9
05/06/2005CSIS © M. Gibbons Our hypothesis is that the accuracies reported for closed systems are relevant only to those systems and may not generalize well to larger, open systems containing non-members –We claim that a classifier with the lowest error rate is not necessarily the best for securityHypothesis
10
05/06/2005CSIS © M. Gibbons Since it is impractical to test a true population, we use a reverse approach to support the hypothesis –We will work with a database M of m members, but assume a closed system of members, where and train the system on the subset members –We then have members to test how well the system holds up when non-members attempt to enter the system This approach is used on two biometric databases, one consisting of writer data and the other of iris data –First, take a look at visualization of pattern classificationApproach
11
05/06/2005CSIS © M. Gibbons In this section we explore pattern classification in two-dimensions in order to produce visualizations to help understand the decision boundaries of the following classifiers –Nearest Neighbor (NN) –Artificial Neural Network (ANN) –Support Vector Machines (SVM) Visualization of Pattern Classification
12
05/06/2005CSIS © M. Gibbons To classify a test subject, the NN computes distances from a test subject d to each member d i of the database, and classifies the test subject as the subject that has the closest distance The distances can be computed using various methods such city-block distance or Euclidean distance A threshold is used to provide reject capability Nearest Neighbor
13
05/06/2005CSIS © M. Gibbons We chose to implement a 1-vs-all implementation of the ANN to provide reject capability The 1-vs-all approach becomes a series of dichotomy problems: class 1 vs. class 2 and class 3, class 2 vs. class 1 and class 3, class x vs. all classes – class x. –Any points that fall into more than one of the dichotomy decision regions will be rejected ANN (1-vs-all)
14
05/06/2005CSIS © M. Gibbons SVM is a pattern classification technique gaining a lot of attention in recent years Basic idea is to maximize the margin between data points –Maximizing the margin provides better generalization than other pattern classifiers (for example, Neural Network) –The points which lie on the hyper-planes separating the data are called the support vectors What if data is non-separable? –Power of SVM is mapping functions transforming feature space into higher dimension. These mapping functions are called kernels Support Vector Machines (1-vs-all)
15
05/06/2005CSIS © M. Gibbons To produce the separated data, we choose three center points and randomly generate 200 points within radius r of the center points The test points will consist of all points (i, j) where i = {1:100} and j = {1:100} We will take a look at how the mentioned classifiers classify the test points based on the separated training data sets Simple 3-Member Dataset
16
05/06/2005CSIS © M. Gibbons Nearest Neighbor Training data No threshold Threshold = 8 Threshold = 2
17
05/06/2005CSIS © M. Gibbons ANN (1 vs. all) Training data 1-vs-all
18
05/06/2005CSIS © M. Gibbons SVM (1-vs-all) Training data Gamma = 2 2 Gamma = 2 -3
19
05/06/2005CSIS © M. Gibbons All Classifiers ANN 1-vs-all Training Data NN SVM
20
05/06/2005CSIS © M. Gibbons Our hypothesis is that biometric identification on closed systems does not generalize well to larger, open systems containing non-members In previous section we saw a visualization of various classifiers against a simple 2- dimensional dataset –We now investigate this hypothesis further by conducting experiments on subset database M from both the writer and iris databasesExperiments
21
05/06/2005CSIS © M. Gibbons Two biometric databases are used to support our claims in this study: the writer and iris biometric databases Biometric Databases
22
05/06/2005CSIS © M. Gibbons For each of the databases, training sets were created –Training sets for the writer data consisted of 50, 100, 200 and 400 members –Training sets for the iris data consisted of 5, 15, 25 and 35 members –These sets included all instances per member, i.e., 3 per member for writer and 10 per member for iris For each training set we created a combined evaluation set consisting of the trained members plus an increasing number of non-members –The evaluation sets for the 50-writer trained SVM consisted of 50, 100, 200, 400, 700 and 841 subjects, where the first 50 subjects are the members and the remaining subjects are non-members –Similarly, the evaluation sets for the 25-iris trained SVM consisted of 25, 35, 45 and 52 subjects, where the first 25 subjects are the members and remaining subjects are non-members Experiment Setup
23
05/06/2005CSIS © M. Gibbons SVM Results Security for SVM on writer Security for SVM on iris
24
05/06/2005CSIS © M. Gibbons NN Results Security for NN on writer
25
05/06/2005CSIS © M. Gibbons As hypothesized, for each curve, as the number of non-members increases, the security monotonically decreases –It might also be noted that the final rates to which the security curves decrease appear to converge –To ensure that this is not an artifact of the particular handwriting data used, we obtained similar experiment results using multiple classifiers and writer and iris dataObservations
26
05/06/2005CSIS © M. Gibbons We now present a comparison of the results from the two classifiers used in this experiment –The next figures illustrate the security performance for 100 members of the writer database and 15 members of iris database Notice, although Nearest Neighbor does not perform as well on the closed environment, it eventually meets and surpasses the performance of the SVM as non-members enter the system Classifier Comparison
27
05/06/2005CSIS © M. Gibbons Classifier Comparison SVM vs. NN on writer consisting of 100 members SVM vs. NN vs. ANN on iris consisting of 15 members
28
05/06/2005CSIS © M. Gibbons Based on the security results of previous figures, we recognize that the curves appear to be of exponential form and that we might be able to extrapolate the security of a system for large populations containing non-members –After some fitting trials, we find the curve most similar to be: Security Convergence
29
05/06/2005CSIS © M. Gibbons Security Convergence
30
05/06/2005CSIS © M. Gibbons System security (1-FAR) decreases rapidly for closed systems when they are tested in open-system mode –Thus, the high accuracy rates often obtained for closed biometric identification problems do not appear to generalize well to the open system problem We also found that, although systems can be trained for greater closed-system security using SVM rather than NN classifiers, the NN systems are better for generalizing to open systems An estimate of the expected error was also projected based on the asymptote of an exponential curve fitted to the dataConclusions
31
05/06/2005CSIS © M. GibbonsQuestions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.