Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.

Similar presentations


Presentation on theme: "Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System."— Presentation transcript:

1 Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System Using an Improved k-NN Classification Method September 27, 2010

2 Contributions of this Research  Evaluates authentication performance on a long- text-input keystroke biometric system  Achieves high performance of about 1% EER on a closed system of known users  Shows how performance degrades as the system is opened to additional users  Derives Receiver Operating Characteristic (ROC) curves directly from the k-NN classifier

3 Keystroke Biometric  Keystroke biometric measures typing characteristics believed to be unique and difficult to duplicate  Behavioral biometric not extensively studied  Keystroke biometric is appealing because:  Non-intrusive  Ubiquitous  Users type frequently for both work and pleasure  Inexpensive – requires only computer and keyboard  Commercial products for hardening passwords

4 Pace University System Uniqueness  Focuses on long-text input  Permits powerful statistical feature measurements  Uses powerful vector-difference authentication model  Appropriate for security applications requiring:  Collection of raw keystroke data over the Internet  Arbitrary (free) text input  Example application: authenticating online test takers Federal Higher Education Opportunity Act (HEOA) requires greater student access control

5 Vector-Difference Dichotomy Model Transforms feature space into feature-vector-difference space. Two classes: within-class (same person), between–class (different people).

6 Experimental Vector-Difference Data  Mimics the process of true users and imposters trying to get authenticated  Uses all possible vector-differences from model  Example: previous slide showed 3 samples from each of 3 users, which provides 9 within-class (same person) and 27 between-class (different people) vector-difference samples for experimentation (either training or testing)

7 k-NN Classification Method k=5

8 Strong versus Weak Training  Strong Training People used in testing also used in training But new difference vectors used to test For example, users provide 10 samples – 5 for training and 5 for testing  Weak Training People used in testing not used in training Independent sets of users for testing and training

9 Strong Training Performance

10  Performance decreases as the population of users increases  Expected – the usual problem with biometric systems  Performance increases with additional training, even when that training is from non-test users  The more training the better, populates feature space Strong Training Performance

11 Weak Training Performance

12  Weak training performance worse than strong training performance  Expected since strong system trained on tested users  Performance increases with additional user training  The more training the better, populates feature space Weak Training Performance

13 ROC Curves  Receiver Operating Characteristic (ROC) curves are used to determine the appropriate operating point of a system, the tradeoff between False Accept Rate (FAR) and False Reject Rate (FRR)  Useful to compare biometric matchers

14 Three ROC Derivation Methods 1. Unweighted - Pure rank method 2. Weighted - Rank method weighted by rank order 3. Hybrid method of rank and vector space distances - Weighted vote based on distances to the kNN

15 Method 1 – Supreme Court Analogy  Usual Supreme Court Majority Decision  Usual majority NN procedure – Nine NN’s are evaluated, each with an equal vote. Authenticate on five within- class matches to the questioned sample.

16  Very Conservative (Unanimous) Decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if all nine samples match within-class to the questioned sample Method 1 – Supreme Court Analogy

17  Conservative – 8, 7, or 6 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 8, 7, or 6 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

18  Liberal – 4, 3, or 2 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 4, 3, or 2 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

19  Very Liberal. 1, or 0 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 1, or 0 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

20 Summary of Method 1: m-kNN  Pure Rank Method  Unweighted  Two parameters: k, m  Evaluate top k NN  Authenticate if #>=m

21 Method 1 Example (m-kNN) k = 9 Provides 10 FRR-FAR pairs: m = 0, 1, …, 9

22 Summary of Method 2: wm-kNN  Linear rank weighting  Two parameters: k, wm  Evaluate top k NN  Max score for k = 9 is k(k+1)/2 = 9x10/2=45  1 st choice has weight 9, 2 nd weight 8, 9 th weight 1  Authenticate if ∑ weighted within-class choices >= t

23 Summary of Method 3: ht-kNN  Weighted vote based on Dempster-Shafer distances to the questioned sample in feature space  Two parameters: k, threshold  For each test sample, the within-class weight (WCW) is calculated based on the vector distances  The sum of within-class weights is compared to a threshold  A user is authenticated if this sum for the questioned sample is >= threshold  Threshold of 0 authenticates all users: FAR=0, FRR=1 Threshold of 1 authenticates very few: FAR=1, FRR=0

24 ROC Curves ROC curves from the kNN classifier with k=21: method m-kNN (left), method wm-kNN (center), and method hd-kNN (right).

25 FAR and FRR versus threshold Closed 14-14 system, kNN classifier with k=21: FAR and FRR versus threshold for method m-kNN (left), wm-kNN (center), and hd-kNN (right).

26 Conclusions  Keystroke password performance – approximately 10% EER  See extensive study by Killourhy & Maxion, 2009  Product advertized performance is exaggerated  Keystroke long-text performance – approximately 1% EER  Reasonable considering powerful statistical features  Closed system better than open system performance  Three ROC curve derivation methods developed for kNN method  All are two-parameter methods – k plus a threshold

27 Questions


Download ppt "Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System."

Similar presentations


Ads by Google