Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.

Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System Using an Improved k-NN Classification Method September 27, 2010

Contributions of this Research  Evaluates authentication performance on a long- text-input keystroke biometric system  Achieves high performance of about 1% EER on a closed system of known users  Shows how performance degrades as the system is opened to additional users  Derives Receiver Operating Characteristic (ROC) curves directly from the k-NN classifier

Keystroke Biometric  Keystroke biometric measures typing characteristics believed to be unique and difficult to duplicate  Behavioral biometric not extensively studied  Keystroke biometric is appealing because:  Non-intrusive  Ubiquitous  Users type frequently for both work and pleasure  Inexpensive – requires only computer and keyboard  Commercial products for hardening passwords

Pace University System Uniqueness  Focuses on long-text input  Permits powerful statistical feature measurements  Uses powerful vector-difference authentication model  Appropriate for security applications requiring:  Collection of raw keystroke data over the Internet  Arbitrary (free) text input  Example application: authenticating online test takers Federal Higher Education Opportunity Act (HEOA) requires greater student access control

Vector-Difference Dichotomy Model Transforms feature space into feature-vector-difference space. Two classes: within-class (same person), between–class (different people).

Experimental Vector-Difference Data  Mimics the process of true users and imposters trying to get authenticated  Uses all possible vector-differences from model  Example: previous slide showed 3 samples from each of 3 users, which provides 9 within-class (same person) and 27 between-class (different people) vector-difference samples for experimentation (either training or testing)

k-NN Classification Method k=5

Strong versus Weak Training  Strong Training People used in testing also used in training But new difference vectors used to test For example, users provide 10 samples – 5 for training and 5 for testing  Weak Training People used in testing not used in training Independent sets of users for testing and training

Strong Training Performance

 Performance decreases as the population of users increases  Expected – the usual problem with biometric systems  Performance increases with additional training, even when that training is from non-test users  The more training the better, populates feature space Strong Training Performance

Weak Training Performance

 Weak training performance worse than strong training performance  Expected since strong system trained on tested users  Performance increases with additional user training  The more training the better, populates feature space Weak Training Performance

ROC Curves  Receiver Operating Characteristic (ROC) curves are used to determine the appropriate operating point of a system, the tradeoff between False Accept Rate (FAR) and False Reject Rate (FRR)  Useful to compare biometric matchers

Three ROC Derivation Methods 1. Unweighted - Pure rank method 2. Weighted - Rank method weighted by rank order 3. Hybrid method of rank and vector space distances - Weighted vote based on distances to the kNN

Method 1 – Supreme Court Analogy  Usual Supreme Court Majority Decision  Usual majority NN procedure – Nine NN’s are evaluated, each with an equal vote. Authenticate on five within- class matches to the questioned sample.

 Very Conservative (Unanimous) Decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if all nine samples match within-class to the questioned sample Method 1 – Supreme Court Analogy

 Conservative – 8, 7, or 6 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 8, 7, or 6 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

 Liberal – 4, 3, or 2 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 4, 3, or 2 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

 Very Liberal. 1, or 0 votes needed for decision  Nine NN’s are evaluated, each with an equal vote. Authenticate if 1, or 0 samples match within-class to the questioned sample. Method 1 – Supreme Court Analogy

Summary of Method 1: m-kNN  Pure Rank Method  Unweighted  Two parameters: k, m  Evaluate top k NN  Authenticate if #>=m

Method 1 Example (m-kNN) k = 9 Provides 10 FRR-FAR pairs: m = 0, 1, …, 9

Summary of Method 2: wm-kNN  Linear rank weighting  Two parameters: k, wm  Evaluate top k NN  Max score for k = 9 is k(k+1)/2 = 9x10/2=45  1 st choice has weight 9, 2 nd weight 8, 9 th weight 1  Authenticate if ∑ weighted within-class choices >= t

Summary of Method 3: ht-kNN  Weighted vote based on Dempster-Shafer distances to the questioned sample in feature space  Two parameters: k, threshold  For each test sample, the within-class weight (WCW) is calculated based on the vector distances  The sum of within-class weights is compared to a threshold  A user is authenticated if this sum for the questioned sample is >= threshold  Threshold of 0 authenticates all users: FAR=0, FRR=1 Threshold of 1 authenticates very few: FAR=1, FRR=0

ROC Curves ROC curves from the kNN classifier with k=21: method m-kNN (left), method wm-kNN (center), and method hd-kNN (right).

FAR and FRR versus threshold Closed 14-14 system, kNN classifier with k=21: FAR and FRR versus threshold for method m-kNN (left), wm-kNN (center), and hd-kNN (right).

Conclusions  Keystroke password performance – approximately 10% EER  See extensive study by Killourhy & Maxion, 2009  Product advertized performance is exaggerated  Keystroke long-text performance – approximately 1% EER  Reasonable considering powerful statistical features  Closed system better than open system performance  Three ROC curve derivation methods developed for kNN method  All are two-parameter methods – k plus a threshold

Questions

Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.

Similar presentations

Presentation on theme: "Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.

Similar presentations

Presentation on theme: "Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System."— Presentation transcript:

Similar presentations

About project

Feedback