Keystroke Biometrics Studies on a Variety of Short and Long Text and Numeric Input Ned Bakelman, DPS Candidate Charles C. Tappert, PhD, Advisor Seidenberg.

Slides:



Advertisements
Similar presentations
Mining the MACHO dataset Markus Hegland, Mathematical Sciences Institute, ANU Margaret Kahn, ANU Supercomputer Facility.
Advertisements

Touch-Screen Mobile- Device Data Collection for Biometrics Studies W. Ciaurro, B. Major, D. Martinez, D. Panchal, G. Perez, M. Rana, R. Rana, R. Reyes,
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Biometric Data Mining “A Data Mining Study of Mouse Movement, Stylometry, and Keystroke Biometric Data” Clara Eusebi, Cosmin Gilga, Deepa John, Andre Maisonave.
Behavior-based Authentication Systems
Designing a Multi-Biometric System to Fuse Classification Output of Several Pace University Biometric Systems Leigh Anne Clevenger, Laura Davis, Paola.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter.
Research Experiment Design Sprint: Keystroke Biometric Intrusion Detection Ned Bakelman Advisor: Dr. Charles Tappert.
Robert S. Zack May 8, 2010 METHODS OF DERIVING BIOMETRIC ROC CURVES FROM THE k-NN CLASSIFIER.
Research Experiment Design Sprint: Keystroke Biometric Intrusion Detection Ned Bakelman Advisor: Dr. Charles Tappert.
Keystroke Biometrics Study Software Engineering Project Team + DPS Student.
Mouse Movement Biometrics, Pace University, Fall'20071 Mouse Movement Biometrics Fall 2007 Capstone -Team Members Rafael Diaz Michael Lampe Nkem Ajufor.
Long Text Keystroke Biometrics Study Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana Sung-Hyuk Cha, Charles Tappert (Software Engineering Project.
Keystroke Biometric : ROC Experiments Team Abhishek Kanchan Priyanka Ranadive Sagar Desai Pooja Malhotra Ning Wang.
CS Team 5 Alex Wong Raheel Khan Rumeiz Hasseem Swati Bharati Biometric Authentication System.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Keystroke Biometric Studies Security Research at Pace Keystroke Biometric Drs. Charles Tappert and Allen Stix Seidenberg School of CSIS.
Keystroke Biometric Studies Assignment 2 – Review of the Literature Case Study – Keystroke Biometric Describe problem investigated (intro + abstract) Developed.
Reduced Support Vector Machine
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Keystroke Biometric Studies Keystroke Biometric Identification and Authentication on Long-Text Input Book chapter in Behavioral Biometrics for Human Identification.
Chapter 6 An Introduction to Portfolio Management.
Ned Bakelman Advisor: Dr. Charles Tappert Research Experiment Design Sprint: Keystroke Biometric Intrusion Detection.
Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.
Performance Testing “ Guide to Biometrics” - chapter 7 “ An Introduction to Evaluating Biometric Systems” by Phillips et al., IEEE Computer, February 2000,
Biometric ROC Curves Methods of Deriving Biometric Receiver Operating Characteristic Curves from the Nearest Neighbor Classifier Robert Zack dissertation.
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Start the slide show by clicking on the "Slide Show" option in the above menu and choose "View Show”. or – hit the F5 Key.
Keystroke Biometric Studies Assignment 2 – Review of the Literature Case Study – Keystroke Biometric Describe the problem being investigated Build a case.
Gait recognition under non- standard circumstances Kjetil Holien.
Keystroke Biometric Identification and Authentication on Long-Text Input Summary of eight years of research in this area Charles Tappert Seidenberg School.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
INVESTMENT MANAGEMENT PROCESS Setting investment objectives Establishing investment policy Selecting a portfolio strategy Selecting assets Managing and.
Face Detection using the Viola-Jones Method
DARPA-BAA Proposal 2012 Active Authentication Technical POC: Dr. Charles Tappert Principal Investigators: Drs. Tappert, Cha, Chen, Grossman.
Masquerade Detection Mark Stamp 1Masquerade Detection.
Pattern Recognition: Baysian Decision Theory Charles Tappert Seidenberg School of CSIS, Pace University.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter 7.
TEMPLATE DESIGN © Detecting User Activities Using the Accelerometer on Android Smartphones Sauvik Das, Supervisor: Adrian.
Start the slide show by clicking on the "Slide Show" option in the above menu and choose "View Show”. or – hit the F5 Key.
Keystroke Biometric System Client: Dr. Mary Villani Instructor: Dr. Charles Tappert Team 4 Members: Michael Wuench ; Mingfei Bi ; Evelin Urbaez ; Shaji.
Keystroke Biometrics Studies on a Variety of Short and Long Text and Numeric Input Ned Bakelman, DPS Candidate Charles C. Tappert, PhD, Advisor Seidenberg.
User Authentication Using Keystroke Dynamics Jeff Hieb & Kunal Pharas ECE 614 Spring 2005 University of Louisville.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang NTUIM.
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
Typing Pattern Authentication Techniques 3 rd Quarter Luke Knepper.
INTRODUCTION TO BIOMATRICS ACCESS CONTROL SYSTEM Prepared by: Jagruti Shrimali Guided by : Prof. Chirag Patel.
I can be You: Questioning the use of Keystroke Dynamics as Biometrics Tey Chee Meng, Payas Gupta, Debin Gao Ke Chen.
Chapter 7 An Introduction to Portfolio Management.
Long-Text Keystroke Biometric Applications over the Internet Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana, Sung-Hyuk Cha, and Charles Tappert.
Computer-User-Input Behavioral Biometrics Dr. Charles C
Deep Feedforward Networks
Keystroke Biometric Studies
Computer-User-Input Behavioral Biometrics The Biometrics we focus on at Pace University Dr. Charles C. Tappert Seidenberg School of CSIS, Pace University.
Keystroke Biometric Studies with Short Numeric Input on Smartphones
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
IMPAIRED-USER INPUT SCENARIOS FOR KEYSTROKE BIOMETRIC AUTHENTICATION
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Computer-User-Input Behavioral Biometrics Dr. Charles C
Keystroke Biometric System
Evaluation of a Stylometry System on Various Length Portions of Books
Multi-Biometrics: Fusing At The Classification Output Level Using Keystroke and Mouse Motion Features Todd Breuer, Paola Garcia Cardenas, Anu George, Hung.
Feature Selection Methods
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Computer-User-Input Behavioral Biometrics Dr. Charles C
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Presentation transcript:

Keystroke Biometrics Studies on a Variety of Short and Long Text and Numeric Input Ned Bakelman, DPS Candidate Charles C. Tappert, PhD, Advisor Seidenberg School of Computer Science and Information Systems Pace University White Plains, NY 10606, USA DPS Defense April 11, 2014

Researched Questions This study focuses on biometric authentication using long bursts of arbitrary input and short bursts of fixed input with an improved classification system Long Input: 100 – 1500 characters ( paragraph, couple of sentences, etc. ) Short Input: 10 – 15 characters ( password, pass code, etc.) Arbitrary Input: Open unrestricted text ( up to the users choosing )

Research Questions (continued) 1)Can we accurately detect the intruder use of a computer system in an office environment? 2)How does the use of standard applications such as word processing, spreadsheet, browser impact intruder detection? 3)Is an intruder still detectable if using a web browser (low text environment) Purpose of the Study Long Input - Unauthorized User Detection 1)What is the accuracy between the two? 2) Which performs better on long input? 3)Which performs better on short input? 1)What is the detection accuracy of short fixed numeric keypad input? 2)Does the use of specific keypad features improve detection accuracy? Short Keypad Input – Detection Accuracy Classifier Comparison – Multi Match vs. Single Match

Background T. Olzak, Keystroke Dynamics: Low Impact Biometric Verification, Sep, 2006 Derived from raw timing data Based on key press duration and transition times Also known as Dwell and Flight time Statistical in nature, mainly Means and Standard Deviations Pre-processing to remove outliers and standardize between 0 – 1 Fallback procedure (Source of Features or Attributes)

Background (continued) Wikipedia.org last updated: March 6, 2012 QWERTYNumeric Keypad Separate features for QWERTY and Keypad Durations and transitions for individual keys, groups of keys, etc. QWERTY: each letter, each number, vowels, consonants, all letters, etc. Keypad: each digit, each operator (+ - * /), all digits, all operators, etc (Target of Features or Attributes)

Background (continued) (Pace Classifier: Single Match) Dichotomy Model Uses vector differences Transforms a multi-class problem to a two-class problem K-Nearest Neighbor (k-NN) is used for classification Feature Vector Space 3 subjects, 4 samples Feature Difference Space 18 within, 48 between

Background (continued) (Pace Classifier: Multi Match) Authentication Process User Focused Reduction Method (reduces the training space) System performance obtained using the Leave-One-Out method “Left out” test sample is used to create differences of different vectors Each test difference is classified(k-NN) Results are grouped together Authentication decision based on all Feature Reduction Space 6 within, 32 between Feature Vector Space 3 subjects, 4 samples Feature Difference Space 18 within, 48 between

Background (continued) Receiver Operating Characteristic Curves (ROC) Historically used in signal detection such as RADAR in distinguishing an actual signal from noise Used in Biometrics to plot the FAR and FRR at various operating points (thresholds) (Performance: ROC Curves, Equal Error Rate) Equal Error Rate (EER) The point on the ROC curve where the FAR and FRR are equal The operating point on the ROC curve where the FAR and FRR intersect ROC CurveFAR / FRR Intersection

Data Collection Only “perfect” samples were used (no mistakes) Rest period of at least one day between sessions Data entered into a spreadsheet using right hand 30 Subjects NumberSessions 20 Per Subject (Numeric Keypad)

Features AttributesMean (µ)Standard Deviation (σ)Total QWERTY (Non-Numeric) Durations: per (Type I and II)Transitions: QWERTY (Numeric) Durations:27 54 per (Type I and II)Transitions: Keypad Durations:29 58 per (Type I and II)Transitions: Totals: (Feature Attribute Summary)

Numeric Keypad Digits with Decimal Arithmetic Operators with Num Lock and Enter Num Lock Enter /* - + All Keys Features (Keypad Durations) Print Screen, Sys Rq, Scroll Lock, Pause, Break Centerpad Home Page Up Page Dn End Del Ins Four Arrows

keypad -> keypad any digit-> any Digit 1->1,2,3…0 2->1,2,3…0 3->1,2,3…0 4->1,2,3…0 5->1,2,3…06->1,2,3…0 7->1,2,3…0 8->1,2,3…0 9->1,2,3…0 0->1,2,3…0 1->digits 2->digits 3->digits 4->digits 5->digits6->digits 7->digits 8->digits 9->digits 0->digits Any Digit-> Arithmetic Operators 1-> Arithmetic Operators 2-> Arithmetic Operators 3-> Arithmetic Operators 4-> Arithmetic Operators 5-> Arithmetic Operators 6-> Arithmetic Operators 7-> Arithmetic Operators 8-> Arithmetic Operators 9-> Arithmetic Operators 0-> Arithmetic Operators div-> digits Arithmetic Operator-> any digit mult-> digits sub-> digits add-> digits Any Key-> Any Key Features (continued) (Keypad Transitions)

Results – Short Input Experiments (Equal Error Rate for each keypad experiment per Classifier) 10 Subject 20 Subject30 Subject Multi Match Single Match Multi Match Single Match Multi Match Single Match

Results – Short Input Experiments (continued) (ROC Curve for each keypad experiment per Classifier) Multi Match ClassifierSingle Match Classifier : 10 Subjects, 20 samples each : 20 Subjects, 20 samples each : 30 Subjects, 20 samples each

Results – Short Input Experiments (continued) Numeric Keypad Subjects Samples per Subject 20 Total Samples (All Subjects) EER % (Multi Match) 5.50%5.65%6.14% EER % (Single Match) 15.56%15.72%14.95% EER Improvement %64.65%64.06%58.93% Independent Variable 1: Number of Subjects Independent Variable 2: Classifier Conclusion 1: EER increases ˄ as Number of Subjects increases * Conclusion 2: New Classifier much better than Old Classifier * Except for old Classifier (Independent Variables for the short input experiments) (but not by much)

CMU Experiment - Keypad Enter Key = 11 Characters 10 key-down ---> key-down 10 key-up ---> key-down 11 dwell times 31 Features Carnegie Melon Features (from their numeric keypad study *) (10 key-down ---> key-down) per µ, per σ = 20 (10 key-up ---> key-down) per µ, per σ = 20 (7 dwell) per µ, per σ = Timing Features Pace University Features (from our numeric keypad study) (Features Set Comparison – CMU vs. PaceU) R. Maxion and K. Killourhy, "Keystroke Biometrics with Number-Pad Input,“ 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN), Chicago, IL, 2010, pp *

CMU Experiment – Keypad (continued) (Equal Error Rate and ROC Curves only using Multi Match) PU Data with CMU Features Equal Error RateROC Curves PU Features vs. CMU Features

CMU Experiment – Keypad (continued) Independent Variable: Feature Set Conclusion: PU Feature Set out performed CMU Feature Set (Independent Variable for the CMU Keypad experiment) Numeric Keypad (30 – 20) Features SetCMUPU Subjects 30 Samples per Subject 20 Total Samples (All Subjects) 600 EER % (Multi Match) 10.47%6.14% EER Improvement %41.36%

Conclusions Keystroke Biometrics can be effective at detecting the unauthorized use of a computer system in a closed environment (government office, school, business office, etc.) Performance Varied with Input Type: Spreadsheet: Good Performance (EER: 8.1%) Text: Very Good Performance (EER: 5.8%) Browser: Fair Performance (ER: 15.7%) Long Input Experiments – Intruder Detection Accuracy 1)Multi Match out performed Single Match significantly (EER Improvement from 50% - 64%) 2)Multi Match out performed detector study from CMU using their data and features (EER: 7.6%) Numeric Keypad yields very good performance (EER Range: 5.5% - 6.2%) PaceU Features Set is Effective: CMU features performed much worse (10.5% vs. 6.2%) Short Input Experiments – Detection Accuracy Classifier Comparison – Multi Match vs. Single Match

Conclusions (continued) Less optimal samples No designated entry window for sample collection (less control over quality of entry) Large fluctuations in the number of keystrokes Input types most likely had substantial mouse activity that “Interrupts” keystroke entry Possible sparseness of keystrokes (meaning less concentrated and spread out especially with browser entry) Long Input Performance: Weaker Performance compared to previous studies at PU… Why? Propose that correlating performance simply to Number of Keystrokes is not sufficient Need to factor in the density of the keystrokes as well Simply stated: It may take a lot more keystrokes to maintain an effective level of performance if the sparseness is high Future Considerations: Do keystroke counts tell the whole story?

Suggestions for Future Work Further studies on numeric entry from QWERTY Compare performance to numeric entry from keypad Study free text entry from keypad Feature Analysis Which features contributed to performance from the keypad? How do equivalent numeric features from QWERTY perform compared to keypad? Perform mixed mode experiments Collect input that combines spreadsheet, browser, and text Collect spreadsheet input which includes all numeric entry from keypad Incorporate Multi Biometric Keystroke + Mouse Movement + Stylometry

Backup Slides

Generate ROC Curves from kNN Data (vary m from 0 to k [m is the controlling or threshold parameter] ) R. Zack, C. Tappert, and S.Cha, "Performance of a Long-Text-Input Keystroke Biometric Authentication System Using an Improved k-Nearest-Neighbor Classification Method," IEEE 4th Int Conf Biometrics (BTAS 2010), Washington D.C., The m-kNN procedure with k = 9 and m = 5 For each Q (questioned) test sample: Examine the top k nearest-neighbors count the number of within-class matches If the number of within-class matches >= a threshold of matches (m), the user is authenticated. Otherwise rejected. Generate the ROC curve as follows: vary m from 0 to k calculate FAR / FRR in each of the following cases: m = 0, authenticate if 0 or more of the k choices are within m = 1 authenticate of 1 or more of the k choices are within and so on until m = 9 in this case Linear Rank Weighting Method: 1st choice weight = k, 2 nd choice weight = k-1… weight = 1 Authenticate a user if the sum of the weighted-within-class choices >= the m threshold Threshold varies from 0 to k(k+1)/2 (maximum score)

Equal Error Rates (From the Literature) Long Input: Ferreiar and Santos: 1.4% Monaco using data from Villani: 1.7% Generate the ROC curve as follows: vary m from 0 to k calculate FAR / FRR in each of the following cases: m = 0, authenticate if 0 or more of the k choices are within m = 1 authenticate of 1 or more of the k choices are within and so on until m = 9 in this case

Multi Biometrics for Intrusion Detection Motor Control Level: keystroke + mouse movement Linguistic Level: stylometry (char, word, syntax) Semantic Level: target likely intruder commands Intruder Keystroke + Mouse Stylometry Motor Control Level Linguistic Level Semantic Level Future Work (continued)

Intruder Experiment Design (continued) Authenticate user on various window sizes, beginning 300-keystroke windows Window Type 1: use overlapping windows to: Minimize the “wait” period for the next authentication Maximize fast intruder detection KS 300 KS 300 KS 300 KS 300 KS 300 KS KS KS 300 KS 300 KS 300 KS Figure Overlapping Window Burst Authentication

Continuous vs Continual Authentication with Data Capture Windows Continuous (ongoing) burst authentication Continual burst authentication with pauses 05 min10 min 1 min 1 min 1 min Burst 1Burst 2Burst 3 08 min30 min 1 min 1 min 1 min Pause Threshold Burst 1Burst 2Burst 3 Pause Threshold 27EISIC 2012

Background (continued) DARPA (Defense Advanced Research Projects Agency) through their Cyber Genome Program is funding research for the development of new software based authentication biometric modalities These include keystrokes and targets a desktop environment running Microsoft Office applications as the standard computer system platform DARPA. Active Authentication Program. accessed The 2008 United States Higher Education Opportunity Act requires institutions of higher learning to make greater online access control efforts by adopting ubiquitous identification technologies HEOA. Higher Education Opportunity Act (HEOA) of accessed

Spreadsheet Template Assets Cash Investments : Cash Equity Securities Corporate debt securities US government securities Private equity Real estate Total Investments0 0 0 Other Assets Total Assets$0 Liabilities and Net Assets Liabilities: Penalities Accounts Payable Advance from Lendor Federak excuse tax Total Liabilities0 0 0 Net Assets: Tangiable Non Tangiable Total Net Assets0 0 0 Total Net Assets and Liabilities$0 Special Journal Entries Enter Journal Entry name here Total Journal Entries$0.00