Keystroke Biometric Studies with Short Numeric Input on Smartphones Michael J. Coakley Advisor: Dr. Charles Tappert
Abstract A classification system was developed to evaluate keystroke biometric smartphone data Based on Pace University Classification System Three sets of features evaluated Mechanical-keyboards-like features Comparable to features available on mechanical keyboards Keystroke features only available on touchscreens Combined mechanical and touchscreen features Touchscreen features subsets were evaluated to determine their relative biometric value
Relevance of Study Use of mobile devices continues to climb dramatically More mobile phones than people on the planet Improved technology and capacity equates to more sensitive data being stored and accessed through mobile devices Most devices are either secured via a small 4-character PIN or not securitized at all Government interest and support Defense Advanced Projects Agency (DARPA) National Institute of Standards and Technology (NIST) National Science Foundation (NSF)
Related Work Keystroke TouchScreen Bakelman (Dissertation at Pace University) Maxion & Killourhy (CMU) Maiorana Trojahn & Ortmeir TouchScreen Zheng Kambourakis Feng Alariki
Data Collection
Mobile Device Biometric System Android BioKeyboard Virtual keypad developed on Android platform and used as the default keyboard on Android mobile devices Text entry data captured on mobile devices Data stored in SQLite Database Data transmitted from devices to centralized server Mechanical-Keyboard-Like Keystroke Features Standard Timing Data of Key Press and Key Release events Touchscreen Keystroke Features Data associated exclusively with Touchscreen Pressure, Location, Accelerometer, Gyroscope
Touchscreen Features User, Session ID, Time Session Began Screen DPI (Dots Per Inch) Action (Press or Release) Time of the event in milliseconds Soft Key Name (“9”, for example) Screen Orientation Holding phone vertically or horizontally
Touchscreen Features (continued) Pressure of Key Press {x,y} coordinates of center of touch event Feature data extracted from other sensors Accelerometer Gyroscope Feature types available but are not included in this study: GPS
System Process Overview
Data Collection Devices 52 Participants 5 identical Android LG-D820 Nexus 5 Mobile devices Virtual keypad capturing keystrokes 52 Participants City of White Plains employees Pace University Students (NYC & PLV) Each entered 10 digit string (914 193 7761) 30 times 58,882 data records from the 52 distinct participants 614 total Keystroke Mechanical & Touchscreen Features Data collected in two sessions several weeks 44% Male, 55% Female, Avg Age = 23, 86% Right Handed
Pace Biometric Classification System (PBCS) Classification system created at Pace Univ. Vector-difference model transforms a multi-class problem into a two-class problem Nearest neighbor method used for decisions in vector difference space Between- and within-person distance matches determine who is authentic and who is not authentic (imposter)
Phase 1 Experiments Three feature sets of biometric data were processed by the Pace Biometric Classification System (PBCS) Mechanical-keyboard-like keystroke features Touchscreen-only keystroke features Combined mechanical & touchscreen features
Phase 2 Experiments The touchscreen keystroke features were divided into four sub-feature sets to determine their relative biometric value Pressure Location Accelerometer Gyroscope Each sub-feature file was processed through Pace Biometric Classification System (PBCS)
Data Analysis Each feature set run through PBCS four times (two distance metrics and two validation methods) Euclidean Distance Repeated Random Subsampling (RRS) Leave One Out Cross Validation (LOOCV) Manhattan Distance Platform Hardware: 16 gigs RAM, 8 Cores (2 threads/core), 100 gig drive OS: Linux Pace Classifier: Python
Distance Metrics Minkowski Distance = Euclidean Distance Distance metrics in a normed vector space Euclidean Distance Minkowski Distance with p = 2 Manhattan Distance Minkowski Distance with p = 1 Sometimes called the city block distance
Two Validation Methods Repeated Random Subsampling (RSS) Max between size of 10/Max within size of 10 used to select number of samples prior to the vector difference calculations 30 iterations Leave-One-Out Cross Validation (LOOCV) Full dataset (no random sampling) n samples => n iterations, one for each sample
Performance Evaluation Receiver Operating Characteristic (ROC) Curves and Equal Error Rate (EER) Plots False Acceptance Rate (FAR) against False Rejection Rate (FRR) The Equal Error Rate (EER) is where FAR and FRR intersect (where FAR = FRR) EER is a single, easy-to-understand number often used in evaluating biometric systems However, when deploying a biometric system, the ROC curve is more valuable
Phase 1 EER Results (preview) On these data, the results indicate LOOCV validation method is better than RSS Manhattan distance is better than Euclidean Receiver Operating Characteristic (ROC) curves follow
Mechanical vs Touchscreen vs Combined ROC Curves: Euclidean Distance & RRS Validation
Mechanical vs Touchscreen vs Combined ROC Curves: Euclidean Distance & LOOCV Validation
Mechanical Keyboard Features Euclidean Distance: RRS versus LOOCV EER = 20%
Touchscreen Features Euclidean Distance: RRS versus LOOCV
Mechanical and Touchscreen Features Euclidean Distance: RRS versus LOOCV EER = 7.1%
Unexpected Issue! Equal Error Rate (EER) for the Combined feature set (7.10%) was higher than the EER of the Touchscreen set (4.9%) This could be explained by the proximity of the Keystroke feature sets inflating the combined EER We modified the distance of measure (P) and re-ran the data using Manhattan Distance
Mechanical vs Touchscreen vs Combined ROC Curves: Manhattan Distance & RRS Validation
Mechanical vs Touchscreen vs Combined ROC Curves: Manhattan Distance & LOOCV Validation EER = 19.7%
Mechanical Keyboard Features Manhattan Distance: RRS versus LOOCV Validation
Touchscreen Features Manhattan Distance: RRS versus LOOCV Validation
Mechanical and Touchscreen Features Manhattan Distance: RRS versus LOOCV Validation
Phase 1 EER Results (review) On these data, the results indicate LOOCV validation method is better than RSS Manhattan distance is better than Euclidean Resolves unexpected issue, now Combined better than Touchscreen
Phase 1 Conclusions Study indicated that the Pace Classifier can be extended to authenticate data associated with and extracted from mobile devices Manhattan Distance performed better than Euclidean Distance Leave One Out Cross Validation (LOOCV) performed better than Repeated Random Subsampling (RRS)
Phase 1 Conclusions (continued) Equal Error Rate (EER) for Mechanical Keystroke Biometrics alone (19.7%) worse than those of Killourhy & Maxion (8.6%) and Bakelman (6.14%) Possibly explained by smaller device form factor as well as “slickness” of touchscreen Touchscreen biometric EER (4.0%) was a significant improvement pure Mechanical Keystroke Biometrics Combined biometric EER (3.9%) was a further improvement Note: The aforementioned studies by Killourhy & Maxion and Bakelman could not utilize the touchscreen feature sets as physical keyboards cannot capture that data
Phase 2 Results (preview) Sensor subsets of the touchscreen features were further evaluated to determine their relative biometric value
ROC Curve – (RRS and Manhattan)
ROC Curve – (LOOCV and Manhattan)
Phase 2 Results (review) Sensor subsets of the touchscreen features were further evaluated to determine their relative biometric value Conclusions Gyroscope sensor has highest biometric value By itself, almost as good as all features Pressure sensor has lowest biometric value
Overall Conclusions Touchscreen biometric data outperformed keystroke biometric data Manhattan distance metric outperformed Euclidean distance metric Leave-One-Out Cross-Validation (LOOCV) outperformed Repeated Random Sub-Sampling (RRS)
Overall Conclusions (continued) Touchscreen biometric data can return excellent results alone or in concert with keystroke biometric data Results associated with the Gyroscope returned the best results of all the Touchscreen Sensor feature data sets Results associated with the Pressure features returned the worst results of all the Touchscreen Sensor feature data sets
Limitations of Study Android Only Brevity and specificity of input string 914 193 7761 Limitation of Classification Algorithms K-Nearest Neighbor
Future Work Replication of this study on other Mobile Devices iOS Windows Expansion of allowable key characters Incorporation of additional sensor data Mobile devices have other sensors that were not utilized in our study Expand research to utilize and compare other classification algorithms Support Vector Machines (SVM)
Thank You
Supplemental Slides
ROC Curve for Touchscreen Pressure Features - Euclidean
ROC Curve for Touchscreen Pressure Features - Manhattan
ROC Curve for Touchscreen Location Features - Euclidean
ROC Curve for Touchscreen Location Features - Manhattan EER=17.9% EER=15.0%
ROC Curve for Touchscreen Accelerometer Features - Euclidean
ROC Curve for Touchscreen Accelerometer Features - Manhattan
ROC Curve for Touchscreen Gyroscope Features - Euclidean
ROC Curve for Touchscreen Gyroscope Features - Manhattan
Comparison – RRS & Euclidean
Comparison – LOOCV & Euclidean EER=26.37% EER=18.25%
Comparison – RRS & Manhattan EER=24.85%
Comparison – LOOCV & Manhattan