A NALYZING I MPAIRED -U SER I NPUT S CENARIOS FOR K EYSTROKE B IOMETRIC A UTHENTICATION Gonzalo Perez J. Vinnie Monaco Advisor – Dr. Charles Tappert Friday, May 1, 2017
K EYSTROKE B IOMETRIC R ATIONALE According to Roy Maxion a research professor of computer science at Carnegie Mellon, “Motions that are performed numerous times, are governed by motor control, not deliberate thought. That is why successfully mimicking keystroke dynamics is physiologically improbable.”
F OCUS OF S TUDY The study examines how to better analyze abnormal typing behavior. Impaired Distracted One-handed, typing behavior may not be performed numerous times and may not necessarily be governed by motor control. One handed typing behavior has been found to be erratic when compared to standard two-handed typing behavior, and this study attempts to shed light as to how to better authenticate distracted or impaired typing scenarios.
O VERALL G OAL To strengthen current keystroke biometric systems by developing a model which will account for various impaired or distracted input scenarios. Other studies to further strengthen keystroke biometrics include: Emotion Word Quality Short structured tests Arbitrary Long-text Stylometry
O BJECTIVES To better understand how a keystroke biometric system handles users that have been impaired or distracted. Collect data by simulating a quiz to capture arbitrary free-text input from users entering data using various scenarios. Run various experiments to determine which model best authenticates users. Continuously analyze results from experiments and modify parameters to improve EER’s.
I MPAIRED U SER R EQUIREMENTS B IOMETRIC R ATIONALE Research will focus on constrained user input Biometric systems need to consider user requirements as users may have some physiological and medical factors that affect the usability and efficiency of biometrics: Visually Impaired Subjects May suffer from Aniridia, absence of an iris Person may be blind Person may have eye tremors Hearing Impairments Subject may not be able to hear instructions that are needed for a biometric system Speech may be affected due to hearing loss
I MPAIRED K EYSTROKE B IOMETRIC R ATIONALE A traditional keystroke biometric system would require the user to always input keystrokes in a normal state. (using both hands) In a normal setting, users may type using one hand only if they are on the phone or drinking a cup of coffee or perhaps one hand or arm is injured. To introduce various keystroke input methods in order to strengthen the validity of the keystroke biometric system The study will analyze various keystroke input methods to determine if the user can still be authenticated or provide a new model.
D ATA C APTURE - P ARTICIPANTS Experiment featured 81 students which successfully enrolled. Entered data as part of a simulated quiz The quiz format encouraged users to enter arbitrary long-text input responses Various scenarios were introduced Both Hand normal Left Hand Only Right Hand Only
H ARDWARE Since the quizzes were taken towards the end of class session, 90% of users utilized an HP VMWare platform using a standard HP keyboard. Some of the users could not complete the exam in class and had to use their laptop or desktop at home. The system prompts the users to identify which system they were using to enter their keystrokes.
DPS D ISSERTATION D ATA C APTURE M OODLE
M OODLE DPS Q UIZ Students were asked to log into a Moodle learning platform which is an open source alternative from blackboard Students were asked to complete three exams, each exam asking them to answer five questions related to the content that was being covered in the introductory computer science course. Every question in each exam was unique Keystrokes were logged by a JavaScript event logging framework which was embedded into a Moodle learning platform
D ATA C APTURED 81 Users entering at least 100 characters for every question in each scenario listed below Both Hands B Left Hand Only L Right Hand Only R
F IRST I TERATION E XPERIMENTS & RESULTS After capturing the keystrokes from users through various normal and impaired scenarios, we began to run simulations: B train – B test B Train – L test B Train – R test Initially, we were hoping to find decent results with this experiment and fine tune the dataset in order to provide a novel method to authenticate one hand only typing.
F IRST I TERATION E XPERIMENTS & RESULTS Train dataTest dataFeaturesEER (%) Both All3.3 Both LeftAll38.04 Both RightAll38 Table 1- First Iteration Results Our results were not encouraging with this method as B Train, L and R Test gave us EER’s in the upper 38% range. Typing one handed proved significantly alter a user’s typing behavior and as a result gave us very poor EER rates
S ECOND I TERATION E XPERIMENTS & RESULTS Second experiment L Train, L Test R Train, R Test As a result from our initial findings, we realized that one handed typing behavior was significantly different than the typing behavior of two handed typing. Next we decided to experiment to determine if one handed typing behavior was so erratic, that it would be difficult to authenticate with the same one handed test sample.
S ECOND I TERATION E XPERIMENTS & RESULTS Train DataTest DataFeaturesEER% Left All13.96 Right All15.61 We were pleased to see the results of the single handed train and test data. The EER rates were relatively low, in the mid-teens which concludes that user one handed samples do have a conclusive pattern that can be analyzed and authenticated with a keystroke biometric system with relative efficacy. However, we wanted to try another experiment to determine if we could improve the error rate to a lower number if possible.
T HIRD I TERATION E XPERIMENTS & RESULTS Third Experiment: B train, L test, L Features B train, R test, R Features L train, L Test, L Features R Train, R Test, R Features With the intent of lowering EER rates further, we wanted to experiment by filtering features which would better authenticate impaired users. We created the feature sets for left/right sides by filtering the linguistics features to those that contain keys on each side of the keyboard
T HIRD I TERATION E XPERIMENTS & RESULTS Train DataTest DataFeaturesEER% Both Left Both Right Left Right Much to our dismay, the left/right feature filter actually worsened the results of our testing. The initial hypothesis was that if a user is typing with one hand, they would perform more natural typing behavior on the segment of the keyboard with the one hand that they were typing with. Our results did not align with this hypothesis and the reason could be related to omission of the segments. Omitting a segment of the keyboard excludes many features of the keylogger system which degraded, not improved the results of the experiment.
F OURTH I TERATION E XPERIMENTS & RESULTS The goal of a fourth iteration was to combine some of the datasets in order to exclude the need for a system to initiate a detector function and then engage various fallback procedures in order to authenticate a user. Therefore, we combined all of the samples into one experiment which included approximately 1200 data points per user which needed to be split into 5 samples. The experiment was so large that it required two days to complete.
F OURTH I TERATION E XPERIMENTS & RESULTS Train DataTest DataFeaturesEER% Both-left-right All12.41 Both-left-right BothAll4.86 Both-left-right LeftAll15.82 Both-left-right RightAll15.74 The results were very encouraging as the EER’s were with the standard margin of error when comparing the training and testing conditions separately. Fewer assumptions are made with this method The method does require that B, L, R samples be collected during the enrollment phase. System doesn’t need to know whether a sample is one-handed when testing. Avoids requiring a detector and fallback procedure for one-handed samples.
D ISCUSSIONS Each iteration provided valuable information which assisted us in expanding and developing the research Initially, we expected to find patterns between the both hand sample and the one handed sample which could have been identified, isolated and matched accordingly. One handed samples were too erratic and could not be matched with decent rates using our tools. Keyboard segmentation actually worsened results. Combining B+L+R, and testing across all scenarios proved to be the best approach that would authenticate users and provide a seamless test implementation process.
C ONCLUSION The major contribution of this research study was to provide a novel approach to authenticate impaired users of a keystroke biometric system. The research is an important step towards creating a more robust keystroke biometric system and is also an essential topic that must be considered when designing any biometric system, albeit physical. Furthermore, our novel approach of combining datasets consisting of various scenarios and then subsequently testing across single scenarios can be an approach to consider for other behavioral biometric systems.
B IOMETRIC I DENTIFICATION C OMPETITION P APER S UBMITTED TO ICB 2015 A paper on one-handed keystroke biometrics which was based from our research was submitted and accepted to the International Conference on Biometrics (ICB 2015) in Phuket, Thailand. We provided our unlabeled dataset and 9 teams from all over the world competed for the top spots. Competition participants designed classification models trained on the normally-typed samples in an attempt to classify an unlabeled dataset that consists of normally-typed and one-handed samples. Participants competed against each other to obtain the highest classification accuracies and submitted classification results through an online system.
RESEARCH POSTER PRESENTATION DESIGN © (—THIS SIDEBAR DOES NOT PRINT—) DESIGN GUIDE This PowerPoint 2007 template produces a 48”x72” presentation poster. You can use it to create your research poster and save valuable time placing titles, subtitles, text, and graphics. We provide a series of online answer your poster production questions. To view our template tutorials, go online to PosterPresentations.com and click on HELP DESK. When you are ready to print your poster, go online to PosterPresentations.com Need assistance? Call us at QUICK START Zoom in and out As you work on your poster zoom in and out to the level that is more comfortable to you. Go to VIEW > ZOOM. Title, Authors, and Affiliations Start designing your poster by adding the title, the names of the authors, and the affiliated institutions. You can type or paste text into the provided boxes. The template will automatically adjust the size of your text to fit the title box. You can manually override this feature and change the size of your text. TIP: The font size of your title should be bigger than your name(s) and institution name(s). Adding Logos / Seals Most often, logos are added on each side of the title. You can insert a logo by dragging and dropping it from your desktop, copy and paste or by going to INSERT > PICTURES. Logos taken from web sites are likely to be low quality when printed. Zoom it at 100% to see what the logo will look like on the final poster and make any necessary adjustments. TIP: See if your company’s logo is available on our free poster templates page. Photographs / Graphics You can add images by dragging and dropping from your desktop, copy and paste, or by going to INSERT > PICTURES. Resize images proportionally by holding down the SHIFT key and dragging one of the corner handles. For a professional-looking poster, do not distort your images by enlarging them disproportionally. Image Quality Check Zoom in and look at your images at 100% magnification. If they look good they will print well. ORIGINAL DISTORTED Corner handles Good printing quality Bad printing quality QUICK START (cont.) How to change the template color theme You can easily change the color theme of your poster by going to the DESIGN menu, click on COLORS, and choose the color theme of your choice. You can also create your own color theme. You can also manually change the color of your background by going to VIEW > SLIDE MASTER. After you finish working on the master be sure to go to VIEW > NORMAL to continue working on your poster. How to add Text The template comes with a number of pre-formatted placeholders for headers and text blocks. You can add more blocks by copying and pasting the existing ones or by adding a text box from the HOME menu. Text size Adjust the size of your text based on how much content you have to present. The default template text offers a good starting point. Follow the conference requirements. How to add Tables T o a d d a t a b l e f r o m s c r a t c h g o t o t h e I N S E R T m e n u a n d c l i c k o n T A B L E. A d r o p - d o w n b o x w i l l h e l p y o u s e l e c t r o w s a n d c o l u m n s. You can also copy and a paste a table from Word or another PowerPoint document. A pasted table may need to be re-formatted by RIGHT-CLICK > FORMAT SHAPE, TEXT BOX, Margins. Graphs / Charts You can simply copy and paste charts and graphs from Excel or Word. Some reformatting may be required depending on how the original document has been created. How to change the column configuration RIGHT-CLICK on the poster background and select LAYOUT to see the column options available for this template. The poster columns can also be customized on the Master. VIEW > MASTER. How to remove the info bars If you are working in PowerPoint for Windows and have finished your poster, save as PDF and the bars will not be included. You can also delete them by going to VIEW > MASTER. On the Mac adjust the Page-Setup to match the Page-Setup in PowerPoint before you create a PDF. You can also delete them from the Slide Master. Save your work Save your template as a PowerPoint document. For printing, save as PowerPoint of “Print-quality” PDF. Student discounts are available on our Facebook page. Go to PosterPresentations.com and click on the FB icon. © 2013 PosterPresentations.com 2117 Fourth Street, Unit C Berkeley CA Is it possible for a keystroke biometric system to give accurate results when typing behavior is severely impaired? This competition aimed to answer that question. Participants built classifiers using a labeled keystroke biometric dataset with normal typing behavior only. They then attempted to identify the subjects in an unlabeled dataset that contained some samples that were typed with only one hand. This scenario simulates a severe user handicap. Baseline results indicate a severe degradation in performance for one-handed keystroke samples. Participants had to construct novel classifiers capable of identifying normal and handicapped samples in this competition that ranked the identification accuracy under several different typing conditions. The winning group was awarded a Futronic FS88 Fingerprint Scanner. INTRODUCTION Three online exams were administered to 64 undergraduate students. Keystrokes were collected using a plugin for Moodle that captures key press and release timestamps on the client and sends this information back to the server. To simulate a typing impairment, students were instructed to Type normally with both hands on the first exam. Type with left hand only on the second exam. Type with right hand only on the third exam. Samples were created by taking 500-keystroke segments separated by at least 50 keystrokes apart. The labeled dataset consisted of one normally-typed sample per student. The unlabeled dataset contained 471 samples from all three typing conditions. Not all of the students in the labeled dataset also appeared in the unlabeled dataset. All samples were provided in millisecond precision and normalized to begin at time 0 to avoid linking the samples by the time the test was taken. Competition participants were allowed to make up to one submission per day, using a plugin for Moodle developed by the authors. Results were automatically scored and remain publicly available: DATA Accuracy for each typing style Accuracy for handedness vs. typing condition COMPETITION RESULTSWINNING STRATEGIES Duration features only Multi-classifier pairwise coupling with 2 regression models and a prediction model Artificial Neural Network (ANN) Counter-Propagation Artificial Neural Network (CPANN) Support Vector Machine (SVM) Weighted fusion of classifier scores Features corresponding to the typing condition Left-side keyboard features for left-hand typing Right-side keyboard features for right-hand typing Third place Duration, release-press latency, and trigraph features trigraph features: press to release of alternate keystrokes Fusion of normalized distance between feature vectors and Least Squares Support Vector Machine (LS-SVM) Meta-parameters of the LS-SVM determined on an independent dataset Weighted fusion of classifier scores based on individual classifier performance ACKNOWLEDGEMENTS The authors would like to acknowledge the support from the National Science Foundation under Grant No Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the US government. John V. Monaco 1, Gonzalo Perez 1, Charles C. Tappert 1, Patrick Bours 2, Soumik Mondal 2, Sudalai Rajkumar 3, Aythami Morales 4, Julian Fierrez 4, Javier Ortega-Garcia 4 1. Pace University, 2. Gjøvik University College, 3. Tiger Analytics, 4. Universidad Autónoma de Madrid One-handed Keystroke Biometric Identification Competition Accuracy distribution per sample for each typing condition Accuracy distribution per studentAccuracy vs. typing speed First place Second place Duration, press-press latency, and release-press latency features Grouped features based on keyboard layout (left vs. right, top vs. bottom) Random Forest classifier Features corresponding to typing condition, similar as above. Bottom tree structure for pairwise coupling