Presentation is loading. Please wait.

Presentation is loading. Please wait.

Long-Text Keystroke Biometric Applications over the Internet Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana, Sung-Hyuk Cha, and Charles Tappert.

Similar presentations


Presentation on theme: "Long-Text Keystroke Biometric Applications over the Internet Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana, Sung-Hyuk Cha, and Charles Tappert."— Presentation transcript:

1 Long-Text Keystroke Biometric Applications over the Internet Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana, Sung-Hyuk Cha, and Charles Tappert Pace University, New York

2 2 Keystroke Biometric As with other biometrics, the keystroke one is becoming important for security apps As with other biometrics, the keystroke one is becoming important for security apps Advantage - inexpensive and easy to implement, the only hardware needed is a keyboard Advantage - inexpensive and easy to implement, the only hardware needed is a keyboard Disadvantage - behavioral rather than physiological biometric, easy to disguise Disadvantage - behavioral rather than physiological biometric, easy to disguise One of the least studied biometrics One of the least studied biometrics

3 3 Focus of Study Previous studies mostly concerned short character string input Previous studies mostly concerned short character string input Password hardening Password hardening Short name strings Short name strings We focus on large text input We focus on large text input 200 or more characters per sample 200 or more characters per sample

4 4 Focus of Study (cont) Applications of interest Applications of interest Identification Identification 1-of-n classification problem 1-of-n classification problem e.g., sender of inappropriate e-mail in a business environment with a limited number of employees e.g., sender of inappropriate e-mail in a business environment with a limited number of employees Verification Verification Binary classification problem, yes/no Binary classification problem, yes/no e.g., student taking online exam e.g., student taking online exam

5 5 Software Components Raw Keystroke Data Capture over the Internet (Java applet) Raw Keystroke Data Capture over the Internet (Java applet) Feature Extraction Feature Extraction Classification Classification Training Training Testing Testing

6 6 Keystroke Data Capture (Java Applet) Raw data recorded for each entry Key’s character Key’s character Key’s code text equivalent Key’s code text equivalent Key’s location on keyboard Key’s location on keyboard 1 = standard, 2 = left, 3 = right 1 = standard, 2 = left, 3 = right Time key was pressed (msec) Time key was pressed (msec) Time key was released (msec) Time key was released (msec) Number of left, right, double mouse clicks Number of left, right, double mouse clicks

7 7 Keystroke Data Capture (Java Applet)

8 8 Example of Aligned Raw Data File (Hello World!)

9 9 Feature Measurements 10 Mean and 10 Std of key press durations 10 Mean and 10 Std of key press durations 8 most frequent alphabet letters (e, a, r, i, o, t, n, s) 8 most frequent alphabet letters (e, a, r, i, o, t, n, s) Space & shift keys Space & shift keys 10 Mean and 10 Std of key transitions 10 Mean and 10 Std of key transitions 8 most common digrams (in, th, ti, on, an, he, al, er) 8 most common digrams (in, th, ti, on, an, he, al, er) Space-to-any-letter & any-letter-to-space Space-to-any-letter & any-letter-to-space 15 Total number of keypresses for 15 Total number of keypresses for Space, backspace, delete, insert, home, end, enter, ctrl, 4 arrow keys combined, shift (left), shift (right), total entry time, left, right, & double mouse clicks Space, backspace, delete, insert, home, end, enter, ctrl, 4 arrow keys combined, shift (left), shift (right), total entry time, left, right, & double mouse clicks

10 10 Feature Measurement Sample

11 11 Feature Extraction Preprocessing Outlier removal Outlier removal Remove samples > 2 std from mean Remove samples > 2 std from mean Prevents skewing of features caused by pausing of the keystroker Prevents skewing of features caused by pausing of the keystroker Standardization Standardization x’ = (x - xmin) / (xmax - xmin) x’ = (x - xmin) / (xmax - xmin) Scales to range 0-1 to give roughly equal weight to each feature Scales to range 0-1 to give roughly equal weight to each feature

12 12 Classification Identification Identification Nearest neighbor classifier using Euclidean distance Nearest neighbor classifier using Euclidean distance Input sample compared to every training sample Input sample compared to every training sample

13 13 Experimental Design: Identification Experiment 10 subjects (8 completed) that know the purpose of the input data 10 subjects (8 completed) that know the purpose of the input data Training – 10 reps of text a (approx. 600 char) Training – 10 reps of text a (approx. 600 char) Testing Testing Leave-one-out method on text a, 1 versus 9 Leave-one-out method on text a, 1 versus 9 10 reps of text b (same length as text a) 10 reps of text b (same length as text a) 10 reps of text c (half length of text a) 10 reps of text c (half length of text a) 28 subjects that don’t know purpose of the input data 28 subjects that don’t know purpose of the input data Subset of above training/testing data Subset of above training/testing data Also, arbitrary text input of reasonable length Also, arbitrary text input of reasonable length

14 14 Experimental Design: Instructions for Subjects Make any necessary corrections to the input data Make any necessary corrections to the input data Leave at least a day between entering samples Leave at least a day between entering samples Input the data using your normal keystroke dynamics (only for subjects that know purpose of the input data) Input the data using your normal keystroke dynamics (only for subjects that know purpose of the input data)

15 15 Experimental Design: Text a – about 600 characters This is an Aesop fable about the bat and the weasels. A bat who fell upon the ground and was caught by a weasel pleaded to be spared his life. The weasel refused, saying that he was by nature the enemy of all birds. The bat assured him that he was not a bird, but a mouse, and thus was set free. Shortly afterwards the bat again fell to the ground and was caught by another weasel, whom he likewise entreated not to eat him. The weasel said that he had a special hostility to mice. The bat assured him that he was not a mouse, but a bat, and thus a second time escaped. The moral of the story: it is wise to turn circumstances to good account. This is an Aesop fable about the bat and the weasels. A bat who fell upon the ground and was caught by a weasel pleaded to be spared his life. The weasel refused, saying that he was by nature the enemy of all birds. The bat assured him that he was not a bird, but a mouse, and thus was set free. Shortly afterwards the bat again fell to the ground and was caught by another weasel, whom he likewise entreated not to eat him. The weasel said that he had a special hostility to mice. The bat assured him that he was not a mouse, but a bat, and thus a second time escaped. The moral of the story: it is wise to turn circumstances to good account.

16 16 Results: Different Samples of the Same Text Predicted Actual 100 % accuracy (76 out of 76) Confusion Matrix of Results (leave-one- out method) 

17 17 Results: Different Text of Equal Length (text b) Predicted Actual 98.5 % accuracy (65 out of 66) Confusion Matrix of Results 

18 18 Results: Different Text of Shorter Length (text c) Predicted Actual 97% accuracy (74 out of 76) Confusion Matrix of Results 

19 19 Analysis of Results Accuracy on text a > that on text b Accuracy on text a > that on text b text a is the training text text a is the training text Accuracy on text b > that on text c Accuracy on text b > that on text c text b is longer than text c text b is longer than text c

20 20 Conclusions System is a viable means of differentiating between individuals based on typing patterns System is a viable means of differentiating between individuals based on typing patterns It is likely that the shorter the text used for verification, the lower the accuracy It is likely that the shorter the text used for verification, the lower the accuracy Decreasing the number of measurements used also decreases accuracy Decreasing the number of measurements used also decreases accuracy

21 21 Experiments in Progress Identification experiment with subjects that don’t know the purpose of the input data Identification experiment with subjects that don’t know the purpose of the input data Verification experiments Verification experiments

22 22 Questions/Comments?


Download ppt "Long-Text Keystroke Biometric Applications over the Internet Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana, Sung-Hyuk Cha, and Charles Tappert."

Similar presentations


Ads by Google