Research Experiment Design Sprint: Keystroke Biometric Intrusion Detection Ned Bakelman Advisor: Dr. Charles Tappert
Research Problem Statement Using the keystroke biometric, how quickly and how accurately can we detect an intruder’s unauthorized use of another person’s computer?
Background DARPA is funding work to monitor military and government computers to detect intrusions Pace University has developed a sophisticated keystroke biometrics system for text input 300 keystrokes good accuracy- time response tradeoff The Pace Keystroke Biometric System (PKBS) was updated to handle completely free (application independent) keystroke samples
Methodology Monitor each computer and continuously authenticate the user through via keystroke input Assume one authorized user per machine for simplicity During this continuing authentication process we want to detect an intruder as someone other than the authorized user
Intruder Scenario 1 User Bob leaves his office for lunch with his computer running and unlocked Intruder Trudy sits down at Bob’s desk and uses the computer while Bob is at lunch Trudy is not being malicious, but just taking advantage of an available computer – using it to type documents, surf the web, check her Facebook account, etc. However there is sensitive information that Trudy could come across, so detecting that an “innocent” intruder is working on Bob’s computer is important
Intruder Scenario 2 Bob goes on his lunch break and leaves his computer accessible (on and unlocked, or password available) Intruder Trudy starts using Bob’s computer to do various malicious activities: Send s impersonating Bob Logon to Expense Tracking-Reimbursement to enter fake claims Logon on to CRM (Customer Relationship Management) system to obtain contact information on customers Modify financial statement spreadsheets on Bob’s hard drive This is a more serious intrusion than Scenario 1
Research Experiment Design Sprint Design experiments to investigate the problem statement re the two scenarios Ideas Keyboard-entered keystrokes are a time series Simulate the time series keystroke data of the authentic user with inserted intruder data Use the data to run experiments with PKBS to obtain performance results
Key Ideas Keyboard-entered keystrokes are a time series Use an authentication window on the time series to authenticate the user on each window Should the window duration be in time or number of keystrokes? Fixed #Keystroke window is better – give rationale If authentication fails, an intruder is detected! Simulate this process by inserting blocks of intruder data into authentic time series Use PKBS to obtain performance results
Authentication Window Design 1 Authenticate the user on windows of 300 keystrokes (possibly overlapping to better detect intruder) KS 300 KS 300 KS 300 KS 300 KS 300 KS KS KS 300 KS 300 KS 300 KS Keystroke Count
Authenticate the user on windows of 300 keystrokes Insert a block of intruder’s keystrokes Start a new window after a significant pause Assumes a pause for intruder access Negates necessity for overlapping windows KS 300 KS 300 KS 300 KS 300 KS Pause Threshold Keystroke Count Authentication Window Design 2
PKBS Experiment Design Number of subjects for normal keystroke entry Number of subjects for intruder keystroke entry Number of training and test samples Etc.
Normal user data is typical user input , word processing, spreadsheet entry, web surfing, etc. Intruder likely has special characteristics What are these characteristics (commands, etc.)? Might be a fast typist Can the special characteristics of intruder data be used to assist intruder detection? Normal User versus Intruder Data
Scenario 1 Use normal typical-user keystroke input , word processing, spreadsheet entry, web surfing, etc. Scenario 2 Use simulated intruder keystroke input Special types of commands, etc., maybe fast typing Simulated Intruder Scenarios
Analysis of Experimental Results Review Receiver Operating Characteristic (ROC) Curves Explore tradeoff between FAR and FRR Etc.
Newly Discovered Possible Hypotheses Starting authentication windows after pauses is better than periodic overlapping or non- overlapping windows Longer authentication windows yield higher performance but slower detection times (graph trade-off, try to find best trade-off) Detecting malicious intruders is easier than detecting non-malicious ones