1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply.

1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply chain attack (Bluetooth, SD card) –Power usage?

2 Key stroke biometrics with number-pad input (DSN 2010) –28 users typed the same 10 digit number –Use statistical machine learning techniques –Detection rate 99.97% –False alarm rate 1.51% –Can be used for real life two-factor authentication

Keyboard Acoustic Emanations Revisited Li Zhuang, Feng Zhou and J. D. Tygar U. C. Berkeley

4 Motivation Emanations of electronic devices leak information How much information is leaked by emanations? Apply statistical learning methods to security –What is learned from recordings of typing on a keyboard?

5 Keyboard Acoustic Emanations Leaking information by acoustic emanations Alice password

6 Acoustic Information in Typing Frequency information in sound of each typed key Why do keystrokes make different sounds? –Different locations on the supporting plate –Each key is slightly different [Asonov and Agrawal 2004]

7 Timing Information in Typing Time between two keystrokes Lasting time of a keystroke E.g. [Song, Wagner and Tian, 2001]

8 Previous Work vs. Our Approach Asonov and AgrawalOurs RequirementText-labelingDirect recovery Analogy in CryptoKnown-plaintext attackKnown-ciphertext attack Feature ExtractionFFTCepstrum Initial training Supervised learning with Neural Networks Clustering (K-means, Gaussian), EM algorithm Language Model/HMMs at different levels Feedback-based Training/Self-improving feedback

9 Key Observation Build acoustic model for keyboard & typist Non-random typed text (English) –Limited number of words –Limited letter sequences (spelling) –Limited word sequences (grammar) Build language model –Statistical learning theory –Natural language processing

10 Overview Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

11 Feature Extraction Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

12 Sound of a Keystroke How to represent each keystroke? –Vector of features: FFT, Cepstrum –Cepstrum features used in speech recognition

13 Cepstrum vs. FFT Repeat experiments from [Asonov and Agrawal 2004] TrainingTest 1 Test 2 Training Linear ClassificationNeural NetworksGaussian Mixtures 0 1 accuracy

14 Unsupervised Learning Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

15 Unsupervised Learning Group keystrokes into N clusters –Assign keystroke a label, 1, …, N Find best mapping from cluster labels to characters Some character combinations are more common –“ th ” vs. “ tj ” –Hidden Markov Models (HMMs)

16 Bi-grams of Characters Colored circles: cluster labels Empty circles: typed characters Arrows: dependency “t”“h”“e” EM 5112

17 Language Model Correction Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

18 Word Tri-grams Spelling correction Simple statistical model of English grammar Use HMMs again to model

19 Two Copies of Recovered Text Before spelling and grammar correction After spelling and grammar correction _____ = errors in recovery = errors in corrected by grammar

20 Sample Collector Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

21 Feedback-based Training Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

22 Feedback-based Training Recovered characters –Language correction Feedback for more rounds of training Output: keystroke classifier –Language independent –Can be used to recognize random sequence of keys E.g. passwords –Representation of keystroke classifier Neural networks, linear classification, Gaussian mixtures

23 Keystroke Classifier Initial training Unsupervised Learning Language Model Correction Sample Collector Classifier Builder keystroke classifier recovered keystrokes Feature Extraction wave signal Subsequent recognition Feature Extraction wave signal Keystroke Classifier Language Model Correction (optional) recovered keystrokes

24 Experiment (1) Single keyboard –Logitech Elite Duo wireless keyboard –4 data sets recorded in two settings Quiet & noisy Keystrokes are clearly separable from consecutive keys –Automatically extract keystroke positions in the signal with some manual error correction

25 –Data sets Initial & final recognition rate Recording lengthNumber of wordsNumber of keys Set 1~12 min~400~2500 Set 2~27 min~1000~5500 Set 3~22 min~800~4200 Set 4~24 min~700~4300 Set 1 (%)Set 2 (%)Set 3 (%)Set 4 (%) WordCharWordCharWordCharWordChar Initial3576398032732368 Final9096899683958092

26 Experiment (2) Multiple Keyboards –Keyboard 1: DELL QuietKey PS/2, P/N: 2P121 In use for about 6 months –Keyboard 2: DELL QuietKey PS/2, P/N: 035KKW In use for more than 5 years –Keyboard 3: DELL Wireless Keyboard, P/N: W0147 New

27 12-minute recording with ~2300 characters Keyboard 1 (%)Keyboard 2 (%)Keyboard 3 (%) WordCharWordCharWordChar Initial317220622364 Final829382947590

28 Experiment (3) Classification methods in feedback-based training –Neural Networks (NN) –Linear Classification (LC) –Gaussian Mixtures (GM)

29 Limitations of Our Experiments Considered letters, period, comma, space, enter Did not consider numbers, other punctuation, backspace, shift, etc. Easily separable keystrokes Only considered white noise (e.g. fans)

30 Defenses Physical security Two-factor authentication Masking noise Keyboards with uniform sound (?)

31 Summary Recover keys from only the sound Using typing of English text for training Apply statistical learning theory to security –Clustering, HMMs, supervised classification, feedback incremental learning Recover 96% of typed characters

1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply.

Similar presentations

Presentation on theme: "1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply.

Similar presentations

Presentation on theme: "1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply."— Presentation transcript:

Similar presentations

About project

Feedback