Snooping Keystrokes with mm-level Audio Ranging on a Single Phone

Slides:



Advertisements
Similar presentations
Chunyi Peng, Guobin Shen, Yongguang Zhang, Yanlin Li, Kun Tan BeepBeep: A High Accuracy Acoustic Ranging System using COTS Mobile Devices.
Advertisements

Microsoft Research Asia
FM-BASED INDOOR LOCALIZATION TsungYun 1.
Abstract Binaural microphones were utilised to detect phonation in a human subject (figure 1). This detection was used to cut the audio waveform in two.
Activity, Audio, Indoor/Outdoor classification using cell phones Hong Lu, Xiao Zheng Emiliano Miluzzo, Nicholas Lane CS 185 Final Project presentation.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.
Tracking Fine-grain Vehicular Speed Variations by Warping Mobile Phone Signal Strengths Presented by Tam Vu Gayathri Chandrasekaran*, Tam Vu*, Alexander.
Voice Recognition Hardware Development Read My Lips John Porter, Lavanya Mynam, Gerald Mortensen.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Top Level System Block Diagram BSS Block Diagram Abstract In today's expanding business environment, conference call technology has become an integral.
DAISY Data Analysis and Information SecuritY Lab
HIWIRE meeting ITC-irst Activity report Marco Matassoni, Piergiorgio Svaizer March Torino.
WALRUS: Wireless Active Location Resolver with Ultrasound Tony Offer, Christopher Palistrant.
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Rutgers: Gayathri Chandrasekaran, Tam Vu, Marco Gruteser, Rich Martin,
Crowd++: Unsupervised Speaker Count with Smartphones Chenren Xu, Sugang Li, Gang Liu, Yanyong Zhang, Emiliano Miluzzo, Yih-Farn Chen, Jun Li, Bernhard.
SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell.
Gesture Recognition Using Laser-Based Tracking System Stéphane Perrin, Alvaro Cassinelli and Masatoshi Ishikawa Ishikawa Namiki Laboratory UNIVERSITY OF.
What’s Making That Sound ?
Macquarie RT05s Speaker Diarisation System Steve Cassidy Centre for Language Technology Macquarie University Sydney.
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
SCPL: Indoor Device-Free Multi-Subject Counting and Localization Using Radio Signal Strength Chenren Xu†, Bernhard Firner†, Robert S. Moore ∗, Yanyong.
TEMPLATE DESIGN © Detecting User Activities Using the Accelerometer on Android Smartphones Sauvik Das, Supervisor: Adrian.
Keystroke Recognition using WiFi Signals
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Multimodal Information Analysis for Emotion Recognition
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.
Sound-Event Partitioning and Feature Normalization for Robust Sound-Event Detection 2 Department of Electronic and Information Engineering The Hong Kong.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
149 th Meeting of the Acoustical Society of America, Vancouver, May 2005 Oldenburg University, acoustics group In situ measurement of absorption of acoustic.
Audio Location Accurate Low-Cost Location Sensing James Scott Intel Research Cambridge Boris Dragovic Intern in 2004 at Intel Research Cambridge Studying.
Perceptual Analysis of Talking Avatar Head Movements: A Quantitative Perspective Xiaohan Ma, Binh H. Le, and Zhigang Deng Department of Computer Science.
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio
James Pittman February 9, 2011 EEL 6788 MoVi: Mobile Phone based Video Highlights via Collaborative Sensing Xuan Bao Department of ECE Duke University.
TDOA SLaP (Time Difference Of Arrival Sound Localization and Placement) Project Developers: Jordan Bridges, Andrew Corrubia, Mikkel Snyder Advisor: Robert.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Indoor Positioning System
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
Secure Unlocking of Mobile Touch Screen Devices by Simple Gestures – You can see it but you can not do it Muhammad Shahzad, Alex X. Liu Michigan State.
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
Presenter: Ailane Mohamed Toufik Authors : Jie Yang †, Simon Sidhom †, Gayathri Chandrasekaran ∗, Tam Vu ∗, Hongbo Liu †, Nicolae Cecan ∗, Yingying Chen.
Turning a Mobile Device into a Mouse in the Air
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan
Feel the beat: using cross-modal rhythm to integrate perception of objects, others, and self Paul Fitzpatrick and Artur M. Arsenio CSAIL, MIT.
Leveraging Wearables for Steering and Driver Tracking
ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.
Chapter 1: Introduction to audio signal processing KH WONG, Rm 907, SHB, CSE Dept. CUHK,
Teng Wei and Xinyu Zhang
When CSI Meets Public WiFi: Inferring Your Mobile Phone Password via WiFi Signals Warren Yeu When CSI Meets Public Wifi.
When CSI Meets Public WiFi: Inferring Your Mobile Phone Password via WiFi Signals Adekemi Adedokun May 2, 2017.
University of Wisconsin-Madison
My Smartphone knows what you print exploring smartphone-based side-channel attacks against 3d Printers Chen Song, feng lin, zongjie ba, kui ren, chi zhou,
Keystroke eavesdropping attacks with WiFi signals
Maxbotix Ultrasonic Distance Sensor
Ch.1: Introduction to audio signal processing
Presented by: Chen Shi 02/22/2018
DAISY Friend or Foe? Your Wearable Devices Reveal Your Personal PIN
Hearing Spatial Detail
Keystroke Recognition using Wi-Fi Signals
A maximum likelihood estimation and training on the fly approach
Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems NDSS 2019 Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick.
Combating Replay Attacks Against Voice Assistants
Mole: Motion Leaks through Smartwatch Sensors
Presentation transcript:

Snooping Keystrokes with mm-level Audio Ranging on a Single Phone DAISY Data Analysis and Information SecuritY Lab Snooping Keystrokes with mm-level Audio Ranging on a Single Phone Presenter: Jian Liu Jian Liu†, Yan Wang†, Gorkem Kar #, Yingying Chen†, Jie Yang‡, Marco Gruteser# †Dept. of ECE, Stevens Institute of Technology, USA # Winlab, Rutgers University, USA ‡ Dept. of CS, Florida State University, USA MobiCom 2015 Paris, France Sep. 9 – 11, 2015 1

Audio chipset: 192kHz playback and recording Mobile Device Hardware Advancements High definition audio capabilities targeted at audiophiles Microphone arrays (stereo recording & noise canceling) 4x improvement in audio sampling rates Such advancements have security concerns Mic-1 Stereo recording Audio chipset: 192kHz playback and recording Mic-2 Mic-3

Adding malware with Mics access The Results of the Advancements Facilitating fine-grained localization based applications Tracking speakers in multiparty conversations Sensing touch interaction on surfaces around mobile devices Eavesdropping keystrokes without suspicion Adding malware into the target user’s phone with microphone access Leaving a phone near a keyboard of the target user Adding malware with Mics access Leaving a phone

Be careful of these nearby phone! They can hear your typing!

Related Work Multiple recording devices Linguistic context Label each key for training Multiple recording devices Linguistic context Training with labeled data Multi-phone to be placed around require a-priori labeled training data typing has to satisfy English language pattern

Our Approach No linguistic model No labeled training (e.g., without any cooperation of the target user) No involvement of multiple phones

Available Audio Components in a Single Phone Stereo recording of two microphones High sampling rate Stereo 1 Mic1 Stereo 2 Noise Cancellation Stereo recording Mic3 Mic2

What can we obtain from the dual-Mic in a phone to snoop keystrokes?

Feature 1: Time Difference of Arrival (TDoA) Theoretical TDoA Mic1 Measured TDoA Distance difference Δd1 t1=t Mic2 Distance difference Δd2 t1=t’ t2=t+Δt t2=t’+Δt’ ` S L Most of the keys could be differentiated by the TDoAs

Limits of Measured TDoA Dual-Microphone TDoA can only identify a group of keystrokes TDoA = Δt r1 – r2 = Δt·v Measured TDoA has the Resolution Limited by Sampling Rate Sampling by ADC Speed of sound: 343m/s

Feature 2: Acoustic Signature Keystrokes of different keys sound different MFCCs (Mel-frequency Cepstral Coefficients) can be used to discriminate sounds of different keys MFCC of key ‘E’ MFCC of key ‘D’ MFCC of key ‘X’

We can combine TDoA and acoustic signatures to identify each keystroke!

System Overview A Set of Keystrokes Keystroke Detection & Segmentation TDoA Derivation Key Groups Generation Grouping of Keystrokes Theoretical Key Groups Acoustic Signature Extraction MFCC-based Clustering with in a Group Theoretical TDoA Cluster-based Letter Labeling Identified Keystrokes

Theoretical Key Groups A theoretical key group – keys having similar theoretical TDoAs Link any pair of keys whose theoretical TDoAs are too similar Sorting Q W E R T Y U I O P A S D F G H J K L One theoretical key group Z X C V B N M

Cross-correlation approach Theoretical key groups Keystroke Grouping [sp − 5ms, sp + 100ms], where sp is starting point Input keystrokes A Set of Keystrokes Cross-correlation approach Keystroke Detection & Segmentation TDoA Derivation Grouping of Keystrokes Theoretical Key Groups g1 g2 g3 gn Theoretical key groups

Clustering within Each Group & Labeling Keystroke clusters Acoustic Signature Extraction MFCC-based Clustering with in a Group A theoretical key group: keystrokes of multiple keys with similar TDoAs Mean TDoAs Theoretical TDoA Cluster-based Letter Labeling Each cluster contains keystrokes of the same key Identified Keystrokes clustering MFCC features: same key shows higher correlation, while different keys present lower correlation Finding Minimum Distance Theoretical TDoA E D X Labeling

Evaluation How robust is the system recovering keystrokes from different keyboards? What is the performance with different sampling rates? How does the placement of the phone influence the snooping accuracy?

Razer Black Widow Ultimate Experimental Setup Phone/Recording Device Samsung Galaxy Note 3 (48kHz) External microphones (96/192kHz) Keyboards Three keyboards with different keystroke sound intensity levels 15.3cm Apple MC184LL/A Microsoft Surface Razer Black Widow Ultimate

Experimental Setup Data collection Placements Evaluation Metric Randomly type the 26 keys a-z on keyboards In typical office environments with ambient noise (e.g., heater, air-conditioner) 3,640 keystrokes are collected Placements Three typical placements Evaluation Metric Top-k Accuracy - identify k candidate keys for each keystroke - whether the pressed keys are among identified key candidates

Overall Performance Average Accuracy Top-k Accuracy Average Accuracy Average Top-1 Accuracy: 86% Average Top-2 Accuracy: 95% Average Top-3 Accuracy: 98% All three keyboards have comparable high accuracies

Impact of Sampling Rates Top-k Accuracy Sampling Rate (kHz) Top-1 Accuracies 48kHz: 85% 96kHz: 86% 192kHz: 94% Higher sampling rate improves the recognition accuracy

Conclusion Show that a single phone can recover keystrokes by exploiting mm-level TDoA ranging and fine-grained acoustic features Develop a training-free approach on a single phone that does not require a linguistic model to snoop keystrokes Extensive experiments with different keyboards & microphones sampling rates demonstrate that our work could achieve sufficient accuracy for keystroke snooping

jliu28@stevens.edu http://personal.stevens.edu/~jliu28/ DAISY Data Analysis and Information SecuritY Lab Thank you! Jian Liu jliu28@stevens.edu http://personal.stevens.edu/~jliu28/