1 802.11 User Fingerprinting Jeffrey Pang 1 Ben Greenstein 2 Ramakrishna Gummadi 3 Srinivasan Seshan 1 David Wetherall 2,4 1 CMU 2 Intel Research Seattle.

Slides:



Advertisements
Similar presentations
Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Advertisements

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
SoNIC: Classifying Interference in Sensor Networks Frederik Hermans et al. Uppsala University, Sweden IPSN 2013 Presenter: Jeffrey.
WiFi-Reports: Improving Wireless Network Selection with Collaboration Presented By Tim McDowell.
Secure In-Band Wireless Pairing Shyamnath Gollakota Nabeel Ahmed Nickolai Zeldovich Dina Katabi.
WEP 1 WEP WEP 2 WEP  WEP == Wired Equivalent Privacy  The stated goal of WEP is to make wireless LAN as secure as a wired LAN  According to Tanenbaum:
User Fingerprinting Jeffrey Pang 1 Ben Greenstein 2 Ramakrishna Gummadi 3 Srinivasan Seshan 1 David Wetherall 2,4 1 CMU 2 Intel Research Seattle.
1 William Lee Duke University Department of Electrical and Computer Engineering Durham, NC Analysis of a Campus-wide Wireless Network February 13,
1 (Un)Trustworthy Wireless: What your wireless traffic says about you… Jeff Pang with Ben Greenstein, Ramki Gummadi, Tadayoshi Kohno, David Wetherall (UW/Intel.
1 Tryst: Making Local Service Discovery Confidential Jeffrey Pang Ben Greenstein Srinivasan Seshan David Wetherall.
60 GHz Flyways: Adding multi-Gbps wireless links to data centers
 Guarantee that EK is safe  Yes because it is stored in and used by hw only  No because it can be obtained if someone has physical access but this can.
Srinivasan Seshan (and many collaborators) Carnegie Mellon University 1.
1/40 Quantifying and Preventing Privacy Threats in Wireless Link Layer Protocols Thesis Proposal Jeffrey Pang.
Link Setup Time (ms) Details : How do sender and receiver synchronize i ? Discovery/binding messages: infrequent and narrow interface  short term linkability.
1 Wireless LAN Security Presented by Vikrant Karan.
1 Making Local Service Discovery Confidential with Tryst Jeffrey Pang CMU Ben Greenstein Intel Research Srinivasan Seshan CMU David Wetherall University.
User Fingerprinting Jeff Pang, Ben Greenstein, Ramki Gummadi, Srini Seshan, and David Wetherall Most slides borrowed from Ben.
WiFi-Reports: Improving Wireless Network Selection Jeffrey Pang (CMU) with Ben Greenstein (IRS) Michael Kaminsky (IRP) Damon McCoy (U. Colorado) Srinivasan.
Security+ Guide to Network Security Fundamentals, Third Edition Chapter 6 Wireless Network Security.
User Fingerprinting MobiCom 2007 (Sept Montreal, Quebec, Canada)
FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. 6 Packet Filtering By Whitman, Mattord, & Austin© 2008 Course Technology.
RFID Cardinality Estimation with Blocker Tags
Physical-layer Identification of RFID Devices Authors: Boris Danev, Thomas S. Heyde-Benjamin, and Srdjan Capkun Presented by Zhitao Yang 1.
Privecsg Tracking of Link Layer Identifiers Date: [ ] Authors: NameAffiliationPhone Juan Carlos ZúñigaInterDigital
Packet Filtering. 2 Objectives Describe packets and packet filtering Explain the approaches to packet filtering Recommend specific filtering rules.
Computer Networks. Network Connections Ethernet Networks Single wire (or bus) runs to all machines Any computer can send info to another computer Header.
ECE 424 Embedded Systems Design Networking Connectivity Chapter 12 Ning Weng.
Chapter 6: Packet Filtering
Guomin Yang et al. IEEE Transactions on Wireless Communication Vol. 6 No. 9 September
Wireless Network Security Dr. John P. Abraham Professor UTPA.
1 Figure 2-11: Wireless LAN (WLAN) Security Wireless LAN Family of Standards Basic Operation (Figure 2-12 on next slide)  Main wired network.
WiFi-Reports: Improving Wireless Network Selection Jeffrey Pang (CMU) with Ben Greenstein (IRS) Michael Kaminsky (IRP) Damon McCoy (U. Colorado) Srinivasan.
1 C-DAC/Kolkata C-DAC All Rights Reserved Computer Security.
Packet Filtering Chapter 4. Learning Objectives Understand packets and packet filtering Understand approaches to packet filtering Set specific filtering.
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft,
CWSP Guide to Wireless Security Chapter 2 Wireless LAN Vulnerabilities.
Network Security. 2 SECURITY REQUIREMENTS Privacy (Confidentiality) Data only be accessible by authorized parties Authenticity A host or service be able.
Mapping Internet Sensors with Probe Response Attacks Authors: John Bethencourt, Jason Franklin, Mary Vernon Published At: Usenix Security Symposium, 2005.
Rushing Attacks and Defense in Wireless Ad Hoc Network Routing Protocols ► Acts as denial of service by disrupting the flow of data between a source and.
User Fingerprinting Jeffrey Pang 1 Ben Greenstein 2 Ramakrishna Gummadi 3 Srinivasan Seshan 1 David Wetherall 2,4 Presenter: Nan Jiang Most Slides:
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Mobile IP Outline Intro to mobile IP Operation Problems with mobility.
WLANs & Security Standards (802.11) b - up to 11 Mbps, several hundred feet g - up to 54 Mbps, backward compatible, same frequency a.
Can Ferris Bueller Still Have His Day Off? Protecting Privacy in the Wireless Era Authors: Ben Greenstein, Ramakrishna Gummadi, Jeffrey Pang, Mike Y. Chen,
Preserving Privacy in GPS Traces via Uncertainty- Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presented by Joseph T. Meyerowitz.
Wireless Networks. Wireless Network A wireless network transports data from one device to another without cables or wires – RF signals – Microwaves –
6° of Darkness or Using Webs of Trust to Solve the Problem of Global Indexes.
Privecsg Tracking of Link Layer Identifiers Date: [ ] Authors: NameAffiliationPhone Juan Carlos ZúñigaInterDigital
1 DozyAP: Power-Efficient Wi-Fi Tethering Speaker Hao Han College of William & Mary 3/22/2013 W&M Graduate Research Symposium 2013.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Internet Literacy Evaluating Web Sites. Objective The Student will be able to evaluate internet web sites for accuracy and reliability The Student will.
Doc.: IEEE /1022r0 Submission September 2008 Greenstein (Intel) et al. Slide 1 SlyFi: Enhancing Privacy by Concealing Link Layer Identifiers.
Wireless Security Presented by Colby Carlisle. Wireless Networking Defined A type of local-area network that uses high-frequency radio waves rather than.
Authentication has three means of authentication Verifies user has permission to access network 1.Open authentication : Each WLAN client can be.
Don’t Log in!. Recap on the previous units I’ve tried to make it as concise as possible but there is a bit of writing, to ensure that you have some notes.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Simon Prasad. Introduction  Smartphone and other mobile devices have made it so easy to stay connected.  But this easy availability may lead to personal.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Improving Security Over Ipv6 Authentication Header Protocol using IP Traceback and TTL Devon Thomas, Alex Isaac, Majdi Alharthi, Ali Albatainah & Abdelshakour.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Wireless LANs.
Location Cloaking for Location Safety Protection of Ad Hoc Networks
Hidetoshi Kido1, Yutaka Yanagisawa2, Tetsuji Satoh1,2
Internet Literacy Evaluating Web Sites.
Mobile IP Outline Homework #4 Solutions Intro to mobile IP Operation
Mobile IP Outline Intro to mobile IP Operation Problems with mobility.
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Lesson Overview 1.1 What Is Science?.
Presentation transcript:

User Fingerprinting Jeffrey Pang 1 Ben Greenstein 2 Ramakrishna Gummadi 3 Srinivasan Seshan 1 David Wetherall 2,4 1 CMU 2 Intel Research Seattle 3 USC,MIT 4 University of Washington

2 They leave “digital fingerprints” that reveal who we are –And thus where we’ve been Why is Bob over there? Motivation: The Mobile Wireless Landscape

3  Location privacy is growing concern Wireless Privacy Protection Act [U.S. House Bill ‘05] Geographic location/privacy working group [IETF] Motivation: The Mobile Wireless Landscape

4 A well known technical problem –Devices have unique and consistent addresses –e.g., devices have MAC addresses  fingerprinting them is trivial! MAC address now: 00:0E:35:CE:1F:59 MAC address later: 00:0E:35:CE:1F:59 Adversary tcpdump time

5 Motivation: The Mobile Wireless Landscape The widely proposed techical solution –Pseudonyms: Change addresses over time : Gruteser ’05, Hu ’06, Jiang ’07 Bluetooth: Stajano ’05 RFID: Juels ‘04 GSM: already employed MAC address now: 00:0E:35:CE:1F:59 MAC address later: 00:AA:BB:CC:DD:EE tcpdump ? time

6 Motivation: The Mobile Wireless Landscape Our work shows: Pseudonyms are not enough –Implicit identifiers: identifying characteristics of traffic –E.G., most users identified with 90% accuracy in hotspots 00:0E:35:CE:1F:5900:AA:BB:CC:DD:EE tcpdump Search: “Bob’s Home Net” Packets  Intel Server … Search: “Bob’s Home Net” Packets  Intel Server … time

7 Contributions Four Novel Implicit Identifiers Automated Identification Procedure Evaluating Implicit Identifier Accuracy

8 Contributions Four Novel Implicit Identifiers Automated Identification Procedure Evaluating Implicit Identifier Accuracy

9 Implicit Identifiers by Example Consider one user at SIGCOMM 2004 –Transferred 512MB via BitTorrent Poor network etiquette? –Seen in a “anonymized” wireless trace MAC addresses hashed, effectively a pseudonym Can we identify the culprit using implicit identifiers? tcpdump ?

10 Implicit Identifiers by Example Implicit identifier: SSIDs in probes –Set of SSIDs in probe requests –Many drivers search for preferred networks –Usually networks you have associated with before SSID Probe: “roofnet” User of “roofnet” tcpdump ?

11 Implicit Identifiers by Example Implicit identifier: SSIDs in probes –Set of SSIDs in probe requests –Many drivers search for preferred networks –Usually networks you have associated with before 00:0E:35:CE:1F:59 SSID Probe: “roofnet” tcpdump Bittorrent transfer 00:AA:BB:CC:DD:EE ? tcpdump time

12 Implicit Identifiers by Example Implicit identifier: network destinations –IP pairs in network traffic –At SIGCOMM, each visited by 1.15 users on average –Some nearly-unique destinations repeatedly visited (e.g., server) SSH/IMAP server: tcpdump ? time SSH/IMAP server: ?

13 Implicit Identifiers by Example Implicit identifier: broadcast packet sizes –Set of broadcast packet sizes in network traffic –E.g., Windows machines NetBIOS naming advertisements; FileMaker and Microsoft Office advertise themselves Broadcast pkt sizes: 239, 245, 257 tcpdump ? time Broadcast pkt sizes: 239, 245, 257

14 Implicit Identifiers by Example Implicit identifier: MAC protocol fields –Header bits (e.g., power management., order) –Supported rates –Offered authentication algorithms tcpdump ? time Protocol Fields: 11,4,2,1Mbps, WEP Protocol Fields: 11,4,2,1Mbps, WEP

15 Implicit Identifier Summary Networks: PublicHomeEnterprise Network destinations SSIDs in probes Broadcast pkt sizes MAC protocol fields Visible even with WEP/WPA TKIP encryption More implicit identifiers exist  Results we present establish a lower bound Identifying even if devices have identical drivers

16 Fixing Implicit Identifiers is not Simple Encryption does not prevent traffic analysis –Cover traffic? –Challenge: Shared medium  large performance hit Service discovery is done in the clear –Don’t probe? –Challenge: Beaconing is often undesirable also Implementation and configuration variation –Standardize? –Challenge: Ambiguity of specifications

17 Contributions Four Novel Implicit Identifiers Automated Identification Procedure Evaluating Implicit Identifier Accuracy

18 Tracking Users Many potential tracking applications: –Was user X here today? –Where was user X today? –What traffic is from user X? –When was user X here? –Etc.

19 Tracking Users Tracking scenario: –Every users changes pseudonyms every hour –Adversary monitors some locations  One hourly traffic sample from each user in each location ? ? ? tcpdump Build a profile from training samples: First collect some traffic known to be from user X and from random others tcpdump..... ? ? ? Then classify observation samples Traffic at 2-3PM Traffic at 3-4PM Traffic at 4-5PM Traffic at 2-3PM Traffic at 3-4PM Traffic at 4-5PM Traffic at 2-3PM Traffic at 3-4PM Traffic at 4-5PM

20 Sample Classification Algorithm Core question: –Did traffic sample s come from user X? A simple approach: naïve Bayes classifier –Derive probabilistic model from training samples –Given s with features F, answer “yes” if: Pr[ s from user X | s has features F ] > T for a selected threshold T. –F = feature set derived from implicit identifiers

21 Deriving features F from implicit identifiers Set similarity (Jaccard Index), weighted by frequency: IR_Guest SAMPLE FROM OBSERVATION Sample Classification Algorithm linksys djw SIGCOMM_1 PROFILE FROM TRAINING Rare Common w(e) = low w(e) = high

22 Contributions Four Novel Implicit Identifiers Automated Identification Procedure Evaluating Implicit Identifier Accuracy

23 Evaluating Classification Effectiveness Simulate tracking scenario with wireless traces: –Split each trace into training and observation phases –Simulate pseudonym changes for each user X DurationProfiled UsersTotal Users SIGCOMM conf. (2004) 4 days UCSD office building (2006) 1 day Apartment building (2006) 14 days39196

24 Question: Is observation sample s from user X? Evaluation metrics: –True positive rate (TPR) Fraction of user X’s samples classified correctly –False positive rate (FPR) Fraction of other samples classified incorrectly Evaluating Classification Effectiveness Pr[ s from user X | s has features F ] > T = 0.01 = ??? Fix T for FPR Measure TPR (See paper)

25 Results: Individual Feature Accuracy TPR  60% TPR  30% Individual implicit identifiers give evidence of identity 1.0

26 Results: Multiple Feature Accuracy Users with TPR >50%: Public: 63% Home: 31% Enterprise: 27% We can identify many users in all environments PublicHomeEnterprise netdests ssids bcast fields

27 Results: Multiple Feature Accuracy Some users much more distinguishable than others Public networks: ~20% users identified >90% of the time PublicHomeEnterprise netdests ssids bcast fields

28 One Application Question: Was user X here today? More difficult to answer: –Suppose N users present each hour –Over an 8 hour day, 8N opportunities to misclassify  Decide user X is here only if multiple samples are classified as his Revised: Was user X here today for a few hours?

29 Results: Tracking with 90% Accuracy Majority of users can be identified if active long enough Of 268 users (71%): 75% identified with ≤4 samples 50% identified with ≤3 samples 25% identified with ≤2 samples

30 Conclusions Implicit identifiers can accurately identify users –Individual implicit identifiers give evidence of identity –We can identify many users in all environments –Some users much more distinguishable than others Understanding implicit identifiers is important –Pseudonyms are not enough –We establish a lower bound on their accuracy Eliminating them poses research challenges –Current work: Confidential service discovery –Current work: Traffic analysis resistant MAC

31 Extra Slides

32 Related Work Other Implicit Identifiers –Device driver fingerprints [Franklin ’06] –Clock-skew fingerprints [Kohno ’05] –Click-prints [Padmanabhan ’06] –RF antenna fingerprints [Hall ’04] Our work: – fingerprints for individual users –Tracking with only commodity hardware/software –Better coverage than some previous work –Procedure to combine implicit identifiers

33 Evaluating Classification Effectiveness

34 Results: Individual Feature Accuracy Individual implicit identifiers give evidence of identity TPR  50% Other implicit identifiers distinguish groups of users

35 Results: Tracking with 90% Accuracy Many users can be identified in all environments

36 Old Slides

37 Answers to Common Questions Is tracking accuracy good enough to track an entire city/how does it vary with population size? –Our accuracy results aren’t good enough to allow fine-grained tracking of most users in environments with more than 100s of users because the adversary will simply obtain too many samples and have too many opportunities for false positives. However, we note that by modifying the question a little, such as “was this person here for x hours, where x is reasonably large, such as a couple days” or “was this person here in this specific area with smaller population in the sub-100s”, we can answer it fairly accurately. Your traces were only over a couple days, why would one expect fingerprints to remain identifying over longer periods of time? –You are correct and we can’t say anything empirical for longer term tracking. However, we suspect that the implicit identifiers would remain identifying because they are mostly caused by automatic behavior of drivers and applications and thus wouldn’t change due to human behavior or location. In the paper we show that ssids are stable over weeks at least for example. Do you think people really care about location privacy? –I think that people don’t care until they realize that they have a problem and then they do (e.g., the article about a guy tracking his ex-girlfriend). One magnifying factor is that because of the low cost of radios and the ubiquity of in devices, almost anyone can easily monitor some locations and track users of interest. Don’t you think people could modify their behavior in order not to be as identifying when they want to be private? –Sure; if they had the technical knowledge to do so, they could easily modify their behavior to “look different” and leave a different implicit fingerprint. However, I would argue that this is undesirable because it means that privacy concerns have a chilling effect on the types of act ivies we are willing to do thus limiting the wireless applications. I think the protocols themselves should protect us enough so that we don’t have to worry about these things. What about wireless protocols other than ? –I believe many of the implicit identifiers still exist, for the reason that the fundamental reasons behind their existence remain: you can still do traffic analysis even with encryption, all wireless protocols have some form of service discovery, and there is implementation variation.

38 Outline Problem –People are worried about tracking, is especially worrisome –Pseudonyms proposed, not enough Bittorrent Example –Use to explain each identifier –Summarize implicit identifiers How to train as an example –Points to bring up and jx: 1 hour sample size How to select the classifier threshold How adversary could obtain training samples –Learning process Q1 results Q2 results –An attacker can use multiple features and multiple samples, so ask question…

39 Implicit Identifiers by Example Implicit identifier: network destinations –IP pairs in network traffic –At SIGCOMM, each visited by 1.15 users on average –Some nearly-unique destinations repeatedly visited (e.g., server) SSH/IMAP server: WPA TKIP encryption XXXXX tcpdump

40 Results: Individual Feature Accuracy Some users more distinguishable than others netdests: ~60% users identified >50% of the time ~20% users identified >90% of the time