Download presentation
Presentation is loading. Please wait.
1
802.11 User Fingerprinting Jeff Pang, Ben Greenstein, Ramki Gummadi, Srini Seshan, and David Wetherall Most slides borrowed from Ben
2
Location Privacy is at Risk You “The adversary” (a.k.a., some dude with a laptop) Your MAC address: 00:0E:35:CE:1F:59 Usually < 100m
3
Are pseudonyms enough? MAC address now: 00:0E:35:CE:1F:59 MAC address later: 00:AA:BB:CC:DD:EE
4
Implicit Identifiers Remain Consider one user at SIGCOMM 2004 Visible in an “anonymized” trace MAC addresses scrubbed Effectively a pseudonym Transferred 512MB via bittorrent => Crappy performance for everyone else Let’s call him Bob Can we figure out who Bob is?
5
Implicit Identifier: SSIDs SSIDs in Probe Requests Windows XP, Mac OS X probe for your preferred networks by default Set of networks advertised in a traffic sample Determined by a user’s preferred networks list SSID Probe: “roofnet” Bob
6
What if Bob used pseudonyms? “roofnet” probe occurred during different session than bittorrent download Can no longer explicitly associate “roofnet” with poor network etiquette Can we do it implicitly?
7
Implicit Identifier: Network Destinations Network Destinations Set of IP pairs in a traffic sample In SIGCOMM, each visited by 1.15 users on average A user is likely to visit a site repeatedly (e.g., an email server) SSH/IMAP server: 159.16.40.45 Bob
8
What if network is encrypted? Can’t see IP addresses through link- layer encryption like WPA Is Bob safe now?
9
Implicit Identifier: Broadcast Packet Sizes Broadcast Packet Sizes Set of 802.11 broadcast packet sizes in a traffic sample E.g., Windows machines NetBIOS naming advertisements; FileMaker and Microsoft Office advertise themselves In SIGCOMM, only 16% more unique tuples than unique sizes Broadcast packet sizes: 239, 245, 257 Bob
10
Implicit Identifier: MAC Protocol Fields MAC Protocol Fields Header bits (e.g., power mgmt., order) Supported rates Offered authentication algorithms Mac Protocol Fields: 11,4,2,1Mbps, WEP, etc. Bob
11
David J. Wetherall Anonymized 802.11 Traces from SIGCOMM 2004 Search on Wigle for “djw” in the Seattle area Google pinpoints David’s home (to within 200 ft) A pseudonym What else do implicit identifiers tell us?
12
Automating Implicit Identifiers TRAINING: Collect some traffic known to be from Bob OBSERVATION: Which traffic is from Bob? ? ??
13
Methodology Simulate using SIGCOMM, USCD Split trace into training data and observation data Sample = 1hour of traffic to/from a user Assume pseudonyms “The adversary”
14
Did this traffic sample come from Bob? How to convert implicit identifiers into features? Naïve Bayesian Classifier: We say sample s (with features f i ) is from Bob if Pr[s from Bob | s has features f i ] > T
15
Did This Traffic Sample Come from Bob? Features: Set similarity (Jaccard Index), weighted by frequency: linksys IR_Guest djw SIGCOMM_1 PROFILE FROM TRAINING SAMPLE FOR VALIDATION Rare Common
16
Individual Feature Accuracy 60% TPR with 99% FPR Higher FPR, likely due to not being user specific Useful in combination with other features, to rule out identities
17
Multi-feature Accuracy Samples from 1 in 4 users are identified >50% of the time with 0.001 FPR bcast + ssids + fields + netdests bcast + ssids + fields bcast + ssids
18
Was Bob here today? Maybe… Suppose N users present Over an 8 hour day, 8*N opportunities to misclassify a user’s traffic Instead, say Bob is present iff multiple samples are classified as his
19
Was Bob here today? In a busy coffee shop with 25 concurrent users, more than half (54%) can be identified with 90% accuracy 4 hour median to detect (4 samples) 27% with two 9s.
20
Conclusion: Pseudonyms Are Insufficient 4 new identifiers: netdests, ssids, fields, bcast Average user emits highly distinguishing identifiers Adversary can combine features Future Uncover more identifiers (timing, etc.) Validate on longer/more diverse traces (SSIDs stable in home setting for >=2 weeks) Build a better link layer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.