Putting People in their Places An Anonymous and Privacy-Sensitive Approach to Collecting Sensed Data in Location-Based Applications Karen P. Tang Pedram Keyani, James Fogarty, Jason I. Hong Human-Computer Interaction Institute Carnegie Mellon University
2 2 Location-Aware Computing Is Here In-car navigation system PDAs, phones, laptops: WiFi & GSM
3 3 Types of Location-Aware Apps Person-centric “What restaurants are near me?” “Where are my friends?” “What’s happening around me?”
4 4 Privacy treated as a tradeoff Anonymity & Privacy Disclosure Fidelity Specific Location Query: “Where are the closest restaurants near me?”
5 5 Privacy treated as a tradeoff Anonymity & Privacy Disclosure Fidelity Specific Location Query: “Where are the closest restaurants near me?” More Anonymous Location Query: “Where are all the restaurants in Montreal?”
6 6 Types of Location-Aware Apps Person-centric “What restaurants are near me?” “Where are my friends?” “What’s happening around me?” Location-centric “What’s happening at the mall?” “How busy is the restaurant?” “What’s happening on highway 5?”
7 7 Zipdash: a Location-Centric App Commercial (acquired by Google) How it works: Runs on GPS-enabled phones Continuously disclose GPS Server infers traffic congestion View traffic information on phone zipdash.com
8 8 Zipdash: How it works Each car reports GPS data Server collects all GPS reports
9 9 Zipdash: Privacy Threat Each car reports GPS data Server collects all GPS reports Can you trust the server? Data is leaked … Someone is eavesdropping … Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW
10 Zipdash: Privacy Threat Observation: consistent routes Start/End is “Work” or “Home” Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW
11 Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW Zipdash: Privacy Threat Observation: consistent routes Start/End is “Work” or “Home” Malicious Server Threat: Hijack GPS log for each car Infer start of route as “Home” Lookup via consumer database “Home”
12 Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW Zipdash: Privacy Threat Observation: consistent routes Start/End is “Work” or “Home” Malicious Server Threat: Hijack GPS log for each car Infer start of route as “Home” Lookup via consumer database Result: Your “Home” and your identity are revealed “Home”
13 Zipdash: Use Fidelity Tradeoff ? Car calculates actual GPS Car reports “blurred” GPS Car A 8:00AMin Montreal, QC 8:05AM in Montreal, QC 8:10AMin Montreal, QC 8:15AMin Montreal, QC Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW
14 Zipdash: Use Fidelity Tradeoff ? Car calculates actual GPS Car reports “blurred” GPS Application loses usefulness Fidelity tradeoff lessens utility Car A 8:00AMin Montreal, QC 8:05AM in Montreal, QC 8:10AMin Montreal, QC 8:15AMin Montreal, QC Car A 8:00AM45.587ºN, ºW 8:05AM45.527ºN, ºW 8:10AM45.594ºN, ºW 8:15AM45.594ºN, ºW
15 Limits of Fidelity Tradeoff Fidelity tradeoff doesn’t work for Zipdash
16 A New Approach to Privacy Fidelity tradeoff doesn’t work for Zipdash Location-centric applications need a better way to protect users’ privacy “Hitchhiking”
17 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
18 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
19 Client-focused, software-based approach to privacy-sensitive, location-centric apps on commodity devices and networks Key: location is the entity of interest Ensure complete user anonymity & no new privacy threats, even with malicious server Hitchhiking: Definition
20 Client-focused, software-based approach to privacy-sensitive, location-centric apps on commodity devices and networks Key: Location is the entity of interest Ensure complete user anonymity & no new privacy threats, even with malicious server Hitchhiking: Definition
21 Hitchhiking Approach to Zipdash “Bridge” = location of interest Only report GPS when on bridge
22 Car A 8:05AM ºN, ºW Car B 8:06AM ºN, ºW Car C 8:07AM ºN, ºW Hitchhiking Approach to Zipdash “Bridge” = location of interest Only report when on bridge Prevent malicious server threat No start/end pattern Every report from the same areas No lookups are possible A B C
23 “Is my bus running late?” Detection of on/off the bus When on the bus: Device senses location Device models on/off bus Device anonymously reports bus location to server Server shares bus info Hitchhiking Example: Bus Location of interest: Bus route [Patterson, 2003]
24 Hitchhiking Example: Coffee shop “Is Starbucks busy now?” When in the coffee shop: Device senses WiFi location Device senses other devices Device anonymously reports device count & WiFi info Server infers shop’s busyness Location of interest: Coffee shop
25 Hitchhiking Example: Meeting Room Location of interest: Meeting Room “Can I use that room now?” When in the meeting room: Device senses WiFi location Device anonymously reports WiFi data to server Server infers room availability Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B
26 Research Contribution Hitchhiking is: … a privacy-sensitive approach … applicable to location-centric apps … provides complete user anonymity while maintaining application’s full utility By using Hitchhiking principles, we can build interesting sensor-based location applications without sacrificing the user’s privacy
27 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
28 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
29 Meeting Room Availability “Is that meeting room available right now?” Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B
30 Standard Approach: Always Track Most common approach for current systems Privacy Threat from Malicious Server: Most people spend bulk of time in an office Correlate location trails to a specific person Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B
31 Hitchhiking Solution Define meeting rooms as locations of interest Privacy defense: Client computation Compute location on the device Only report while at this location Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B
32 Hitchhiking Solution Define meeting rooms as locations of interest Privacy defense: Client computation Compute location on the device Only report while at this location Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B
33 Client location computation Prior work: Place Lab [LaMarca et al, 2005; Schilit, 2003] Client-based approach alone is not enough Hitchhiking thoroughly investigates these other privacy threats and extends prior work to address them
34 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
35 Threat: Location Spoofing Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B Privacy Threat from Malicious Server: Add fake locations of interest (e.g. your office)
36 Threat: Location Spoofing Privacy Threat from Malicious Server: Add fake locations of interest (e.g. your office) Mislabel a fake location of interest Enables tracking of potential private places Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B Meeting Room C
37 Hitchhiking Solution Make threat apparent to the user Privacy defense: Location of interest approval In Office 4: “You appear to be in a location that another user has indicated is Meeting Room C. Do you want to disclose your info? Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B Meeting Room C
38 Hitchhiking Solution Make threat apparent to the user Privacy defense: Location of interest approval In Office 4: “You appear to be in a location that another user has indicated is Meeting Room C. Do you want to disclose information from your current location?” Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room A Meeting Room B Meeting Room C
39 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
40 Threat: Link identifiers to a person Privacy Threat from Malicious Server: Attach unique identifiers to locations of interest Craft identifiers to each individual People-specific reports for each location of interest Malicious Server Meeting Room B B: John B: Mary
41 Hitchhiking Solution Privacy defense: Sensed physical identifiers Use device to sense surrounding identifiers Ensures every device sees the same identifiers Anonymizes reports from devices Hitchhiking Server Meeting Room B 00-0C-F1-5C-04-A8
42 Hitchhiking: Putting it Together Device reports after detecting “Meeting Room B”: If first time, device prompts for disclosure approval Device anonymously reports sensed WiFi to server Server only knows someone is in Meeting Room B No person-specific location trail for any users Office 1Office 2Office 3Office 4Office 5Office 6 Office 7Office 8 Meeting Room B Meeting Room A 00-0C-F1-5C-04-A8
43 Related issues Other issues surrounding Hitchhiking: Query Anonymity Live Reports vs. Offline Collection Transport Layer Attack Denial-of-Service Attack Timing-Based Attack Defenses for these threats exist…
44 Overview Motivation & Limits of Fidelity Tradeoff Hitchhiking Example Applications Privacy Analysis & Hitchhiking principles Client computation Location of interest approval Sensing physical identifiers Conclusion
45 Conclusion: Hitchhiking Highlights It is a client-focused, software-based approach to privacy-sensitive location-centric apps It works on existing devices & networks It uses location constraints & anonymity
46 Conclusion: Hitchhiking Highlights Hitchhiking is an extreme architecture: Assumes a system with minimum trust Systems with implicit trust can relax principles Provides application developers a way to build useful location apps while avoiding well-known privacy risks
47 Thank you! Questions and comments? Karen P. Tang Human-Computer Interaction Institute Carnegie Mellon University Acknowledgements: This is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. NBCHD030010, by an AT&T Labs fellowship, and by the National Science Foundation under grants IIS and IIS We also thank contributors to Place Lab, jpcap, libpcap, and JDesktop Integration Components, which were utilized in this work.
48 Potential Questions Slides K-anonymity Mixed Zones Query Anonymity Live Reports vs. Offline Collection Transport Layer Attack Denial-of-Service Attacks Timing-based Attacks
49 K-Anonymity Server obscures client’s location by including client + k-1 others However: Requires a trusted middleware server Not applicable to location-centric applications supported by Hitchhiking k-1 others may not be in the meeting room
50 Mixed Zones Client gets new ID when entering location However: Requires trusted middleware server Server keeps tab of all used IDs Server provides new IDs to clients
51 Query Anonymity Hitchhiking: Anonymizes location’s report Doesn’t anonymize queries about a location Problem: What if you ask about a location? If you’ve already been there before: Used sensed identifiers to ask server
52 Query Anonymity Hitchhiking: Anonymizes location’s report Doesn’t anonymize queries about a location Problem: What if you ask about a location? If you haven’t been there before: Mask queries Cached, local model
53 Live Reports vs Offline Collection Live reports not a Hitchhiking requirement Hitchhiking doesn’t assume connectivity Alternative: local cache, upload later However, might need to change app Real-time availability Temporal models of availability
54 Transport Layer Attacks Problem: Phone networks: providers know your location WiFi networks: provider could log MAC address Reality: People trust their network providers
55 Transport Layer Attacks Problem: Phone networks: providers know your location WiFi networks: provider could log MAC address Reality: People trust their network providers Hitchhiking: Give app developers same level of trust Does not introduce any new privacy threats by allowing apps to collect sensed data
56 Denial-of-Service Attacks What if: server flooded with bad reports Standard approach: Give everyone an unique ID Ban the ID that sends fraudulent data Doesn’t allow for anonymity
57 Denial-of-Service Attacks What if: server flooded with bad reports More anonymous approaches: Note IP address which reports Unlikely to report from many places in short time Seed database with false data Insert non-existent MAC address in identifier list Ban reports that include false identifiers
58 Timing-Based Attacks Hitchhiking: Content cannot lead to tracking Can we infer from consecutive reports? 2 reports received around same time for same location of interest Use reports from 2 close locations of interest
59 Timing-Based Attacks Hitchhiking: Content cannot lead to tracking Can we infer from consecutive reports? 2 reports received around same time for same location of interest Use reports from 2 close locations of interest Solution: Limit frequency of reports Not just for an application but for all reports E.g. report 1x/10 min for any app = sparse