Location Privacy CompSci Instructor: Ashwin Machanavajjhala Some slides are from a tutorial by Mohamed Mokbel (ICDM 2008) Lecture 19: Fall 121 news.consumerreports.org
Outline Location based services Location Privacy Challenges Achieving Location Privacy – Concepts – Solutions Open Questions Lecture 19: Fall 122
Location Based services “ Imagine being a victim of cardiac arrest with about ten minutes to live, and first responders more than ten minutes away. A CPR- trained passerby gets a mobile ping from the fire department that someone nearby needs help; the good Samaritan then rushes to your side, administers CPR, and keeps you alive long enough to get professional help. Lecture 19: Fall 123 ” Mayor of Starbucks Today, Local Hero Tomorrow: The Power and Privacy Pitfalls of Location Sharing Julie Adler, June 2011
Location Based Services Location based Traffic Reports – How many cars on ? – What is the shortest travel time? Location based Search – “showtimes near me” – Is there an ophthalmologist within 3 miles of my current location? – What is the nearest gas station? Location based advertising/recommendation – Starbucks (.5 miles away) is giving away free lattes. Lecture 19: Fall 124 Analysis of location data User initiated System Initiated
Location Based Services Lecture 19: Fall 125
Location Based Services Lecture 19: Fall 126 GIS / Spatial Databases Mobile Devices Internet GPS Devices Yahoo! Maps Google Maps … Location Based Services
Outline Location based services Location Privacy Challenges Achieving Location Privacy – Concepts – Solutions Open Questions Lecture 19: Fall 127
Privacy Threats Lecture 19: Fall /man-accused-of-stalking-hatfield-woman
Privacy Threats Lecture 19: Fall 129
Privacy Threats Lecture 19: Fall gps-enabled-cell-phones-to-track/
Location Privacy Laws “ The GPS Act (despite its name) focuses on all these forms of tracking and establishes a uniform standard for government access to location data: Under the bill, government agents would have to go before an independent judge and get a warrant based on “probable cause” before using technological means to track an individual. The Act applies the warrant standard regardless of the technology employed and regardless of whether the government is seeking data on a prospective or retrospective basis.” Lecture 19: Fall 1211 Bill Introduced to Protect Location Privacy Joshua Gruenspecht, June 2011
Lecture 19: Fall 1212 GPS Act ( chaffetz-gps-amendment-text )
Privacy-utility tradeoff Lecture 19: Fall 1213 Example: What is my nearest gas station? Utility 100% 0% Privacy 0%
Why is Location Privacy different? Database Privacy Each individual’s record must be kept secret. Queries are not private Data is usually static Privacy is common across all individuals Location Privacy Individual’s current and future locations (and other inferences) must be secret. Queries (location) themselves are private! Must tolerate updates to locations. Privacy is personalized for different individuals Lecture 19: Fall 1214
Outline Location based services Location Privacy Challenges Achieving Location Privacy – Concepts – Solutions Open Questions Lecture 19: Fall 1215
Location Perturbation The user location is represented with a wrong value The privacy is achieved from the fact that the reported location is false The accuracy and the amount of privacy mainly depends on how far the reported location form the exact location Lecture 19: Fall 1216
Spatial Cloaking The user exact location is represented as a region that includes the exact user location An adversary does know that the user is located in the cloaked region, but has no clue where the user is exactly located The area of the cloaked region achieves a trade-off between the user privacy and the service Lecture 19: Fall 1217
Spatio-temporal cloaking In addition to spatial cloaking the user information can be delayed a while to cloak the temporal dimension Temporal cloaking could tolerate asking about stationary objects (e.g., gas stations) Challenging to support querying moving objects, e.g., where is my nearest friend Lecture 19: Fall 1218 X Y T
Data Dependent Cloaking If you know other individuals, you can have a single coarse region to represent all of them. Lecture 19: Fall 1219 Naïve cloakingMBR cloaking
Space Dependent Cloaking Lecture 19: Fall 1220 Adaptive grid cloaking Fixed grid cloaking
K-anonymity The cloaked region contains at least k users The user is indistinguishable among other k users The cloaked area largely depends on the surrounding environment. A value of k =100 may result in a very small area if a user is located in the stadium or may result in a very large area if the user in the desert. Lecture 19: Fall 1221
Queries in Location services Private Queries over Public Data – What is my nearest gas station – The user location is private while the objects of interest are public Public Queries over Private Data – How many cars in the downtown area – The query location is public while the objects of interest is private Private Queries over Private Data – Where is my nearest friend – Both the query location and objects of interest are private Lecture 19: Fall 1222
Modes of Privacy User Location Privacy – Users want to hide their location information and their query information User Query Privacy – Users do not mind or obligated to reveal their locations, however, users want to hide their queries Trajectory Privacy – Users do not mind to reveal few locations, however, they want to avoid linking these locations together to form a trajectory Lecture 19: Fall 1223
Outline Location based services Location Privacy Challenges Achieving Location Privacy – Concepts – Solutions Open Questions Lecture 19: Fall 1224
Solution Architectures for Location Privacy Client-Server architecture – Users communicated directly with the sever to do the anonymization process. Possibly employing an offline phase with a trusted entity Third trusted party architecture – A centralized trusted entity is responsible for gathering information and providing the required privacy for each user Peer-to-Peer cooperative architecture – Users collaborate with each other without the interleaving of a centralized entity to provide customized privacy for each single user Lecture 19: Fall 1225
Client-Server Lecture 19: Fall 1226 Location Based Service Query + Perturbed Location Answer
Client-Server Clients try to cheat the server using either fake locations or fake space Simple to implement, easy to integrate with existing technologies Lower quality of service Examples: Landmark objects, false dummies Lecture 19: Fall 1227
Client-Server Solution 1: Landmarks Instead of reporting the exact location, report the location of a closest landmark The query answer will be based on the landmark Voronoi diagrams can be used to efficiently identify the closest landmark Lecture 19: Fall 1228
Client-Server Solutions 2: False Dummies A user sends m locations, only one of them is true while m-1 are false dummies The server replies with a service for each received location The user is the only one who knows the true location, and hence the true answer Generating false dummies is hard: should follow a certain pattern similar to a user pattern but with different locations Lecture 19: Fall 1229 Server A separate answer for each received location
Trusted Third Party Lecture 19: Fall 1230 Location Based Service Query + Cloaked Spatial location Location Anonymizer
Trusted Third Party A trusted third party receives the exact locations from clients, blurs the locations, and sends the blurred locations to the server Provide powerful privacy guarantees with high-quality services Need to trusted a third party … Lecture 19: Fall 1231
Mix Zones A strategy for anonymization for continuous location tracking Server only sees locations and user’s pseudonyms Mix zone is like a “no track zone” + “change of pseudonyms” Lecture 19: Fall 1232 Mix Zone User1234 User1235 User5768 User5678
Quad-tree Spatial Cloaking Achieve k-anonymity, i.e., a user is indistinguishable from other k-1 users Recursively divide the space into quadrants until a quadrant has less than k users. The previous quadrant, which still meet the k-anonymity constraint, is returned Lecture 19: Fall 1233 Achieve 5-anonmity for
Nearest Neighbor k-Anonymization STEP 1: Determine a set S containing u and k - 1 u’s nearest neighbors. STEP 2: Randomly select v from S. STEP 3: Determine a set S’ containing v and v’s k - 1 nearest neighbors. STEP 4: A cloaked spatial region is an MBR of all users in S’ and u. Need to pick a random node first. Otherwise, adversary can reconstruct location (by picking centroid of spatial region) Lecture 19: Fall 1234 S S’
Pyramid Anonymization Divide region into grids at different resolutions Each grid cell maintains the number of users in that cell To anonymize a user request, we traverse the pyramid structure from the bottom level to the top level until a cell satisfying the user privacy profile is found. Lecture 19: Fall 1235
Outline Location based services Location Privacy Challenges Algorithms Location Privacy – Concepts – Solutions Answering Queries over Anonymized Data Open Questions Lecture 19: Fall 1236
Range Queries Q1: “Find all gas stations within 5 miles from my location” – Query is private, but results are public But “my location” is a cloaked region and not a point Extend the cloaked region by 5 miles in each direction. Database returns all gas stations in the larger region. Client filters out “extra” gas stations Lecture 19: Fall 1237
Range Queries Q1: “Find all gas stations within 5 miles from my location” Three ways to report the answer: Lecture 19: Fall Answers per area 2. Probabilistic Answers 1. All possible answers
Range Queries Q2: Find all cars/people within a certain area – Query is public, but results are private Objects of interest are represented as cloaked spatial regions in which the objects of interest can be anywhere Any cloaked region that overlaps with the query region is a candidate answer Can also answer with probabilities (A, 0.1), (B, 0.2), (C, 1.0), (D, 0.25) Lecture 19: Fall 1239 A B C D
Radius Queries Q3: “How many friends are there in a 5 mile radius” – Query is private, objects are also private Use a combination of previous 2 techniques Lecture 19: Fall 1240
Nearest Neighbor Queries Q1: “Find the gas stations nearest to my location” – Query is private, but results are public Step 1: Identify a set of candidate answers Step 2: Return all candidate answers, or Determine probability of answers, or Return answers in terms of areas Lecture 19: Fall 1241 v 1 v 2 v 3 v 4
Outline Location based services Location Privacy Challenges Algorithms Location Privacy – Concepts – Solutions Answering Queries over Anonymized Data Open Questions Lecture 19: Fall 1242
Privacy Guarantees Most existing algorithms provide k-anonymity type guarantees However, this does not provide privacy against: – Homogeneity attack, 100s of people may be at a race track, but one can still learn that an individual was at the race track. – Background knowledge attacks, where adversary knows something about individuals. – Minimality attacks, where adversary knows how the algorithm anonymizes the data Lecture 19: Fall 1243
Differential Privacy Differential privacy tolerates aforementioned attacks Can work effectively in the trusted third party model Montreal Traffic, Trajectory Anonymization … but, No good solutions for the typical location based services problem. No known techniques to personalize differential privacy. Lecture 19: Fall 1244
Utility Cloaking techniques can provide good utility. But, if you need to cloak trajectories, rather than locations, utility can degrade. Not much adoption of privacy technology due to this issue. Lecture 19: Fall 1245
Summary Our locations is being tracked in a number of ways – Search queries – Location based services – GPS – … Defining privacy is tricky. – Data is not static. Location keeps changing. – Must be personalized … Number of solutions, but have privacy/utility problems and hence not much adoption in real systems. Lecture 19: Fall 1246
References M. Mokbel, “Privacy Preserving Location Services”, Tutorial, ICDM (see references in the tutorial for more pointers) R. Chen, B C Fung, B. Desai, N. Sossou, “Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System”, KDD 2012 V. Rastogi, S. Nath, “Differentially private aggregation of distributed time-series with transformation and encryption”, SIGMOD ‘10 Lecture 19: Fall 1247