Quantifying Location Privacy Reza Shokri George Theodorakopoulos Jean-Yves Le Boudec Jean-Pierre Hubaux May 2011.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Review bootstrap and permutation
On the Optimal Placement of Mix Zones Julien Freudiger, Reza Shokri and Jean-Pierre Hubaux PETS, 2009.
Protecting Location Privacy: Optimal Strategy against Localization Attacks Reza Shokri, George Theodorakopoulos, Carmela Troncoso, Jean-Pierre Hubaux,
The Role of History and Prediction in Data Privacy Kristen LeFevre University of Michigan May 13, 2009.
Probabilistic models Haixu Tang School of Informatics.
Self-Organized Anonymous Authentication in Mobile Ad Hoc Networks Julien Freudiger, Maxim Raya and Jean-Pierre Hubaux SECURECOMM, 2009.
Sampling: Final and Initial Sample Size Determination
Quantifying Location Privacy: The Case of Sporadic Location Exposure Reza Shokri George Theodorakopoulos George Danezis Jean-Pierre Hubaux Jean-Yves Le.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Model Cryptanalysis Chris Karlof and David Wagner.
1 A Distortion-based Metric for Location Privacy Workshop on Privacy in the Electronic Society (WPES), Chicago, IL, USA - November 9, 2009 Reza Shokri.
… Hidden Markov Models Markov assumption: Transition model:
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Privacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification Wenliang (Kevin) Du, Zhouxuan Teng, and Zutao Zhu. Department of Electrical.
Mutual Information Mathematical Biology Seminar
Lecture 5: Learning models using EM
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New.
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
Information Theory and Security
PRIVACY CRITERIA. Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline.
Image Analysis and Markov Random Fields (MRFs) Quanren Xiong.
6 am 11 am 5 pm Fig. 5: Population density estimates using the aggregated Markov chains. Colour scale represents people per km. Population Activity Estimation.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Mobile Networks - Module H2 Privacy in Mobile Networks Privacy notions and metrics Location privacy Privacy preserving routing in ad hoc networks Slides.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
UNIVERSITY of NOTRE DAME COLLEGE of ENGINEERING Preserving Location Privacy on the Release of Large-scale Mobility Data Xueheng Hu, Aaron D. Striegel Department.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
PARAMETRIC STATISTICAL INFERENCE
earthobs.nr.no Temporal Analysis of Forest Cover Using a Hidden Markov Model Arnt-Børre Salberg and Øivind Due Trier Norwegian Computing Center.
Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE.
Background Knowledge Attack for Generalization based Privacy- Preserving Data Mining.
Holistic Privacy From Location Privacy to Genomic Privacy Jean-Pierre Hubaux With contributions from E. Ayday, M. Humbert, J.-Y. Le Boudec, J.-L. Raisaro,
On the Age of Pseudonyms in Mobile Ad Hoc Networks Julien Freudiger, Mohammad Hossein Manshaei, Jean-Yves Le Boudec and Jean-Pierre Hubaux Infocom 2010.
Preserving Location Privacy in Wireless LANs Jiang, Wang and Hu MobiSys 2007 Presenter: Bibudh Lahiri.
Randomization in Privacy Preserving Data Mining Agrawal, R., and Srikant, R. Privacy-Preserving Data Mining, ACM SIGMOD’00 the following slides include.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
Preserving Privacy in GPS Traces via Uncertainty- Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presented by Joseph T. Meyerowitz.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Hilbert Space Embeddings of Conditional Distributions -- With Applications to Dynamical Systems Le Song Carnegie Mellon University Joint work with Jonathan.
MaskIt: Privately Releasing User Context Streams for Personalized Mobile Applications SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.
Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.
Chapter 8. Learning of Gestures by Imitation in a Humanoid Robot in Imitation and Social Learning in Robots, Calinon and Billard. Course: Robots Learning.
Graph Data Management Lab, School of Computer Science Personalized Privacy Protection in Social Networks (VLDB2011)
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
1 Probability and Statistics Confidence Intervals.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
1 An infrastructure for context-awareness based on first order logic 송지수 ISI LAB.
Unraveling an old cloak: k-anonymity for location privacy
Optimizing the Location Obfuscation in Location-Based Mobile Systems Iris Safaka Professor: Jean-Pierre Hubaux Tutor: Berker Agir Semester Project Security.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Other Models for Time Series. The Hidden Markov Model (HMM)
1 Maintaining Data Privacy in Association Rule Mining Speaker: Minghua ZHANG Oct. 11, 2002 Authors: Shariq J. Rizvi Jayant R. Haritsa VLDB 2002.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
Smart Meter Privacy by sankar, Rajgopalan, Mohajer, Vincent CSE 898AB Privacy Enhancing Technologies Dr. Murtaza Jadliwala Presented by Viswa Chaitanya.
Quantifying Location Privacy
WP2 INERTIA Distributed Multi-Agent Based Framework
Quantifying Location Privacy
Presented By Siddartha Ailuri Graduate Student, EECS 04/07/17
Differential Privacy in Practice
Filtering and State Estimation: Basic Concepts
A Unified Framework for Location Privacy
Presentation transcript:

Quantifying Location Privacy Reza Shokri George Theodorakopoulos Jean-Yves Le Boudec Jean-Pierre Hubaux May 2011

2

3 The contextual information attached to a trace tells much about our habits, interests, activities, and relationships A location trace is not only a set of positions on a map

4 envisioningdevelopment.net/map

5

6 Distort location information before exposing it to others Location-Privacy Protection

7 originallow accuracylow precision Pictures from Krumm 2007 Location-Privacy Protection Anonymization (pseudonymization) –Replacing actual username with a random identity Location Obfuscation –Hiding location, Adding noise, Reducing precision How to evaluate/compare various protection mechanisms? Which metric to use? A common formal framework is MISSING

Location Privacy: A Probabilistic Framework

9 Reconstructed Traces Attack KC Attacker Knowledge Construction riri rjrj P ij Users’ Mobility Profiles MC Transition Matrices uNuN u1u1 uNuN u1u1 Past Traces (vectors of noisy/missing events) … Location-Privacy Preserving Mechanism u1u1 u2u2 uNuN … 1234 T Users Timeline: Actual Traces (vectors of actual events) 1 … 1234 T Nyms Timeline: Observed Traces (vectors of observed events) 2 N LPPM ObfuscationAnonymization

10 Location-Privacy Preserving Mechanism LPPM Alice Location-Obfuscation Function: Hiding, Reducing Precision, Adding Noise, Location Generalization,… A Probabilistic Mapping of a Location to a Set of Locations

11 Location-Privacy Preserving Mechanism Anonymization Function: Replace Real Usernames with Random Pseudonyms (e.g., integer 1…N) LPPM Alice Charlie Bob A Random Permutation of Usernames

12 Location-Privacy Preserving Mechanism AnonymizationLocation Obfuscation (for user u) Observed trace of user u, with pseudonym u’ Actual trace of user u Spatiotemporal Event:

13 Adversary Model ObservationKnowledge Anonymized and Obfuscated Traces Users’ mobility profiles PDF anonymization PDF obfuscation LPPM

14 Learning Users’ Mobility Profiles ((adversary knowledge construction)) KC riri rjrj P ij Users’ Profiles MC Transition Matrices uNuN u1u1 uNuN u1u1 Past Traces (vectors of noisy/missing past events) … From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities P u

15 Example – Simple Knowledge Construction Day – … Day – … … Day – … Time8am9am10am11am… Prior Knowledge for (this example: 100 Training Traces) ⅓⅓⅓ Alice Mobility Profile for Alice How to consider noisy/partial traces? e.g., knowing only the user’s location in the morning (her workplace), and her location in the evening (her home)

16 Learning Users’ Mobility Profiles ((adversary knowledge construction)) KC riri rjrj P ij Users’ Profiles MC Transition Matrices uNuN u1u1 uNuN u1u1 Past Traces (vectors of noisy/missing past events) … From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities P u Our Solution: Using Monte-Carlo method: Gibbs Sampling to estimate the probability distribution of the users’ mobility profiles

17 Adversary Model ObservationKnowledge Anonymized and Obfuscated Traces Users’ mobility profiles PDF anonymization PDF obfuscation LPPM Inference Attack Examples Localization Attack : “Where was Alice at 8pm?” What is the probability distribution over the locations for user ‘Alice’ at time ‘8pm’? Tracking Attack : “Where did Alice go yesterday?” What is the most probable trace (trajectory) for user ‘Alice’ for time period ‘yesterday’? Meeting Disclosure Attack : “How many times did Alice and Bob meet?” Aggregate Presence Disclosure : “How many users were present at restaurant x, at 9pm?”

18 Inference Attacks Our Solution: Decoupling De-anonymization from De-obfuscation Computationally infeasible:  (anonymization permutation) can take N! values

19 De-anonymization 1 - Compute the likelihood of observing trace ‘i’ from user ‘u’, for all ‘i’ and ‘u’, using HMP: Forward-Backward algorithm. O(R 2 N 2 T) 2 - Compute the most likely assignment using a Maximum Weight Assignment algorithm (e.g., Hungarian algorithm). O(N 4 ) u1u1 u2u2 uNuN … Users 1 … Nyms 2 N

20 De-obfuscation Given the most likely assignment  *, the localization probability can be computed using Hidden Markov Model: the Forward-Backward algorithm. O(R 2 T) Tracking Attack Given the most likely assignment  *, the most likely trace for each user can be computed using Viterbi algorithm. O(R 2 T) Localization Attack

Location-Privacy Metric

22 Assessment of Inference Attacks In an inference attack, the adversary estimates the true value of some random variable ‘X’ (e.g., location of a user at a given time instant) Three properties of the estimation’s performance: How focused is the estimate on a single value? The Entropy of the estimated random variable How accurate is the estimate? Confidence level and confidence interval How close is the estimate to the true value (the real outcome)? Let x c (unknown to the adversary) be the actual value of X

23 Location-Privacy Metric The true outcome of a random variable is what users want to hide from the adversary Hence, incorrectness of the adversary’s inference attack is the metric that defines the privacy of users Location-Privacy of user ‘u’ at time ‘t’ with respect to the localization attack = Incorrectness of the adversary (the expected estimation error):

Location-Privacy Meter A Tool to Quantify Location Privacy

25 Location-Privacy Meter (LPM) You provide the tool with –Some traces to learn the users’ mobility profiles –The PDF associated with the protection mechanism –Some traces to run the tool on LPM provides you with –Location privacy of users with respect to various attacks: Localization, Tracking, Meeting Disclosure, Aggregate Presence Disclosure,…

26 LPM: An Example CRAWDAD dataset N = 20 users R = 40 regions T = 96 time instants Protection mechanism: –Anonymization –Location Obfuscation Hiding location Precision reduction (dropping low-order bits from the x, y coordinates of the location)

27 LPM: Results – Localization Attack No obfuscation

28 Assessment of other Metrics EntropyK-anonymity

29 Conclusion A unified formal framework to describe and evaluate a variety of location-privacy preserving mechanisms with respect to various inference attacks Modeling LPPM evaluation as an estimation problem –Throw attacks at the LPPM The right Metric: Expected Estimation Error An object-oriented tool (Location-Privacy Meter) to evaluate/compare location-privacy preserving mechanisms

30

31 Hidden Markov Model OiOi {11,12,13}{6,7,8}{14,15,16}{18,19,20}… P Alice (11  6)P Alice (6  14) P LPPM (6  {6,7,8}) P Alice (11) Alice