Download presentation
Presentation is loading. Please wait.
Published byLynn Briggs Modified over 8 years ago
1
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak Ridge National Laboratory)
2
Motivation: Collecting mobility traces Mobile network applications ◦ traffic monitoring, road surface sensing, radiation and chemical detection Mobility traces are collected and published to assist the design, analysis, and evaluation of mobile networks ◦ E.g., Crawdad
3
Motivation: Privacy vulnerability Measures are carried out to protect privacy of the participants ◦ Traces are identified using a random but consistent and unique identifier that is not correlated to the real ID ◦ Spatial and temporal granularities are reduced
4
These measures are not enough! ◦ Participants can be openly observed ◦ Participants may leak their location information (snapshots of time and location pairs, termed as side information) web blogs, status in social networks, tweets, causal conversations, etc. An adversary, who tries to identify the complete trace (movement history) of one or more participants, may succeed with high probability Motivation: Privacy vulnerability
5
Our contributions Comprehensive study of attack strategies ◦ Various ways for side information collection ◦ Analytically proved the optimality of attack strategy ◦ Quantitative simulation results Privacy implications of characteristics of real traces and synthetic traces ◦ Synthetic nodes are more sparsely placed More easily identified but more difficult to meet with
6
Agenda Problem formulation Analytical derivation Experimental analysis Conclusion
7
Problem formulation - trace sampling and publication
8
Problem formulation An adversary tries to identify the complete movement history of the participant(s) ◦ collects side information and compares with the published traces Possible attack scenarios ◦ Adversary infers the location of a victim indirectly (passive adversary) ◦ Adversary observes the movement of the victims physically (active adversary)
9
Passive Adversary - infers snapshots of victim Special case: reference times are sampling times
10
Passive Adversary - infers snapshots of victim General case: reference times are not sampling times
11
Passive Adversary - infers snapshots of victim General case: reference times are not sampling times Infers the possible location of the node at reference times using a general mobility model - preference of the nodes, physical constraints
12
Passive Adversary - infers snapshots of victim General case: reference times are not sampling times Infers the possible location of the node at reference times using a general mobility model
13
Passive Adversary - infers snapshots of victim General case: reference times are not sampling times
14
Attack approaches of passive adversary Use of Bayesian approach to determine the trace that gives the best match with the inferred location information Published traces Noisy side information
15
Attack approaches of passive adversary For the special case (reference time = sampling time), with the assumption that noise is i.i.d., For the general case, with the assumptions that noise is i.i.d. and movement is Markovian,
16
Attack approaches of passive adversary Most Likelihood Estimator (MLE) approach Minimum Square (MSQ) approach Basic (BAS) approach Weighted Exponential (EXP) approach When noise is Gaussian, MLE and MSQ are equivalent
17
Active Adversary - observes victims physically Adversary is one of the participants
18
Active Adversary - observes victims physically Adversary stays at a (popular) position
19
Active Adversary - observes victims physically Adversary travels between popular locations
20
Problem formulation Why the two different cases? ◦ Active Needs to consider how to collect the side information physically as time evolves Adversary tries to identify as many victims as possible – plot of k-anonymity as function of time ◦ Passive Snapshots of victim are inferred (not collected) and less accurate in general Adversary tries to identify one victim only – plot of correctness as function of pieces of side information
21
Attack strategy of active adversary Algorithm of the attack (in action) 1A, B, C 2A, B, C 3A, B, C 1A, B 2A, B 3A, B, C 1 2 3 t1t1 t2t2 real ID trace IDs
22
Experimental analysis Basic information ◦ Real traces 536 San Francisco taxicabs 2348 Shanghai Grid buses ◦ Synthetic traces Using map size and average speed computed from taxi cab traces Random waypoint (with different maximum trip lengths) Random walk ◦ Spatial granularity = 1 km ◦ Temporal granularity = 1 minute (unless stated otherwise)
23
Characteristics of the traces Distance between traces Real traces are closer to each other on average ◦ Bus traces have a broader range For synthetic traces, the shorter the trip length, the further away they are from each other in general
24
Significant observations Lack of preferred locations and random initial location of the synthetic traces – Nodes are more sparsely distributed in the network Implications: – For adversary in general Can easily identify the trace of a synthetic node since no other traces share similar path – For active adversary May take longer time to meet with each synthetic node
25
Attack performance Passive adversary (special case) Special case - side- information inferred at sampling times of traces Correct assumption of noise (Gaussian ) Cab traces Observations ◦ MLE, MSQ perform equally well ◦ BAS gives the least amount of wrong conclusions initially
26
Attack performance Passive adversary (special case) Random waypoint traces Most efficient attack ◦ traces have very different paths
27
Attack performance Passive adversary (special case) Incorrect assumption of noise ◦ Assumption: Uniform ◦ Actual: Gaussian Cab traces Observations ◦ MLE is much worsened
28
Attack performance Passive adversary (general case) General case – side information at times different from trace sampling times Worst case scenario – all times are different Infer the location of the victim using the mobility model Gaussian noise (no noise as best performance bound) Cab traces
29
Summary Passive adversary For passive adversary ◦ MLE and MSQ give the best performance among the four approaches in terms of the fraction of correct conclusions ◦ Since MLE relies on the knowledge of type of noise and its magnitude, MSQ is the preferred more robust attack approach
30
Attack performance Active adversary as one of mobile nodes Higher attack efficiency for real traces ◦ Mobile nodes more likely to visit the same set of locations at the same time ◦ Synthetic nodes more sparsely distributed in the network 1 time step = 1 minute
31
Attack performance Active adversary who stays at one of the cells cabsbuses Random waypointRandom walk Observations ◦ Comparing real traces and synthetic traces Attacks on real traces are more efficient – k-anonymity drops more quickly ◦ Popular cells in real traces and random waypoint traces are more aggregated together ◦ Being at a popular cell does not necessarily results in higher attack efficiency
32
cabsbuses Random waypointRandom walk Attack performance Active adversary who moves among popular cells The ability to move among popular cells improve attack efficiency ◦ Improvement is more significant if node movements are more localized ◦ Visiting more cells does not necessarily improves efficiency
33
Conclusion Study how privacy leaks through trace publication ◦ Under different adversary strategies to collect side information ◦ Using different mobile traces with different characteristics Experimentally show that the adversary is able to identify the trace of a victim from the published set with high probability
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.