PRIVÉ : Anonymous Location-Based Queries in Distributed Mobile Systems 1 National University of Singapore 2 University of Peloponnese, Greece Gabriel Ghinita 1 Panos Kalnis 1 SpirosSkiadopoulos 2
Location-Based Services (LBS) LBS users Mobile devices with GPS capabilities Spatial database queries Queries NN and Range Queries Location server is NOT trusted “Find closest hospital to my present location”
Problem Statement Queries may disclose sensitive information Query through anonymous web surfing service But user location may disclose identity Triangulation of device signal Publicly available databases Physical surveillance How to preserve query source anonymity? Even when exact user locations are known
Solution Overview Anonymizing Spatial Region (ASR) Identification probability ≤ 1/K Minimize overhead Reduce ASR extent Fast ASR assembly time Support user mobility
Central Anonymizer Architecture Intermediate tier between users and LBS Bottleneck and single point of attack/failure
PRIVÉ Architecture
K-Anonymity * AgeZipCodeDisease Ulcer Pneumonia Flu Gastritis Dyspepsia Bronchitis * L. Sweeney. k-Anonymity: A Model for Protecting Privacy. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5): , NameAgeZipCode Andy Bill Ken Nash Mike Sam (a) Microdata (b) Voting Registration List (public)
K-Anonymity * AgeZipCodeDisease Ulcer Pneumonia Flu Gastritis Dyspepsia Bronchitis * L. Sweeney. k-Anonymity: A Model for Protecting Privacy. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5): , (a) 2-anonymous microdata(b) Voting Registration List (public) NameAgeZipCode Andy Bill Ken Nash Mike Sam
Relational and Spatial Anonymity k 25k 30k 35k 40k 45k 50k 55k Zip Age
Existing Cloaking Solutions
Redundant Queries Send K-1 redundant queries Gives away exact location of users Potentially high overhead
CloakP2P [Chow06] Find K-1 NN of query source Source likely to be closest to ASR center Vulnerable to “center-of-ASR” attack [Chow06] – Chow et al, A Peer-to-Peer Spatial Cloaking Algorithm for Anonymous Location- based Services, ACM GIS ’06 uquq 5-ASR NOT SECURE !!!
QuadASR [Gru03, Mok06] Quad-tree based Fails to preserve anonymity for outliers Unnecessarily large ASR size u1u1 u2u2 u3u3 u4u4 A1A1 A2A2 u 4 ’s identity is disclosed If u 4 queries, ASR is A 2 If any of u 1, u 2, u 3 queries, ASR is A 1 Let K=3 [Gru03] - Gruteser et al, Anonymous Usage of Location-Based Services Through Spatial and Temporal Cloaking, MobiSys 2003 [Mok06] – Mokbel et al, The New Casper: Query Processing for Location Services without Compromising Privacy, VLDB 2006 NOT SECURE !!!
Secure Location Anonymization
Reciprocity Consider querying user u q and ASR A q Let AS q = {set of users enclosed by A q } A q has the reciprocity property iff i. |AS| ≥ K ii. u i,u j AS, u i AS j u j AS i
hilbASR Based on Hilbert space-filling curve index users by Hilbert value of location partition Hilbert sequence into “K-buckets” StartEnd
Advantages of hilbASR Guarantees source privacy K-ASRs have the “reciprocity” property Reduced ASR size Hilbert ordering preserves locality well K-ASR includes exactly K users (in most cases) Efficient ASR assembly and user relocation Balanced, annotated index tree User relocation, ASR assembly in O(log #users)
hilbASR with Annotated Index K=6 Example
PRIVÉ
PRIVÉ Characteristics P2P overlay network Resembles annotated B + -tree Hierarchical clustering architecture Bounded cluster size [,3) S relocates to 60
Relocation
PRIVÉ Protocol Users self-organize into clusters Bounded cluster size [,3) Cluster head handles operations State replicated at each cluster peer Operations Join/Departure Similar to B-tree insert/delete Relocation Handled bottom-up, restrict propagation K-request Decentralized implementation of hilbASR
Operation Complexity OperationLatency Communication Cost Join/Departurelog N log N + Relocationlog N log N + K-requestlog N + log K log N + K/
Load Balancing Hierarchical architecture Inherent imbalance in peer load Cluster head rotation mechanism Rotation triggered by load Communication cost predominant
Fault Tolerance Soft-state mechanism Cluster membership periodically updated Recovery facilitated by state replication Leader election protocol In case of cluster head failure
Experimental Evaluation
Experimental Setup San Francisco Bay Area road network Network-based Generator of Moving Objects * Up to users Velocities from 18 to 68 km/h Uniform and skewed query distributions Anonymity degree K in the range [10, 160] * T. Brinkhoff. A Framework for Generating Network-Based Moving Objects. Geoinformatica, 6(2):153–180, 2002.
Anonymity Strength (center-of-ASR)
ASR Size
Query Efficiency
Relocation Efficiency
Load Balancing 0% 20% 40% 60% 80% 100% Node Fraction
Conclusions LBS Privacy an important concern Existing solutions have no privacy guarantees Centralized approach has limitations Poor scalability, legal issues Contribution Anonymization with privacy guarantees hilbASR Extension to decentralized systems Improved scalability and availability No single point-of-attack/failure
Ongoing & Future Work Relational DB Employ space mapping techniques to achieve k-anonymity and l-diversity We outperform existing “state-of-the art” Space/Data Partitioning and Clustering Spatial anonymity Address anonymization of trajectories As opposed to point locations
Ongoing & Future Work Address anonymization of trajectories As opposed to point locations Infrastructure-less scenario
Bibliography on LBS Privacy
Bibliography [Chow06] – Mokbel et al, A Peer-to-Peer Spatial Cloaking Algorithm for Anonymous Location-based Services, ACM GIS ’06 [Gru03] - Gruteser et al, Anonymous Usage of Location-Based Services Through Spatial and Temporal Cloaking, MobiSys 2003 [Ged05] – Gedik et al, Location Privacy in Mobile Systems: A Personalized Anonymization Model, ICDCS 2005 [Mok06] – Mokbel et al, The New Casper: Query Processing for Location Services without Compromising Privacy, VLDB 2006
MobiHide Randomized ASR assembly technique: Also uses Hilbert ordering ASR chosen as random K-user sequence Advantages No global knowledge required Flat index structure (Chord DHT) Disadvantages No privacy guarantees for skewed query distributions but still strong anonymity in practice