Download presentation
Presentation is loading. Please wait.
Published byMarjorie Boyd Modified over 9 years ago
1
Privately Querying Location-based Services with SybilQuery Pravin Shankar, Vinod Ganapathy, and Liviu Iftode Department of Computer Science Rutgers University { spravin, vinodg, iftode } @ cs.rutgers.edu
2
Sep 9, 2010IBM Frontiers of Cloud Computing 20102 Location-based Services (LBSes) Implicit assumption: Users agree to reveal their locations for access to services How is the traffic in the road ahead? Where is my nearest restaurant?
3
Sep 9, 2010IBM Frontiers of Cloud Computing 20103 Privacy concerns while querying an LBS With two weeks of GPS data from a user’s car, we can infer home address (median error < 60 m) [Krumm ‘07] 5% of people are uniquely identified by their home and work locations even if it is known only at the census tract level [Golle and Partridge ‘09]
4
Sep 9, 2010IBM Frontiers of Cloud Computing 20104 Querying an LBS LBS... Work loc 1 loc n Home loc 2 Client
5
Sep 9, 2010IBM Frontiers of Cloud Computing 20105 Work'' Work' Work Basic Idea LBS... Home Client Home'' Home' loc 1, loc 1 ', loc 1 '' loc 2, loc 2 ', loc 2 '' loc n, loc n ', loc n ''
6
Sep 9, 2010IBM Frontiers of Cloud Computing 20106 What the LBS sees Which of these is the real user?
7
Sep 9, 2010IBM Frontiers of Cloud Computing 20107Outline Introduction SybilQuery Overview Design Challenges Implementation Evaluation and Results Conclusions and Future Work
8
Sep 9, 2010IBM Frontiers of Cloud Computing 20108 SybilQuery Overview Basic Idea: Achieves privacy using synthetic (Sybil) queries For each real user trip, the system generates –k-1 Sybil start and end points (termed endpoints) –k-1 Sybil paths For each real query made, the system generates –k-1 Sybil Queries
9
Sep 9, 2010IBM Frontiers of Cloud Computing 20109 SybilQuery Design
10
Sep 9, 2010IBM Frontiers of Cloud Computing 201010Outline Introduction SybilQuery Overview Design Challenges Implementation Evaluation and Results Conclusions and Future Work
11
Sep 9, 2010IBM Frontiers of Cloud Computing 201011 SybilQuery Challenges Endpoint generation: –How to automatically generate synthetic endpoints similar to a pair of real endpoints? Path generation: –How to choose the waypoints of the Sybil path? Query generation: –How to simulate motion along the Sybil path?
12
Sep 9, 2010IBM Frontiers of Cloud Computing 201012 Endpoint Generator Produces synthetic endpoints that resemble the real source and destination High-level idea: –Tag locations with features –Identify clusters of locations that share similar features Feature used in SybilQuery: traffic statistics
13
Sep 9, 2010IBM Frontiers of Cloud Computing 201013 Tagging locations with traffic statistics Naïve approach: Annotate locations with descriptive tags –Eg. “parking lot”, “downtown office building”, “freeway” –Laborious manual task Our approach: Automatically compute features using a database of regional traffic statistics –Dataset: Month-long GPS traces from the San Francisco Cabspotter project - 530 unique cabs; 529,533 trips –Compute traffic density τ l for each location from dataset
14
Sep 9, 2010IBM Frontiers of Cloud Computing 201014 Path Generator Consults an off-the-shelf navigation service –Our implementation uses Microsoft Multimap API to obtain waypoints Users may not always follow the shortest path to destination –Detours, road closures, user intention Computes multiple paths to the destination (with varying lengths) Uses a probability distribution to choose path
15
Sep 9, 2010IBM Frontiers of Cloud Computing 201015 Query Generator Triggered each time the user queries the LBS Simulates the motion of users along the Sybil paths Uses current traffic conditions to more accurately simulate user movement –Eg. Simulate slower movement if traffic is congested
16
Sep 9, 2010IBM Frontiers of Cloud Computing 201016 Endpoint caching Attack 1: If a real path P frequented by the user (e.g., commuter paths) is associated with multiple Sybil paths: –P can be statistically identifed as the real path Attack 2: After arriving at the first destination, when a user travels to a new location shortly : –Since the real paths share an endpoint, they could be identified Solution: Endpoint caching 1.For most common trips, Sybil endpoints are cached 2.If the user makes multiple trips from one common endpoint (e.g., home/office), the corresponding Sybil endpoints are cached 3.When the user embarks on a multi-destination trip, the endpoint of a trip is the same as the startpoint of the following trip
17
Sep 9, 2010IBM Frontiers of Cloud Computing 201017 Providing path continuity Attack: If a real trip ends before some Sybil trips end –The system stops sending queries –The LBS can differentiate the real path from Sybil paths SybilQuery guards against this by being an “always on” tool –continues to simulate movement along Sybil paths even when the user’s real trip is complete
18
Sep 9, 2010IBM Frontiers of Cloud Computing 201018Outline Introduction SybilQuery Overview Design Challenges Implementation Evaluation and Results Conclusions and Future Work
19
Sep 9, 2010IBM Frontiers of Cloud Computing 201019 SybilQuery Implementation An interface akin to navigation systems Input: –The source and destination address for the trip –A security parameter k Number of Sybil users Query interface: –Integrated with Yahoo! Local Search
20
Sep 9, 2010IBM Frontiers of Cloud Computing 201020Outline Introduction SybilQuery Overview Design Challenges Implementation Evaluation and Results Conclusions and Future Work
21
Sep 9, 2010IBM Frontiers of Cloud Computing 201021 Evaluation Goals 1.Privacy How indistinguishable are Sybil queries from real queries? 2.Performance Can Sybil queries be efficiently generated?
22
Sep 9, 2010IBM Frontiers of Cloud Computing 201022 Evaluation: Privacy User Study –Give the working system to adversarial users, who would try to break the system by find real user paths hidden between Sybil paths –15 volunteers Methodology –Pick real paths from the Cabspotter traces –Use SybilQuery to generate Sybil paths with different values of k
23
Sep 9, 2010IBM Frontiers of Cloud Computing 201023 Results from user study k# Questions# CorrectProbability 475200.26 675140.19
24
Sep 9, 2010IBM Frontiers of Cloud Computing 201024 User approaches to distinguish queries Contrasting rationale to guess real users –“Circuitous paths” –“Prominent start/end location” –“Odd man out”
25
Sep 9, 2010IBM Frontiers of Cloud Computing 201025 Evaluation: Performance Setup: –Server: 2.33 GHz Core2 Duo, 3 GB RAM, 250 GB SATA (7200 RPM) –Client: 1.73 GHz Pentium-M laptop, 512 MB RAM, Linux 2.6 –Privacy parameter k = 4 (unless otherwise specified) Micro-benchmarks –One-time and once-per-trip costs –Query-response latency of SybilQuery Comparison with Spatial Cloaking for Yahoo! local search
26
Sep 9, 2010IBM Frontiers of Cloud Computing 201026 One-time and once-per-trip costs One-time cost – preprocessing of traffic database –2 hours 16 mins (processed 529,533 trips) Once-per-trip costs – endpoint generation and path generation TaskAverage Time (sec)St dev (sec) Endpoint generation5.471.02 Path generation *0.360.15 * Includes network latency to query the Microsoft MultiMap API
27
Sep 9, 2010IBM Frontiers of Cloud Computing 201027 Query-response latency of SybilQuery Scales linearly with k (number of Sybil users) Sub-second latency for typical values of k
28
Sep 9, 2010IBM Frontiers of Cloud Computing 201028 Conclusions and Future Work SybilQuery: Efficient decentralized technique to hide user location from LBSes Experimental results demonstrate: –Sybil queries can be generated efficiently –Sybil queries resemble real user queries Future Work –Enhance SybilQuery to achieve stronger privacy guarantees, such as l-diversity, t-closeness and differential privacy
29
Sep 9, 2010IBM Frontiers of Cloud Computing 201029 My research on location in mobile computing Privacy: Users may not want to reveal their private locations for accessing location-based services. SybilQuery – Ubicomp 2009. Querying mobile phones for real-time location-based state. SocialTelescope – Internship at IBM, Summer 2010. Incentives for sharing in social networks – WINE 2009. Rapid change of client location affects network connectivity and performance. Context-Aware Rate Selection (CARS) – a solution for improving network performance by using client location – ICNP 2008.
30
Thank You! Pravin Shankar spravin@cs.rutgers.edu
31
Sep 9, 2010IBM Frontiers of Cloud Computing 201031 Related Work Synthetic Locations for Privacy [Krumm ’09, Kido ‘05] Spacial Cloaking [Gruteser and Grunwald ’03, and others] Peer-to-peer Schemes [Chow ’06, Ghinita ‘07] Private Information Retrieval (PIR) [Ghinita ’08] Detailed list is available in paper
32
Sep 9, 2010IBM Frontiers of Cloud Computing 201032 Spatial Cloaking Spatial Cloaking – k-anonymity solution that uses anonymizers Users send their location to anonymizer Anonymizer computes cloaked region –Region where atleast k users are present client server anonymizer
33
Sep 9, 2010IBM Frontiers of Cloud Computing 201033 Performance Comparison with Spatial Cloaking Cloaked regions grow as users travel Cloaked regions grow as users travel SybilQuery overhead constant SybilQuery overhead constant Response Size as users travel
34
Sep 9, 2010IBM Frontiers of Cloud Computing 201034 Prior techniques (1/2) Spatial Cloaking –Need for Anonymizer - Trusted Third Party –Single point of failure –Scalability and performance bottleneck client server anonymizer
35
Sep 9, 2010IBM Frontiers of Cloud Computing 201035 Prior techniques (2/2) Peer-to-peer schemes –Rely on participating peers Private Information Retrieval (PIR) –Computationally inefficient
36
Sep 9, 2010IBM Frontiers of Cloud Computing 201036 Tagging locations with traffic statistics (2/2) Locations represented as QuadTree –Balances precision with scalability San Francisco Airport. Black blocks have higher densities
37
Sep 9, 2010IBM Frontiers of Cloud Computing 201037 Finding suitable endpoints using reverse geocoding Real endpoints do not start in non-driveable terrain Random point in geographic location Reverse Geocoding Street address closest to the random point
38
Sep 9, 2010IBM Frontiers of Cloud Computing 201038 Our goals Performance Autonomy Ease of deployment
39
Sep 9, 2010IBM Frontiers of Cloud Computing 201039 Basic design of SybilQuery
40
Sep 9, 2010IBM Frontiers of Cloud Computing 201040 Design enhancements Endpoint Generator –Endpoint caching Path Generator –Randomizing path selection Query Generator –Providing path continuity –Adding GPS sensor noise –Handling active adversaries
41
Sep 9, 2010IBM Frontiers of Cloud Computing 201041 Endpoint caching (1/2) Attack 1: If a real path P frequented by the user (e.g., commuter paths) is associated with multiple sets of Sybil paths: –P can be statistically identifed as the real path Attack 2: After arriving at the first destination, when a user travels to a new location shortly : –Since the real paths share an endpoint, they could be distinguished from the Sybil paths
42
Sep 9, 2010IBM Frontiers of Cloud Computing 201042 Endpoint caching (2/2) Solution: SybilQuery employs three types of caching 1.For most common trips, Sybil endpoints are cached 2.If the user makes multiple trips from one common endpoint (e.g., home/office), the corresponding Sybil endpoints are cached 3.When the user embarks on a multi-destination trip, the start points of the Sybil trips are cached i.e. the endpoint of a trip is the same as the startpoint of the following trip
43
Sep 9, 2010IBM Frontiers of Cloud Computing 201043 Randomizing path selection Real users may not always follow the shortest path to destination –Detours, road closures, user intention Path generator computes multiple paths to the destination (each with varying lengths) Uses a probability distribution (of the frequency with which users choose paths other than the shortest path) to choose an appropriate path
44
Sep 9, 2010IBM Frontiers of Cloud Computing 201044 Handling active adversaries An actively adversarial LBS may return doctored query responses to differentiate Sybil paths from a client’s real path –For example, it falsely reports traffic congestion at the query location. SybilQuery handles active adversaries using N-variant queries to multiple LBSes –Unless all the LBSes collude, the adversarial LBS can be detected
45
Sep 9, 2010IBM Frontiers of Cloud Computing 201045Implementation SybilQuery implemented as a Python client Endpoint generator: –Uses a PostgreSQL database with PostGIS spacial extensions to process regional traffic information Path generator: –Queries the Microsoft Multimap API for waypoints Query generator: –Interfaced with Yahoo! Local API to simulate movement under the constraints of current traffic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.