Mohamed F. Mokbel University of Minnesota The New Casper: Query Processing for Location Services without Compromising Privacy Mohamed F. Mokbel University of Minnesota Chi-Yin Chow University of Minnesota Walid G. Aref Purdue University
Cover story, IEEE Spectrum, July 2003 Major Privacy Threats YOU ARE TRACKED…!!!! “New technologies can pinpoint your location at any time and place. They promise safety and convenience but threaten privacy and security” Cover story, IEEE Spectrum, July 2003 VLDB 2006
Major Privacy Threats VLDB 2006
WHY location-detection devices? With all its privacy threats, why do users still use location-detection devices? Location-based Database Server Wide spread of location-based services Location-based store finders Location-based traffic reports Location-based advertisements Location-based services rely on the implicit assumption that users agree on revealing their private user locations Location-based services trade their services with privacy VLDB 2006
Service-Privacy Trade-off Example: Where is my nearest bus Service 100% 0% Privacy VLDB 2006
The Casper Architecture Privacy-aware Query Processor 2: Query + blurred Spatial Region Location-based Database Server 3: Candidate Answer Third trusted party that is responsible on blurring the exact location information. Location Anonymizer 1: Query + Location Information 4: Candidate/Exact Answer VLDB 2006
System Users: Privacy Profile Each mobile user has her own privacy-profile that includes: K. A user wants to be k-anonymous Amin. The minimum required area of the blurred area Multiple instances of the above parameters to indicate different privacy profiles at different times Time k Amin 8:00 AM - 5:00 PM - 10:00 PM - 1 100 1000 ___ 1 mile 5 miles VLDB 2006
Location Anonymizer: Grid-based Pyramid Structure The entire system area is divided into grids. The Location Anonymizer incrementally keeps track the number of users residing in each grid. Traverse the pyramid structure from the bottom level to the top level, until a cell satisfying the user privacy profile is found. Disadvantages: High location update cost. High searching cost, Grid-based Pyramid Structure VLDB 2006
Adaptive Location Anonymizer Each sub-structure may have a different depth that is adaptive to the environmental changes and user privacy requirements. Cell Splitting: A cell cid at level i needs to be split into four cells at level i+1 if there is at least one user u in cid with a privacy profile that can be satisfied by some cell at level i+1. Cell Merging: Four cells at level i are merged into one cell at a higher level i-1 only if all users in the level i cells have strict privacy requirements that cannot be satisfied within level i. Adaptive Grid-based Pyramid Structure VLDB 2006
The Privacy-aware Query Processor Embedded inside the location-based database server Process queries based on cloaked spatial regions rather than exact location information Two types of data: Public data. Gas stations, restaurants, police cars Private data. Personal data records VLDB 2006
Privacy-aware Query Processor: Query Types Private queries over public data What is my nearest gas station Public queries over private data How many cars in the downtown area Private queries over private data Where is my nearest friend VLDB 2006
Private Queries over Public Data: Naive Approaches Complete privacy The Database Server returns all the target objects to the Location Anonymizer. High transmission cost Shifting the burden of query processing work onto the mobile user Nearest target object to center of the spatial query region Simple but NOT accurate Location Anonymizer (The correct NN object is T13.) VLDB 2006
Private Queries over Public Data Step 1: Locate four filters The NN target object for each vertex Step 2 : Find the middle points The furthest point on the edge to the two filters Step 3: Extend the query range Step 4: Candidate answer m34 m24 m13 m12 VLDB 2006
Private Queries over Public Data: Proof of Correctness Theorem 1 Given a cloaked area A for user u located anywhere within A, the privacy-aware query processor returns a candidate list that includes the exact nearest target to u. Theorem 2 Given a cloaked area A for a user u and a set of filter target object t1 to t4, the privacy-aware query processor issues the minimum possible range query to get the candidate list. (a) ti=tj (b) ti≠tj VLDB 2006
Private Queries over Private Data Step 1: Locate four filters The NN target object for each vertex Step 2: Find the middle points The furthest point on the edge to the two filters Step 3: Extend the query range Step 4: Candidate answer m34 m24 m13 m12 VLDB 2006
Private Queries over Private Data: Proof of Correctness Theorem 3 Given a cloaked area A for user u located anywhere within A and a set of target objects represented by their cloaked regions, the privacy-aware query processor returns a candidate list that includes the exact nearest target to u. Theorem 4 Given a cloaked area A for a user u and a set of filter target object t1 to t4 represented by their cloaked areas, the privacy-aware query processor issues the minimum possible range query to get the candidate list. (a) ti=tj (b) ti≠tj VLDB 2006
Experimental Settings We use the Network-based Generator of Moving Objects to generate a set of moving objects and moving queries. The input to the generator is the road map of Hennepin County, MN, USA. Compare the performance between Basic Location Anonymizer and Adaptive Location Anonymizer Study the performance of Casper on processing Private queries over public data Private queries over private data The Casper end-to-end performance VLDB 2006
Location Anonymizer: Number of Moving Users Parameter settings: k = [10, 50] Amin=[0.005, 0.1]% of the system area Pyramid height = 9 Basic LA and Adaptive LA are scalable to the number of moving users. Adaptive LA outperforms Basic LA in terms of the cloaking CPU time and the maintenance cost. VLDB 2006
Location Anonymizer: Effect of k Privacy Requirement Parameter settings: Amin=0 Pyramid height = 9 Basic LA and Adaptive LA are salable to the value of k. Adaptive LA also outperforms Basic LA, as the value of k gets larger. VLDB 2006
Privacy-aware Query Processor: Number of Public Target Objects Parameter settings: k = [10, 50] Amin=[0.005, 0.1]% of the system area # of moving users = 50K The case of 4 filters outperforms the case of 1 filter and 2 filters in terms of query processing CPU time and candidate answer size VLDB 2006
The Casper End-to-End Performance Parameter settings: Amin= 0 # of moving users = 10K # of target objects 5K Bandwidth = 20 Mbps Using 4 filters gives much better performance than that of using 1 filter The bottleneck is moved to be the transmission time. Public Data Private Data VLDB 2006
Summary Addressing a major privacy threat to the user in location-based service environment Casper Location Anonymizer Privacy-aware Query Processor Experiment results depict that Casper is Scalable Accurate Efficient VLDB 2006
Related Work (1/2) Adaptive-Interval Cloaking Algorithm Drawbacks Divide the entire system area into quadrants of equal area iteratively, until the quadrant includes the user and other k-1 users Drawbacks Not scalable to the number of users Not consider minimum required resolution of the cloaked region Not support query processing Compared with Casper Flexibility Efficiency Quality Accuracy M. Gruteser and D. Grunwald. Anonymous usage of location-based services through spatial and temporal cloaking, MobiSys, 2003 VLDB 2006
Related Work (2/2) Clique-Cloak Algorithm Drawbacks Each user has her own k-anonymity requirement. A clique graph is constructed to search for a minimum bounding rectangle that includes the user’s message and other k-1 messages. Drawbacks Not scalable to k Not consider minimum required resolution of the cloaked region Not support query processing An adversary can guess the location information of the users lying on the rectangle boundary with high probability. Compared with Casper Flexibility Efficiency Quality Accuracy B. Gedik and L. Liu. Location Privacy in Mobile Systems: A Personalized Anonymization Model. ICDCS, 2005. VLDB 2006
Location Anonymizer: Pyramid Height Parameter settings: k = [10, 50] Amin=[0.005, 0.1]% of the system area # of moving users = 50K Cloaking CPU time and maintenance cost get higher with increasing pyramid height Adaptive LA performs better than Basic LA, as the pyramid height increases VLDB 2006
Privacy-aware Query Processor: Number of Private Target Objects Parameter settings: k = [10, 50] Amin=[0.005, 0.1]% of the system area # of moving users = 50K The case of 4 filters outperforms the case of 1 filter and 2 filters in terms of query candidate answer size The case of 4 filters performs better than the case of 1 filter and 2 filters in terms of query processing CPU time when number of target object is over 8K VLDB 2006