Privacy Preserving Query Processing in Cloud Computing Wen Jie
Outline Background Privacy Preserving Query Processing ◦ Method Based on Privacy Homomorphism Processing Private Queries over Untrusted Data Cloud through Privacy Homomorphism (ICDE 2011) ◦ Method Based on Secret Share: Privacy Preserving Query Processing on Secret Share Based Data Storage (DASFAA 2011) Comparison Conclusion
Background Development of cloud computing applications ◦ Amazon: EC2 S3 ◦ Google: appEngine Development of DaaS in cloud computing Expensive hardware, software and expertise Background Secret Share Method Encryption Method Comparison Conclusion
Background Security ◦ Query privacy Disclose to Cloud Disclose to DO ◦ Data privacy Disclose to Cloud Disclose to User Background Secret Share Method Encryption Method Comparison Conclusion Data privacy Query privacy
Background Generalization Principal ◦ Relational data: quasi-identifier ◦ Spatial data: location cloaking Encrypt or transform ◦ Hashing ◦ Space filling curves Distributed environment ◦ Based on Secure Multiparty Computation Background Secret Share Method Encryption Method Comparison Conclusion
Processing Private Queries over Untrusted Data Cloud through Privacy Homomorphism (ICDE 2011) Background Secret Share Method Encryption Method Comparison Conclusion
Preliminary Privacy Homomorphism ◦ Encryption transformations which map a set of operations on cleartext to another set of operations on ciphertext ◦ Modified ASM-PH Encryption Scheme E(e 1 ) + E(e 2 ) = E(e 1 + e 2 ) E(e 1 ) - E(e 2 ) = E(e 1 - e 2 ) E(e 1 ) * E(e 2 ) = E(e 1 * e 2 ) Background Secret Share Method Encryption Method Comparison Conclusion Processing Private Queries over Untrusted Data Cloud through Privacy Homomorphism (ICDE 2011)
Architecture Key idea: let the client lead the distance access and keep track of traversal path Background Secret Share Method Encryption Method Comparison Conclusion Dist(E(e 1 ), E(e 2 )) = E(dist(e 1, e 2 )) Step 0: initialization
Architecture Key idea: let the client lead the distance access and keep track of traversal path Background Secret Share Method Encryption Method Comparison Conclusion Step 1: local distance computation E(q) in the query Dist(E(q), E(e 1 )) = E(dist(q, e 1 ))Scrambling Dist(E(p), E(e 1 ))
Architecture Key idea: let the client lead the distance access and keep track of traversal path Background Secret Share Method Encryption Method Comparison Conclusion Step 2: distance decryption and recoding Scrambled E(dist( p, e 1 )) Decrypt to distanceRecoding the distance
Architecture Key idea: let the client lead the distance access and keep track of traversal path Background Secret Share Method Encryption Method Comparison Conclusion Step 3: find next node to traverse Recoded distance
Local Distance Computation of Minimum Square Distance Distance between query point q and an index entry [l, u] Background Secret Share Method Encryption Method Comparison Conclusion
Scrambling Notice: ◦ Real distances ◦ Monotonic: distance compare Two scrambling functions ◦ Sign computation E(s)*E( ξ ) = E (s* ξ ) Receive sign(s* ξ ) ◦ Recoding E(s 1 )*E( ξ ) + E(s 2 ) = E(s 1 * ξ +s 2 ) Receive recoded(s 1 * ξ +s 2 ) Background Secret Share Method Encryption Method Comparison Conclusion Depend on sign(s) Depend on sign(s 1 )
Distance Decryption and Recoding Decryption with E -1 (· ) Recoding properties ◦ Strictly monotonic Key idea: record all existing recoded value pairs (real valued, recoded value) at cloud side ◦ Immune to chosen ciphertext attack Key idea: recoded values are random Background Secret Share Method Encryption Method Comparison Conclusion
Processing Distance Range Queries Query: find all records whose distances are within r from point q Background Secret Share Method Encryption Method Comparison Conclusion s 1 *4r 2 + s 2 Recoding Recoded 4r 2
Processing Distance Range Queries Query: find all records whose distances are within r from point q Background Secret Share Method Encryption Method Comparison Conclusion Recoded 4r 2 E(s 1 )*dist(E(e 1 ), E(q)) + E(s 2 ) Decryption Recoding
Processing Distance Range Queries Query: find all records whose distances are within r from point q Background Secret Share Method Encryption Method Comparison Conclusion Recoded 4r 2 Recoded dist(e 1, q)
Performance Analysis Distance Range Query Performance Background Secret Share Method Encryption Method Comparison Conclusion distance threshold
Privacy Preserving Query Processing on Secret Share Based Data Storage (DASFAA 2011) Secret Share Method Encryption Method Comparison Conclusion Background
Preliminary Secret share scheme ◦ protect sensitive information by dividing the value into n shares The scheme is called (k, n) threshold scheme if it satisfies: ◦ k or more shares reconstruct the value ◦ k-1 or less shares make the value completely undetermined Secret Share Method Encryption Method Comparison Conclusion Background Privacy Preserving Query Processing on Secret Share Based Data Storage (DASFAA 2011)
Architecture Three parties ◦ Data Owner (DO) ◦ Database Service Provider (DSP) ◦ Data Requestor (DR) How it works ◦ Delegate data (DO) ◦ Build an index (DO) ◦ Process a query (DR) Secret Share Method Encryption Method Comparison Conclusion Background Privacy preserving index
Secret Share Scheme A share is the result value y Given known x 1 x 2 … x n, n shares are y 1 y 2 … y n. Any k pairs of (x 1, y 1 ), (x 2, y 2 )… (x k, y k ) can reconstruct the above polynomial Secret Share Method Encryption Method Comparison Conclusion Background Real value
Data Division Data Division at DO with (3, 5) threshold scheme ◦ Randomly choose a polynomial on finite domain F 103 ◦ Choose a minimum generator = 5 X = {5, 25, 22, 7, 35} ◦ Share (20, 1) = 82; Share (20, 2) = 79; Share (20, 3) = 14; Share (20, 4) = 87; Share (20, 5) = 102 Secret Share Method Encryption Method Comparison Conclusion Background
Data Division empnonamesalary Mary John… Kate… Mike… Henry… Secret Share Method Encryption Method Comparison Conclusion Background empnonamesalary Mary John… Kate… Mike… Henry… empnonamesalary Mary John… Kate… Mike… Henry… DSP 1 DSP 2 DSP 3 DSP 4 DSP 5 empnonamesalary Mary John… Kate… Mike… Henry… empnonamesalary Mary John… Kate… Mike… Henry…
Data Reconstruction Secret Share Method Encryption Method Comparison Conclusion Background Private Data Reconstruction at DR ◦ DR needs at least k shares of the value ◦ Lagrange interpolation to reconstruct the polynomial
Storage Model Secret Share Method Encryption Method Comparison Conclusion Background All relations like R(A 1, A 2, …,A m ) are stored into n DSPs in the form of following relation: Source attribute key attribute
Key Generation Function Secret Share Method Encryption Method Comparison Conclusion Background Key value = bucket_id || encrypted_sal ◦ Bucket_id makes sure that values are in order ◦ Use a symmetric algorithm DES and the random key to encrypt salary value
Index Creation Function Secret Share Method Encryption Method Comparison Conclusion Background B+ index
Query Processing Secret Share Method Encryption Method Comparison Conclusion Background Employee name and salary are both divided into n shares SELECT name FROM Employees WHERE salary = 35 Encrypt 35 using DES scheme into h8jbka8g Search in metedata for key_sal: 128h8jbka8g search index on attribute key_sal K sub queries reconstruct name from k shares
Experiments Evaluation Security analysis ◦ DSPs collude with each other ◦ DR colludes with at least k DSPs Secret Share Method Encryption Method Comparison Conclusion Background
Experiments Evaluation Efficiency Evaluation ◦ Time comparison between hash based searching and index based searching Secret Share Method Encryption Method Comparison Conclusion Background
Experiments Evaluation Efficiency Evaluation ◦ Time comparison between encryption and polynomial computation ◦ Data extension and tuple size Secret Share Method Encryption Method Comparison Conclusion Background
Comparison Secret Share Method Encryption Method Comparison Conclusion Background Encryption MethodSecret Share Method Data locationData ownerCloud Index locationClient (shadow index)Cloud DO involvementInitialization: Send shadow index to client Send key to cloud Outsourcing: Data division Index creation Client ComputationNode traversal Local distance computation Distance comparison Query transformation Results reconstruction Cloud ComputationEncryption Decryption Recoding Query processing Communication CostsHighLow
Conclusion PH Encryption Method ◦ Low efficiency ◦ Data privacy preservation ◦ Query privacy preservation Secret Share Method ◦ High efficiency ◦ Data privacy preservation ◦ Query privacy leak when DO colludes with cloud Secret Share Method Encryption Method Comparison Conclusion Background
Q&A? Thank you~