1 Towards an end-to-end architecture for handling sensitive data Hector Garcia-Molina Rajeev Motwani and students
2 DB Perspective Performance Preservation Distribution (P2P) Bad Guys: eavesdrop corrupt Trust
3 DB Perspective Preservation privacy + - preservation + - easy goal
4 Privacy Spectrum Prevention Detection Containment
5 Prevention: Our Work Privacy-Preserving OLAP Distributed Architecture for Secure DBMS (P) Data Preservation in P2P Systems P2P Trust and Reputation Management (P) P2P Privacy Preserving Indexing (P)
6 Distributed Architecture for Secure DBMS Motivation: Outsourcing –Secure Database Provider (SDP) Encrypt Client Service Provider
7 Performance Problem Encrypt Client Client-side Processor Query Q Q’ “Relevant Data” Answer Problem: Q’ “SELECT *” Service Provider
8 The Power of Two Client DSP1 DSP2
9 Basic Idea { CC#, expDate, name } { expDate, name } { CC# }
10 Another Example { salary } { rand } { salary + rand }
11 The Power of Two DSP1 DSP2 Client-side Processor Query Q Q1 Q2 Key: Ensure Cost (Q1)+Cost (Q2) Cost (Q)
12 Challenges Find a decomposition that –Obeys all privacy constraints –Minimizes execution cost for given workload For given query, find good plan
13 Example R(id, a, b, c), privacy constraint: { a, b, c } R1(id, a) R2(id, b, c) R1(id, a, b) R2(id, c) R1(id, a, b) R2(id, b, c) R1(id, a, c) R2(id, b, c) … Most popular queries: Select on a, b Select on b, c R1(id, a, b) R2(id, b, c)
14 Detection: Our Work Simulatable Auditing (P) k-Anonymity –algorithms and hardness
15 Containment: Our Work Paranoid Platform for Privacy Preferences (P) Entity Resolution
16 Containment Trusting –privacy policies Paranoid
17 Example: Trusting alice dealsRus (1) browse policy (2) give info (3) cross fingers Example P3P Policies: –Current purpose: completion and support of the recurring subscription activity –Recipients: DealsRUs and/or entities acting as their agents or entities for whom DealsRUs are acting as an agent...
18 Example: alice dealsRus (1) temp alice’s agent (2) (3) (4) To:
P4P: Paranoid Platform for Privacy Preferences Framework Data/Control Types: t 1... t n API Strategy/ Reference Implementation
20 Private Information ownership function control individual organization complete privacy limited time use no predicate input no integration accountable sharable identifier service handle input to predicate copy
21 Entity Resolution N: a A: b CC#: c Ph: e e1 N: a Exp: d Ph: e e2 Applications: –mailing lists, customer files, counter-terrorism,...
22 Privacy Nm: Alice Ad: 32 Fox Ph: Nm: Alice Ad: 32 Fox Ph: Nm: Alice Ad: 32 Fox 1.0 Nm: Alice Ad: 32 Fox Ph: Nm: Alice Ad: 32 Fox Ph: Ad: 14 Cat 1.0 Bob Alice
23 Leakage Nm: Alice Ad: 32 Fox Ph: Nm: Alice Ad: 32 Fox Ph: Bob Alice L = 0.6 (between 0 and 1)
24 Multi-Record Leakage Nm: Alice Ad: 32 Fox Ph: Bob Alice LL = 0.9 (between 0 and 1, e.g., max L) r1, L = 0.9 r2, L = 0.8 r3, L = 0.7
25 Q1: Added Vulnerability? Bob Alice ΔLL = ?? r1r2 r3 r4 p r4 may cause Bob’s records to snap together!
26 Q2: Disinformation? Bob Alice ΔLL = ?? r1r2 r3 r4 (lies) p What is most cost effective disinformation?
27 Q3: Verification? Bob Alice p What is best fact to verify to increase confidence in hypothesis? r1, 0.9 r2, 0.8 r3, hypothesis h (0.6)
28 Privacy Spectrum Prevention Detection Containment