Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)

Private Inference Control David Woodruff MIT dpwood@mit.edu Joint work with Jessica Staddon (PARC)

Contents 1.Background 1.Access Control and Inference Control 2.Our contribution: Private Inference Control (PIC) 3.Related Work 2.PIC model & definitions 3.Our Results 4.Conclusions

Access Control Server DB of n records User queries a database. Some info in DB sensitive. Access control prevents user from learning individual sensitive relations/attributes. Does access control prevent user from learning sensitive info? Whats Bobs salary? Sensitive: Access denied

Inference Control NameJobSalary Alyssa P. Hacker Software Engineer $90,000 Paul E. NomialMathematician$31,415 ……… Combining non-sensitive info may yield something sensitive Inference Channel: {(name, job), (job, salary)} Inference Control : block all inference channels Query 1How much does Alyssa make? Query 2What is Alyssas job? Query 3How much do software engineers make? Sensitive. Software Engineer $90,000

Inference Control Database x 2 ({0,1} m ) n DB of n records, m attributes 1, …, m per record n tending to infinity, m = O(1) Inference engine: generates collection C of subsets of [m] denoting all the inference channels We assume have an engine [QSKLG93] (exhaustive search) F 2 C means for all i, user shouldnt learn x i, j for all j 2 F Assume C is monotone. Assume C input to both user and server User learns C anyway when his queries are blocked C is data-independent, reveals info only about attributes

Our contribution: Private Inference Control Existing inference control schemes require server to learn user queries to check if they form an inference Our goal: user Privacy + Inference Control = PIC This talk: arbitrary malicious users U*, semi-honest S Privacy: polytime S learns nothing about honest users queries except # made so far # queries made so far enables S to do inference control Private and symmetrically-private information retrieval Not sufficient since they are stateless Users permissions change over time Generic secure function evaluation Not efficient – our communication exponentially smaller

Application Government analysts inspect repositories for terrorist patterns 1. Inference Control: prevent analysts from learning sensitive info about non-terrorists. 2. User Privacy: prevent server from learning what analysts are tracking – if discovered this info could go to terrorists! DB

Related Work Data perturbation [AS00, B80, TYW84] So much noise required data not as useful [DN03] Adaptive Oblivious Transfer [NP99] One record can be queried adaptively at most k times Priced Oblivious Transfer [AIR01] One record, supports more inference channels than threshold version considered in [NP99] We generalize [NP99] and [AIR01] Arbitrary inference channels and multiple records More efficient/private than parallelizing NP99 and AIR01 on each record

The Model Offline Stage: S given x, C, 1 k, and can preprocess x Online Stage: at time t, honest U generates query (i t, j t ) (i t, j t ) can depend on all prior info/transactions with S Let T denote all queries U makes, (i 1, j 1 ), …, (i |T|, j |T| ) T r.v. - depends on Us code, x, and randomness T permissable if no i s.t. (i,j) 2 T for all j 2 F for some F 2 C. We require honest U to generate permissable T. U and S interact in a multiround protocol, then U outputs out t View U consists of C, n, m, 1 k, all messages from S, randomness View S consists of C, n, m, 1 k, x, all messages from U, randomness

Security Definitions Correctness: For all x, C, for all honest users U, for all 2 [|T(U, x)|], out = x i, j User Privacy: For all x, C, for all honest U, for any two sequences T 1, T 2 with |T 1 | = |T 2 |, for all semi-honest servers S * and random coin tosses of S * (View S* | T(U, x) = T 1 ) (View S* | T(U, x) = T 2 ) Inference Control: Comparison with ideal model – for every U *, every x, any random coins of U *, for every C there exists a simulator U interacting with trusted party Ch for which View U* View, where U just asks Ch for tuples (i t, j t ) that are permissable

Efficiency Efficiency measures are per query Minimize communication & round complexity Ideally O(polylog(n)) bits and 1 round Minimize servers time-complexity Ideally O(n) without preprocessing W/preprocessing, potentially better, but O(n) optimal w.r.t. known single-server PIR schemes

Our Results For any PIR scheme, let C(n) W(n) denote communication and server work for DB size n PIC scheme #1 Communication: O(k log n C(n 2 )), 1-round Work: O(k log n W(n 2 )) PIC scheme #2 Communication: O(k(n + C(n))), O(1)-round Work: O(k(n + W(n))) Plugging in best PIR parameters, Scheme #1: comm. O(polylog(n)), work O(n 2 ) Scheme #2: comm. & work: O(npolylog(n))

A Generic Reduction A protocol is a threshold PIC (TPIC) if it satisfies the definitions of a PIC scheme assuming C = {[m]}. Theorem (roughly speaking): If there exists a TPIC with communication C(n), work W(n), and round complexity R(n), then there exists a PIC with communication O(C(n)), work O(W(n)), and round complexity O(R(n)).

PIC ideas: … … cnvdselvuiaapxnw User/server do SPIR on table of encryptions Idea: Encryptions of both data and keys that will help user decrypt encryptions on future queries User can only decrypt if has appropriate keys – only possible if not in danger of making an inference

Stateless PIC Minimizing communication is a data structures problem What type of keys require least communication for user to: 1. Update as user makes new queries? 2. Prove user not in danger of making an inference on current/future queries? Keys must prevent replay attacks: cant use old keys to pretend made less queries to records than actually have

PIC Scheme #1 – Stage 1 E(i 1 ) -> E(r 1 (i 1 – i 3 )) E(i 2 ) -> E(r 2 (i 2 – i 3 )) (i 3, j 3 ) E(i 3 ), E(j 3 ), ZKPOK Let E by a homomorphic semantically secure encryption scheme (e.g., Pallier) Suppose we allow accessing each record at most once PK, SK PK Recovers r 1, r 2 iff hasnt previously accessed i 3 From r 1 and r 2 user can reconstruct a secret S 3

PIC Scheme #1 – Stage 2 (i 3, j 3 ) E(i 3 ), E(j 3 ), ZKPOK PK, SK PK Recovers S 3 E(r 1,1 (j-j 3 ) + r 1,1 (i – i 3 ) + S 3 + x 1,1 ) E(r 1,2 (j-j 3 ) + r 1,2 (i – i 3 ) + S 3 + x 1,2 ) E(r 2,1 (j-j 3 ) + r 2,1 (i – i 3 ) + S 3 + x 2,1 ) … User does SPIR on records on table of encryptions

PIC Scheme #1 - Wrapup To extend to querying a record < m times, on t-th query, let r 1, …, r t-1 be (t-m+1) out of (t-1) secret sharing of S t This scheme can be proven to be a TPIC – use generic reduction to get a PIC User Privacy: semantic security of E, ZK of proof, privacy of SPIR Inference Control: user can recover at most t-m r i if already queried record m-1 times – can build a simulator using SPIR w/knowledge extractor [NP99]

PIC Scheme #2 - Glimpse 1243 t K v, b K u, a K w,c K x,d K y,e K z,f polylog(n)-communication PIC Balanced binary tree B Leaves are attributes Parents of leaves are records Internal node n accessed when record r queried and n on path from r to root Keys encode # times nodes in B have been accessed. a+b =t

Conclusions Extensions not in this talk Multiple users (pseudonyms) Collusion resistance: c-resistance => m-channel becomes collection of (m-1)/c channels. Summary New Primitive – PIC (Almost) Communication-optimal implementations

Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)

Similar presentations

Presentation on theme: "Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)

Similar presentations

Presentation on theme: "Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)"— Presentation transcript:

Similar presentations

About project

Feedback