Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010.

Similar presentations


Presentation on theme: "Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010."— Presentation transcript:

1 Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010

2  The Problem ◦ Practical private retrieval of public data  Main Challenges ◦ Strong privacy, practical cost of retrieval  Our proposal ◦ Absolute privacy in a bounding box  Contributions ◦ Private retrieval service charge model ◦ Bounding-box PIR: generalizing k-Anonymity and PIR ◦ Query by key in one round 6/21/2010S.Wang, D.Agrawal and A.El Abbadi2

3 6/21/2010S.Wang, D.Agrawal and A.El Abbadi3 public data Server Private query method Client query obfuscated query I don’t want to reveal my personal interest. Untrusty server I can provide this private retrieval service, if you pay for it. Private data profile

4  Desiderata ◦ Practical  Minimize computation and communication costs ◦ Flexible  Allow clients to specify their desired degree of privacy ρ and service charge budget µ. Satisfy ρ without exceeding µ.  Metrics of interests ◦ Performance metrics  Computation Cost C comp  Communication Cost C comm ◦ Quality of service metrics  Privacy Breach Probability P brh (P brh ≤ ρ)  Server Charge C srv (C srv ≤ µ)  Challenge ◦ Difficult to achieve both strong privacy and practical retrieval cost at the same time 6/21/2010S.Wang, D.Agrawal and A.El Abbadi4

5  Principle ◦ Blur a data value with a range or partition s.t. each value is indistinguishable among at least k values. [Sama98, Swee02]  Analysis: use k bit data to anonymize 1 requested bit ◦ E.g. k =30, query “June 17, 1972” -> obfuscated query “June, 1972” ◦ C comp = k, C comm = k +1 ◦ P brh = 1/k, C srv = k Pros Flexible Computationally cheap Cons Potential proximity breach for numeric data (due to a narrow anonymous range) [Li08] Plain text communication, subject to attack with background knowledge 6/21/2010S.Wang, D.Agrawal and A.El Abbadi5

6  Principle ◦ Achieve computationally complete privacy by applying cryptographic computations over the entire public data [Kush97]  Pros ◦ Complete privacy for clients ◦ Secure communication  Cons ◦ Orders of magnitude less efficient than simply transferring the entire data from the server to the client [Sion07] X1X1 X2X2 ……………… XnXn 6/21/2010S.Wang, D.Agrawal and A.El Abbadi6 X=X= public data ServerClient q= “ give me ith record ” encrypted(q) encrypted-result=f(X, encrypted(q))XiXi

7  Quadratic Residue (QR)  x is a quadratic residue (QR) mod N if ◦ E.g. N=35, 11 is QR (9 2 =11 mod 35), 3 is QNR (no y exists for y 2 =3 mod 35) ◦ Essential properties:  QR ×QR = QR  QR ×QNR = QNR  Let N =p 1 ×p 2, p 1 and p 2 are large primes of m/2 bits.  Quadratic Residuosity Assumption (QRA) ◦ Determining if a number is a QR or a QNR is computationally hard if p 1 and p 2 are not given.

8 6/21/2010S.Wang, D.Agrawal and A.El Abbadi8 Adapted from Tan’s presentation 0101 1101 0101 0111 e g Get M 2,3 e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} 4 16 17 11 QNR z4z3z2z1z4z3z2z1 z 2 =QNR => M 2,3 =1 z 2 =QR => M 2,3 =0 M 2,3 17331727 public data size: n = 16 Organize data in an s×t (4×4) binary matrix M

9  Principles ◦ Rely on cPIR cryptographic operations to achieve strong privacy ◦ Trade partial privacy of cPIR for practical performance ◦ Adopt the flexible privacy principle of k-Anonymity  Basic idea ◦ Bound expensive cryptographic computations in an r×c bounding box BB, a sub-matrix on M. ◦ (1) Satisfy client’s privacy requirement: r×c = 1/ρ ◦ (2) Minimize C comm -> minimize (c + b×r)  Properties ◦ The bounding box contains both the data whose values are close to the query value and the data whose values are not close. ◦ Unify k-Anonymity and cPIR by varying dimensions of the bounding box 6/21/2010S.Wang, D.Agrawal and A.El Abbadi9

10 6/21/2010S.Wang, D.Agrawal and A.El Abbadi10 0101 1101 0101 0111 e g Get M 2,3 e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} z 2 =QNR => M 2,3 =1 M 2,3 1727 16 17 QNR y:y: z:z: BB

11 6/21/2010S.Wang, D.Agrawal and A.El Abbadi11 8335689 7265480 5235379 1164572 Public data size: n = 16 Query: retrieve the item with key 53 g e cPIR 8335689 7265480 5235379 1164572 C comp = k = 4 C comm = k +1 = 5 P brh = 1/ k = ¼ C srv = k = 4 8335689 7265480 5235379 1164572 g e k-Anonymity g e bbPIR Bounding box

12  Limitation of previous formulation: query by matrix address  Solution for query by key: find address by key ◦ Candidate solution I: third party translation, like in Casper [Mokb07]  Cons: security subject to a third party ◦ Candidate solution II: an index structure on server mapping key to address [Chor97]  Cons: needs O(b × logn) times communication ◦ Our proposal: server publishes a histogram H on the key field to authorized clients.  Client calculates an address range for the queried entry by searching the bin in which the entry falls.  Pros: If the bin size w ≤ s, only need to run one round of bbPIR 6/21/2010S.Wang, D.Agrawal and A.El Abbadi12

13  In clients’ view, server matrix M is a histogram matrix HM, thus the address of the requested item x maps to an address range of the items in the same bin with x. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi13 M 2,3 40 -- 26 HM 1,3 (M 1,3, M 2,3 ) w=2 100 -- 94 79 -- 72 53 -- 45 23 -- 16 5 -- 1 138 -- 101 93 -- 80 70 -- 54 13 -- 7 g e 947245161 1007953235 1018054267 1078960338 138 93704013 g e

14  Implementation of three private retrieval methods ◦ bbPIR, cPIR ◦ k-Anonymity: anonymize the private query item by specifying a consecutive range that covers the item  Data set ◦ Generated n=10 6 data records with 3 attributes based on an Adult census data set with 32561 records of 15 attributes. ◦ Only for experiment on proximity privacy of numeric data, generated 10 6 numeric data following Zipf distribution in [0.0, 1.0].  Settings ◦ Test bed: Intel 2.40GHz CPU, 3GB memory, Federal Core 8 OS ◦ Default parameter values: ρ = 0.001, µ = 50, k = 1000, m = 1024 6/21/2010S.Wang, D.Agrawal and A.El Abbadi14

15 6/21/2010S.Wang, D.Agrawal and A.El Abbadi15

16 6/21/2010S.Wang, D.Agrawal and A.El Abbadi16

17 6/21/2010S.Wang, D.Agrawal and A.El Abbadi17

18 6/21/2010S.Wang, D.Agrawal and A.El Abbadi18

19 6/21/2010S.Wang, D.Agrawal and A.El Abbadi19

20  We proposed a practical, flexible and secure approach for private retrieval of public data in single server settings, called Bounding-Box PIR (bbPIR).  bbPIR generalizes cPIR and k-Anonymity based private retrieval methods.  We incorporated the realistic assumption of charging clients for the exposed service data.  We achieved query by key without running additional rounds of bbPIR. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi20

21  [Sama98] P. Samarati et al. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, 1998.  [Swee02] L. Sweeney. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge- Based Systems, 10(5):557--570, 2002.  [Li08] J. Li et al. Preservation of proximity privacy in publishing numerical sensitive data. In SIGMOD 2008.  [Mokb07] M. Mokbel et al. The new casper: A privacy-aware location-based database server. In ICDE 2007.  [Kush97] E. Kushilevitz et al. Replication is not needed: Single database, computationally-private information retrieval. In FOCS 1997.  [Sion07] R. Sion et al. On the computational practicality of private information retrieval. In NDSS 2007.  [Chor97] B. Chor et al. Private information retrieval by keywords. Technical Report, TRCS 0917, Technian. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi21


Download ppt "Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010."

Similar presentations


Ads by Google