Download presentation
Presentation is loading. Please wait.
Published byBrooke Davis Modified over 9 years ago
1
Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010
2
The Problem ◦ Practical private retrieval of public data Main Challenges ◦ Strong privacy, practical cost of retrieval Our proposal ◦ Absolute privacy in a bounding box Contributions ◦ Private retrieval service charge model ◦ Bounding-box PIR: generalizing k-Anonymity and PIR ◦ Query by key in one round 6/21/2010S.Wang, D.Agrawal and A.El Abbadi2
3
6/21/2010S.Wang, D.Agrawal and A.El Abbadi3 public data Server Private query method Client query obfuscated query I don’t want to reveal my personal interest. Untrusty server I can provide this private retrieval service, if you pay for it. Private data profile
4
Desiderata ◦ Practical Minimize computation and communication costs ◦ Flexible Allow clients to specify their desired degree of privacy ρ and service charge budget µ. Satisfy ρ without exceeding µ. Metrics of interests ◦ Performance metrics Computation Cost C comp Communication Cost C comm ◦ Quality of service metrics Privacy Breach Probability P brh (P brh ≤ ρ) Server Charge C srv (C srv ≤ µ) Challenge ◦ Difficult to achieve both strong privacy and practical retrieval cost at the same time 6/21/2010S.Wang, D.Agrawal and A.El Abbadi4
5
Principle ◦ Blur a data value with a range or partition s.t. each value is indistinguishable among at least k values. [Sama98, Swee02] Analysis: use k bit data to anonymize 1 requested bit ◦ E.g. k =30, query “June 17, 1972” -> obfuscated query “June, 1972” ◦ C comp = k, C comm = k +1 ◦ P brh = 1/k, C srv = k Pros Flexible Computationally cheap Cons Potential proximity breach for numeric data (due to a narrow anonymous range) [Li08] Plain text communication, subject to attack with background knowledge 6/21/2010S.Wang, D.Agrawal and A.El Abbadi5
6
Principle ◦ Achieve computationally complete privacy by applying cryptographic computations over the entire public data [Kush97] Pros ◦ Complete privacy for clients ◦ Secure communication Cons ◦ Orders of magnitude less efficient than simply transferring the entire data from the server to the client [Sion07] X1X1 X2X2 ……………… XnXn 6/21/2010S.Wang, D.Agrawal and A.El Abbadi6 X=X= public data ServerClient q= “ give me ith record ” encrypted(q) encrypted-result=f(X, encrypted(q))XiXi
7
Quadratic Residue (QR) x is a quadratic residue (QR) mod N if ◦ E.g. N=35, 11 is QR (9 2 =11 mod 35), 3 is QNR (no y exists for y 2 =3 mod 35) ◦ Essential properties: QR ×QR = QR QR ×QNR = QNR Let N =p 1 ×p 2, p 1 and p 2 are large primes of m/2 bits. Quadratic Residuosity Assumption (QRA) ◦ Determining if a number is a QR or a QNR is computationally hard if p 1 and p 2 are not given.
8
6/21/2010S.Wang, D.Agrawal and A.El Abbadi8 Adapted from Tan’s presentation 0101 1101 0101 0111 e g Get M 2,3 e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} 4 16 17 11 QNR z4z3z2z1z4z3z2z1 z 2 =QNR => M 2,3 =1 z 2 =QR => M 2,3 =0 M 2,3 17331727 public data size: n = 16 Organize data in an s×t (4×4) binary matrix M
9
Principles ◦ Rely on cPIR cryptographic operations to achieve strong privacy ◦ Trade partial privacy of cPIR for practical performance ◦ Adopt the flexible privacy principle of k-Anonymity Basic idea ◦ Bound expensive cryptographic computations in an r×c bounding box BB, a sub-matrix on M. ◦ (1) Satisfy client’s privacy requirement: r×c = 1/ρ ◦ (2) Minimize C comm -> minimize (c + b×r) Properties ◦ The bounding box contains both the data whose values are close to the query value and the data whose values are not close. ◦ Unify k-Anonymity and cPIR by varying dimensions of the bounding box 6/21/2010S.Wang, D.Agrawal and A.El Abbadi9
10
6/21/2010S.Wang, D.Agrawal and A.El Abbadi10 0101 1101 0101 0111 e g Get M 2,3 e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} z 2 =QNR => M 2,3 =1 M 2,3 1727 16 17 QNR y:y: z:z: BB
11
6/21/2010S.Wang, D.Agrawal and A.El Abbadi11 8335689 7265480 5235379 1164572 Public data size: n = 16 Query: retrieve the item with key 53 g e cPIR 8335689 7265480 5235379 1164572 C comp = k = 4 C comm = k +1 = 5 P brh = 1/ k = ¼ C srv = k = 4 8335689 7265480 5235379 1164572 g e k-Anonymity g e bbPIR Bounding box
12
Limitation of previous formulation: query by matrix address Solution for query by key: find address by key ◦ Candidate solution I: third party translation, like in Casper [Mokb07] Cons: security subject to a third party ◦ Candidate solution II: an index structure on server mapping key to address [Chor97] Cons: needs O(b × logn) times communication ◦ Our proposal: server publishes a histogram H on the key field to authorized clients. Client calculates an address range for the queried entry by searching the bin in which the entry falls. Pros: If the bin size w ≤ s, only need to run one round of bbPIR 6/21/2010S.Wang, D.Agrawal and A.El Abbadi12
13
In clients’ view, server matrix M is a histogram matrix HM, thus the address of the requested item x maps to an address range of the items in the same bin with x. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi13 M 2,3 40 -- 26 HM 1,3 (M 1,3, M 2,3 ) w=2 100 -- 94 79 -- 72 53 -- 45 23 -- 16 5 -- 1 138 -- 101 93 -- 80 70 -- 54 13 -- 7 g e 947245161 1007953235 1018054267 1078960338 138 93704013 g e
14
Implementation of three private retrieval methods ◦ bbPIR, cPIR ◦ k-Anonymity: anonymize the private query item by specifying a consecutive range that covers the item Data set ◦ Generated n=10 6 data records with 3 attributes based on an Adult census data set with 32561 records of 15 attributes. ◦ Only for experiment on proximity privacy of numeric data, generated 10 6 numeric data following Zipf distribution in [0.0, 1.0]. Settings ◦ Test bed: Intel 2.40GHz CPU, 3GB memory, Federal Core 8 OS ◦ Default parameter values: ρ = 0.001, µ = 50, k = 1000, m = 1024 6/21/2010S.Wang, D.Agrawal and A.El Abbadi14
15
6/21/2010S.Wang, D.Agrawal and A.El Abbadi15
16
6/21/2010S.Wang, D.Agrawal and A.El Abbadi16
17
6/21/2010S.Wang, D.Agrawal and A.El Abbadi17
18
6/21/2010S.Wang, D.Agrawal and A.El Abbadi18
19
6/21/2010S.Wang, D.Agrawal and A.El Abbadi19
20
We proposed a practical, flexible and secure approach for private retrieval of public data in single server settings, called Bounding-Box PIR (bbPIR). bbPIR generalizes cPIR and k-Anonymity based private retrieval methods. We incorporated the realistic assumption of charging clients for the exposed service data. We achieved query by key without running additional rounds of bbPIR. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi20
21
[Sama98] P. Samarati et al. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, 1998. [Swee02] L. Sweeney. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge- Based Systems, 10(5):557--570, 2002. [Li08] J. Li et al. Preservation of proximity privacy in publishing numerical sensitive data. In SIGMOD 2008. [Mokb07] M. Mokbel et al. The new casper: A privacy-aware location-based database server. In ICDE 2007. [Kush97] E. Kushilevitz et al. Replication is not needed: Single database, computationally-private information retrieval. In FOCS 1997. [Sion07] R. Sion et al. On the computational practicality of private information retrieval. In NDSS 2007. [Chor97] B. Chor et al. Private information retrieval by keywords. Technical Report, TRCS 0917, Technian. 6/21/2010S.Wang, D.Agrawal and A.El Abbadi21
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.