Download presentation
Presentation is loading. Please wait.
Published byPrimrose Morris Modified over 9 years ago
1
DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Piotr (Peter) Mardziel, Stephen Magill, Michael Hicks, and Mudhakar Srivatsa
2
Your information 2
3
Bad + Good 3
4
Want Protect personal information. Preserve beneficial uses. 4
5
Take back control Photography 5 querier you
6
Useful and non-revealing Photography out = 24 ≤ Age ≤ 30 Æ Female? Æ Engaged? * true 6 * real query used by a facebook advertiser querier you
7
out = (age, zip-code, birth-date) * reject 7 * - zip-code: postal code in USA - age, zip-code, birth-date can be used to uniquely identify 63% of Americans
8
Benefits Can provide exact answer to query. Need not know ahead of time how your information can be useful. Need not preemptively hide information. doesn’t restricts uses 8
9
The problem: how to decide to run a query or not? out := (age, zip-code, birth-date) out := age out := 24 ≤ age ≤ 30 Æ female? Æ engaged? Q1 Q2 Q3 ? 9
10
Setting Past answers, unknown future queries. Cannot view queries in isolation. Cannot use various static QIF measures. Rejection. Cannot let rejection defeat our protection. time Q41 Q43 … … Q42 X reject 10
11
Goal: restrict querier knowledge Define knowledge. Define learning. 11
12
Knowledge model Given belief, a query, determine revised belief given the output of the query (Bayesian revision) evaluate revised belief as acceptable or not Can repeat the process on future queries start with the revised belief from the previous one = belief (probability distribution) over secret data 12
13
Meet Bob = 0 · bday · 364 1956 · byear · 1992 Bob (born September 24, 1980) bday = 267 byear = 1980 Secret bday byear 1956 1992 0364 13 Assumption: this is accurate each equally likely
14
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 | = (out = 0) 1956 1992 0364 259 267 14 1956 1992 0364 P
15
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 | = (out = 0) 1956 1992 0364 259 267 = 15 Problem Policy: Is this acceptable?
16
Probabilistic interpretation We can use (general purpose) probabilistic languages to model attacker knowledge and how it changes due to query results. prob-scheme, IBAL, etc. Problem: (standard) enumeration-based probabilistic interpretation and revision is too slow. 16
17
Outline Our results Problem 1: Suitable policy definition. Problem 2: Approximate (but sound in terms of policy) probabilistic computation. Implementation. Experimental results. compare to prob-scheme, enumeration-based probabilistic interpretation 17
18
Problem 1: Policy Let us consider a policy Pr[my secret] < t Choice of threshold t might depend on risks involved in revelation of secret. 18
19
Bob’s policy = 0 · bday · 364 1956 · byear · 1992 Bob (born September 24, 1980) bday = 267 byear = 1980 Secret Policy Pr[bday] < 0.2 Pr[bday,byear] < 0.05 bday byear 1956 1992 0364 Currently Pr[bday] = 1/365 Pr[bday,byear] = 1/(365*37) Pr[bday = 267] … 19 each equally likely P
20
Bob’s policies Policy Pr[bday] < 0.2 bday specifically should never be certainly known 20
21
Bob’s policies 21 Policy Pr[bday,byear] < 0.05 the pair of bday,byear should never be certainly known doesn’t commit to protecting bday nor byear alone allows queries that need high precision in one, or low precision in both
22
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 | = (out = 0) 1956 1992 0364 259 267 22 1956 1992 0364
23
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 | = (out = 0) 1956 1992 0364 259 267 Potentially Pr[bday] = 1/358 < 0.2 Pr[bday,byear] = 1/(358*37) < 0.05 = 23
24
bday-query2 today := 261; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 Next day … 1956 1992 0364 259 267 | = (out = 1) 24 1956 1992 0 267 So reject? Pr[bday] = 1 P
25
1956 1992 0 267 if bday != 267if bday = 267 1956 1992 0364 259 268 Querier’s perspective will get answer will get reject Assume querier knows policy 25
26
Rejection problem | =reject Policy: Pr[bday = 267 | out = o] < t Rejection, intended to protect secret, reveals secret. 26
27
Rejection revised Policy: Pr[bday = 267 | out = o] < t Solution? Decide policy independently of secret Revised policy for every possible output o, for every possible bday b, Pr[bday = b | out = o] < t So the real bday in particular 27
28
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 accept 28 initial belief
29
bday-query2 today := 261; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 reject (regardless of what bday actually is) 29 (after bday-query1)
30
bday-query3 today := 266; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 30 (after bday-query1) accept
31
Problem 2: Slowness We can use (general purpose) probabilistic languages to model attacker knowledge and how it changes due to query results. prob-scheme, IBAL, etc. (standard) enumeration-based probabilistic interpretation and revision is too slow. Let us look at why. 31
32
1956 1992 0364 bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 32 Enumeration 1956 1992 0364 259 267 Æ (out = 0) * * ± | (out = 0) = normalize(± Æ (out = 0)) input state output state
33
33 Enumeration A lot of states = slow Can (over estimate) probabilities after partial enumeration. Assume unseen mass is in worst possible spot
34
34 Problem 2: Slowness Let us do away with enumeration. 15:00
35
Representation P 1 : 0 · bday · 259, 1956 · byear · 1992, out = 0 p = 0.000074 P 2 : 267 · bday · 364, 1956 · byear · 1992, out = 1 p = 0.000074 … 1956 1992 0364 P1P1 P 1 : 0 · bday · 364, 1956 · byear · 1992 p = 0.000074 35 1956 1992 0364 259 267 P1P1 P2P2 P3P3 P
36
36 Region Enumeration of states ! Manipulation of regions of states Problems too many regions non-uniform regions
37
nasty-query1 … disjuncts … … more disjuncts … … disjuncts EVERYWHERE … 1991 1956 0364 259 267 1961 1971 1981 1992 37 Too many regions
38
Approximation P 1 : 0 · bday · 259, 1956 · byear · 1992 p = 0.000067 s · 8580 ( size of P 1 ) P 2 : 267 · bday · 364, 1956 · byear · 1992 p = 0.000067 s · 3234 ( size of P 2 ) 1956 0364 259 267 1961 1971 1981 1992 P1P1 P2P2 P 1 : 0 · bday · 259, 1992 · byear · 1992 p = 0.000067 P 2 : 0 · bday · 259, 1982 · byear · 1990 p = 0.000067 … P 10 p,s refer to possible (non-zero probability) points in region 38 1956 0364 259 267 1961 1971 1981 1992 1991 abstract P1P1 P2P2
39
Abstraction imprecision 1956 0364 259 267 1961 1971 1981 1992 1991 abstract P1P1 P2P2 1956 0364 259 267 1961 1971 1981 1992 39 1956 0364 259 267 1961 1971 1981 1992 1956 0364 259 267 1961 1971 1981 1992
40
nasty-query-2 … disjuncts … … more disjuncts … … probabilistic choice … 40 1956 0364 259 267 1961 1971 1981 1992 P1P1 P2P2 Non-uniform regions
41
Approximation 1956 0364 259 267 1961 1971 1981 1992 1991 approximate P1P1 P2P2 P 1 : 0 · bday · 259, 1956 · byear · 1992 p · 0.000074 s · 9620 P 2 : 267 · bday · 364, 1956 · byear · 1992 p · 0.000074 s · 3626 1956 0364 259 267 1961 1971 1981 1992 P1P1 P2P2 P 1 : 0 · bday · 259, 1992 · byear · 1992 p = 0.0000074 P 2 : 0 · bday · 259, 1991 · byear · 1991 p = 0.000074 … P 18 for policy check 8 … 8 … Pr[bday = b | out = o] < t 41 P
42
Abstraction 1956 0364 259 267 1961 1971 1981 1992 1991 abstract P1P1 P2P2 For each P i, store region (polyhedron) upper bound on probability of each possible point upper bound on the number of (possible) points Also store lower bounds on the above Pr[A | B] = Pr[A Æ B] / Pr[B] Probabilistic Polyhedron 42
43
Abstract Conditioning 43 1956 0364 259 267 1961 1971 1981 1992 1956 0364 259 267 1961 1971 1981 1992
44
Abstract Conditioning 1956 0364 259 267 1961 1971 1981 1992 1991 approximate P1P1 P2P2 44 1956 0364 259 267 1961 1971 1981 1992 1956 0364 259 267 1961 1971 1981 1992 1956 0364 259 267 1961 1971 1981 1992
45
Approximation benefit = 0 · bday · 364 1956 · byear · 1992 Bob (born September 24, 1980) bday = 267 byear = 1980 Secret bday byear 1956 1992 0364 45 Assumption: this is accurate each equally likely 20:00
46
Initial distribution bday byear 1956 1992 0364 46 P1P1 1/(37*365) - ² · probability · 1/(37*365) + ² 37*365 · # of points · 37*365 Need attacker’s actual belief to be represented by this P 1 Much easier task than knowing the exact querier belief.
47
Abstraction Summary Approximate representation of a set of probability distributions. Abstract operations for Clarkson’s probabilistic semantics. Sound in terms of policy, even after belief revision. Restricted arithmetic expressions (linear only) No state enumeration and a way to restrict number of regions (more) computationally feasible Implementation Uses Parma PPL for polyhedra operations Uses Latte to determine size (# of integer points) inside polyhedra 47
48
Results Statespace size resistance See technical report for more queries precision/performance tradeoff issues 48
49
Statespace size resistance = 0 · bday · 364 1956 · byear · 1992 bday byear 1956 1992 0364 49 each equally likely
50
bday-query1 today := 260; if bday ¸ today Æ bday < (today + 7) then out := 1 else out := 0 1.prob-poly-set (our implementation) 2.prob-scheme (sampling/enumeration) provides sound estimation after partial enumeration measure time and probability bound of most probable state (pair bday,byear) 50 = 0 · bday · 364 1956 · byear · 1992 bday byear 1956 1992 0364 each equally likely
51
Statespace size resistance = 0 · bday · 364 1956 · byear · 1992 = 0 · bday · 364 1910 · byear · 2010 1 pp > 1 pp 51 each equally likely 24:00
52
Conclusions Knowledge modeling Probabilistic computation and revision of knowledge due to query outputs. Our results Suitable policy definition. Approximate (but sound in terms of policy) probabilistic interpretation. Experimental results. State size resistance. Drawbacks Restricted language, integer linear expressions Still too slow for practical use The future performance / accuracy simpler domains (octagons) can replace polyhedra (and Latte counting tool) use of Latte accounts for over 95% of most tests we ran various other technicalities less restricted language and/or state space 52
53
Thank you! Go back. 53
54
Creating dependency Dependencies can be created. 54 1956 1992 0364 P1P1 1956 1992 0364 P1P1 strange-query if bday – 1956 ¸ byear - 5 Æ bday – 1956 · byear + 5 then out := 1 else out := 0 * not to scale and/or shape
55
Differential privacy Encourage participation in a database by protecting inference of a record. Add noise to query result. Requires trusted curator. Suitable for aggregate queries where exact results are not required. Not suitable to a single user protecting him(her)self. Using differential privacy on a query with 2 output values results in the incorrect query output almost half the time (for reasonable ²) 55
56
Composability / Collusion If collusion is suspected among queries, one can make them share belief query outputs are assumed to have been learned by all of them assumes deterministic queries only Confusion problem: non-deterministic queries can result in a revised belief which is less sure of the true (or some) secret value than the initial belief If collusion is suspected but not present, this can result in incorrect belief Abstract solution: abstract union of the revised belief (colluded) and the prebelief (not colluded) P 1 [ P 2 if ± i 2 °(P i ) for i = 1,2 then ± 1, ± 2 2 °(P 1 [ P 2 ) 56
57
Setting, Known Secret Measures of badness that weigh their values by the distribution of the secret (or the distribution of the outputs of a query) can be problematic. An unlikely user could theorize a query is safe, but know it is not safe for them Consider conditional min-entropy (see Information-theoretic Bounds for Differentially Private Mechanisms) H 1 (X | Y) = -log 2 y P Y (y) max x P X|Y (x,y) > - log 2 t probability of certain query output Our policy: -log 2 max y max x P X|Y (x,y) > - log 2 t so that -log 2 max x P X|Y (x,yreal) > - log 2 t 57
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.