DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Michael Hicks University of Maryland, College Park Joint work with Piotr Mardziel, Stephen Magill,

Slides:



Advertisements
Similar presentations
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Advertisements

What is a Database By: Cristian Dubon.
Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
3.3 Hypothesis Testing in Multiple Linear Regression
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
MATH 224 – Discrete Mathematics
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Inferring Disjunctive Postconditions Corneliu Popeea and Wei-Ngan Chin School of Computing National University of Singapore - ASIAN
Fast Algorithms For Hierarchical Range Histogram Constructions
Mean, Proportion, CLT Bootstrap
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
Learning Objectives Explain similarities and differences among algorithms, programs, and heuristic solutions List the five essential properties of an algorithm.
PROBABILISTIC COMPUTATION FOR INFORMATION SECURITY Piotr (Peter) Mardziel (UMD) Kasturi Raghavan (UCLA)
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Piotr (Peter) Mardziel, Stephen Magill, Michael Hicks, and Mudhakar Srivatsa.
Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.
POMDPs: Partially Observable Markov Decision Processes Advanced AI
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Single Point of Contact Manipulation of Unknown Objects Stuart Anderson Advisor: Reid Simmons School of Computer Science Carnegie Mellon University.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Chapter 10 Hypothesis Testing
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Reading and Writing Mathematical Proofs
DISCRETE MATHEMATICS I CHAPTER 11 Dr. Adam Anthony Spring 2011 Some material adapted from lecture notes provided by Dr. Chungsim Han and Dr. Sam Lomonaco.
V. Space Curves Types of curves Explicit Implicit Parametric.
Secure sharing in distributed information management applications: problems and directions Piotr Mardziel, Adam Bender, Michael Hicks, Dave Levin, Mudhakar.
Algorithms and their Applications CS2004 ( ) Dr Stephen Swift 3.1 Mathematical Foundation.
OLAP : Blitzkreig Introduction 3 characteristics of OLAP cubes: Large data sets ~ Gb, Tb Expected Query : Aggregation Infrequent updates Star Schema :
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
International Technology Alliance in Network & Information Sciences Knowledge Inference for Securing and Optimizing Secure Computation Piotr (Peter) Mardziel,
CSC 211 Data Structures Lecture 13
1 Relational Algebra and Calculas Chapter 4, Part A.
PROBABILISTIC PROGRAMMING FOR SECURITY Michael Hicks Piotr (Peter) Mardziel University of Maryland, College Park Stephen Magill Galois Michael Hicks UMD.
Data Structure Introduction.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
OLAP Recap 3 characteristics of OLAP cubes: Large data sets ~ Gb, Tb Expected Query : Aggregation Infrequent updates Star Schema : Hierarchical Dimensions.
Uncertainty Management in Rule-based Expert Systems
Copyright © Cengage Learning. All rights reserved.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Chapter 10 Algorithmic Thinking. Learning Objectives Explain similarities and differences among algorithms, programs, and heuristic solutions List the.
Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Belief in Information Flow Michael Clarkson, Andrew Myers, Fred B. Schneider Cornell University 18 th IEEE Computer Security Foundations Workshop June.
KNOWLEDGE-ORIENTED MULTIPARTY COMPUTATION Piotr (Peter) Mardziel, Michael Hicks, Jonathan Katz, Mudhakar Srivatsa (IBM TJ Watson)
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
Introduction to Algorithms
OPERATING SYSTEMS CS 3502 Fall 2017
Input Space Partition Testing CS 4501 / 6501 Software Testing
Enough Mathematical Appetizers!
Computation.
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Differential Privacy in Practice
Knowledge Inference for Optimizing Secure Multi-party Computation
Applied Discrete Mathematics Week 6: Computation
Introduction to Algorithms
Lecture 7 Sampling and Sampling Distributions
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Differential Privacy (1)
Presentation transcript:

DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Michael Hicks University of Maryland, College Park Joint work with Piotr Mardziel, Stephen Magill, and Mudhakar Srivatsa (IBM)

My research Programming languages-based techniques for more reliable, available, and secure software 2

My research Programming languages-based techniques for more reliable, available, and secure software Recent projects Dynamic software updating (for C and Java) Blending PL and Crypto Adapton: Demand-driven Incremental Computation Druby: Static types for dynamic languages 3

Why am I here? 4 enthusiast.net/2014/09/08/probabilistic- programming/ 21,858 pageviews, as of yesterday 21,858 pageviews, as of yesterday

Today’s talk Using probabilistic programming to enforce knowledge-based security policies 5 1.The setting, the problem, and our solution 2.A technique for probabilistic inference that is sound with respect to security Based on abstract interpretation

“Free” sites Good: Useful service, no direct charge Bad: Free use paid for by use of personal information How to balance the two? 6

Status quo: they have your data 7 your data you query/filter page Photography advertisers + ads

Take back control 8 Photography advertisers query/filter page + query/filter query responses your data you

Simpler model (for exposition) response 9 querier you query Photography

Useful and non-revealing Photography out = 24 ≤ Age ≤ 30 & Female? & Engaged? * true 10 * real query used by a facebook advertiser querier you

Reveals too much! out = (age, zip-code, birth-date) * reject 11 * - age, zip-code, birth-date can be used to uniquely identify 63% of Americans

The problem: how to decide to answer based on the information revealed? out := (age, zip-code, birth-date) out := age out := 24 ≤ age ≤ 30 & female? & engaged? Q1 Q2 Q3 ? 12

Approach Maintain a representation of the querier’s belief about secret’s possible values Each query result revises the belief; reject if actual secret becomes too likely Cannot let rejection defeat our protection. time Q1 Q3 … … Q2 Reject 13 Belief ≜ probability distribution Bayesian reasoning to revise belief OK (answer)

Contributions 1. Knowledge-based security policies Dynamic enforcement (CSF’11, JCS’13) Threshold to decide when an answer reveals too much Query rejection does not reveal information Risk prediction (IEEE S&P’14, FCS’14) Model series of interactions between adaptive adversary and a defender able to change the secret What is risk of certain change strategies? 14 All work collected in Piotr Mardizel’s dissertation

Contributions 2. Implementing checking and belief revision efficiently Treat queries as probabilistic programs, where inputs are distributions that represent beliefs Perform probabilistic inference via abstract interpretation, using a novel probabilistic polyhedra Domain associates polyhedron with overlapping descriptions of probability mass, for improved precision We track upper and lower bounds for sound conditioning and normalization 15 All work collected in Piotr Mardizel’s dissertation

Meet Bob = 0 ≤ bday ≤ ≤ byear ≤ 1992 Bob (born September 24, 1980) bday = 267 byear = 1980 Secret bday byear Assumption: this is accurate each equally likely

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 | = (out = 0)

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 | = (out = 0) = 18 Problem Policy: Is this acceptable?

Idea: policy as knowledge threshold Answer a query if, for querier’s revised belief, Pr[my secret] < t Call t the knowledge threshold Choice of t depends on the risk of revelation 19

Bob’s policy = 0 ≤ bday ≤ ≤ byear ≤ 1992 Bob (born September 24, 1980) bday = 267 byear = 1980 Secret Policy Pr[bday] < 0.2 Pr[bday,byear] < 0.05 bday byear Currently Pr[bday] = 1/365 Pr[bday,byear] = 1/(365*37) Pr[bday = 267] … 20 each equally likely P

Bob’s policies Policy Pr[bday] < 0.2 bday specifically should never be certainly known 21

Bob’s policies 22 Policy Pr[bday,byear] < 0.05 the pair of bday,byear should never be certainly known doesn’t commit to protecting bday or byear alone allows queries that need high precision in one, or low precision in both

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 | = (out = 0)

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 | = (out = 0) Potentially Pr[bday] = 1/358 < 0.2 Pr[bday,byear] = 1/(358*37) < 0.05 = 24

bday-query2 today := 261; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 Next day … | = (out = 1) So reject? Pr[bday] = 1 P

if bday ≠ 267if bday = Querier’s perspective will get answer will get reject Assume querier knows policy 26

Rejection problem | =reject Policy: Pr[bday = 267 | out = o] < t Rejection, intended to protect secret, reveals secret! 27

Rejection revised Policy: Pr[bday = 267 | out = o] < t Solution? Decide policy independently of secret Revised procedure for every possible output o, for every possible bday b, Pr[bday = b | out = o] < t So the real bday in particular 28

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 accept 29 initial belief

bday-query2 today := 261; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 reject (regardless of what bday actually is) 30 (after bday-query1)

bday-query3 today := 266; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 31 (after bday-query1) accept

Implementing belief revision Idea: Use probabilistic programming! Query is a probabilistic program Inputs are distributions representing adversary beliefs Observations are actual/possible outputs Posterior is conditioned on these 32

BUT: Enumeration/sampling-based probabilistic interpretation is problematic In the limit, time taken is proportional to the running time of the query and the size of the state space Approximations used may not be sound (may underestimate risk) Can address the latter problem using abstract interpretation Sound approximation of enumeration-based semantics 33 Implementing belief revision

Three slides on abstract interpretation Given a programming language e ::= x | n | e * e The concrete interpretation is its actual semantics Concrete domain is natural numbers Concrete interpretation Eval(e) is multiplication 1 * (2 * 3) = 6, etc. An abstract interpretation approximates the semantics Abstract domain may be intervals of natural numbers [n 1,n 2 ] Abstract interpretation Aeval(e): [n 1,n 2 ] * [n 3,n 4 ] = [n 1 *n 3, n 2 *n 4 ] [1,4] * ([2,2] * [3,6]) = [1,4] * [6,12] = [6,48] 34

Relating abstract and concrete Concretization function γ relates the abstract domain A to the concrete one C γ : A C γ ([n 1,n 2 ]) = { n | n 1 ≤ n ≤ n 2 } – here, C is the power-set of naturals Abstraction function α relates the concrete domain C to the abstract one A α : C A α(S) = [n 1,n 2 ] where n ∈ S implies n 1 ≤ n and n ≤ n 2 Galois insertion: S ⊆ γ(α(S)) and α(γ(A)) = A 35

Soundness Aeval(e) (where x appears in e) is sound iff Aeval(e) [x=A] = A’ forall n ∈ γ(A), Eval(e) [x=n] = n’ implies n’ ∈ γ(A’) (With obvious generalization to multiple variables) Suppose e is x * 5 Abstract interpretation, assuming x=[1,3] Aeval(e) [x=[1,3]] = [5,15] Concrete interpretation, assuming x=2 ∈ γ([1,3]) Eval(e) [x=2] = 10 Notice that 10 ∈ γ([5,15]) 36

Concrete domain: probability distributions Distributions δ : States Real numbers A state σ is a map from variables to integers Semantics of a statement S under distribution δ is written Sδ. Informally, concrete interpretation is: For each state σ in the support of δ, let p be its probability Execute S in that state, producing σ’ with probability mass p Add to it the probability mass of all other input states that produce σ’ Note that distributions’ probabilities may not sum to 1 Frequency distributions are more efficient, simplify development 37 Back to our setting of probabilistic inference:

bday-query1 today := 260; if bday ≤ today && bday < (today + 7) then out := 1 else out := 0 38 Implementation by enumeration | (out = 0) * * δ | (out = 0) = normalize(δ | (out = 0)) input state output state =

Probabilistic Interpretation Semantics (cf. Kozen) skipδ = δ S 1 ; S 2 δ = S 2 (S 1 δ) if B then S 1 else S 2 δ = S 1 (δ | B) + S 2 (δ | ¬B) pif p then S 1 else S 2 δ = S 1 (p*δ) + S 2 ((1-p)*δ) x := Eδ = δ[x E] while B do S = lfp (λF. λδ. F(S(δ | B)) + (δ | ¬B)) Operations p*δ – scale probabilities by p δ | B – remove mass inconsistent with B δ 1 + δ 2 – combine mass from both δ[x E] – transform mass

+ y := y – 3(δ | x > 5) Subdistribution operations δ | B – remove mass inconsistent with B δ | B = λσ. if Bσ = true then δ(σ) else 0 δB = x ≥ y δ | B δ 1 + δ 2 – combine mass from both δ 1 + δ 2 = λσ. δ 1 (σ) + δ 2 (σ) δ1δ1 δ2δ2 δ 1 + δ 2 if x ≤ 5 then y := y + 3 else y := y - 3δ δ δ | x ≤ 5 δ | x > 5 y := y + 3(δ | x ≤ 5) y := y – 3(δ | x > 5) SδSδ = y := y + 3(δ | x ≤ 5)

Abstract domain: probabilistic polyhedra P 1 : 0 ≤ bday ≤ 259, 1956 ≤ byear ≤ 1992, out = 0 p = P 2 : 267 ≤ bday ≤ 364, 1956 ≤ byear ≤ 1992, out = 1 p = … P1P1 P 1 : 0 ≤ bday ≤ 364, 1956 ≤ byear ≤ 1992 p = P1P1 P2P2 P3P3 Key elements of Prob. Poly. P:  Constraints describing polygons C i  Probability p per point within the polygon

42 Performance / precision tradeoff (Sets of) polyhedra are a standard abstract domain, so we can reuse (parts of) existing tools to operate on them directly Can end up with many small regions May be expensive to manipulate Approach Collapse regions: specify the “boundary” of a region and bounds on the number and probability of the points within it Nonuniform regions hurt precision but fewer regions improve performance

nasty-query1 … many disjuncts … age := 2011 – byear; if age = 20 || … || age = 60 then out := true else out := false; Too many regions Let = belief after bday-query1 | (out = false)

Approximation P 1 : 0 ≤ bday ≤ 259, 1956 ≤ byear ≤ 1992 p = s = 8580 (  true size = 9620) P 2 : 267 ≤ bday ≤ 364, 1956 ≤ byear ≤ 1992 p = s = 3234 (  true size = 3626) P1P1 P2P2 P 1 : 0 ≤ bday ≤ 259, 1992 ≤ byear ≤ 1992 p = P 2 : 0 ≤ bday ≤ 259, 1982 ≤ byear ≤ 1990 p = … (ten total) P 10 p,s refer to possible (non-zero probability) points in region approximate P1P1 P2P2 exact

Abstraction imprecision abstract P1P1 P2P exact

nasty-query-2 … many disjuncts … age := 2011 – byear; if age = 20 || … || age = 60 then out := true else out := false; Non-uniform regions Takes the true branch one time out of ten Thus, out = true implies age is a decade or the coin flipped in our favor; former more likely | out = true pif 0.1 then out := true … probabilistic choice:

Approximation approximate P1P1 P2P2 P 1 : 0 ≤ bday ≤ 259, 1956 ≤ byear ≤ 1992 p ≤ (less-equal) s = 9620 P 2 : 267 ≤ bday ≤ 364, 1956 ≤ byear ≤ 1992 p ≤ s = P1P1 P2P2 P 1 : 0 ≤ bday ≤ 259, 1992 ≤ byear ≤ 1992 p = P 2 : 0 ≤ bday ≤ 259, 1991 ≤ byear ≤ 1991 p = … (18 total) P 18 for policy check Pr[bday = b | out = o] < t 47

Final abstraction P1P1 P2P2 For each P i, store region (polyhedron) p: upper bound on probability of each possible point s: upper bound on the number of (possible) points m: upper bound on the total probability mass (useful) Also store lower bounds on the above Pr[A | B] = Pr[A ∩ B] / Pr[B] Probabilistic Polyhedron 48

Relating concrete and abstract CiCi δ ∈ γ(P) iff support(δ) ⊆ γ(C) p min ≤ δ(σ) ≤ p max for every σ ∈ support(δ) s min ≤ |support(δ)| ≤ s max m min ≤ mass(δ) ≤ m max support(δ) = {σ | δ(σ) > 0} p i min, p i max s i min, s i max m i min, m i max ∈ γ( ) … δ1δ1 δ2δ2 δ3δ3 δ ∈ γ({P 1,P 2 }) iff δ = δ 1 + δ 2 with δ 1 ∈ γ(P 1 ) and δ 2 ∈ γ(P 2 ) similarly for more than two

Abstract Semantics Define SP Soundness: if δ ∈ γ(P) then Sδ ∈ γ (SP) Need Abstract operations P 1 + P 2 if δ i ∈ γ(P i ) for i = 1,2 then δ 1 + δ 2 ∈ γ(P 1 + P 2 ) P | B if δ ∈ γ(P) then δ | B ∈ γ(P | B) p*P if δ ∈ γ(P) then p * δ ∈ γ(p*P) … (and likewise for sets of P)

Abstract operation example δ 1 + δ 2 – combine mass from both δ 1 + δ 2 = λσ. δ 1 (σ) + δ 2 (σ) δ1δ1 δ2δ2 δ1+δ2δ1+δ2 s 1 max = 100 s 2 max = 30 P1P1 C1C1 P2P2 C2C2 P 3 = P 1 + P 2 s 3 max ≤ s 1 max + s 2 max = 130 | C 1 ⋂ C 2 | = | C 1 | = 100 | C 2 | = 40 s 3 max = 120 What is the maximum number of possible points in the sum? determine minimum overlap (20)

Operations P 3 = P 1 + P 2 C 3 – convex hull of C 1, C 2 s 3 max – what is the smallest overlap? s 3 min – what is the largest overlap? p 3 max – is overlap possible? p 3 min – is overlap impossible? m 3 max – simple sum m 1 max + m 2 max m 3 min – simple sum m 1 min + m 2 min Other operations, similar, complicated formulas abound Need to count number of integer points in a convex polyhedra Latte maximize a linear function over integer points in a polyhedron Latte convex hull, intersection, affine transform Parma ±P PPL

Implementation and experiments Interpreter written in Objective Caml for a simple imperative language Calling out to LattE and Parma, as mentioned In addition to full-precision polyhedra, implemented probabilistic versions of two other domains Intervals – constraints have form c 1 ≤ x ≤ c 2 Can implement counting without LattE Octagons – constraints have form ax + by ≤ c with a,b ∈ {-1,0,1} Experiments on several queries Results for a Mac Pro with two 2.26 GHz quad-core Xeon processors using 16 GB of RAM 53

Queries Birthday query 1 Birthday query 1+2 Birthday query 1+2+special Pizza User in particular age range, in college, lives within 2 mile square Photo User in particular age range, female, engaged Travel Lives in particular country, speaks English, over 21, completed high school 54

Scales better than enumeration = 0 ≤ bday ≤ ≤ byear ≤ 1992 = 0 ≤ bday ≤ ≤ byear ≤ pp > 1 pp 55 each equally likely bday1 small bday 1 large

Performance/precision tradeoff 56 Birthday query 1+2+special

Intervals very fast generally QueryIntervalsOctagonsPolyhedra Bday1 (small) Bday1+2 (small) Bday1+2+spec Bday1 (large) Bday1+2 (large) Bday1+2+spec Pizza Photo Travel Times in seconds All achieve maximum precision when given unlimited polyhedra

LattE is the bottleneck 58

Limitations: Prototype impoverished No real or rational numbers No probabilistic choice where weight is unknown No widening/narrowing operator for better handling loops Little support for structured data These things should all be possible, starting from standard domains for abstract interpretation 59

Ongoing work, other applications Knowledge-based security policies in Symmetric setting A and B are private inputs to Alice and Bob, and Charlie computes f(A,B) and returns the result to both What can Alice and/or Bob learn about the other’s input? See our paper in PLAS’12 Location privacy Apply knowledge policies to locations as secrets Must consider time-varying nature of current location Ongoing work with Neil Toronto and David Van Horn More … ! 60

Summary of contributions 61 Knowledge-based security policies Implementation of belief tracking using probabilistic abstract interpretation Performance results using intervals are promising

BACKUP 62

Creating dependency Dependencies can be created P1P P1P1 strange-query if bday – 1956 ¸ byear - 5 Æ bday – 1956 · byear + 5 then out := 1 else out := 0 * not to scale and/or shape

Differential privacy Encourage participation in a database by protecting inference of a record. Add noise to query result. Requires trusted curator. Suitable for aggregate queries where exact results are not required. Not suitable to a single user protecting him(her)self. Using differential privacy on a query with 2 output values results in the incorrect query output almost half the time (for reasonable ²) 64

Composability / Collusion If collusion is suspected among queries, one can make them share belief query outputs are assumed to have been learned by all of them assumes deterministic queries only Confusion problem: non-deterministic queries can result in a revised belief which is less sure of the true (or some) secret value than the initial belief If collusion is suspected but not present, this can result in incorrect belief Abstract solution: abstract union of the revised belief (colluded) and the prebelief (not colluded) P 1 [ P 2 if ± i 2 °(P i ) for i = 1,2 then ± 1, ± 2 2 °(P 1 [ P 2 ) 65

Setting, Known Secret Measures of badness that weigh their values by the distribution of the secret (or the distribution of the outputs of a query) can be problematic. An unlikely user could theorize a query is safe, but know it is not safe for them Consider conditional min-entropy (see Information-theoretic Bounds for Differentially Private Mechanisms) H 1 (X | Y) = -log 2  y P Y (y) max x P X|Y (x,y) > - log 2 t probability of certain query output Our policy: -log 2 max y max x P X|Y (x,y) > - log 2 t so that -log 2 max x P X|Y (x,yreal) > - log 2 t 66

Abstract Conditioning

Abstract Conditioning approximate P1P1 P2P

Initial distribution bday byear P1P1 1/(37*365) - ² · probability · 1/(37*365) + ² 37*365 · # of points · 37*365 Need attacker’s actual belief to be represented by this P 1 Much easier task than knowing the exact querier belief.