PROBABILISTIC PROGRAMMING FOR SECURITY Michael Hicks Piotr (Peter) Mardziel University of Maryland, College Park Stephen Magill Galois Michael Hicks UMD Mudhakar Srivatsa IBM TJ Watson Jonathan Katz UMD Mário Alvim UFMG Michael Clarkson Cornell Arman Khouzani Royal Holloway Carlos Cid Royal Holloway
Part 1 Machine learning ≈ Adversary learning Part 2 Probabilistic Abstract Interpretation Part 3 ~1 minute summary of our other work 2
Part 1 Machine learning ≈ Adversary learning Part 2 Probabilistic Abstract Interpretation Part 3 ~1 minute summary of our other work 3
“Machine Learning” 4 Today = not-raining weather 0.55 : Outlook = sunny 0.45 : Outlook = overcast “Forward” Model
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather “Forward” Model Prior
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather 0.82 : Today = not-raining 0.18 : Today = raining Outlook = sunny inference Posterior “Forward” Model “Backward” Inference Prior Observation
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather Samples: Today = not-raining Today = raining … Outlook = sunny inference* Posterior Samples “Forward” Model “Backward” Inference Prior Observation
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather 0.82 : Today = not-raining 0.18 : Today = raining Outlook = sunny inference* Posterior “Forward” Model “Backward” Inference Prior Observation
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather 0.82 : Today = not-raining 0.18 : Today = raining Outlook = sunny inference* Posterior “Forward” Model “Backward” Inference Prior Observation Classification Today=not-raining
“Machine Learning” : Today = not-raining 0.5 : Today = raining weather 0.82 : Today = not-raining 0.18 : Today = raining Outlook = sunny inference* Posterior “Forward” Model “Backward” Inference Prior Observation Classification Today=not-raining Reality Accuracy/Error
Adversary learning : Pass = “password” : Pass = “12345” : Pass = … Auth(“password”) : Pass = “12345” Login=failed inference Posterior “Forward” Model “Backward” Inference Prior Observation $$ Exploitation Pass=“12345” Reality Vulnerability
Different but Same 12 PPL for machine learningPPL for security Model/program of prior Model/program of observation Inference + can be approximate + can be a sampler Inference - cannot be approximate + can be sound - cannot be a sampler ClassificationExploitation Accuracy/Error + compare inference algorithms Vulnerability measures + compare observation functions (with/without obfuscation, …) Deploy classifierDeploy protection mechanism
Different but Same 13 PPL for machine learningPPL for security Model/program of prior Model/program of observation Inference + can be approximate + can be a sampler Inference - cannot be approximate + can be sound - cannot be a sampler ClassificationExploitation Accuracy/Error + compare inference algorithms Vulnerability measures + compare observation functions (with/without obfuscation, …) Deploy classifierDeploy protection mechanism
Distributions δ : S [0,1] 14 all distributions over S Inference visualized δ δ' δ’’ δ’’’ prior inference Accuracy
Distributions δ : S [0,1] 15 all distributions over S Inference visualized δ δ' δ’’ δ’’’ prior inference Vulnerability
16 Vulnerability scale δ δ' δ’’δ’’’ prior inference Vulnerability
17 Information flow δ δ' δ’’δ’’’ prior inference Vulnerability information “flow”
18 Issue: Approximate inference δ δ' δ’’δ’’’ prior inference Approximate inference Vulnerability exact inference
19 Sound inference δ δ' δ’’δ’’’ prior inference Approximate, but sound inference Vulnerability exact inference
20 Issue: Complexity δ prior inference Vulnerability δ' δ’’ δ’’’
21 Issue: Prior δ prior Vulnerability
22 Worst-case prior δ wc worst-case prior Vulnerability δ δ' actual prior inference information “flow” δ’ wc w.c. information “flow”
23 Issue: Prior δ prior Vulnerability
24 Differential Privacy δ prior Vulnerability
25 Issue: Prior δ prior Vulnerability
Part 1 Machine learning ≈ Adversary learning Part 2 Probabilistic Abstract Interpretation Part 3 ~1 minute summary of our other work 26
27 all distributions over S Probabilistic Abstract Interpretation δ δ' δ’’ δ’’’ prior inference Vulnerability Abstract prior abstract inference
Part 2: Probabilistic Abstract Interpretation Standard PL lingo Concrete Semantics Abstract Semantics Concrete Probabilistic Semantics Abstract Probabilistic Semantics 28
(Program) States σ : Variables Integers Concrete semantics: [[ Stmt ]] : States States 29 All states over {x,y} Concrete Interpretation {x 1,y 1} {x 1,y 2} [[ y := x + y ]] [[ if y >= 2 then x := x + 1 ]] {x 2,y 2} x y
Abstract Program States AbsStates Concretization: γ(P) := { σ s.t. P(σ) } Abstract Semantics: > : AbsStates AbsStates Example: intervals Predicate P is a closed interval on each variable γ(1≤x≤2, 1≤y≤1) = all states that assign x between 1 and 2, and y = 1 30 All states over {x,y} Abstract Interpretation (1≤x≤2,1≤y≤1) (1≤x≤2,3≤y≤4)(1≤x≤3,3≤y≤4) > = 4 then x := x + 1 >> x y
Abstract Program States AbsStates Concretization: γ(P) := { σ s.t. P(σ) } Abstract Semantics: > : AbsStates AbsStates Example: intervals Predicate P is a closed interval on each variable γ(1≤x≤2, 1≤y≤1) = all states that assign x between 1 and 2, and y = 1 31 All states over {x,y} Abstract Interpretation (1≤x≤2,1≤y≤1) (1≤x≤2,3≤y≤4)(1≤x≤3,3≤y≤4) > = 4 then x := x + 1 >> x y σ σ' [[ y := x + 2*y ]]
Probabilistic Interpretation Concrete Abstraction Abstract semantics 32
Concrete Probabilistic Semantics (sub)distributions δ : States [0,1] Semantics skipδ = δ S 1 ; S 2 δ = S 2 (S 1 δ) if B then S 1 else S 2 δ = S 1 (δ ∧ B) + S 2 (δ ∧ ¬B) pif p then S 1 else S 2 δ = S 1 (p*δ) + S 2 ((1-p)*δ) x := Eδ = δ[x E] while B do S = lfp (λF. λδ. F(S(δ | B)) + (δ | ¬B)) p*δ – scale probabilities by p p*δ := λσ. p*δ(σ) δ ∧ B – remove mass inconsistent with B δ ∧ B := λσ. if Bσ = true then δ(σ) else 0 δ 1 + δ 2 – combine mass from both δ 1 + δ 2 := λσ. δ 1 (σ) + δ 2 (σ) δ[x E] – transform mass
+ y := y – 3(δ ∧ x > 5) Subdistribution operations δ ∧ B – remove mass inconsistent with B δ ∧ B = λσ. if Bσ = true then δ(σ) else 0 δB = x ≥ y δ ∧ B δ 1 + δ 2 – combine mass from both δ 1 + δ 2 = λσ. δ 1 (σ) + δ 2 (σ) δ1δ1 δ2δ2 δ 1 + δ 2 if x ≤ 5 then y := y + 3 else y := y - 3δ δ δ ∧ x ≤ 5 δ ∧ x > 5 y := y + 3(δ ∧ x ≤ 5) y := y – 3(δ ∧ x > 5) SδSδ = y := y + 3(δ ∧ x ≤ 5)
Subdistribution Abstraction 35
Subdistribution Abstraction: Probabilistic Polyhedra P Region of program states (polyhedron) + upper bound on probability of each possible state in region + upper bound on the number of (possible) states + upper bound on the total probability mass (useful) + also lower bounds on the above Pr[A | B] = Pr[A ∩ B] / Pr[B] 36 V(δ) = max σ δ(σ)
Abstraction imprecision abstract P1P1 P2P2 37 exact
38 all distributions over S Probabilistic Abstract Interpretation δ δ' δ’’ δ’’’ prior inference Abstract prior P abstract inference Define > P Soundness: if δ ∈ γ(P) then Sδ ∈ γ ( >P) Abstract versions of subdistribution operations P 1 + P 2 P ∧ B p*P
Example abstract operation 39 δ 1 (σ) σ(x) δ1δ1 p 1 max p 1 min δ 2 (σ) σ(x) δ2δ2 p 2 max p 2 min + δ 3 (σ) σ(x) δ 3 := δ 1 + δ 2 {P 3, P 4, P 5 } = {P 1 } + {P 2 }
Conditioning Concrete Abstract: Lower bound on total mass
Simplify representation Limit number of probabilistic polyhedra P 1 ± P 2 - merge two probabilistic polyhedra into one Convex hull of regions, various counting arguments
Add and simplify 42 δ 1 (σ) σ(x) δ1δ1 p 1 max p 1 min δ 2 (σ) σ(x) δ2δ2 p 2 max p 2 min ± δ 3 (σ) σ(x) δ 3 := δ 1 + δ 2 {P 3 } = {P 1 } ± {P 2 }
Primitives for operations Need to Linear Model Counting: count number of integer points in a convex polyhedra Integer Linear Programming: maximize a linear function over integer points in a polyhedron
44 all distributions over S Probabilistic Abstract Interpretation δ δ' δ’’ δ’’’ prior inference Vulnerability Abstract prior abstract inference P P’ P’’ P’’’ Conservative (sound) vulnerability bounds
Part 3 45 [CSF11,JCS13] Limit vulnerability and computational aspects of probabilistic semantics [PLAS12] Limit vulnerability for symmetric cases [S&P14,FCS14] Measure vulnerability when secrets change over time [CSF15] onwards Active defense game theory See
Abstract Conditioning
Abstract Conditioning approximate P1P1 P2P