Privacy-preserving Prediction

Privacy-preserving Prediction
Vitaly Feldman Brain with Cynthia Dwork

Privacy-preserving learning
Input: dataset 𝑆=( 𝑥 1 , 𝑦 1 ),…,( 𝑥 𝑛 , 𝑦 𝑛 ) Goal: given 𝑥 predict 𝑦 𝑠 Differentially private learning algorithm 𝐴 Model ℎ 𝐴(𝑆) 𝐴(𝑆′)

Trade-offs Linear regression in ℝ 𝒅
With 𝜖-DP needs factor Ω 𝑑 𝜖 more data [Bassily,Smith,Thakurta 14] Learning a linear classifier over {0,1} 𝑑 Needs factor Ω 𝑑 𝜖 more data [Feldman,Xiao 13] MNIST accuracy ≈𝟗𝟓% with small 𝜖, 𝛿 vs 99.8% without privacy [AbadiCGMMTZ 16]

Prediction 𝑠 Users need predictions not models
Fits many existing systems 𝑝 1 ∈𝑋 Prediction API 𝑠 𝑣 2 𝑝 2 𝑣 2 𝑝 𝑡 𝑣 𝑡 Users Given that many existing applications DP

Attacks Black-box membership inference with high accuracy
[Shokri,Stronati,Song,Shmatikov 17; LongBWBWTGC 18; SalemZFHB 18]

Learning with DP prediction
Accuracy-privacy trade-off Single prediction query Differentially private prediction : 𝑀: 𝑋×𝑌 𝑛 ×𝑋→𝑌 is 𝜖-DP prediction algorithm if for every 𝑥∈𝑋, 𝑀(𝑆,𝑥) is 𝜖-DP private w.r.t. 𝑆

Differentially private aggregation
Label aggregation [HCB 16; PAEGT 17; PSMRTE 18; BTT 18] 𝑆 𝑆 1 𝑆 2 𝑆 3 ⋯ 𝑆 𝑘 𝑛=𝑘𝑚 ⋯ 𝐴 ℎ 1 ℎ 2 ℎ 3 ℎ 𝑘−2 ℎ 𝑘−1 ℎ 𝑘 ⋯ (non-DP) learning algo 𝐴 𝑥 ℎ 1 (𝑥) ℎ 2 (𝑥) ℎ 3 (𝑥) ℎ 𝑘−2 (𝑥) ℎ 𝑘−1 (𝑥) ℎ 𝑘 (𝑥) Differentially private aggregation 𝑦 e.g. exponential mechanism 𝑦∝ 𝑒 𝜖 | 𝑖 ℎ 𝑖 𝑥 =𝑦}|/2

Classification via aggregation
PAC model: Let 𝐶 be a class of function over 𝑋 For all distributions 𝑃 over 𝑋×{0,1} output ℎ such that w.h.p. 𝐏𝐫 (𝑥,𝑦)∼𝑃 ℎ 𝑥 ≠𝑦 ≤Op t 𝑃 𝐶 +𝛼 Non-private 𝝐-DP prediction 𝝐-DP model Θ VCdim 𝐶 𝑛 Θ VCdim 𝐶 𝜖𝑛 Θ Rdim 𝐶 𝜖𝑛 𝛼 Realizable case: Θ VCdim 𝐶 𝑛 𝑂 VCdim 𝐶 𝜖𝑛 1/3 + Θ Rdim 𝐶 𝜖𝑛 Agnostic: Representation dimension [Beimel,Nissim,Stemmer 13] VCdim 𝐶 ≤ Rdim 𝐶 ≤ VCdim 𝐶 ⋅log⁡|𝑋| [KLNRS 08] For many classes Rdim 𝐶 =Ω( VCdim 𝐶 ⋅ log 𝑋 ) [F.,Xiao 13]

Prediction stability À la [Bousquet,Elisseeff 02]:
𝐴: 𝑋×𝑌 𝑛 ×𝑋→ℝ is uniformly 𝛾-stable algorithm if for every, neighboring 𝑆,𝑆′ and 𝑥∈𝑋, 𝐴 𝑆,𝑥 −𝐴 𝑆 ′ ,𝑥 ≤𝛾 Convex regression: given 𝐹= 𝑓 𝑤,𝑥 𝑤∈𝐾 For 𝑃 over 𝑋×𝑌, minimize: ℓ 𝑃 𝑤 = 𝐄 (𝑥,𝑦)∼𝑃 [ℓ(𝑓 𝑤,𝑥 ,𝑦)] over convex 𝐾⊆ ℝ 𝑑 , where ℓ(𝑓 𝑤,𝑥 ,𝑦) is convex in 𝑤 for all 𝑥,𝑦 Convex 1-Lipschitz regression over ℓ 2 ball of radius 1: Non-private 𝝐-DP prediction 𝝐-DP model Θ 1 𝑛 𝑂 1 𝜖𝑛 Ω 1 𝑛 + 𝑑 𝜖𝑛 Excess loss:

DP prediction implies generalization
Beyond aggregation Threshold functions on a line 1 𝑚 Excess error for agnostic learning Non-private 𝝐-DP prediction 𝝐-DP model Θ 1 𝑛 Θ 𝑛 + 1 𝜖𝑛 Θ 1 𝑛 + log 𝑚 𝜖𝑛 DP prediction implies generalization

Conclusions Natural setting for learning with privacy
Better accuracy-privacy trade-off Paper (COLT 2018): Open problems: General agnostic learning Other general approaches Handling of multiple queries [BTT 18]

Privacy-preserving Prediction

Similar presentations

Presentation on theme: "Privacy-preserving Prediction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Privacy-preserving Prediction

Similar presentations

Presentation on theme: "Privacy-preserving Prediction"— Presentation transcript:

Similar presentations

About project

Feedback