Differential Privacy and Statistical Inference: A TCS Perspective

Slides:



Advertisements
Similar presentations
Inference in the Simple Regression Model
Advertisements

Why Simple Hash Functions Work : Exploiting the Entropy in a Data Stream Michael Mitzenmacher Salil Vadhan And improvements with Kai-Min Chung.
Chapter 4 Inference About Process Quality
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Statistics for Business and Economics
Estimation from Samples Find a likely range of values for a population parameter (e.g. average, %) Find a likely range of values for a population parameter.
Mean for sample of n=10 n = 10: t = 1.361df = 9Critical value = Conclusion: accept the null hypothesis; no difference between this sample.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
8-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft.
Copyright ©2011 Pearson Education 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft Excel 6 th Global Edition.
Current Developments in Differential Privacy Salil Vadhan Center for Research on Computation & Society School of Engineering & Applied Sciences Harvard.
Confidence Intervals and Hypothesis Testing - II
Private Analysis of Graphs
Myopic Policies for Budgeted Optimization with Constrained Experiments Javad Azimi, Xiaoli Fern, Alan Fern Oregon State University AAAI, July
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
PARAMETRIC STATISTICAL INFERENCE
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Confidence Intervals: The Basics BPS chapter 14 © 2006 W.H. Freeman and Company.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 9 1 MER301:Engineering Reliability LECTURE 9: Chapter 4: Decision Making for a Single.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
LECTURE 25 THURSDAY, 19 NOVEMBER STA291 Fall
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 6 Hypothesis Tests with Means.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
BIOSTATISTICS Hypotheses testing and parameter estimation.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Ex St 801 Statistical Methods Part 2 Inference about a Single Population Mean (HYP)
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Inference: Conclusion with Confidence
Review of Power of a Test
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
ESTIMATION.
Making inferences from collected data involve two possible tasks:
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inference about Two Means - Independent Samples
Techniques for Achieving Vertex Level Differential Privacy
One-Sample Inference for Proportions
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Inference: Conclusion with Confidence
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Understanding Generalization in Adaptive Data Analysis
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Privacy-preserving Release of Statistics: Differential Privacy
Graph Analysis with Node Differential Privacy
Privacy as a tool for Robust Mechanism Design in Large Markets
When we free ourselves of desire,
Inference on Mean, Var Unknown
Differential Privacy in Practice
Vitaly (the West Coast) Feldman
Current Developments in Differential Privacy
Statistical Inference for the Mean Confidence Interval
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Section 7.7 Introduction to Inference
Private Statistics: A TCS Perspective
Generalization bounds for uniformly stable algorithms
CS639: Data Management for Data Science
2.) A. Incorrect, the prob. Is either 0 or 1, but we don’t know which.
Significance Tests: The Basics
Presentation transcript:

Differential Privacy and Statistical Inference: A TCS Perspective Salil Vadhan Center for Research on Computation & Society School of Engineering & Applied Sciences, Harvard University Simons Institute Data Privacy Planning Workshop May 23, 2017

Two views of data analysis Traditional TCS Algorithms (and much of DP literature): 𝑥 M 𝑌 “utility” 𝑈(𝑥,𝑌) Statistical Inference 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 Population 𝑃 𝑌 M random sampling “utility” 𝑈(𝑃,𝑌)

Statistical Inference with DP 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 Population 𝑃 𝑌 M random sampling “utility” 𝑈(𝑃,𝑌) Desiderata: M differentially private [worst-case] Utility maximized [average-case over 𝑋=( 𝑋 1 ,…, 𝑋 𝑛 ), 𝑌 worst-case (frequentist) or average-case (Bayesian) over 𝑃] Example: Differentially private PAC Learning [Kasiviswanathan-Lee-Nissim-Raskhodnikova-Smith `08].

Natural Two-Step Approach 1. Start with the “best” non-private inference procedure 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 Population 𝑃, mean 𝜇 𝑋 𝑀 𝑛𝑝 𝑋 −𝜇 =Θ 1 𝑛 2. Approximate it as well as possible with a DP procedure 𝑥 1 𝑥 2 ⋮ 𝑥 𝑛 𝑌= 𝑥 +Lap(.) 𝑀 𝑑𝑝 𝑌− 𝑥 =Θ 1 𝑛

Privacy for Free? 𝑀 𝑑𝑝 𝑋 1 𝑋 2 ⋮ 𝑌= 𝑋 +Lap(.) 𝑋 𝑛 𝑌−𝜇 = 1+o 1 ⋅ 𝑋 −𝜇 Population 𝑃, mean 𝜇 𝑌= 𝑋 +Lap(.) 𝑀 𝑑𝑝 𝑌−𝜇 = 1+o 1 ⋅ 𝑋 −𝜇

Limitations of Two-Step Approach Asymptotics hides important parameters. 𝜎 𝑛 ≫ 𝑅 𝜖 𝑛 only when 𝑛≫ 𝑅 𝜎𝜖 2 , often huge! Some parameters (e.g. 𝜎= σ[𝑃]) may be unknown. Can draw wildly incorrect inferences at finite 𝑛 [“DP kills”] Requiring 𝑀 𝑑𝑝 𝑥 ≈ 𝑀 𝑛𝑝 𝑥 on worst-case inputs may be overkill (and even impossible), e.g. if range 𝑅 unbounded Optimal non-private procedure may not yield optimal differentially private procedure.

A Different Two-Step Approach “summary” of dataset 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 Population 𝑃 𝑌 𝑍 M T “utility” 𝑈(𝑃,𝑍) DP mechanism Post-processing Naïve application runs into similar difficulties as before. [Feinberg-Rinaldo-Yang `11, Karwa-Slavkovic `12]. Approaches for addressing these problems in [Vu-Slavkovic `09, McSherry-Williiams `10, Karwa-Slavkovic `12, ...].

Take-Away Messages M Study overall design problem: 𝑋 1 𝑋 2 ⋮ 𝑌 𝑋 𝑛 Population 𝑃 𝑌 M random sampling “utility” 𝑈(𝑃,𝑌) Study overall design problem: M differentially private [worst-case] Utility maximized [average-case over 𝑋=( 𝑋 1 ,…, 𝑋 𝑛 ), 𝑌 worst-case (frequentist) or average-case (Bayesian) over 𝑃] [Kasiviswanathan et al. `08, Dwork-Lei `09, Smith `10, Wasserman-Zhou `10, Hall-Rinaldo-Wasserman `13, Duchi-Jordan-Wainwright `12 & `13, Barber-Duchi `14,… ]

Take-Away Messages M 2. Ensure “soundness”: 𝑋 1 𝑋 2 ⋮ 𝑌 𝑋 𝑛 Population 𝑃 𝑌 M random sampling “utility” 𝑈(𝑃,𝑌) 2. Ensure “soundness”: Prevent incorrect conclusions even at small 𝑛. OK to declare “failure”. [Vu-Slakvovic `09, McSherry-Williiams `10, Karwa-Slavkovic `12, ...].

Example 1: Confidence Intervals [Karwa-Vadhan `17] 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 𝑃=𝑁(𝜇, 𝜎 2 ) 𝜇∈ −𝑅,𝑅 𝜎∈[ 𝜎 𝑚𝑖𝑛 , 𝜎 𝑚𝑎𝑥 ] 𝐼⊆ℝ M Requirements: Privacy: 𝑀 𝜖-differentially private. Coverage (“soundness”): ∀ 𝑛, 𝜇, 𝜎, 𝜖, Pr 𝜇∈𝐼 ≥ .95. Goal: Length (“utility”): minimize E[|I|].

Example 1: Confidence Intervals [Karwa-Vadhan `17] 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 𝑃=𝑁(𝜇, 𝜎 2 ) 𝜇∈ −𝑅,𝑅 𝜎 known 𝐼⊆ℝ M Upper Bound: there is an 𝜖-DP algorithm 𝑀 achieving E[|𝐼|] ≤ 2𝑧 .975 ⋅ 𝜎 𝑛 + 𝜎 𝜖 ⋅ 𝑂 1 𝑛 non-private length provided that 𝑛≳ 𝑐 𝜖 log 𝑅 𝜎

Example 1: Confidence Intervals [Karwa-Vadhan `17] 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 𝑃=𝑁(𝜇, 𝜎 2 ) 𝜇∈ −𝑅,𝑅 𝜎 known 𝐼⊆ℝ M Upper Bound: there is an 𝜖-DP algorithm 𝑀 achieving E[|𝐼|] ≤ 2𝑧 .975 ⋅ 𝜎 𝑛 + 𝜎 𝜖 ⋅ 𝑂 1 𝑛 Lower Bound: Must have either E[|𝐼|] ≥𝑅/2 or both E[|𝐼|] ≥ 𝜎 𝜖𝑛 and 𝑛≳ 𝑐 𝜖 log 𝑅 𝜎 provided that 𝑛≳ 𝑐 𝜖 log 𝑅 𝜎

Example 2: Hypothesis Testing [Vu-Slavkovic `09, Uhler-Slavkovic-Feinberg `13, Yu-Feinberg-Slavkovic-Uhler `14, Gaboardi-Lim-Rogers-Vadhan `16, Wang-Lee-Kifer `16, Kifer-Rogers `16] 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 𝑃 distribution on X M 0 or 1 Requirements: Privacy: 𝑀 𝜖-differentially private. Significance (Type I error): for all 𝑛, 𝜖 if 𝑃= 𝐻 0 then Pr 𝑀 𝑋 =0 ≥ .95. Goal: Power (Type II error): if 𝑃 “far” from 𝐻 0 , then Pr 𝑀 𝑋 =1 “large”

Example 2: Hypothesis Testing [Cai-Daskalakis-Kamath `17] 𝑋 1 𝑋 2 ⋮ 𝑋 𝑛 𝑃 distribution on X M 0 or 1 Requirements: Privacy: 𝑀 𝜖-differentially private. Significance (Type I error): for all 𝑛, 𝜖,𝛾 if 𝑃= 𝐻 0 then Pr 𝑀 𝑋 =0 ≥ .95. Goal: Power (Type II error): if 𝑑 𝑇𝑉 𝑃, 𝐻 0 ≥𝛾, then Pr 𝑀 𝑋 =1 ≥.95.

Challenges for Future Research I Study more sophisticated inference problems e.g. confidence intervals for multivariate gaussians with unknown covariance matrix (related to least-squares regression [Sheffet `16]) What asymptotics are acceptable? Much of statistical inference relies on calculating asymptotic distributions (e.g. via CLT); often reliable even when 𝑛 small. [Kifer-Rogers `17]: assume that 𝜖=Ω( 1 𝑛 ).

Challenges for Future Research II Can we rigorously analyze effect of privacy even when non-private algorithms don’t have rigorous analyses? e.g. in hypothesis testing, privacy needs at most 𝑂( 1 𝜖 ) blow-up in sample size… but this is suboptimal [Cai-Daskalakis-Kamath `17]. Lower Bounds Most existing techniques prove lower bounds on some kind of inference problems. We should explicitly state these! Does privacy have an inherent cost even when 𝑛 is large? e.g. must DP confidence intervals have length E[|𝐼|] ≥ 2𝑧 .975 ⋅ 𝜎 𝑛 +Ω 𝜎 𝜖𝑛 ?