Differential Privacy in Practice

Slides:



Advertisements
Similar presentations
Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Advertisements

I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Raef Bassily Adam Smith Abhradeep Thakurta Penn State Yahoo! Labs Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds Penn.
Privacy Enhancing Technologies
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
An brief tour of Differential Privacy Avrim Blum Computer Science Dept Your guide:
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Privacy without Noise Yitao Duan NetEase Youdao R&D Beijing China CIKM 2009.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Differential Privacy (2). Outline  Using differential privacy Database queries Data mining  Non interactive case  New developments.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Private Analysis of Graphs
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
CS573 Data Privacy and Security Statistical Databases
Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.
14 Elements of Nonparametric Statistics
JSM, Boston, August 8, 2014 Privacy, Big Data and The Public Good: Statistical Framework Stefan Bender (IAB)
Differentially Private Marginals Release with Mutual Consistency and Error Independent of Sample Size Cynthia Dwork, Microsoft TexPoint fonts used in EMF.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Privacy Preservation of Aggregates in Hidden Databases: Why and How? Arjun Dasgupta, Nan Zhang, Gautam Das, Surajit Chaudhuri Presented by PENG Yu.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Other Perturbation Techniques. Outline  Randomized Responses  Sketch  Project ideas.
PRISM: Private Retrieval of the Internet’s Sensitive Metadata Ang ChenAndreas Haeberlen University of Pennsylvania.
1 IPAM 2010 Privacy Protection from Sampling and Perturbation in Surveys Natalie Shlomo and Chris Skinner Southampton Statistical Sciences Research Institute.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
Preserving Privacy in GPS Traces via Uncertainty- Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presented by Joseph T. Meyerowitz.
Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.
Foundations of Privacy Lecture 5 Lecturer: Moni Naor.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
A Whirlwind Tour of Differential Privacy
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
Tutorial I: Missing Value Analysis
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
Sergey Yekhanin Institute for Advanced Study Lower Bounds on Noise.
Reconciling Confidentiality Risk Measures from Statistics and Computer Science Jerry Reiter Department of Statistical Science Duke University.
A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose.
Private Data Management with Verification
Understanding Generalization in Adaptive Data Analysis
Location Cloaking for Location Safety Protection of Ad Hoc Networks
Privacy-preserving Release of Statistics: Differential Privacy
Chapter 7: Sampling Distributions
Graph Analysis with Node Differential Privacy
Sublinear Algorithmic Tools 2
Privacy-Preserving Classification
Differential Privacy (2)
Gentle Measurement of Quantum States and Differential Privacy
Published in: IEEE Transactions on Industrial Informatics
DESIGN OF EXPERIMENTS by R. C. Baker
CS639: Data Management for Data Science
Gentle Measurement of Quantum States and Differential Privacy *
Some contents are borrowed from Adam Smith’s slides
Shumin Guo, Keke Chen Data Intensive Analysis and Computing (DIAC) Lab
Differential Privacy (1)
Differential Privacy.
Presentation transcript:

Differential Privacy in Practice Georgios Kellaris

Privacy-preserving data publishing A curator (e.g., a company, a hospital, an institution, etc.) wishes to publish data about its users Third-parties (e.g., research labs, advertising agencies, etc.) wish to learn statistical facts about the published data How can the curator release useful data while preserving user privacy? Users Curator 3rd party

ϵ-differential privacy Publishing statistics can reveal potentially private information Goal: Encourage user participation in the statistical analysis How? Prove that whether they participate in the analysis or not, the revealed information is almost the same

ϵ-differential privacy Prove that the participation of any single user in the published data will not increase the adversary’s knowledge

ϵ-differential privacy Prove that the participation of any single user in the published data will not increase the adversary’s knowledge Simply publishing statistics does not work D the database is hidden l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9

ϵ-differential privacy Prove that the participation of any single user in the published data will not increase the adversary’s knowledge Simply publishing statistics does not work D' D the database is hidden l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c' 2 4 l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 Before publishing, the adversary happens to know everything, except for u9

ϵ-differential privacy Prove that the participation of any single user in the published data will not increase the adversary’s knowledge Simply publishing statistics does not work D' D the database is hidden l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c' 2 4 l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c 2 4 Before publishing, the adversary happens to know everything, except for u9 After publishing, the adversary can find info about u9 Published counts

ϵ-differential privacy Main idea M Randomized Mechanism D t

ϵ-differential privacy Main idea Any output (called transcript) of M is produced with almost the same probability, whether any single user was in the database (D) or not (D’) M Randomized Mechanism D t M D Randomized Mechanism OR t D’

ϵ-differential privacy Main idea Any output (called transcript) of M is produced with almost the same probability, whether any single user was in the database (D) or not (D’) Formal Definition A mechanism M satisfies ϵ-differential privacy if for any two neighboring databases D, D', and for all possible transcripts t M Randomized Mechanism D t Pr⁡[𝑀 𝐷 =𝑡] Pr⁡[𝑀 𝐷′ =𝑡] =1±𝜖

ϵ-differential privacy Red line: Probability to receive a certain t given D Blue line: Probability to receive a certain t given D’ For every t, the ratio of these probabilities must be bounded Pr⁡[𝑀 𝐷 =𝑡] Pr⁡[𝑀 𝐷′ =𝑡] =1±𝜖

Laplace Perturbation Algorithm (LPA) It injects noise to every published statistic D l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c 2 4

Laplace Perturbation Algorithm (LPA) It injects noise to every published statistic c 2 1 4 c 5 8 2 1

Laplace Perturbation Algorithm (LPA) The noise is randomly drawn from a Laplace distribution with mean 0 c 2 1 4 c 5 8 2 1

Laplace Perturbation Algorithm (LPA) The scale of the distribution depends on the sensitivity Δ Δ: maximum amount of statistical information that can be affected by any single user I.e. how much the statistics will change if we remove any single user D D' l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c 2 4 l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 c 2 4

Laplace Perturbation Algorithm (LPA) Main Result If the scale of the noise is λ=Δ/ϵ, LPA satisfies ϵ-differential privacy The higher the noise scale, the less accurate the published statistics Essentially, it hides the presence of any user, by hiding the effect she has on the published statistics

Example We want to count the users at a certain location Any user can affect the count by at most 1 (i.e. Δ=1) True answer 100 if user opts out (D’) 101 if user opts in (D) We add noise Lap(1/ϵ) to the true answer We satisfy ϵ-differential privacy Ratio bounded

Privacy Levels E.g. let Δ=1 and a fixed ϵ If a mechanism M adds noise Lap(aΔ/ϵ) to its statistics, it satisfies ϵ/a-differential privacy E.g. let Δ=1 and a fixed ϵ If a mechanism adds noise Lap(1/ϵ), it satisfies ϵ-differential privacy If a mechanism adds noise Lap(2/ϵ), it satisfies ϵ/2-differential privacy If a mechanism adds noise Lap(3/ϵ), it satisfies ϵ/3-differential privacy Etc.

Composition Theorem [Dwork et al., TCC’06] ϵ1 M1 ϵ2 D M2 ( 𝑖=1 𝑛 𝜖 𝑖 ) -differential privacy … ϵn Mn

ϵ as privacy budget We want to run multiple mechanisms on the same data We want to satisfy ϵ-differential privacy We view ϵ as privacy budget distributed among the mechanisms ϵ

ϵ as privacy budget D ϵ

ϵ as privacy budget ϵ/2 M1 D ϵ

ϵ as privacy budget ϵ/2 M1 D ϵ/2

ϵ as privacy budget ϵ/2 M1 D ϵ/3 M2 ϵ/2

ϵ as privacy budget ϵ/2 M1 D ϵ/3 M2 ϵ/6

ϵ as privacy budget ϵ/2 M1 D ϵ/3 M2 ϵ/6 ϵ/6 M3

ϵ as privacy budget ϵ/2 M1 D ϵ/3 M2 ϵ/6 M3

ϵ as privacy budget We cannot run more mechanisms on the same data! M1 ϵ/2 M1 D ϵ/3 M2 ϵ/6 M3 We cannot run more mechanisms on the same data!

Sampling Reduce the sensitivity by using a sample of the original data Compute the statistics on the sample (maybe less accurate) Add smaller noise for the same privacy level l1 l2 l3 l4 l5 u1 1 u2 u3 u4 u5 u6 u7 u8 u9 l1 l2 l3 l4 l5 u1 1 u3 u4 u7 u9

1st Application Publishing Counts Geo-social network application: User “check-in” data u1 u2 u3 u4 u5 … un l1 1 … l2 1 … l3 1 … l4 1 … l5 1 … l6 1 … l7 1 … l8 1 … l9 1 … l10 1 … l11 1 … l12 1 …

1st Application Publishing Counts Goal: Publish the count of “check-ins” per location with differential privacy u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 33 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18

Sensitivity A user can affect each count by one u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 33 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18

Sensitivity A user can affect each count by one u1 u2 u3 u4 u5 … un c l1 1 … 13 l2 1 … 12 l3 … 1 l4 1 … 50 l5 1 … 39 l6 1 … 9 l7 1 … 59 l8 1 … 32 l9 1 … 7 l10 … 2 l11 1 … 6 l12 1 … 17

Sensitivity Worst case: a user affects every count Sensitivity: 12 u1 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 33 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18

Problem: Too much noise! Laplace Perturbation Algorithm (LPA) u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 33 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18 Sensitivity: 12 Noise scale: 12/ϵ Problem: Too much noise!

Idea Group the counts Smooth via averaging Consider the average value of each group as the count of each group’s column u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 34 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18 20 36 9

Sensitivity Each user can affect each average value by one 20 36 9 u1 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 34 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18 20 36 9

Sensitivity Each user can affect each average value by one Before removing u5, the first group average was (14+13+2+51)/4=20, now it is (13+12+1+50)/4=19 u1 u2 u3 u4 u5 … un c l1 1 … 13 l2 1 … 12 l3 1 … l4 1 … 50 l5 1 … 39 l6 1 … 9 l7 1 … 59 l8 1 … 33 l9 1 … 7 l10 1 … 2 l11 1 … 6 l12 1 … 17 19 35 8

Sensitivity Each user can affect each average value by one Fewer values to publish u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 34 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18 Sensitivity: 3 Noise scale: 3/ϵ 20 36 9

Problem: Arbitrary grouping – Bad Smoothing effect … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 34 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18 20 36 9 Problem: Arbitrary grouping – Bad Smoothing effect

Idea Find optimal grouping by reordering the columns u1 u2 u3 u4 u5 … un c l1 1 … 14 l2 1 … 13 l3 1 … 2 l4 1 … 51 l5 1 … 40 l6 1 … 10 l7 1 … 60 l8 1 … 33 l9 1 … 8 l10 1 … 3 l11 1 … 7 l12 1 … 18

Idea Find optimal grouping by reordering the columns 5 13.75 46 u1 u2 … un c l3 1 … 2 l10 1 … 3 l11 1 … 7 l9 1 … 8 l6 1 … 10 l2 1 … 13 l1 1 … 14 l12 1 … 18 l8 1 … 33 l5 1 … 40 l4 1 … 51 l7 1 … 60 5 13.75 46

Problem: Grouping reveals information – Not differentially private … un c l3 1 … 2 l10 1 … 3 l11 1 … 7 l9 1 … 8 l6 1 … 10 l2 1 … 13 l1 1 … 14 l12 1 … 18 l8 1 … 33 l5 1 … 40 l4 1 … 51 l7 1 … 60 5 13.75 46 Problem: Grouping reveals information – Not differentially private

Challenge: Find “good” groups while retaining differential privacy

Question: How do we sample??? Idea Create two sequential mechanisms, each using budget ϵ/2 First mechanism: find groups on a sample (= lower sensitivity) Second mechanism: group, smooth, add noise, and publish Question: How do we sample???

Row sampling Sample each row with probability 𝛽= 𝑒 𝜖 12 −1 𝑒 𝜖 −1 Sample each row with probability The sensitivity becomes 1 u1 u2 u3 u4 u5 … un c l1 1 … l2 1 … l3 1 … l4 1 … l5 1 … l6 1 … 2 l7 1 … 3 l8 1 … l9 1 … l10 1 … l11 1 … l12 1 … 2

Problem: Bad sampling – Bad grouping … un c l3 1 … 2 l10 1 … 3 l1 1 … 14 l2 1 … 13 l4 1 … 51 l5 1 … 40 l11 1 … 7 l9 1 … 8 l8 1 … 33 l6 1 … 10 l12 1 … 18 l7 1 … 60 8 26.5 30.25 Problem: Bad sampling – Bad grouping

Column sampling Keep all rows Sample a single 1 from each row The sensitivity becomes 1 u1 u2 u3 u4 u5 … un c l1 1 … 7 l2 1 … 6 l3 1 … l4 1 … 20 l5 1 … 16 l6 1 … 5 l7 1 … 29 l8 1 … 15 l9 1 … 4 l10 1 … l11 1 … 3 l12 1 … 8

Good Result 5 13.75 46 u1 u2 u3 u4 u5 … un c l3 … l10 … l11 1 … l9 1 … … 2 l10 … 3 l11 1 … 7 l9 1 … 8 l6 1 … 10 l2 1 … 13 l1 1 … 14 l12 1 … 18 l8 1 … 33 l5 1 … 40 l4 1 … 51 l7 1 … 60 5 13.75 46 Good Result

Experiments

2nd Application Traffic Reporting

2nd Application Traffic Reporting 3 7 9 8 6

Privacy Concerns

Privacy Concerns 3 7 8 8 6

Privacy Concerns Missing user’s position revealed!! 3 7 9 8 6 3 7 8 6 Published Data Adversary’s View 3 7 9 8 6 3 7 8 6 Missing user’s position revealed!!

Laplace Perturbation Algorithm Real Data Noise from Laplace Distribution 3 7 9 8 6 Sensitivity=1 At a specific point of time a user can be only at one location 2 9 7 1 8 Published Data

Streaming Setting

Streaming Setting 6 7 5 2 8 5

Streaming Setting

Streaming Setting 3 7 9 8 6

Streaming Setting Timestamp 1 Timestamp 2 Timestamp 3 Timestamp n 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … …

ϵ-Differential Privacy for Streams Timestamp 1 Timestamp 2 Timestamp 3 Timestamp n 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 Real Data … … LPA LPA LPA LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp n 7 6 1 9 3 7 5 2 6 6 8 4 1 5 Published Data 2 7 5 …

ϵ-Differential Privacy for Streams 3 7 9 8 6 Timestamp 2 5 2 Timestamp 1 Timestamp n Timestamp 3 … LPA 1 4 Event level Noise scale 1/ϵ ϵ-differential privacy at any timestamp User level Noise scale n/ϵ ϵ-differential privacy at all timestamps ϵ ϵ ϵ ϵ 3 7 9 8 6 Timestamp 2 5 2 Timestamp 1 Timestamp n Timestamp 3 … LPA 1 4 ϵ/n ϵ/n ϵ/n ϵ/n

w-Event Differential Privacy Event level ϵ-differential privacy at any timestamp it does not protect user movement noise proportional to 1 w-Event level ϵ-differential privacy at any w consecutive timestamps it protects user movement that lasts at most w timestamps noise proportional to w<<n User level ϵ-differential privacy at all timestamps it protects user movement noise proportional to n

w-Event Differential Privacy Timeline ϵ-differential privacy Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … w=3

w-Event Differential Privacy Timeline ϵ-differential privacy Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … w=3

w-Event Differential Privacy Timeline ϵ-differential privacy Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … w=3

w-Event Differential Privacy Timeline ϵ-differential privacy Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … ϵ1 ϵ2 ϵ3 Guarantee: ϵ1+ϵ2+ϵ3 ≤ ϵ w=3 Challenge: Set the noise/budget on-the-fly

Uniform … … Real Data ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Published Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 …

Uniform … ϵ ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA ϵ/3 LPA Published Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Published Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 …

Sample … … Real Data ϵ LPA LPA LPA ϵ LPA LPA LPA Published Data 6 7 5 Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … ϵ LPA LPA LPA ϵ LPA LPA LPA Timestamp 1 Timestamp 4 Published Data 6 7 5 2 8 3 7 9 8 6 …

Sample … ϵ ϵ LPA LPA LPA ϵ LPA LPA LPA Published Data 6 7 5 2 8 3 7 9 Timestamp 1 Timestamp 4 Published Data 6 7 5 2 8 3 7 9 8 6 …

Observations Uniform may lead to large noise Sample may skip important information Key Idea: If the counts to be published now are similar to the previous counts, skip them i.e., approximate them with the previous publication The similarity calculation is done in a special (private) manner – details omitted

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA ϵ/2 ϵ/4 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/4 ϵ/4 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/8 LPA 3ϵ/8 3ϵ/4 Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/8 LPA 3ϵ/8 Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/8 LPA 3ϵ/8 Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6

Budget Distribution Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/8 LPA LPA 3ϵ/8 5ϵ/8 Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6

Budget Distribution … … Real Data Available Budget: ϵ/2 LPA LPA ϵ/4 Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … Available Budget: ϵ/2 LPA LPA ϵ/4 LPA 3ϵ/8 LPA LPA LPA 5ϵ/8 Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6 …

Budget Distribution … (3ϵ)/4 ≤ϵ (5ϵ)/8 ≤ϵ (3ϵ)/8 ≤ϵ ϵ/2 LPA LPA ϵ/4 3ϵ/8 LPA LPA LPA Timestamp 1 Timestamp 3 Timestamp 4 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6 …

Budget Absorption Real Data ϵ/3 LPA ϵ/3 LPA 2ϵ/3 ϵ/3 LPA ϵ/3 ϵ/3 ϵ/3 Timestamp 1 Timestamp 2 Timestamp 3 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 ϵ/3 LPA ϵ/3 LPA 2ϵ/3 ϵ/3 LPA ϵ/3 ϵ/3 ϵ/3 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Absorption Real Data ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 ϵ/3 Published Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 ϵ/3 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Absorption Real Data ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 ϵ/3 Published Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 ϵ/3 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Absorption Real Data ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 LPA 2ϵ/3 ϵ/3 Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 ϵ/3 LPA LPA 2ϵ/3 LPA ϵ/3 LPA 2ϵ/3 ϵ/3 Timestamp 1 Timestamp 3 Published Data 6 7 5 2 8 6 7 5 2 8

Budget Absorption … … Real Data ϵ/3 LPA LPA 2ϵ/3 LPA LPA 2ϵ/3 LPA Timestamp 1 Timestamp 2 Timestamp 3 Timestamp 4 Timestamp 5 Timestamp 6 Real Data 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 6 7 5 2 8 3 7 9 8 6 … ϵ/3 LPA LPA 2ϵ/3 LPA LPA 2ϵ/3 LPA Timestamp 1 Timestamp 3 Timestamp 6 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6 …

Budget Absorption … ϵ (2ϵ)/3 ≤ϵ ϵ/3 LPA LPA 2ϵ/3 LPA LPA 2ϵ/3 LPA Timestamp 1 Timestamp 3 Timestamp 6 Published Data 6 7 5 2 8 6 7 5 2 8 3 7 9 8 6 …

Experiments Rome dataset World Cup dataset