Download presentation
Presentation is loading. Please wait.
1
Differential Privacy (2)
2
Outline Review the basic definition Exponential mechanism
Application in data mining Non-interactive DP
3
Definition Mechanism: K(x) = f(x) + D, D is some noise.
It is an output perturbation method.
4
Sensitivity function How to design the noise D? It is actually linked back to the function f(x) Captures how great a difference must be hidden by the additive noise
5
Adding LAP noise
6
Exponential mechanism
Think about that the Laplacian mechanism… a random sampling mechanism of the Laplacian distribution with the mean of the function output
7
Exponential mechanism
What if we have a multidimensional function that outputs the optimal value at certain point, and we want to guarantee the differential privacy?
8
Exponential mechanism
9
Interactive data mining
10
Consider the simple decision tree algorithm
- Search each possible value in each attribute to find the optimal partitioning point, partition the dataset to two sets - Recursively partition each subset, until a certain condition is met
11
Apply DP to build ID3 decision trees
quality functions for partitioning Example: Information Gain: the amount of reduced entropy by splitting the dataset IG(D, “x<a”) = entropy(D) – n/N entropy(“x<a” of D) - (N-n)/N entropy(“x>=a” of D)
12
Sensitivity of IG
13
Also consider different quality functions
They may give different model quality Gini index
14
More quality functions
Max operator Gini ratio – sensitivity is unbounded
15
Privacy budget e User specified total budget e
Composite operations need a specific e’ for each operation Sum of e’ should be less than e
16
Sketch of the algorithm
17
Experimental evaluation
Using synthetic and reald datasets
18
J48 is a C4.5 implementation = ID3 + pruning
19
Tradeoff between utility and privacy
20
Non-interactive DP Noisy histogram release
21
Non interactive differential privacy
Noisy histogram release Problems: sparse data dramatically increased data size The level of granularity trade-off between privacy and quality
22
Sampling and filtering
Problem: privacy leak
23
Partitioning histogram drill-down, under constraint of privacy budget
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.