Download presentation
Presentation is loading. Please wait.
Published byBrice Casey Modified over 8 years ago
1
Preservation of Proximity Privacy in Publishing Numerical Sensitive Data J. Li, Y. Tao, and X. Xiao SIGMOD 08 Presented by Hongwei Tian
2
Outline What is PPDP Existing Privacy Principles Proximity Attack (ε, m)-anonymity Determine ε and m Algorithm Experiments and Conclusion
3
Privacy Preservation Data Publishing A true story in Massachusetts, 1997 GIC 20 dollars Governor Weld
4
PPDP Privacy Sensitive information of individuals should be protected in the published data More anonymized data Utility The published data should be useful More accurate data
5
PPDP Anonymization Technique Generalization Specific value -> General value Maintain the semantic meaning 78256 -> 7825*, UTSA -> University, 28 -> [20, 30] Perturbation One value -> another random value Huge information loss -> poor utility
6
PPDP Example of Generalization
7
Some Existing Privacy Principles Generalization SA – Categorical k-anonymity l-diversity, (α, k)-anonymity, m-invariance, … (c, k)-safety, Skyline-privacy … SA – Numerical (k, e)-anonymity, Variance Control t-closeness δ-presence …
8
Next… What is PPDP Existing Privacy Principles Proximity Attack (ε, m)-anonymity Determine ε and m Algorithm Experiments and Conclusion
9
Proximity Attack
10
(ε, m)-anonymity I(t) private neighborhood of tuple t I(t) = [t.SA − ε, t.SA + ε] I(t) = [t.SA·(1 − ε), t.SA·(1 + ε)] P(t) the risk of proximity breach of tuple t P(t) = x / |G|
11
(ε, m)-anonymity ε = 20 I(t1) = [980, 1020] x = 3, |G| = 4 P(t1) = 3/4
12
(ε, m)-anonymity Principle Given a real value ε and an integer m ≥ 1, a generalized table T ∗ fulfills absolute (relative) (ε,m)-anonymity, if P(t) ≤ 1/m for every tuple t ∈ T. Larger ε and m mean stricter privacy requirement
13
(ε, m)-anonymity What is the Meaning of m? |G| ≥ m The best situation is for any two tuples t i and t j in G, and Similar to l-diversity when the equivalence class has l tuples with distinct SA values.
14
(ε, m)-anonymity How to make t j.SA does not fall in I(t i )? All tuples in G are sorted in ascending order of their SA values | j – i | ≧ max{ |left(t j,G)|, |right(t i,G)| }
15
(ε, m)-anonymity Let maxsize(G) = max ∀ t ∈ G { max{ |left(t,G)|, |right(t,G)| } } | j – i | ≧ maxsize(G)
16
(ε, m)-anonymity Partitioning Ascending order of tuples in G according to SA values Hash the ith tuple into the jth bucket using function j = (i mod maxsize(G))+1 Thus, all tuples (SA values) in the same bucket do not fall into the neighborhood of each other.
17
(ε, m)-anonymity (6, 2)-anonymity Privacy is breached P(t 3 )= ¾ >1/m =1/2 Need partitioning An ascending order is ready according to SA values g = maxsize(G) = 2 j = (i mod 2)+1 New P(t 3 )= 1/2 tupleNoQISA 1q10 2q20 3q25 4q30
18
Determine ε and m Given ε and m Check if an equivalence class G satisfies (ε, m)- anonymity Theorem: G has at least one (ε, m)-anonymous generalization, iff Scan the sorted tuples in G to find maxsize(G) Predict whether G can be partitioned or not
19
Algorithm Step 1: Splitting Mondrain, ICDE 2006. Splitting is only based on QI-attributes Iteratively find median value of frequency sets on one selected QI-dimension to cut G into G1 and G2, and make sure G1 and G2 are legal to be partitioned.
20
Algorithm Splitting ((6, 2)-anonymity) 20 30 25 1040 50
21
Algorithm Step 2: Partitioning After step 1 stops Check all G produced by splitting Release directly if G satisfies (ε, m)-anonymity Otherwise, Partitioning, and then release new buckets
22
Algorithm Partitioning ((6, 2)-anonymity) 20 30 25 1040 50
23
Next… What is PPDP Evolution of Privacy Preservation Proximity Attack (ε, m)-anonymity determine ε and m algorithm Experiments and Conclusion
24
Experiments Real Database SAL http://ipums.orghttp://ipums.org Attributes are Age, Birthplace, Occupation and Income with domains [16,93], [1,710], [1,983], and [1k, 100k], respectively. 500K tuples Compare to a perturbation method (OLAP, SIGMOD 2005 )
25
Experiments - Utility Use count query with workload = 1000
26
Experiments - Utility
27
Experiments - Efficiency
28
Conclusion Discuss most of existing privacy principles in PPDP Identify the proximity attack and propose (ε, m)-anonymity to prevent this attack Verify that the method is effective and efficient experimentally
29
Any Question?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.