Download presentation
Presentation is loading. Please wait.
Published byPaula Ford Modified over 9 years ago
1
Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian
2
Outline What is Privacy Breaching? Drawback of K-Anonymity and ℓ-Diversity Personalized Anonymous How Adversary attacks? How Data owner defeats the attacks? Experiments
3
What is Privacy Breaching Mainly, there are two classes Compare prior belief and posterior belief prior belief < posterior belief : help adversary 50% --> 80% “Bob has cancer” prior belief > posterior belief : reasonable? 80% --> 50% “Bob has cancer” =>I don’t think so. And many others have the same thought with me.
4
What is Privacy Breaching Only posterior belief K-Anonymity: posterior belief ≤ 1/ k In a QI group, there are at least k tuples. ℓ-Diversity: posterior belief = p ≤ threshold In a QI group, p percent of the tuples appear in the largest sub-group Personalized: posterior belief = Pr breach ≤ threshold Pr breach : Breaching probability
5
Drawback of K-Anonymity and ℓ-Diversity a k-anonymous table only prevents association between individuals and tuples. ℓ-Diversity and Personalized methods both prevent association between individuals and sensitive values.
6
Drawback of K-Anonymity and ℓ-Diversity a k-anonymous table may lose considerable information ℓ-Diversity also has this problem.
7
Drawback of K-Anonymity and ℓ-Diversity Consider such a situation: In one QI group, all tuples come from the same individual v Adversary only knows v in this QI group from external datasets I am Bob and I am unlucky that I have so many diseases Bob must be here Aha, I know Bob has four diseases
8
Drawback of K-Anonymity and ℓ-Diversity Do not take into account personal anonymity requirements
9
Personalized Anonymous personalized anonymity: a person can specify the degree of privacy protection for her/his sensitive values. So far, the literature has focused on a universal approach that exerts the same amount of privacy preserving for all persons, without catering for their concrete needs.
10
Personalized Anonymous
11
BREACH PROBABILITY: For a tuple t ∈ T, its breach probability P breach (t) equals the probability that an adversary can infer from T ∗ that any of the associations {o, v 1 },..., {o, v x } exists in T, where v 1,..., v x are the leaf values in SUBTR(t.GN).
12
Personalized Anonymous BREACH PROBABILITY Data owner and Adversary both can compute it Data owner want to P breach (t) < threshold, then the privacy of the individual corresponding to t holds Adversary hope to get a P breach (t) > threshold, which breaches the privacy of the individual. How the adversary do (attack)? ???
13
How Adversary attacks Adversary know One Individual One Tuple (Primary Case) Possible reconstruction P(5,4) ×3 ×3=1080; Breaching reconstruction 2 × P(4,3) × 3 ×3=432; Pbreach(t) = 432/1080=2/5
14
How Adversary attacks Adversary know One Individual Multiple Tuples (Non-Primary Case) Possible reconstruction 5 4 ×3 ×3=5625; Breaching reconstruction 2 × 5 3 × 3 ×3 - 5 2 × 3 ×3 =2025; Pbreach(t) = 2025/5625=9/25
15
How Data owner defeats the attacks The formal computation for P breach (t). Primary Case Non-Primary Case Overlap disjoint n=5,b=2 n=2,b=2, c=1/3
16
How Data owner defeats the attacks Utility Measure: Information Loss
17
How Data owner defeats the attacks Algorithm Picture Table Group1 Group N SA-Generalization Split … New Table Replace if having more utility
18
How Data owner defeats the attacks Algorithm Start from all QI values are roots and SA values have been generalized to satisfy every P breach (t)< threshold Top-down split QI attributes, in order to increase utility information “Single Split” means every time, only one attribute can be split into its direct children SA-Generalization guarantee in every QI group, every P breach (t)< threshold, then the whole table prevents privacy Every iteration, it should find a “Split” and after SA-Generalization, the utility information increases; otherwise, it quits
19
How Data owner defeats the attacks Algorithm Bottom-up generalize SA values to improve privacy If tuples in S prob satisfy privacy requirement, means all tuples in G satisfy privacy requirement All SA values of the tuples which dissatisfy privacy requirement will be generalized to the parent of the Guarding Node which approaches the root mostly Finish when no tuples in S prob dissatisfy privacy requirement; or no possibility to generalize.
20
Experiments Adult dataset (http://www.ipums.org)http://www.ipums.org 5 QI Attributes, 1 SA Attribute Pri-leaf, Pri-mixed, Nonpri-leaf, Nonpri-mixed Breaching Threshold =0.25 All Weight for attributes =1
21
Experiments Breaching Probability
22
References Xiaokui Xiao and Yufei Tao. Personalized Privacy Preservation. In SIGMOD 2006. A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacybeyond k-anonymity. In ICDE, 2006. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, 2005. A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaching in privacy preserving data mining. In ACM Symposium on Principles of Database Systems, 2003.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.