Download presentation
Presentation is loading. Please wait.
1
PRIVACY CRITERIA
2
Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline
3
Privacy in data mining Random Perturbation (quantitative data) Given value x, return value x + r, r is a random value from a distribution Construct decision-tree classifier on perturbed data s.t. accuracy is comparable to classifiers of original data Randomized Response (categorical data) Basic idea: disguise data by probabilistically changing the value of sensitive attribute to another value Distribution of original data can be reconstructed using the disguised data
4
Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline
5
Mobile privacy Spatial cloaking: Cloaked region Contains location q and at least k-1 other user locations Circular region of location q Contains location q and number of dummy locations generated by client Transformation based matching Transform region through Hilbert curves by using Hilbert keys Casper: user registers with (k, A min ) profile k: user is k-anonymous A min : minimum acceptable resolution of the cloaked spatial region
6
Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline
7
(k-e) - anonymity Privacy protection for numerical sensitive attributes GOAL: group sensitive attribute values s.t. No less than k distinct values Range of group larger than threshold e Permutation-based technique to support aggregate queries Constructing help table Aggregate Query Answering on Anonymized Tables @ ICDE2007
8
(k-e) - anonymity Original Table Table after Permutation
9
(k-e) - anonymity Table after Permutation Help Table
10
Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline
11
(c-k) – safety Goal: quantify background knowledge k of attacker maximum disclosure w.r.t. k is less than threshold c Express background knowledge through a language Worst –Case Background Knowledge for Privacy –Preserving Data Publishing @ ICDE2007
12
(c-k) – safety Create buckets, where randomly permute sensitive attribute values within each bucket Original TableBucketized Table
13
(c-k) – safety Bound background knowledge i.e., attacker knows k basic implications Atom: t p [S] = s, s S, p Person e.g. t Jack [Disease] = flu Basic implication: For some m, n and A i, B i atoms e.g. t Jack [Disease] = flu t Charlie [Disease] = flu is the language consisting of conjunctions of k basic implications
14
(c-k) – safety Find bucketization B of original table s.t. B is (c-k) – safe The maximum disclosure of B w.r.t is less than threshold c
15
Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline
16
Original data transformed in Generalized or Bucketized data Quantify external knowledge through skyline for each sensitive value External knowledge for each individual Having single sensitive value Having multiple sensitive values Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge @ VLDB 2007
17
Privacy skyline Three types of knowledge (l, k, m) e.g.(2, 3, 1) l: Knowledge about target individual t flue Tom[S] and cancer Tom[S] (obtained from Tom.s friend) k: Knowledge about individuals (u 1,..u k ) other than t flue Bob[S] and flue Cary[S] and cancer Frank[S] (obtained from another hospital) m: Knowledge about the relationship between t and other individuals (v 1, …v m ) AIDS Ann[S] AIDS Tom[S] (because Ann is Tom’s wife)
18
Privacy skyline Example: knowledge threshold (1, 5, 2) and confidence c=50% for sensitive value AIDS Adversary knows l≤1 sensitive values that t does not have Adversary knows sensitive values of k≤5 others Adversary knows m≤2 members in t’s same-value family Adversary cannot predict individual t to have AIDS with confidence 50% when the above hold
19
Privacy skyline If transformed data D* is safe for (1, 5, 2) it is safe for any (l, k, m) with l≤1, k≤5, m≤2 i.e., the shaded region
20
Privacy skyline Skyline for set of incomparable points {(1, 1, 5), (1, 3, 4), (1, 5, 2)}
21
Privacy skyline Given a skyline {(l 1, k 1, m 1, c 1 ), …,(l r, k r, m r, c r )} release candidate D* is safe for sensitive value iff, for i =1 to r max {Pr( t[S] | L t, (l i, k i, m i ), D*)} < c i maximum probability of a sensitive value to be for individual t w.r.t external knowledge and release candidate is below confidence threshold c i
22
Original TableGeneralize Table Bucketized Table
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.