Download presentation
Presentation is loading. Please wait.
1
1 Global Privacy Guarantee in Serial Data Publishing Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jia Liu 2, Ke Wang 3, Yabo Xu 4 The Hong Kong University of Science and Technology 1 The Chinese University of Hong Kong 2 Simon Fraser University 3 Sun Yat-sen University 4 Prepared by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong
2
2 Outline 1.Sequential Releases 2.Related Work 3.Our Proposed Privacy Model Local Guarantee 4.Conclusion
3
3 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 Release the data set to public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data This table satisfies some privacy requirements (e.g., m-invariance)
4
4 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Release the data set to public Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data This table satisfies some privacy requirements (e.g., m-invariance) Insertions, deletions and updates
5
5 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data This table satisfies some privacy requirements (e.g., m-invariance) Insertions, deletions and updates
6
6 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Problem: At the current time t, we want to generate a table which satisfies some privacy requirements (e.g., m-invariance) with respect to all published tables at any time <= t
7
7 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Problem: At the current time t, we want to generate a table which satisfies some privacy requirements (e.g., m-invariance) with respect to all published tables at any time <= t NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Privacy Requirement: Peter would not want anyone to deduce with high confidence from these published data that he has ever contracted chlamydia in the past. A sexually transmitted disease (STD) one or more published dataset
8
8 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Problem: At the current time t, we want to generate a table which satisfies some privacy requirements (e.g., m-invariance) with respect to all published tables at any time <= t NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Privacy Requirement: Peter would not want anyone to deduce with high confidence from these published data that he has ever contracted chlamydia in the past. A sexually transmitted disease (STD) Privacy Requirement: Probability that Peter is linked to chlamydia in one or more published dataset is at most a given threshold (e.g., 1/2). Global Guarantee
9
9 1. Sequential Releases This global guarantee requirement seems to be quite “obvious” and “natural” No existing works consider this global guarantee requirement Instead, they consider another requirement called local guarantee. Problem: At the current time t, we want to generate a table which satisfies some privacy requirements (e.g., m-invariance) with respect to all published tables at any time <= t Privacy Requirement: Peter would not want anyone to deduce with high confidence from these released data that he has ever contracted chlamydia in the past. Privacy Requirement: Probability that Peter is linked to chlamydia in one or more published dataset is at most a given threshold (e.g., 1/2). Global Guarantee
10
10 1. Sequential Releases Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data A sexually transmitted disease (STD) Privacy Requirement: Probability that Peter is linked to chlamydia in each published dataset is at most a given threshold (e.g., 1/2). Local Guarantee Probability that Peter is linked to chlamydia in the dataset at time = 1 is at most a given threshold (e.g., 1/2). Probability that Peter is linked to chlamydia in the dataset at time = 2 is at most a given threshold (e.g., 1/2). Probability that Peter is linked to chlamydia in the dataset at time = 3 is at most a given threshold (e.g., 1/2).
11
11 2. Related Work Local Guarantee m-invariance Xiao et al, “m-invariance: Towards Privacy Preserving Re- publication of Dynamic Datasets”, SIGMOD, 2007 l-scarcity Bu et al, “Privacy Preserving Serial Data Publishing by Role Composition”, VLDB, 2008
12
12 Contribution We are the first to propose the global guarantee requirement We prove that global guarantee is a stronger requirement than local guarantee
13
13 How can we calculate the probability? According to the published datasets, we derive a formula based on the possible world analysis We skip the details. Problem: At the current time t, we want to generate a table which satisfies some privacy requirements (e.g., m-invariance) with respect to all published tables at any time <= t Privacy Requirement: Peter would not want anyone to deduce with high confidence from these released data that he has ever contracted chlamydia in the past. Privacy Requirement: Probability that Peter is linked to chlamydia in one or more published dataset is at most a given threshold (e.g., 1/2). Global Guarantee
14
14 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public Time = 1 NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 2 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data Time = 3 Hospital NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Medical Data Public NamePIDDisease Raymondp1p1 Flu Peterp2p2 HIV Maryp3p3 Fever Alicep4p4 HIV Bobp5p5 Flu Johnp6p6 Fever Published Data
15
15 Property Theorem: Global guarantee is a stronger privacy requirement than local guarantee. If the published tables satisfy global guarantee, then they satisfy local guarantee.
16
16 Our Algorithm How can we generate tables such that they satisfy global guarantee? Idea: Large group size
17
17 5. Conclusion We are the first to propose global guarantee Global guarantee is a stronger privacy requirement than local guarantee.
18
18 Q&A
19
19 In the following, I will elaborate two concepts. Local Guarantee (e.g., m-invariance) Global Guarantee
20
20 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M65001flu M65002chlamydia F65014flu F65015fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public
21
21 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M65001flu M65002chlamydia F65014flu F65015fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public
22
22 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public Generalization Each individual is linked to “ chlamydia ” with probability at most 1/2 in THIS PUBLISHED TABLE 2-diversity only focuses on ONE-TIME publishing 2-invariance focuses on MULTIPLE-TIME publishing It also makes use of the idea of 2-diversity Idea: Each individual is linked to “ chlamydia ” with probability at most 1/2 for each of the MULTIPLE PUBLISHED TABLES
23
23 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} 2-invariance
24
24 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} 2-invariance
25
25 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data Voter Registration List NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} 2-invariance
26
26 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} Voter Registration List 2-invariance
27
27 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} Voter Registration List Time = 2 Hospital NameSexZipcodeDisease RaymondM65001chlamydia PeterM65002flu MaryF65014fever EmilyF65010flu Medical Data Release the data set to public SexZipcodeDisease M6500*chlamydia M6500*flu F6501*fever F6501*flu Published Data Raymond Peter Mary Emily 2-invariance
28
28 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} Voter Registration List Time = 2 Hospital NameSexZipcodeDisease RaymondM65001chlamydia PeterM65002flu MaryF65014fever EmilyF65010flu Medical Data Release the data set to public SexZipcodeDisease M6500*chlamydia M6500*flu F6501*fever F6501*flu Published Data Raymond Peter Mary Emily NameSignature Raymond Peter Mary Emily {flu, chlamydia} {flu, fever} This table satisfies 2-invariance. This is because each individual is linked to the SAME signature. Idea of 2-invariance: Each individual is linked to the SAME signature in each published table. 2-invariance
29
29 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Release the data set to public NameSignature Raymond Peter Mary Alice {flu, chlamydia} {flu, fever} Voter Registration List Time = 2 Hospital NameSexZipcodeDisease RaymondM65001Chlamydia PeterM65002flu MaryF65014fever EmilyF65010flu Medical Data Release the data set to public SexZipcodeDisease M6500*chlamydia M6500*flu F6501*fever F6501*flu Published Data NameSignature Raymond Peter Mary Emily {flu, chlamydia} {flu, fever} 2-invariance
30
30 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu F6501*fever F6501*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis
31
31 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis
32
32 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu This is the possible world analysis based on the published table at time = 1 only.
33
33 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu This is the possible world analysis based on the published table at time = 2 only.
34
34 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4
35
35 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 In the published data at time = 1, Prob(the second individual (i.e. Peter) is linked to chlamydia) = 2/4 = 1/2 Yes No
36
36 Public Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia Published Data NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 SexZipcodeDisease M6500*chlamydia M6500*flu Published Data 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Why? Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 In the published data at time = 2, Prob(the second individual (i.e. Peter) is linked to chlamydia) = 2/4 = 1/2 Yes No Yes No
37
37 Public Time = 1 NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 Global Guarantee: Probability that an individual is linked to chlamydia in one or more published dataset is at most 1/2. Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) =
38
38 Public Time = 1 NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 Global Guarantee: Probability that an individual is linked to chlamydia in one or more published dataset is at most 1/2. Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) = Yes
39
39 Public Time = 1 NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 Global Guarantee: Probability that an individual is linked to chlamydia in one or more published dataset is at most 1/2. Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) = Yes
40
40 Public Time = 1 NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 Global Guarantee: Probability that an individual is linked to chlamydia in one or more published dataset is at most 1/2. Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) = Yes
41
41 Public Time = 1 NameSexZipcode RaymondM65001 PeterM65002 MaryF65014 AliceF65015 EmilyF65010 Voter Registration List Time = 2 2-invariance 2-invariance provides the local guarantee. Probability that an individual is linked to chlamydia in each of the published datasets is at most 1/2. Possible World Analysis SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 1 SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu SexZipcodeDisease M65001flu M65002chlamydia SexZipcodeDisease M65001chlamydia M65002flu World 2 World 3 World 4 Global Guarantee: Probability that an individual is linked to chlamydia in one or more published dataset is at most 1/2. Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) = Yes No 3/4 This value is larger than 1/2.
42
42 We illustrate how we derive a probabilty that an individual is linked to chlamydia with an example (for both local guarantee and global guarantee). In fact, the general formula is much more complicated.
43
43 Theorem: Global guarantee is a stronger privacy requirement than local guarantee. If the published tables satisfy global guarantee, then they satisfy local guarantee.
44
44 How can we generate tables such that they satisfy global guarantee? Idea: Large group size
45
45 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M/F650**flu M/F650**chlamydia M/F650**flu M/F650**fever Published Data Release the data set to public Time = 2 Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014fever EmilyF65010flu Medical Data Release the data set to public SexZipcodeDisease M/F650**flu M/F650**chlamydia M/F650**fever M/F650**flu Published Data Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published datasets) = 7/16 Global Guarantee This value is smaller than 1/2.
46
46 5. Conclusion We are the first to propose global guarantee Global guarantee is a stronger privacy requirement than local guarantee.
47
47 Q&A
48
48 Public Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014flu AliceF65015fever Medical Data Time = 1 SexZipcodeDisease M6500*flu M6500*chlamydia F6501*flu F6501*fever Published Data Release the data set to public Time = 2 Hospital NameSexZipcodeDisease RaymondM65001flu PeterM65002chlamydia MaryF65014fever EmilyF65010flu Medical Data Release the data set to public SexZipcodeDisease M6500*flu M6500*chlamydia F6501*fever F6501*flu Published Data 2-invariance (Local Guarantee) Prob(the second individual (i.e. Peter) is linked to chlamydia in one or more published dataset) = 3/4 This value is larger than 1/2.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.