1 Dr. Xiao Qin Auburn University Spring, 2011 COMP 7370 Advanced Computer and Network Security The VectorCover Algorithm (2)
2 Minimal Distance Vectors
3 The Outlier Set and All Set Outliers: Tuples which have less than k occurrences All: a set of distinct tuples in a table
4 Pair – (strategy, tuples) New data structure Represents a transformation strategy Represents a set of tuples after applying such a transformation. Strategy = Distrance Vectors
5 Distance between Two Tuples
6 The VectorCover Algorithm
7 Dr. Xiao Qin Auburn University Spring, 2011 COMP 7370 Advanced Computer and Network Security The MinGen Algorithm
8
9 Step 1: PT vs. PT[QI] vs.
10 Step 2: history <- [d_1, … d_n] n =2 E_0 -> d_1 = 0 Z_0 -> d_2 = 0 E_1 -> d_1 = ? Z_2 -> d_2 = ? E_1 -> d_1 = 1 Z_2 -> d_2 = 2 Use subscripts to represent generalization strategies.
11 Step 2: history <- [d_1, … d_n] Note: E_i and Z_j must be specific when you implement the MinGen algorithm. You must specify your generalization strategies. For example:
12 Step 2: E_i, Z_j n =2 E_0 -> d_1 = 0 Z_0 -> d_2 = 0 E_1 -> d_1 = ? Z_2 -> d_2 = ? E_1 -> d_1 = 1 Z_2 -> d_2 = 2
13 Step 3: Check single attributes Each single attribute must satisfy k-anonymity E -> MGT[E] v = a -> freq(a, MGT[E]) = ? If 4 < k then what does this mean? What should we do? 4
14 Step 3.1: Check single attributes Each single attribute must satisfy k-anonymity If 4 < k then we need data generalization! V_E = [d_E, d_Z] = [1, 0] not [0, 1] Note: move one step at a time.
15 Step 3.2: the generalize() function Each single attribute must satisfy k-anonymity E -> MGT[E] Value v = a -> freq(a, MGT[E]) = ? If 4 < k then what does this mean? V_E = [d_E, d_Z] = [1, 0] MGT <- generalize(MGT, V_E, [0,0]) 4
16 Step 3.2: the generalize() function Each single attribute must satisfy k-anonymity MGT <- generalize(MGT, v, h) Generalize() transform MGT based on a generalization strategy specified by v, h.
17 Step 3.3: update the history vector Each single attribute must satisfy k-anonymity Can you give me an example to illustrate how step 3.3 works? History [d_E, d_Z] = [0, 0] V_E = [1, 0] New History [0, 0] + [1, 0] = [1, 0]
Step
Step