Download presentation
Presentation is loading. Please wait.
Published byDortha McCormick Modified over 6 years ago
1
Security in Outsourcing of Association Rule Mining
2
Agenda Introduction Item mapping Algorithms and experiments Summary
3
Introduction Data mining in company Types of data mining
know about the past activities of their customers make strategic decisions Types of data mining Association rules mining Clustering Classification
4
Association rules Possible solutions: Complexity of exponential order
Apriori, FPclose, … Complexity of exponential order Required a long time for simple mining algorithm Motivated to outsource mining responsibility
5
Concerns in outsourcing
Correctness Security Privacy of records Information of the company Company DB Data Miner
6
Item mapping
7
Item mapping Substitution cipher has been widely used in encryption in text Each letter is replaced by another letter In encryption of transactions Each item is replaced by another item
8
Example bread -> 54 chocolate -> 165
What is <8, 69, 153, 756>? <cheese, fork, ice-cream, clock>
9
Security Substitution cipher is not secure Frequency analysis
Frequency analysis can be done to recover the original plain text Frequency analysis We know the relative frequencies of each letter in dictionary Find frequencies of letters in the cipher text Perform matching using both frequencies
10
Security in item substitution
Item substitution is more secure than substitution in text The number of possible mappings is much larger (|I|! where I is the entire set of items) Relative frequencies of itemset is not known There is no ways to determine whether a guess on the mappings is correct
11
Architecture Association Rules Association Rules Mappings DB DB
Transformer
12
Simple scheme (one-to-one item mapping)
T: the database I: a set of items in T D: another set of items with |D| = |I| T’: the transformed database m: I -> D : one-to-one item mapping Encryption For each transaction t in T = <x1, x2, …, xn> t’ = <m(x1), m(x2), …, m(xn)> Decryption of association rules For every association rule in T’ <a1, .., ak> => <b1, …bn> <m-1(a1), …, m-1(ak)> => <m-1(b1), …, m-1(bn)> is a rule in T
13
Privacy breach Statistical information is known by the third party miner Counts of some itemsets Number of rules in the database Number of large itemsets
14
Fake items m: I -> D : one-to-one item mapping
F: a set of items not in D For each transaction t, a subset of F is added to t’
15
Example I = {a, b}, D = {1, 2} m(a) = 1, m(b) = 2 F = {3}
T = {<a>, <b>} T’ = {<1, 3>, <2>}
16
Functions of fake items
Increase the difficulty to get a correct guess on a mapping of item (from 1/|I| to 1/(|I| + |F|)) Protect statistical information of the data owner The miner do not know the exact information
17
Better way to utilize “fake” items
A one-to-n item mapping B: a set of items m: I -> 2B Example m(a) = {1, 4, 5} m(b) = {2} m(c) = {3, 5}
18
Advantage In an ideal situation, it is even more difficulty to get a correct guess on a mapping of item One-to-one item mapping (no fake items): 1/|I| With fake items :1/(|I| + |F|) One-to-n item mapping: 1 / (2|B| - 1) It does not introduce additional overheads compared to adding fake items
19
Itemset mapping m: I -> 2B : one-to-n item mapping
M: 2I -> 2B : itemset mapping X: an itemset with (x1, x2,…, xn) M(X) = Ux in X m(x) M-1(Y) = {x | x in I and m(x) is subset of Y}
20
Example itemset mapping
m(b) = {2} m(c) = {3, 5} M(<a, b>) = <1, 2, 4, 5> M(<a, c>) = <1, 3, 4, 5> M(<b, c>) = <2, 3, 5>
21
Example inverse itemset mapping
m(b) = {2} m(c) = {3, 5} M-1(<1, 2, 4, 5>) = <a, b> M-1(<1, 3, 4, 5>) = <a, c> M-1(<1, 2, 3, 5>) is not well-defined
22
Correctness – restrictions on one-to-n mapping
m(b) = {2, 3} m(c) = {1, 3} M(<a, b>) = <1, 2, 3> M(<a, b, c>) = <1, 2, 3> Each mapping of an item must contain unique stuff
23
Admissible item mapping
m: I -> 2B m is admissible iff For all x in I m(x) – U y in I – {x} m(y) is not empty Guarantee M-1(M(X)) = X (correct decryption)
24
Security of one-to-n item mapping
Function cover M1: 2I -> 2D1 M2: 2I -> 2D2 M1 covers M2 iff for all X is a subset of I Y = M2(X) M2-1(Y ) = M1-1(Y intersect D1)
25
Function cover If M1 covers M2 Our results
Any transaction encrypted by M2 can be decrypted by using the inverse of M1 Our results A one-to-n itemset mapping built on an admissible one-to-n item mapping is covered by some one-to-one itemset mapping
26
Example transformation
{b} {c} {a, b} {a, c} {b, c} {a, b, c} T’ = {1, 4, 5} {2} {3, 5} {1, 2, 4, 5} {1, 3, 5} {2, 3, 5} {1, 2, 3, 4, 5} M: a -> {1, 4, 5} b -> {2} c -> {3, 5} M’: a -> {1} b -> {2} c -> {3}
27
Function cover Suppose a data owner encrypted his own database using a one-to-n item mapping m An adversary do not need to find m, but instead he look for a one-to-one item mapping m’ Conclusion One-to-n item mapping is not more secure than a one-to-one item mapping
28
Transaction transformation
M: 2I -> 2B : One-to-n itemset mapping F: a set of items not in B To hide B N: transaction transformation Maps from 2I to 2BUF N(t) = M(t) U E E is a random subset of B U F N-1(t’) = {x | m(x) in t’}
29
Transaction transformation
Valid N-1(N(t)) = t Complete (based on valid) s is a program to generate E outs(t) is the set of all possible outputs of s with transaction t If for all t For any E such that t’ = M(t) U E and N-1(t’) = t E in outs(t)
30
Valid and complete transaction transformation
N: valid and complete transaction transformation There does not exist a one-to-one itemset mappint M’ which covers N Cannot analyze it as a one-to-one mapping Increased difficulty in cryptanalysis
31
Example transformation
{b} {c} {a, b} {a, c} {b, c} {a, b, c} T’ = {1, 4, 5} {2, 1, 3} {3, 5, 4} {1, 2, 4, 5} {1, 3, 5} {2, 3, 5, 1} {1, 2, 3, 4, 5} M: a -> {1, 4, 5} b -> {2} c -> {3, 5} M’: a -> ?
32
Conclusion on transaction transformation
Correct results More difficult to perform cryptanalysis Overhead comparable to one-to-one item mapping with fake items Transaction transformation is a non-deterministic approach
33
Recovering association rules
Given an encrypted rule in T’ X => Y Find A = M-1(X) and B = M-1(X U Y) If both are well-defined The rule in T is A => B - A
34
Example Given 1 => 4 (rejected) 2 => 1, 5 (rejected)
b 2 c 3, 5 Given 1 => 4 (rejected) 2 => 1, 5 (rejected) 2 => 1, 3, 5 (rejected) 2 => 1, 3, 4, 5 (b => ac) 2, 3, 5 => 1, 4 (bc => a) 2, 3, 5 => bc 1, 2, 3, 4, 5 => abc
35
Algorithms
36
Algorithms Developed Generate an admissible one-to-n item mapping
Perform a transaction transformation given a transaction t and an admissible one-to-n item mapping Valid and complete
37
Experiments
38
Frequency analysis One-to-one item mapping vs one-to-n transaction transformation One-to-n mapping Limited to size 2 Each item can be mapped to at most 2 items Adversary: Genetic algorithm Automated frequency analysis tools using a random approach
39
Parameters One-to-one: 5 fake items
One-to-n: |I| items maps to |I|+5 items Two initialization strategies Good Each mapping of item has a chance to be initiated to be a correct one Random Each mapping of item is initiated totally random
41
20 items 35 items
42
Observation One-to-n requires more information to be prepared
Takes a longer time to calculate the goodness of current state Long time spent in each iteration Shows slow progress when near the optimal solution Similar progress but in fact a poorer result
43
Background knowledge Assume the adversary has an approximation from a global result of mining alpha: knows alpha% of large itemsets in original database beta: the support in the knowledge is in the range real support * (1 beta)
44
beta alpha beta alpha
45
Overhead measure Measure the overhead introduced by transaction transformation Parameters Em: expected size of a mapping of an item NE: expected size of B intersect E NF: expected size of F intersect E
46
Em = 1.6, Em = 1.6, Em = 1.4,
47
Summary The idea of substitution cipher is used in the problem of encryption of transaction database One-to-n item mapping cannot be directly applied since it is effectively a one-to-one item mapping Transaction transformation is proposed Experiments show that it is suitable for outsourcing
48
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.