Anonymized Social Networks, Hidden Patterns, and Structural Stenography Lars Backstrom, Cynthia Dwork, Jon Kleinberg WWW 2007 – Best Paper
OUTLINE Problem Some graph theory Walk-Based Attack Cut-Based Attack (Semi)-Passive Attacks
PROBLEM Massive social network graphs exist MySpace FaceBook Phone Records Instant Messaging... Social network structure is valuable Just removing names isn't enough (we show this)
MOTIVATION Privacy concerns – who talks to who Economic concerns – selling to marketers AOL Search Data
GENERAL METHOD Watermark the graph so that finding the watermark allows us to find individuals Reveals the removed names Reveals edges between revealed names
WALK BASED ATTACK Create a subgraph S to embed Desired Properties of Subgraph Doesn't already exist in the graph Can be easily found No non-trivial automorphisms (can't be mapped to itself beyond the identity)
WALK BASED ATTACK Let k = (2+d)logn be the number of nodes in the subgraph
x2x2 x3x3 x1x1 x4x4
WALK BASED ATTACK Let k = (2+d)logn be the number of nodes in the subgraph Pick W = {w 1...w b } users to target
x2x2 x3x3 x1x1 w1w1 w2w2 w3w3 x4x4
WALK BASED ATTACK Let k = (2+d)logn be the number of nodes in the subgraph Pick W = {w 1...w b } users to target Pick a unique set of nodes in the subgraph to connect to each w i
x2x2 x3x3 x1x1 w1w1 w2w2 w3w3 x4x4
WALK BASED ATTACK Let k = (2+d)logn be the number of nodes in the subgraph Pick W = {w 1...w b } users to target Pick a unique set of nodes in the subgraph to connect to each w i Pick an external degree for each x i and create additional spurious edges
x2x2 x3x3 x1x1 w1w1 w2w2 w3w3 x4x4
WALK BASED ATTACK Create the internal edges by including each edge (x i,x i+1 ). Include all other edges with probability ½ Theoretical result guarantees that w.h.p. this subgraph doesn't exist in G and has no automorphisms.
x2x2 x3x3 x1x1 w1w1 w2w2 w3w3 x4x4
FINDING THE SUBGRAPH Find all nodes with degree(x 1 ) Find all nodes connected to x 1 with degree(x 2 ). Repeat by building a tree With high probability the tree will be pruned to our embedded subgraph.
x2x2 x3x3 x1x1 w1w1 w2w2 w3w3 x4x4 d b c a e deg(x 1 ) = 5 deg(x 2 ) = 4 x2x2 w3w3 x3x3 x4x4 x1x1 deg(x 3 ) = 6 deg(x 4 ) = 7 w2w2
QUESTION What could we do to foil this attack?
Evaluation LJ Data = 4.4 mil people, 77 mil edges
EVALUATION Using 7 nodes the attack succeeds w.h.p Can attack nodes and ~ edges Our subgraph is not 'obvious' in the graph without the degree sequence
CUT-BASED ATTACK Requires O(√logn) nodes instead of O(logn) (theoretical lower bound) Create a subgraph in a similar manner Each x 1 connects to one w i Use min-cut methods to find H Walk-based attack is better This subgraph is highly disconnected = sticks out
(SEMI)-PASSIVE ATTACKS Walk and Cut based attacks are active Groups of users could also collude to execute an attack on their neighbors Experiments show this works for groups as small as 3 or 4 users How do you defend against this?
Questions?