Privacy and trust in social network Michelle Hong 2009/03/02
Outline What is privacy and trust? Privacy in social network Basic privacy requirement Privacy in graph Trust in social network Reference
What is Privacy Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively. Different privacy boundaries and content Voluntarily sacrificed Uniquely identifiable data relating to a person or persons
What is Trust? Trust is a relationship of reliance. Not related to good character, or morals Trust does not need to include an action that you and the other party are mutually engaged in. Trust is a prediction of reliance on an action. Conditional
Privacy and Trust Tradeoff Need legal rights Reveal more data to trustworthy people Provide access rights Gain trust through open sensitive data
Outline What is privacy and trust? Privacy in social network Basic privacy requirement Privacy in graph Trust in social network Reference
K-anonymous [1] Have at least k answers Given multiple data publisher Get sensitive value Have at least k answers
L-diversity [2] Have at least l different sensitive answers
t-closeness [3] T semantic meaning result
Dynamic Anonymization [4]
Outline What is privacy and trust? Privacy in social network Basic privacy requirement Privacy in graph Trust in social network Reference
Possible Attacks On Anonymized Graphs Attack method [5] Identify by neighborhood information It includes: Vertex Refinement Queries Sub-graph Queries Hub Fingerprint Queries Attack types[6] Active Attacks Create a small number of new user accounts linking with other users before the anonymized graph is generated Passive Attacks Indentify themselves in the published graph Semi-passive Attacks Create necessary link with other users
Vertex Refinement Queries H*’s computation is linear in the number of edges in the graph, very efficiently.
Sub-graph Queries Query is the subgraph information adjacent to the target node Computation intensive
Hub Fingerprint Queries Suppose Dave and Ed are selected as hubs F1(fred) = (1, 0) (The shortest path length to each hub) F2(fred) = (1, 2) If F1(fred) = (1, 0) in open world, then both F1(fred) = (1, 0) and (1, 1) are candidate because the adversary may not have the complete knowledge
Avoid attacks Request authorities to linkage confirmation Users confirm a request about adding a friend Website provides checking on users Identify and remove attack nodes Find the strange structure nodes
k-degree anonymous[7] The kind of attack Objective Method Vertex Refinement Queries (H(1)) Objective The published graph For every node v, there exist at least k-1 other nodes in the graph with the same degree as v Minimum edges are added in to reserve the graph’s shape as much as possible Method Add edges into the original anonymized graph First compute the new degree vector that satisfy k-degree Then generate the new graph based on this degree vector
K-neighbor anonymous [8]
Resist neighborhood attack through graph generalization[5] 2 Step1: Partition the graph, each partition contains at least k nodes 1 1 3 2 2 2 Step2: For each partition, generate a super node 2 3 3 2 Step3: Draw the edges between partitions, the weight is the edge number In this paper, he use simulated annealing to find the partitions maximize the likelihood function Step3: Draw the sel-edges for each partition, the weight is the edge number with it
Outline What is privacy and trust? Privacy in social network Basic privacy requirement Privacy in graph Trust in social network Reference
Mining Privacy in Social Network What’s the problem in Web 2.0: Activity streams: users are not aware that some mini-feeds on the profile Unwelcome linkage: a friend who explicitly write the link for other user's profile merge social graph: link of link
Privacy in Social Data Different users have different opinions on sensitive data Website enables users to set up access permission Construct trust network from social data
Reference [1] L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10 (5), 2002; 557-570. [2] Ashwin Machanavajjhala , Daniel Kifer , Johannes Gehrke , Muthuramakrishnan Venkitasubramaniam, L-diversity: Privacy beyond k-anonymity, ACM Transactions on Knowledge Discovery from Data (TKDD), v.1 n.1, p.3-es, March 2007 [3] Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian, "t-Closeness: Privacy Beyond k- Anonymity and l-Diversity," in IEEE International Conference on Data Engineering (this proceedings), 2007. [4] Xiao, X., Tao, Y. Dynamic Anonymization: Accurate Statistical Analysis with Privacy Preservation. Proceedings of ACM Conference on Management of Data (SIGMOD), pages 107-120, 2008. [5] Michael Hay, Gerome Miklau, David Jensen, Don Towsley and Philipp Weis, Resisting Structural Re-identification in Anonymized Social Networks. PVLDB08 [6] Lars Backstrom, Cynthia Dwork and Jon Kleinberg, Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography. WWW2007 [7] Kun liu and Evimaria Terzi, Towards Identity Anonymization on Graphs. SIGMOD08 [8] Bin Zhou and Jian Pei, Preserving Privacy in Social Networks Against Neighborhood Attacks ICDE08