SocialMix: Supporting Privacy-aware Trusted Social Networking Services Chao Li, Balaji Palanisamy, James Joshi School of Information Sciences University of Pittsburgh
Outline Introduction Motivation and Objective Technical details Experiments Conclusion
The increasing Online Social Networks The OSN users was 0.97 billion in 2011 and is expected to rise to 2.04 billion by 2016.
Conventional and anonymous OSNs There are two types of OSNs: Conventional OSNs: Communication messages are typically linked with real identities of users in order to obtain higher trust among the communicating entities. Anonymous OSNs: Communication messages are disassociated from the real identities of the users.
Anonymous OSNs: trust or not? Untrusted anonymous OSNs (e.g. Whisper): They remove user identities, which gives high privacy but untrusted communication. Any comment comes from an unknown user. Trusted anonymous OSNs (e.g. Secret): They anonymous user identities, which gives trusted communication but low privacy. Any comment comes from a friend or friend of friend. Our goal: To make an anonymous OSN with both strong privacy and trusted communication.
Outline Introduction Motivation and Objective Technical details Experiments Conclusion
The current trusted anonymous OSNs are not safe. Though the ID and Profile can be anonymized, the accumulated posted messages can be collected to re-identify the sender. The success rate depends on the background knowledge of the adversaries, which is always strong in trusted anonymous OSNs because the sender has a friendship with them. ID: Alice / A / *** Posted messages Headache today, can’t go to school. Who? No idea… Love chicken here! Profile: Age Gender Location Education … Oh, Alice or Bob. I hate my BOSS! Alice, that’s her boss! OMG! HIV! What should I do! Alice has HIV!
Message perturbation The goal: We aim at ensuring high degree of user privacy while keeping communicating over a trusted over anonymous social network. The idea: Shuffle the messages through message aggregation so that the relationship between the content of a message and the poster’s identity can be anonymized and perturbed.
Outline Introduction Motivation and Objective Technical details Experiments Conclusion
Ideal mix node DEFINITION 1 (IDEAL MIX NODE): Some nodes in OSNs are selected to work as ‘mix node’ to perturb the messages. DEFINITION 1 (IDEAL MIX NODE): A mix node N is said to be an ideal mix-node iff: The node N has at least k messages during perturbation. The perturbation starts when at least k messages are present and ends when the stored messages is less than k. The amount of time duration each message stays in a mix-node is completely random. a x b y
Basic SocialMix approach What is a SocialMix network like? We first present the basic approach. Suppose we have a network with 10 nodes. A,B,I,J send messages to D,H. D,H works as event-driven Mix nodes, also called pre-mix nodes. Node E works as intermediate mix node, also called post-mix node. B C I H E J D F A G K
Attacks towards Basic SocialMix Time-based attack: In most trusted anonymous OSNs, once a message is posted, it will be shown to other users in a real-time manner (FIFO). Adversaries can link messages to user identity through time information, even if the poster ID is de-identified. Solution: Each message can spend a random duration of time inside the mix node so that the third requirement of ideal mix nodes can be achieved. 11:00am 11:00am (Anonymous) Perturbed post time 11:12am (Anonymous) 11:10am 11:10am (Anonymous) 11:15am (Anonymous) Perturbed post order
Attacks towards Basic SocialMix Friendship-based attack: One user may have higher probability to share the messages of their best friends but have very low probability to share messages coming from somebody they don’t like. Therefore, with background knowledge about the friendship of a mix node, the adversary can assign different probability to different neighbors so that the probability distribution is skewed. Solution: Only select a subset of them with higher resilience towards friendship-based attack. For each node in the network, we can assign the probabilities based on the friendship and calculate the entropy to measure the resilience and then select the top-n nodes with higher entropy or select the nodes with entropy higher than a threshold to be the mix nodes. Expected friendship of mix nodes 90% 10%
Attack-resilient SocialMix The way an attack-resilient mx node work: m1 m2 m3 m2 m3 m4 <15:01, Alice, ‘hahaha’> <15:03, Bob, ‘Hello’> m2 m3 m1 < 15:03, Bob, ‘hahaha’> <15:01, Alice, ‘Hello’> m4 ouputTable m3
Mix node placement Though the pre-mix nodes are event-driven, the post-mix nodes should be pre-determined. There should be a module on OSN server which can regularly select post-mix nodes based on the latest network topology. Naive placement: A naive method for mix node selection is to randomly select the nodes with higher resilience towards friendship-based attack resilience. Top-n-based placement: Among the nodes with friendship-based attack resilience higher than the lower bound, we can further filter out the n nodes with highest entropy. Centrality-based placement: Centrality is an important measurement for networks which can be used to measure the importance of the role of a user in a network. In this scheme, we select post-mix nodes based on their degree centrality, betweenness centrality and eigenvector centrality.
Outline Introduction Motivation and Objective Technical details Experiments Conclusion
Experimental setup Activity: Data set: Friendship: For each of the node, we set a value called ‘activity’, which represents the frequency of the message generated (posted) by this node. Range 1~100. Friendship: For each node and one of his friends, we set a value called ‘friendship’, which indicates the probability a node may share the message posted by this friend. Range 1~100. Data set: A small data set, which is a OSN of 34 members in a club.
Experimental evaluation We first measure the operation time of the algorithm with varying anonymity level k. As can be seen, the operation time for each sharing process for a single mix node is stable for varying k. We do not want a message to be blocked by a mix-node, which means it is stored in the buffer for a long time and cannot be selected for output. Therefore, we set a time bound 10 and measure the probability that a message can pass this mix node within 10 timestamps with varying k. The results show that the pass rate is lower for higher k.
Experimental evaluation The entropy under time-based attack is exactly same as the ideal case, which means the adversary cannot gather any additional information through time-based attack and SocialMix can completely defeat time-based attack. The variation of entropy is very large. Some of the nodes may have high resilience with entropy larger than 3 while node 11 provides no resilience with entropy 0.
Experimental evaluation The entropy bound is the lowest entropy provided by any selected mix node, namely the lower bound of the resilience. The two extreme topn conditions, namely top-1 and top-34, gives 3.46 entropy bound and 0 entropy bound respectively. In practice, based on the demand, a threshold can be set to determine the value of n. Even though we have chosen the better nodes with higher resilience, the entropy by performing friendship-based attack is lower, which means that there are still some information leaked out.
Experimental evaluation The anonymization rate of naive scheme which randomly selects post-mix nodes grows slowly with increasing number of selected post-mix nodes. However, even for random selection case, a subset of 15 nodes among the 34 nodes can already guarantees a very high anonymization rate. The other four improved schemes gives much better results. The PDF for all the schemes roughly follows normal distribution, which provides an appropriate protection of the privacy.
Outline Introduction Motivation and Objective Technical details Experiments Conclusion
Conclusion This paper proposes SocialMix, an anonymous communication mechanism to support privacy-aware trusted social networking services. We propose a suite of mix node construction and placement schemes that enhance the attack resilience and anonymization effectiveness of the SocialMix approach. Our experimental evaluation shows that SocialMix provides high attack resilience for trusted communication over social networks with high anonymization rate.
Thanks. Q&A.