Security and Privacy in Social Networks Raymond Heatherly Data Security and Privacy Lab
Social Network Privacy (Heatherly et al) Facebook currently has over 400 million users Each of these users specify details about themselves For example:
What about details they don’t specify? So what? What about details they don’t specify? In our previous example, what political affiliation does she have? What about her job? Two possible reasons: Forgot Don’t want people to know
Privacy But can we figure out anyways? For instance, is there anything our previous example does state that talks about her job? An activity talks about ‘my classroom’
Learning Consider a social network as a graph, where the vertices are the users in the network, and the edges are friendship links between those users. Each node has a finite subset of detail types (hometown, birthdate, groups, books, etc.) Each detail type has a finite number of detail values (books = The Bible, Harry Potter, etc.)
We use these properties to construct three different models: Model Building We use these properties to construct three different models: Details Only Links Only Average
Details Only Naïve Bayesian classifier (Detail independence) Builds a raw model based on training data over all details
Links Only Naïve Bayes based With changes Weigh friendships based on similarity
Average Calculate Link only and Details only probabilities and average them
Collective Inference When we classify large graphs, the decisions we make at one node transfer through the graph CI gives us a series of algorithms to assist with handling these transfers Local Classifier Relational Classifier CI Algorithm
Preserving Privacy What happens when data is released? In what ways can we decrease accuracy of classifiers? We can add or remove links or details Consider what additions mean What about deletions?
Our experiments Performed on data gathered from the DFW network on Facebook in the Spring of 2008 Performing only link or detail deletions For Details, remove the best identifiers of any classification globally For Links, remove links to those individuals most like a person
Results
Access Control in Social Networks (Carminati et al, 2009) What about access to resources? For example, photos: Who should control viewers of a photo on Facebook? Now, on Facebook, the photo uploader has control of the photo’s viewers A person in the picture can only untag
Parental rights over a minor What if a photo is of a minor child? How would a parent be able to (reliably) have photos removed or restricted of their children? What about limiting children’s access to inappropriate videos over a social network?
Propose several generic classes of friends: Friendship Hierarchy Propose several generic classes of friends: Friend Co-Worker Family Some classes can have (user-defined) specific sub-classes, such as a Best Friend, a Boss, a Parent, a child, etc.
Project motivation What if we give all people tagged in a photo some say in who can see photos of them? Additionally, parents of minor children can also have a say in the permissions of photos of their children Instead of a static access list, what about inferring the authorizations using semantic reasoners?
Data Generation Facebook doesn’t give full set of its data to researchers Needed to test efficacy of semantic solution using a comparable size of data Generated 350 million `users’ with their own security policies Simulated a scale-free network Generated Between 750,000 and 350 million resources
Implementation challenges Initially, we attempted to do the reasoning on the entire data set. SweetRules did not update in-memory model of security policies, so gave incorrect responces Pellet then crashed due to the amount of memory required to perform inference on data set
We then decided to partition data Partitioning We then decided to partition data But any single partition would be a cut that would have edges to (at least) one other partition These would decrease our accuracy Dynamic partitioning Owner Tagged individuals Requestor
Experiment 1 Friendship types: Security policies: Coworker Friend: with BestFriend sub-type Family: with Parent/Child sub-type Security policies: 1. Strict – Only BestFriends and Family can view photos of self and any child; child may not view any videos 2. Casual – Anyone can see photos; no restriction on child 3. ParentStrict – Anyone can see photos of the parent, only family can see photos of child;
Discard almost all Link Types Experiment 2 Discard almost all Link Types Keep ParentOf/ChildOf Replace with a Trust value between 1 and 10
Maintained all general and specific link types Experiment 3 Used a hybrid approach Maintained all general and specific link types Each friendship also assigned a Trust Value i.e. A Best Friend with a TV of 6
Time (in seconds) for each inference Results Average Low High Link-Type only 0.585 0.562 0.611 Trust Value Only 0.612 0.534 0.598 Value/Trust Hybrid 0.731 0.643 0.811 Time (in seconds) for each inference
Questions?