To Join or Not to Join: The Illusion of Privacy in social Networks with Mixed Public and Private User Profiles Paper by Elena Zhelea and Lise Getoor
Introduction What do we want to find out about a “Private” profile? –S–Sensitive information What Is Sensitive information ? –W–What advertizing agencies and companies want to know –W–What you do not want others to find out
How can we find out private information? If a profile is really private how can you find out something? –What if it was not facebook? A completely anonymous profile? –Utilize what pubic info you have. Using tactics that exploit friendship links Exploiting group affiliations –Neither Facebook nor Flikr hide group members.
BASIC model Guess sensitive attribute based on distribution of known attributes. ? ? ? ? ? ? Ana Gia Fabio Emma Chris Bob Don Sensitive Info =Favorite Colors Orange Blue Green
Sensitive-Attribute Inference Models We assume the overall distribution of the sensitive attribute is either known or it can be found using public profiles. We will consider the BASIC distribution to be the baseline attack. A successful attack is one that with extra knowledge, has significantly higher accuracy.
Our Model ? ? ? ? ? ? Ana Gia Fabio Emma Chris Bob Don ? ? Bob ? ? Gia Fabio True Blue Lovers ? ? Bob Emma Chris Don Espresso lovers Sensitive Info = Favorite Color Orange Blue Green
“Tell me who your friends are, and I’ll tell you who are you” Link based Attacks Friend-aggregate model (AGG) Collective Classification model (CC) Flat-link model (LINK)
Friend-aggregate model (AGG) ? ? ? ? ? ? Ana Gia Fabio Emma Chris Bob Don Given my friends, what am I most likely? Public-Sensitive attributes/Total Links
Collective Classification model (CC) AGG, With re-evaluation ? ? ? ? ? ? Ana Gia Fabio Emma Chris Bob Don
Flat-link model (LINK) ? ? ? ? ? ? Ana Gia Fabio Emma Chris Bob Don Flatten the data by considering adjacency matrix of the graph. CLASSIFICATION!!! AnaBobChrisDonEmmaFabioGiacolor Ana ? Bob ? Chris Orange Don Green Emma Orange Fabio Blue Gia ?
Group Based Attacks Groupmate-link model (CLIQUE) –Considers all people in a group as friends Group-based classification model (GROUP) –Considers each group as a feature in a classifier
Groupmate-link model (CLIQUE) ? ? Bob ? ? Gia Fabio ? ? Bob Emma Chris Don Espresso loversTrue Blue Lovers Consider everyone in a group, a friend Then flatten to adjacency matrix Use previous LINK methods after AnaBobChrisDonEmmaFabioGiacolor Ana ? Bob ? Chris Orange Don Green Emma Orange Fabio Blue Gia ?
Group-based classification model (GROUP) ? ? Bob ? ? Gia Fabio ? ? Bob Emma Chris Don Espresso Lovers Use groups as a feature set –Prune away less useful groups Homogeneity = Entropy (h) Size, smaller groups might be better. True Blue Lovers Espresso LoversColor Ana00? Bob11? Chris01Orange Don01Green Emma01Orange Fabio10Blue Gia10?
LINK-GROUP Use friends and groups as features and then use traditional classifier AnaBobChrisDonEmm a FabioGiaTrue Blue Espre sso Color Ana ? Bob ? Chris Orang e Don Green Emm a Orang e Fabio Blue Gia ?
Using Both –Groups and Links LINK-GROUP –Uses the links and groups as features in a classifier model
Facebook Data Link based attacks –AGG, CC, BLOCK similar to baseline –LINK’s accuracy varied between 65.3% and 73.5% Group based Attacks –73.4% success in determining gender Mixed-Model – 72.5%, no improvement, 57.8% or 1% better than BASIC on political views
How good is this paper? How good is their attack methods? We can attack in more ways –Using image recognition –Using the names of people and “googling” Also applies to doing the same to their friends –Search for key words in wall posts