Assessing the Veracity of Identity Assertions via OSNs Michael Sirivianos Telefonica Research with: Kyunbaek Kim (UC Irvine), Jian W. Gan (TellApart), Xiaowei Yang (Duke University)

Assessing the Veracity of Identity Assertions via OSNs Michael Sirivianos Telefonica Research with: Kyunbaek Kim (UC Irvine), Jian W. Gan (TellApart), Xiaowei Yang (Duke University)

Leveraging Social Trust to address a tough problem Assessing the credibility of identity statements made by web users

Why Social Trust?  It requires effort to built up social relationships:  The social graph can be used to defeat Sybils  Online Social Networks (OSN) help users to organize and manage their social contacts  Easy to augment the OSN UI, with features that allow users to declare who they trust and and by how much

An online world without identity credentials makes determining who and what to believe difficult


How can ``Merkin'' convince us that he is a chef?

Trustworthy online communication:  Dating websites, Craigslist, eBay transactions first contact in OSNs  ``I work in...", ``I am an honest seller", ``My name is " Access control  Age-restricted sites: ``I am over 18 years old'' More Motivating Scenarios  We need a new online identity verification primitive that:  enables users to post assertions about their identity  informs online users and services on whether they should trust a user's assertions  preserves the anonymity of the user that posts the assertion  does not require strong identity or infrastructure changes

Our Approach  Crowd-vetting  Employ friend feedback (tags) to determine whether an online user's assertion is credible  A new application of OSNs and user-feedback  OSNs have so far been used to block spam (Re:, Ostra) and for Sybil-resilient online voting (Sumup)  User-feedback has so far been used for recommendations (Yelp, Amazon, YouTube, eBay etc) Our main contribution lies in combining OSNs and user feedback to provide credible assertions

Our Solution: FaceTrust  Online social network users tag their friends' identity assertions  OSN providers issue web-based credentials on a user's assertions using his friends' feedback  bind the assertion to a measure of its credibility  for not very critical applications, but they can help users or services make informed decisions  Uses trust inference to prevent manipulation

Social Tagging  A Facebook app ``with a purpose''  Users post assertions on their OSN profiles:  e.g., "Am I really over 18 years old?"  Friends tag those assertions as TRUE or FALSE



An Amazon review example  I want to write a scathing review for Sergei's book  I want to prove that I am indeed a CS researcher, thus my review is authoritative and readers can take it seriously  I don't want Sergei to know I wrote the review  Amazon's ``Real Name'' is not an option



We aim at determining the ground truth on identity assertions  We assume that user beliefs reflect the ground truth

Because user feedback is not fully reliable, we provide a credibility measure in 0-100% that should correlate strongly with the truth  We refer to this measure as Assertion Veracity  Verifiers can use thresholds suggested by the OSN provider, e.g., accept as true if > 50%

Outline  How to defend against colluders and Sybils?  Manipulation-resistant assertion veracity  Trust inference  OSN-issued web-based credentials  Unforgeable credentials  Anonymous and unlinkable credentials  Evaluation

Our Non-manipulability Objective  It should be difficult for dishonest users to post assertions that appear true  Ensure that honest users can post assertions that appear true

Veracity values should be informative  The veracity values should correlate strongly with the truth.  If an assertion has higher veracity than another → the assertion is more likely to be true  Useful for devices/users that have prior experience with the system and know the veracity values of known true assertions

Manipulation-resistant Assertion Veracity  User j posts an assertion. Only his friend i can tag the assertion  Use weighted average of tags d ij by friends i on j's assertion  If TRUE, d ij = +1. If FALSE d ij = -1  j's assertion veracity = max(  i w i  d ij /  i w i, 0) Tags weighted by w i FALSE tags matter more To defend against colluders that have low weight: if  i w i < M, assertion veracity = 0

Tagger Trustworthiness  Each tagger i is assigned a tagger trustworthiness score assertion veracity = max(  i w i  d ij /  i w i, 0)  Trust inference analyzes the social graph of taggers and their tagging history Tagger credibility derived via Trust Inference

Trust Inference via the Social Graph  Honest users tend to befriend honest users  each edge in the social graph implies trust  Annotate trust edges by tagging similarity:  History-defined similarity = #same-tags / #common-tags e.g., if 2 friends have tagged the same 2 assertions of a common friend and agree on only 1 tag, they have 50% similarity  linearly combine history- and user-defined similarity (``Do I honestly tag my friends?") Difficult for Sybils to establish similarity-annotated trust edge with honest users

Our Trust Inference Problem  Dishonest users may employ Sybils  Dishonest users can try to build up high similarity with their honest friends  The input is the trust graph G(V, E)  V are the users in the social network  E are directed friend connections annotated by tagging similarity  The output is the tagger trustworthiness of users

Similarity-annotated Trust Graph  How to translate this trust graph into tagger trustworthiness values for all users?  Plethora of prior work on trust inference  The trust inference method determines how trust propagates from trusted seed to users  The closer and the better connected the dishonest user is to the trust seed, the greater the trust it can obtain  Having multiple trust seeds reduces the trust a dishonest user can obtain by focusing on a seed  We need a Sybil-resilient trust inference method  the trust dishonest users and Sybils can obtain should be limited by edges connecting them to honest region

MaxTrust  We transform the similarity-annotated trust graph into a single-source/single-sink flow graph  The cost of computation should not increase with the number of trusted seeds  Unlike Advogato, cost of MaxTrust's max-flow heuristic is independent of the number of seeds O(Tmax |E| log |V|)  The tagger trustworthiness of a user is 0 ≤ w i ≤ Tmax in increments of 1

Outline  How to defend against colluders and Sybils?  Manipulation-resistant assertion veracity  Sybil-resilient trust inference  OSN-issued web-based credentials  Unforgeable credentials  Anonymous and unlinkable credentials  Evaluation  Introducing social trust to collaborative spam mitigation

OSN-issued credentials  Issued by the OSN provider:  {assertion type, assertion, assertion veracity}  Simple and web-based  Easy to parse by human users with no need to understand cryptographic tools [``Why Johnny can't Encrypt'', Security 99]  XML web API to enable online services to read credentials


Unforgeable credentials  the user binds the credential to the context he is issuing it for. Thus, no user can reuse it in another context



Anonymous credentials  as long as OSN provider is trusted and the assertion does not contain personally identifiable info  Unlinkable credentials  as long as the user does not list it along with his other credentials and he creates a new credential for each distinct verifier

Effectiveness Simulation How well do credibility scores correlate with the truth?  Can the design withstand dishonest user tagging and Sybil attacks?  Evaluating the effectiveness of trust inference in our setting:  user feedback weighted by trust derived from the social graph and tagging history

Veracity is reliable under Sybils  50% of users is honest. Veracity of true assertions is substantially higher than veracity of dishonest  The number of Sybils does not affect it

Facebook Deployment  Do users tag?  Is the UI sufficiently intuitive and attractive?  Do users tag honestly?  Does trust inference work in real life?  Measure average veracity of a' priori known true and false assertions  Data set  395-user Facebook AIR social graph  14575 tags, 5016 assertions, 2410 social connections

Users tag honestly  Veracity correlates strongly with the truth  Real users tag mostly honestly

FaceTrust contributions  A solution to the problem of identity verification in non-critical online settings  crowd-vetting through social tagging  trust inference for attack resistance  simple, lightweight web-based credentials with optional anonymity and unlinkability.  Deployment of our social tagging and credential issuing front-ends  collection of real user tagging data  proof of feasibility

Thank You! Facebook application "Am I Really?" at: Questions?

The bar for the non-manipulability of user-feedback-based systems is low  Plain majority voting  Weighted majority voting  reduces the weight of votes submitted by friends, but easily manipulable with Sybils Yet users still rely on them to make informed decisions

Threat Model  Dishonest users  tag as TRUE the dishonest assertions posted by colluding dishonest users  create Sybils that tag their dishonest assertions as TRUE  tag as TRUE the honest assertions posted by honest users to build trust with honest users  Sybils, which are friends only with dishonest users, post assertions, which cannot be voted FALSE (Sybil assertion poster attack)

Assumptions  Honest users  tag correctly to the best of their knowledge  when they tag the same assertions, their tags mostly match  most of their friends will not tag their honest assertions as FALSE  do not indiscriminately add friends

