Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)

Similar presentations


Presentation on theme: "Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)"— Presentation transcript:

1 Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)

2  Rich social structure in online computing applications  Such structures are modeled by networks  Most social network analyses view links as positive  Friends  Fans  Followers  But generally links can convey either friendship or antagonism 2

3  Our plan  Study social interactions on the Web that have positive and negative relationships  Questions  How do edge signs and network structure interact?  Approach  Edge sign prediction problem  Given a network and signs on all but one edge, predict the missing sign  Applications  Friend recommendation for social media  Easier to predict whether you know someone vs. to predict what you think of them 3 + + ? + + + + + – – – – – – –

4  Each link A  B is explicitly tagged with a sign:  Epinions: Trust/Distrust  Does A trust B’s product reviews? (only positive links are visible)  Wikipedia: Support/Oppose  Does A support B to become Wikipedia administrator?  Slashdot: Friend/Foe  Does A like B’s comments?  Other examples:  World of Warcraft [Szell et al. 2010] + + + + + + + + – – – – – – – 4

5  Edge signs can be predicted with ~90% accuracy using only the local network structure  No need for global trust-propagation mechanisms  Data oriented justification of classical theories from social psychology  Our models align with theories of Balance and Status  Near perfect generalization:  Same underlying mechanism of signed edge creation  Can train on “how people vote” and predict trust as well as the model trained on trust itself 5

6 Machine Learning formulation:  Predict sign of edge (u,v)  Class label:  +1: positive edge  -1: negative edge  Learning method:  Logistic regression  Evaluation:  Accuracy and ROC curves  Dataset:  Original: 80% +edges  Balanced: 50% +edges  Features for learning:  Next slide 6 u u v v + + ? + + + + + – – – – – – –

7 For each edge (u,v) create features:  Triad counts (16):  Counts of signed triads edge u  v takes part in  Node degree (7 features):  Signed degree:  d + out (u), d - out (u), d + in (v), d - in (v)  Total degree:  d out (u), d in (v)  Embeddedness of edge (u,v) 7 u v - + ++ -- + -

8  Error rates:  Epinions: 6.5%  Slashdot: 6.6%  Wikipedia: 19%  Signs can be modeled from local network structure alone  Trust propagation model of [Guha et al. ‘04] has 14% error on Epinions  Triad features perform less well for less embedded edges  Wikipedia is harder to model:  Votes are publicly visible 8 Epin Slash Wiki

9  Our goal is not just to predict signs but also to derive insights into usage of signed edges  Logistic regression learns a weight b i for each feature x i :  Connection to theories from social psychology:  Structural balance  Theory of status which both give predictions on the sign of the edge (u,v) based on the triad it is embedded into 9 uv +-

10 Consider edges as undirected  Start with intuition [Heider ’46]:  Friend of my friend is my friend  Enemy of enemy is my friend  Enemy of friend is my enemy  Look at connected triples of nodes that are consistent with this logic: 10 ++ + - - + - - +

11  Status theory [Davis-Leinhardt ‘68, Guha et al. ’04, Leskovec et al. ‘10]  Link u  v means: v has higher status than u  Link u  v means: v has lower status than u  Based on signs/directions of links from/to node x make a prediction  Status and balance can make different predictions: 11 + – uv x - - uv x + + Balance: + Status: – LogReg: – Balance: + Status: – LogReg: – Balance: + Status: – LogReg: – Balance: + Status: – LogReg: – uv x - + Balance: – Status: – LogReg: – Balance: – Status: – LogReg: –

12 12 + + + - - + - - + + + - - + - - + + + - - + - - + + + - - + - -

13  Both theories agree well with learned models  Further observations:  Backward-backward triads have smaller weights than forward and mixed direction triads  Balance is in better agreement with Epinions and Slashdot while Status is with Wikipedia  Balance consistently disagrees with “enemy of my enemy is my friend” 13 v v x x u u

14  Balance based and learned coefficients: 14 Feature Balance theory EpinSlashdotWiki const00.431.490.04 10.050.040.05 -0.11-0.24-0.16 -0.21-0.35-0.14 1-0.01-0.03-0.05 + - - - + - - - + + + + Model if signs would be created purely based on Balance theory

15  Status based and learned coefficients: 15 Feature Status theory EpinSlashdotWiki const0-0.68-1.39-0.30 u < x < v10.110.050.03 u > x > v-0.10-0.11-0.19 u v00.060.160.03 u > x < v0-0.010.040.05 v v x x u u + + Triads where u > x > v Triads where u > x > v v v x x u u v v x x u u v v x x u u + + – – – – Model if signs would be created purely based on Status theory

16  Deterministic models compare well to Learned models  Epinions and Slashdot:  More embedded edges are easier to predict  Wikipedia:  Status outperforms balance  Learned balance performs nearly as well as the full model 16 Epin Slash Wiki

17  Do people use these very different linking systems by obeying the same principles?  How generalizable are the results across the datasets?  Train on row “dataset”, predict on “column”  Almost perfect generalization of the models even though networks come from very different applications 17

18  Suppose we are only interested in predicting whether there is a trust edge or no edge  Does knowing negative edges help? 18 + + ? + + + + + – – – – – – – + + ? + + + + + Vs. YES!

19  Both theories make predictions about the global structure of the network  Structural balance – Factions  Put nodes into groups such that the number of in group “+” and between group “-” edges is maximized  Status theory – Global Status  Flip direction and sign of negative edges  Assign each node a unique status value so that most edges point from low to high 19 5 1 3 + + -

20  Fraction of edges of the network that satisfy Balance and Status?  Observations:  No evidence for global balance beyond the random baselines  Real data is 80% consistent vs. 80% consistency under random baseline  Evidence for global status beyond the random baselines  Real data is 80% consistent, but 50% consistency under random baseline 20

21  Signed networks provide insight into how social computing systems are used:  Status vs. Balance  Sign of relationship can be reliably predicted from the local network context  ~90% accuracy sign of the edge  More evidence that networks are globally organized based on status  People use signed edges consistently regardless of particular application  Near perfect generalization of models across datasets  Negative information helps in predicting positive edges 21

22 22Jure Leskovec

23  Heuristic predictors for (u,v):  Balance: Chose sign that makes majority of the triads balanced  Status: Predict in based on status  (v)=d + in (u)+d - out (u)-d + out (v)-d - in (v)  Out-sign of v: majority sign  In-sign of u: majority sign  Observations:  Triadic models do better with increasing embeddedness 23


Download ppt "Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)"

Similar presentations


Ads by Google