Presentation is loading. Please wait.

Presentation is loading. Please wait.

Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia.

Similar presentations


Presentation on theme: "Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia."β€” Presentation transcript:

1 Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia (ilouz) cohen

2

3 Abstract Many on-line social applications include mechanisms for users to express evaluations of one another, or of the content they create. What affects the types of such evaluations?

4 question-answering site
Dataset Description Wikipedia free encyclopedia Stack Overflow question-answering site Epinions reviewing site

5 Dataset Ways of evaluation
Direct evaluation Indirect evaluation E T E T

6 Which factors influence evaluations?
Similarity Similarity of interests Similarity of social ties Relative status User similarity in

7 Which factors influence evaluations?
Similarity Similarity of interests Similarity of social ties Relative status User similarity in

8 Similarity of interests Definition
Action User u’s binary action vector Vector of length M with 1 in the i’th position if u took action i at least once, and 0 otherwise. Alternative definition relevant on Stack overflow: u’s binary tag vector. Similarity of users u and v: 𝑠 𝑒,𝑣 = 𝑒 βˆ™ 𝑣 𝑒 βˆ™ 𝑣

9 Similarity of interests Example
u v u’s binary action vector: 𝑒 = 1,0,1,0,0 v’s binary action vector: 𝑣 =(0,0,1,1,1) u and v’s similarity: 𝑠 𝑒,𝑣 = 𝑒 βˆ™ 𝑣 𝑒 βˆ™ 𝑣 = βˆ™ 3 = 1 6

10 Which factors influence evaluations?
Similarity Similarity of interests Similarity of social ties Relative status User similarity in

11 Similarity of social ties Definition
User can be characterized by the users he has evaluated User u’s binary evaluation vector Vector of length M with 1 in the i’th position if user u evaluated user i, and 0 otherwise. Social similarity: 𝑠 𝑒,𝑣 = 𝑒 βˆ™ 𝑣 𝑒 βˆ™ 𝑣

12 Similarity of social ties Example
u v u’s binary evaluation vector: 𝑒 = 0,0,1,1,0 v’s binary evaluation vector: 𝑣 =(0,0,1,1,0) u and v’s similarity: 1 2 3 4 5 𝑠 𝑒,𝑣 = 𝑒 βˆ™ 𝑣 𝑒 βˆ™ 𝑣 = βˆ™ 2 =1

13 How does similarity influence evaluations?
What do you think?

14 And the answer is…

15 Similarity vs. P(+) Wikipedia
Relationship between similarity and 𝑃(+) is monotonically-increasing. 𝑷 + is the fraction of positive evaluations in a given set of evaluations.

16 Similarity vs. P(+) Stack Overflow
Strength of relationship depends on the similarity notion used: Tag similarity Evaluation similarity 𝑷 + is the fraction of positive evaluations in a given set of evaluations.

17 Similarity vs. P(+) Epinions
Similarity of interests Produces similar effects as Stack Overflow. Similarity of social ties Turns out to be too sparse to be meaningful.

18 So far. similarity is correlated with P(+). Next
So far similarity is correlated with P(+). Next how both similarity and status affect evaluations.

19 Which factors influence evaluations?
Similarity Similarity of interests Similarity of social ties Relative status User similarity in

20 Status Definition User u’s status: total number of actions u has taken
User u’s (absolute) status: 𝜎 𝑒

21 Differential status Definition
Evaluation behavior doesn't vary with target status alone, but with the evaluator and target’s statuses together. Differential status: Ξ”= 𝜎 𝐸 βˆ’ 𝜎 𝑇

22 Status Example u v u’s absolute status: 𝜎 𝑒 = 1,0,1,0,0 2 =2
𝜎 𝑒 = 1,0,1,0,0 2 =2 v’s absolute status: 𝜎 𝑣 = 0,0,1,1,1 2 =3 u and v’s differential status: Ξ”= 𝜎 𝑒 βˆ’ 𝜎 𝑣 =βˆ’1

23 𝚫 vs. 𝑷(+) Wikipedia For low-similarity pairs:
βˆ† < 0: votes are about 80% positive βˆ† > 0: votes are about 55% positive As similarity increases, votes get more positive across all values of βˆ†. Ξ”= 𝜎 𝐸 βˆ’ 𝜎 𝑇 𝑷 + is the fraction of positive evaluations in a given set of evaluations.

24 𝚫 vs. 𝑷(+) Stack Overflow
When βˆ† > 0 The higher similarity is, the higher P(+) is. Same as Wikipedia. When βˆ† < 0 The lower similarity is, the higher P(+) is. Opposite order of similarity curves. Why does this happen? Ξ”= 𝜎 𝐸 βˆ’ 𝜎 𝑇

25 𝚫 vs. 𝑷(+) Stack Overflow
Why Does this happen? Reputation system of Stack Overflow costs a small amount of reputation to β€œdown-vote”. A disincentive to downvote is strongly felt by users with lower status. Ξ”= 𝜎 𝐸 βˆ’ 𝜎 𝑇

26 𝚫 vs. 𝑷(+) Stack Overflow
This effect disappears when removing low-status evaluators ( 𝜎 𝐸 <100).

27 So far similarity controls the extent to which differential status influences evaluations.
Next relationship between similarity and status.

28 Similarity vs. Ξ” Wikipedia elections
Selection effect Users with higher status tend to vote on targets who are active in the same areas as they are.

29 Similarity vs. Ξ” Stack Overflow elections
Evaluator-target pairs are more similar as Ξ” moves away from 0. No selection effect

30 Similarity vs. Ξ” Wikipedia and Stack Overflow
Hypothesis: This happens due to the different contexts that evaluations occur on Stack Overflow and Wikipedia. Ranges of similarity in both graphs are very small, thus differences between plots are not as big as it seems.

31 Is 𝚫 an appropriate way to compare statuses?

32 Absolute status vs. 𝑷(+) Stack Overflow
The P(+) curves are approximately flat and don’t depend on 𝜎 𝑇 . P(+) depends on Ξ”, the higher the βˆ† range is, the lower P(+) is. 𝜎 𝑇

33 Summary Main insights P(+) is positively correlated with similarity.
βˆ† plays a major rule for low-similarity pairs. Selection effect on Wikipedia, but not on Stack Overflow. βˆ† is a correct measure for relative status.

34 How are similarity and status useful?

35

36 Ballot-blind prediction
Knowledge of similarity and status of first voters alone provides enough information to successfully predict the final result.

37 Ballot-Blind prediction Definition
The task Predicting administrator promotion on Wikipedia from early voters The method By looking at properties of first few voters and their similarity with the candidate. Without looking at the sign of the votes.

38 Ballot-Blind prediction Experimental setup
Election (𝑇, Ω, 𝑅) Target Set of votes Result ∈ 1, βˆ’1 Vote 𝑣 𝑖 ∈ Ω ( 𝐸 𝑖 , 𝑠 𝑖 , 𝑑 𝑖 ) Evaluator Vote ∈ 1, βˆ’1 Time of vote

39 Ballot-Blind prediction Experimental setup
English Wikipedia 7,600 voters 120,000 votes 2,953 candidates 3,422 elections k = 5: looking at the first 5 voters

40 Ballot-Blind prediction Classes of features
Simple Summary Statistics (S) π‘™π‘œπ‘” ( 𝜎 𝑇 ) 𝑠 = 𝑖=1 π‘˜ 𝑠 𝐸 𝑖 , 𝑇 π‘˜ π›₯ = 𝑖=1 π‘˜ ( 𝜎 𝐸 𝑖 βˆ’ 𝜎 𝑇 ) π‘˜ βˆ†-s quadrants Partition βˆ†-s space into four quadrants (s=0.025 in Wikipedia).

41 Ballot-Blind prediction Baselines
Baseline B1: logistic regression classifier S statistics 4 features of βˆ†-s quadrants Baseline B2: probability 𝟏 π’Œ π’Š=𝟏 π’Œ 𝑃 𝑖 threshold Voter’s positivity 𝑃 𝑖 - his historical fraction of positive votes Baseline GS β€œgold-standard”: best possible performance 𝟏 π’Œ π’Š=𝟏 π’Œ 𝒔 π’Š

42 Ballot-Blind prediction Main assumptions
User positivity 𝑃 𝑖 indicates about the default voting behavior. Default behavior’s changes depend on π›₯ 𝑖 , 𝑠 𝑖 . 𝑑 π›₯ 𝑖 , 𝑠 𝑖 is the average deviation of 𝑃 𝑖 in the βˆ† 𝑖 , 𝑠 𝑖 bucket, compared to the overall fraction of positive votes across the entire dataset.

43 Ballot-Blind prediction First method M1
Adjusting 𝑃 𝑖 according to average deviation of 𝑃 𝑖 in the βˆ† 𝑖 , 𝑠 𝑖 bucket, across the entire population. Model 𝑃 𝐸 𝑖 =1 = 𝑃 𝑖 +𝑑( π›₯ 𝑖 , 𝑠 𝑖 ) 𝟏 π’Œ π’Š=𝟏 π’Œ 𝑃 𝐸 𝑖 =1 Threshold

44 Ballot-Blind prediction Second method M2
Adjusting 𝑃 𝑖 βˆ† 𝑖 , 𝑠 𝑖 based on effects of s and βˆ† on 𝑃 𝑖 βˆ† 𝑖 , 𝑠 𝑖 for both individual and global levels. Model 𝑃 𝐸 𝑖 =1 =𝛼· 𝑃 𝑖 βˆ† 𝑖 , 𝑠 𝑖 + 1βˆ’π›Ό ·𝑑 βˆ† 𝑖 , 𝑠 𝑖 𝟏 π’Œ π’Š=𝟏 π’Œ 𝑃 𝐸 𝑖 =1 Threshold 𝑃 𝑖 βˆ† 𝑖 , 𝑠 𝑖 is 𝐸 𝑖 ’s positivity in the βˆ† 𝑖 , 𝑠 𝑖 bucket.

45 Ballot-Blind prediction Results on English Wikipedia
B2 without S does as well as B1 with S. Models M1 and M2 outperform B1 and B2.

46 Ballot-Blind prediction Results on German Wikipedia
Baseline B2 performs about as well as B1. Models M1 and M2 outperform B1 and B2. Model M2 with S features gives performance that is halfway to even the gold standard.

47 Ballot-Blind prediction Conclusions
How evaluations depend on where they fall in the βˆ†-s space? Low evaluator-target similarity with evaluator’s higher status Low evaluator-target similarity with target’s higher status High evaluator-target similarity with evaluator’s higher status High evaluator-target similarity with target’s higher status More predictive Less predictive

48 Conclusions Similarity is positively correlated with on-line users evaluation. Similarity controls the effect of Ξ” on user evaluations. Similarity and status have a predictive power (ballot blind-prediction).

49 Questions?

50 for your attention!

51


Download ppt "Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia."

Similar presentations


Ads by Google