The Structure of Information Networks

Slides:



Advertisements
Similar presentations
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Advertisements

Social network partition Presenter: Xiaofei Cao Partick Berg.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Positive and Negative Relationships Chapter 5, from D. Easley and J. Kleinberg book.
Jure Leskovec (Stanford) Kevin Lang (Yahoo! Research) Michael Mahoney (Stanford)
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Strong and Weak Ties Chapter 3, from D. Easley and J. Kleinberg book.
The Structure of Networks with emphasis on information and social networks T-214-SINE Summer 2011 Chapter 3 Ýmir Vigfússon.
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research.
Graph Partitioning Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
J. Leskovec, D. Huttenlocher, J. Kleinberg Paper Review by Rachel Katz S IGNED N ETWORKS IN S OCIAL M EDIA.
Selected Topics in Data Networking
V4 Matrix algorithms and graph partitioning
Lecture 21: Spectral Clustering
Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Models of Influence in Online Social Networks
Jure Leskovec Stanford University. Large on-line applications with hundreds of millions of users The Web is my “laboratory” for understanding the pulse.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Web Science Course Lecture: Social Networks - * Dr. Stefan Siersdorfer 1 * Figures from Easley and Kleinberg 2010 (
Predicting Positive and Negative Links in Online Social Networks
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
CS 590 Term Project Epidemic model on Facebook
Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Social Networks Strong and Weak Ties
Graph clustering to detect network modules
Cohesive Subgraph Computation over Large Graphs
Groups of vertices and Core-periphery structure
School of Computing Clemson University Fall, 2012
Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia.
Sofus A. Macskassy Fetch Technologies
Copyright © Zeph Grunschlag,
Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook By: Lars Backstrom - Facebook Inc, Jon Kleinberg.
User Joining Behavior in Online Forums
Minimum Spanning Tree 8/7/2018 4:26 AM
Lecture 9 Measures and Metrics.
Predicting Positive and Negative Links in Online Social Networks
Edge Weight Prediction in Weighted Signed Networks
Social and Information Network Analysis: Review of Key Concepts
Community detection in graphs
CS 3343: Analysis of Algorithms
Lecture 10 Measures and Metrics.
The Structure of Information Networks
Using Friendship Ties and Family Circles for Link Prediction
Peer-to-Peer and Social Networks
Data Mining Practical Machine Learning Tools and Techniques
Networks with Signed Edges
Finding modules on graphs
Michael L. Nelson CS 495/595 Old Dominion University
Discovering Clusters in Graphs
Why Social Graphs Are Different Communities Finding Triangles
Statistical properties of network community structure
+ – If we look at any two people in the group in isolation,
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Emotions in Social Networks: Distributions, Patterns, and Models
CS 594: Empirical Methods in HCC Social Network Analysis in HCI
Networks with Signed Edges
Practical Applications Using igraph in R Roger Stanton
Roc curves By Vittoria Cozza, matr
Affiliation Network Models of Clusters in Networks
Prims’ spanning tree algorithm
Analysis of Large Graphs: Overlapping Communities
Analysis of Large Graphs: Community Detection
Presentation transcript:

The Structure of Information Networks Community Structure in Networks and Structural Balance with emphasis on information and social networks Ymir Vigfusson

Florentine Families: Power 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Communities, clusters, groups, modules Network Communities Networks of tightly connected groups Network communities: Sets of nodes with lots of connections inside and few to outside (the rest of the network) Communities, clusters, groups, modules 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Finding Network Communities How to automatically find such densely connected groups of nodes? Ideally such automatically detected clusters would then correspond to real groups For example: Communities, clusters, groups, modules 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Micro-Markets in Sponsored Search Find micro-markets by partitioning the “query x advertiser” graph: query advertiser 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Social Network Data Zachary’s Karate club network: Observe social ties and rivalries in a university karate club During his observation, conflicts led the group to split Split could be explained by a minimum cut in the network Why would we expect such clusters to arise? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Group Formation in Networks [Backstrom et al. KDD ‘06] Group Formation in Networks In a social network nodes explicitly declare group membership: Facebook groups, Publication venue Can think of groups as node colors Gives insights into social dynamics: Recruits friends? Memberships spread along edges Doesn’t recruit? Spread randomly What factors influence a person’s decision to join a group? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Group Growth as Diffusion Analogous to diffusion Group memberships spread over the network: Red circles represent existing group members Yellow squares may join Question: How does probability of joining a group depend on the number of friends already in the group? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

P(join) vs. # friends in the group [Backstrom et al. KDD ‘06] P(join) vs. # friends in the group LiveJournal: 1 million users 250,000 groups DBLP: 400,000 papers 100,000 authors 2,000 conferences Very different data sets, same threshold sort of thing Diminishing returns: Probability of joining increases with the number of friends in the group But increases get smaller and smaller 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Groups: More Subtle Features Connectedness of friends: x and y have three friends in the group x’s friends are independent y’s friends are all connected Who is more likely to join? y x 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Connectedness of Friends [Backstrom et al. KDD ‘06] Connectedness of Friends Competing sociological theories: Information argument [Granovetter ‘73] Social capital argument [Coleman ’88] Information argument: Unconnected friends give independent support Social capital argument: Safety/trust advantage in having friends who know each other x y 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Connectedness of Friends LiveJournal: 1 million users, 250,000 groups Social capital argument wins! Prob. of joining increases with the number of adjacent members. Going to a party/protest Believing a rumor 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

So, This Means That A person is more likely to join a group if [Backstrom et al. KDD ‘06] So, This Means That A person is more likely to join a group if she has more friends who are already in the group friends have more connections between themselves Thus groups form clusters of tightly connected nodes 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Community Detection How to find communities? We will work with undirected (unweighted) networks 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Method 1: Strength of Weak Ties Intuition: Edge strengths (call volume) in real network Edge betweenness in real network 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Method 1: Girvan-Newman Divisive hierarchical clustering based on the notion of edge betweenness: Number of shortest paths passing through the edge Girvan-Newman Algorithm: Undirected unweighted networks Repeat until no edges are left: Calculate betweenness of edges Remove edges with highest betweenness Connected components are communities Gives a hierarchical decomposition of the network 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Girvan-Newman: Example 12 1 33 49 Need to re-compute betweenness at every step 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Girvan-Newman: Example Step 1: Step 2: Hierarchical network decomposition: Step 3: 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Girvan-Newman: Results Communities in physics collaborations 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Girvan-Newman: Results Zachary’s Karate club: hierarchical decomposition 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

We need to resolve 2 questions How to compute betweenness? How to select the number of clusters? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How to Compute Betweenness? Want to compute betweenness of paths starting at node A Breadth first search starting from A: 1 2 3 4 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How to Compute Betweenness? Count the number of shortest paths from A to all other nodes of the network: 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How to Compute Betweenness? Compute betweenness by working up the tree: If there are multiple paths count them fractionally The algorithm: Add edge flows: -- node flow = 1+∑child edges -- split the flow up based on the parent value Repeat the BFS procedure for each starting node 1+1 paths to H Split evenly 1+0.5 paths to J Split 1:2 1 path to K. Split evenly 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How to select number of clusters? Define modularity to be Q = (number of edges within groups) – (expected number within groups) Actual number of edges between i and j is Expected number of edges between i and j is Ki and Kj are degrees of vertices And m is the total number of edges in the network Aij = adjacency matrix Q can be +ve or –ve +ve values means presence of community structure m…number of edges 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Modularity: Definition Q = (number of edges within groups) – (expected number within groups) Then: Modularity lies in the range [−1,1] It is positive if the number of edges within groups exceeds the expected number 0.3<Q<0.7 means significant community structure m … number of edges Aij … 1 if (i,j) is edge, else 0 ki … degree of node i ci … group id of node i (a, b) … 1 if a=b, else 0 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Modularity: Number of clusters Modularity is useful for selecting the number of clusters: Q HW: Question – come up with a graph where GIRVAN-NEWMAN MODULARITY IS NOT UNIMODAL Why not optimize modularity directly? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Signed networks 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Today: Signed Networks - + Networks with positive and negative relationships Our basic unit of investigation will be signed triangles First we will talk about undirected nets then directed Plan for today: Model: Consider two soc. theories of signed nets Data: Reason about them in large online networks Application: Predict if A and B are linked with + or - - + 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Signed Networks Networks with positive and negative relationships Consider an undirected complete graph Label each edge as either: Positive: friendship, trust, positive sentiment, … Negative: enemy, distrust, negative sentiment, … Examine triples of connected nodes A, B, C 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Theory of Structural Balance Start with the intuition [Heider ’46]: Friend of my friend is my friend Enemy of enemy is my friend Enemy of friend is my enemy Look at connected triples of nodes: - + - + + - Unbalanced Balanced Consistent with “friend of a friend” or “enemy of the enemy” intuition Inconsistent with the “friend of a friend” or “enemy of the enemy” intuition 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Balanced/Unbalanced Networks Graph is balanced if every connected triple of nodes has: all 3 edges labeled +, or exactly 1 edge labeled + Unbalanced Balanced 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Local Balance  Global Factions Balance implies global coalitions [Cartwright-Harary] Claim: If all triangles are balanced, then either: The network contains only positive edges, or Nodes can be split into 2 sets where negative edges only point between the sets + L R - + 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

– + – + + – + Analysis of Balance B D A C E R L Every node in L is enemy of R – B D + – + Any 2 nodes in L are friends + Any 2 nodes in R are friends A – + R C E L Friends of A Enemies of A 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Example: International Relations Positive edge: alliance Negative edge: animosity Separation of Bangladesh from Pakistan in 1971: US supports Pakistan. Why? USSR was enemy of China China was enemy of India India was enemy of Pakistan US was friendly with China China vetoed Bangladesh from U.N. B – –? – P I + – +? C + – U R 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Balance in General Networks Def 1: Local view Fill in the missing edges to achieve balance Def 2: Global view Divide the graph into two coalitions The 2 defs. are equivalent! - + Balanced? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Is a Signed Network Balanced? Graph is balanced if and only if it contains no cycle with an odd number of negative edges. How to compute this? Find connected components on + edges For each component create a super-node Connect components A and B if there is a negative edge between the members Assign super-nodes to sides using BFS 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Signed Graph: Is it Balanced? 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Positive Connected Components 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Reduced Graph on Super-Nodes 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

BFS on Reduced Graph Using BFS assign each node a side Graph is unbalanced if any two connected super-nodes are assigned the same side L R R L L L R Unbalanced! 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Real Large Signed Networks [Leskovec et al. CHI ‘10] Real Large Signed Networks Each link AB is explicitly tagged with a sign: Epinions: Trust/Distrust Does A trust B’s product reviews? (only positive links are visible) Wikipedia: Support/Oppose Does A support B to become Wikipedia administrator? Slashdot: Friend/Foe Does A like B’s comments? Other examples: Online multiplayer games + – MAKE IT DIRECTED 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Balance in Real Network Data [Leskovec et al. CHI ‘10] Balance in Real Network Data Does structural balance hold? + x – Triad Epinions Wikipedia Balance P(T) P0(T) 0.87 0.62 0.70 0.49  0.07 0.05 0.21 0.10 0.32 0.08 0.007 0.003 0.011 0.010  + - + Real data + – + - x x - x x P(T) … probability of a triad P0(T)… triad probability if the signs would be random x Shuffled data 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Global Structure of Signed Nets Intuitive picture of social network in terms of densely linked clusters How does structure interact with links? Embeddedness of link (A,B): Number of shared neighbors Suggest the “coalitions” structure of networks

Global Factions: Embeddedness [CHI ‘10] Global Factions: Embeddedness Epinions Embeddedness of ties: Positive ties tend to be more embedded Positive ties tend to be more clumped together Public display of signs (votes) in Wikipedia further attenuates this Wikipedia 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Global Structure of Signed Nets [CHI ‘10] Global Structure of Signed Nets Clustering: +net: More clustering than baseline –net: Less clustering than baseline Size of connected component: +/–net: Smaller than the baseline + - Suggest tha “coalitions” structure of networks 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Evolving Directed Networks [CHI ‘10] Evolving Directed Networks New setting: Links are directed and created over time How many  are now explained by balance? Only half (8 out of 16) Is there a better explanation? Yes. Status.     - + - +     B X A  + - +  -    - + - +     16 *2 signed directed triads 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Alternate Theory: Status [CHI ‘10] Alternate Theory: Status Links are directed and created over time Status theory [Davis-Leinhardt ‘68, Guha et al. ’04, Leskovec et al. ‘10] Link A  B means: B has higher status than A Link A  B means: B has lower status than A Status and balance give different predictions: + – X X - + - + A B A B Balance: + Status: – Balance: + Status: – 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

+ + ?   Theory of Status Edges are directed [CHI ‘10] Theory of Status Edges are directed Edges are created over time X has links to A and B Now, A links to B (triad A-B-X) How does sign of A-B depend signs of X? We need to formalize: Links are embedded in triads: Provides context for signs Users are heterogeneous in their linking behavior X + + A B ? X   A B 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

16 Types of Context Link (A,B) appears in the context (A,B; X) [CHI ‘10] 16 Types of Context Link (A,B) appears in the context (A,B; X) 16 different contextualized links: 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Generative (Receptive) Surprise [CHI ‘10] Generative (Receptive) Surprise Surprise: How much behavior of user deviates from baseline in context t: (A1, B1; X1),…, (An, Bn; Xn) … instances of contextualized link t k of them closed with a plus pg(Ai)… generative baseline of Ai empirical prob. of Ai giving a plus Then: generative surprise of triad type t: B X - A Vs. what we really do (2 slides): For every node compute the baseline Identify all the edges that close same type of triads Compute surprise B X A  Std. rnd. var.: 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

- + - + Status: Two Examples Two basic examples: X X A B A B Gen. surprise of A: — Rec. surprise of B: — Gen. surprise of A: — Rec. surprise of B: — 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

END END (when I spent 15 min for finishing up the previous lecture) 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Consistency with Status [CHI ‘10] Consistency with Status Determine node status: Assign X status 0 Based on signs and directions of edges set status of A and B Surprise is status-consistent, if: Gen. surprise is status-consistent if it has same sign as status of B Rec. surprise is status-consistent if it has the opposite sign from the status of A Surprise is balance-consistent, if: If it completes a balanced triad X + + +1 A B +1 Status-consistent if: Gen. surprise > 0 Rec. surprise < 0 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Status vs. Balance (Epinions) [CHI ‘10] Status vs. Balance (Epinions) Predictions: Sg(ti) Sr(ti) Bg Br Sg Sr t3 t15 t2 t14 t16 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

From Local to Global Structure [WWW ‘10] From Local to Global Structure Both theories make predictions about the global structure of the network Structural balance – Factions Find coalitions Status theory – Global Status Flip direction and sign of minus edges Assign each node a unique status so that edges point from low to high + + - 3 2 1 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

From Local to Global Structure [WWW ‘10] From Local to Global Structure Fraction of edges of the network that satisfy Balance and Status? Observations: No evidence for global balance beyond the random baselines Real data is 80% consistent vs. 80% consistency under random baseline Evidence for global status beyond the random baselines Real data is 80% consistent, but 50% consistency under random baseline 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Predicting Edge Signs Edge sign prediction problem [WWW ‘10] Predicting Edge Signs u v + ? – Edge sign prediction problem Given a network and signs on all but one edge, predict the missing sign Machine Learning Formulation: Predict sign of edge (u,v) Class label: +1: positive edge -1: negative edge Learning method: Logistic regression Dataset: Original: 80% +edges Balanced: 50% +edges Evaluation: Accuracy and ROC curves Features for learning: Next slide 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Features for Learning For each edge (u,v) create features: [WWW ‘10] Features for Learning For each edge (u,v) create features: Triad counts (16): Counts of signed triads edge uv takes part in Node degree (7 features): Signed degree: d+out(u), d-out(u), d+in(v), d-in(v) Total degree: dout(u), din(v) Embeddedness of edge (u,v) - + + + u v - - + - 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Edge Sign Prediction Classification Accuracy: [WWW ‘10] Edge Sign Prediction Epin Classification Accuracy: Epinions: 93.5% Slashdot: 94.4% Wikipedia: 81% Signs can be modeled from local network structure alone Trust propagation model of [Guha et al. ‘04] has 14% error on Epinions Triad features perform less well for less embedded edges Wikipedia is harder to model: Votes are publicly visible Slash Wiki 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Balance and Status: Complete Model + + - - + - + + - - + - + + - - + - + + - - + -

Generalization Do people use these very different linking systems by obeying the same principles? How generalizable are the results across the datasets? Train on row “dataset”, predict on “column” Nearly perfect generalization of the models even though networks come from very different applications 12/2/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Concluding Remarks Signed networks provide insight into how social computing systems are used: Status vs. Balance Different role of reciprocated links Role of embeddedness and public display Sign of relationship can be reliably predicted from the local network context ~90% accuracy sign of the edge

Concluding Remarks More evidence that networks are globally organized based on status People use signed edges consistently regardless of particular application Near perfect generalization of models across datasets Many further directions: Status difference of nodes A and B [ICWSM ‘10]: A<B A=B A>B Status difference (A-B)