COLOR TEST COLOR TEST
Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
Graph Representation Nodes:= Edges:=
New Testament
Visualization from ManyEyes New Testament
Scientific Collaboration
World Wide Web
Seattle Honolulu
gridstarcycle“tree”
1 7 Degrees Definition: The degree of a node v i is the number of nodes v j such that there’s an edge e i,j between them. Degree v 8 = 4.
Degree Distributions /4 1/2 3/ Frequency # of friend (degree) A cycle? Cycle deg. dist. A star? Star deg. dist.
Power-law Degree Dist /3 1/ Frequency Degree Power-law: P(∂) = c ∂ -α
Log-Log Plots Log (Frequency) Log (Degree) Power-law: P(∂) = c ∂ -α log (P(∂))= log ( c ∂ -α ) = log (c) – α∙log (∂) Straight line on a log-log plot!
Example: Web Graph In-Degree Power law exponent: α = 2.09
Example: New York Facebook Lognormal is better fit.
Paths Definition: A path is a sequence of nodes (v 1, …, v k ) such that for any adjacent pair v i and v i+1, there’s an edge e i,i+1 between them. Path (v 1,v 2,v 8,v 3,v 7 )
Paths “I know someone who knows someone who knows you.”
Path length Definition: The length of a path is the number of edges it contains. Path (v 1,v 2,v 8,v 3,v 7 ) has length
Distance Definition: The distance between nodes v i and v j is the length of the shortest path connecting them. The distance between v 1 and v 7 is
Famous distances nodes = {mathematicians} edges = if 2 mathematicians co-author a paper Erdos number = distance between mathematican and Erdos Paul Erdos number
Famous distances Erdos number of …
Diameter Definition: The diameter of a graph is the maximum shortest-path distance between any two nodes. The diameter is “longest shortest path”
The trace of a disease 1.Initially just one node is infected 2.All nodes with an infected friend get infected Day 0Day 1Day 2
The trace of a disease # days ≤ diameter ≤ twice # days Day 0Day 1Day 2 Because the trace defines the distance from the initially infected person to the last infected person. Because there’s a path in the trace between any two people going through the initially infected person.
Six degrees of separation The diameter of a social network is typically small.
Small world phenomenon Milgram’s experiment (1960s). Ask someone to pass a letter to another person via friends knowing only the name, address, and occupation of the target. Short paths exist (and people can find them!).
Diameter “longest shortest path” gridstartree √n2log n (for const. deg.)
Diameter For the population of the US, gridstartree 2,00026
Clustering Coefficient “fraction of triangles bt. all connected triples” ZERO ….> ZERO gridstarcycle“tree”
Why do we see these realities? 1.High clustering coefficient … triadic closure – tend to know your friend’s friends 2.Power-law degree distribution … popular people attract proportionally more friends 3.Low diameter … there is an element of chance to whom we meet
Preferential Attachment 1.People join network in order 1, 2, …, N 2.When join, person t chooses friend by a)With probability p, pick person t’ uniformly at random from 1, …, t-1 b)With probability (1-p), pick person t’ uniformly at random and link to person that t’ links too Imitation
The rich get richer 2 b) With prob. (1-p), pick person t’ uniformly at random and link to person that t’ links too 1/4 3/4
The rich get richer 2 b) With prob. (1-p), pick person t’ uniformly at random and link to person that t’ links too Equivalently, 2 b)With probability (1-p), pick a person proportional to in-degree and link to him
Conclusion Social networks have predictable structure – Power-law degree distribution (preferential attachment model) – Low (logarithmic) diameter – High clustering coefficients Social networks impact many social processes – Spread disease/technology – Generate revenue – Sustain cooperation