Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar
What is a social network? A set of relationships between entities
Social Network Analysis [Social network analysis] is grounded in the observation that social actors are interdependent and that the links among them have important consequences for every individual [and for all of the individuals together].... [Relationships] provide individuals with opportunities and, at the same time, potential constraints on their behavior.... Social network analysis involves theorizing, model building and empirical research focused on uncovering the patterning of links among actors. It is concerned also with uncovering the antecedents and consequences of recurrent patterns. [Linton C. Freeman, UC-Irvine]
Representing a Social Network a set V of n nodes or vertices, usually denoted {v 1, …, v n } a set E of m edges between nodes, usually denoted {e i,j } 83 edge e 8,3 node v 2
Examples of Social Networks Nodes are high-school students. Boys are red, Girls are blue… What is the meaning of a bidirectional edge?
Meaning of Bidirected Edges “I date(d) you.”
Paths Definition: A path is a sequence of nodes (v 1, …, v k ) such that for any adjacent pair v i and v i+1, there’s a directed edge e i,i+1 between them. Path (v 1,v 2,v 8,v 3,v 7 )
Paths “I date(d) someone who date(d) someone who date(d) you.”
Examples of Social Networks Nodes are high-school students. Boys are red, Girls are blue… What is the meaning of a bidirectional edge?
Path length Definition: The length of a path is the number of edges it contains. Path (v 1,v 2,v 8,v 3,v 7 ) has length 4.
Distance Definition: The distance between nodes v i and v j is the length of the shortest path connecting them. The distance between v 1 and v 7 is 3.
Famous distances nodes = {actors} edges = if two actors star in same film Kevin Bacon number = distance between actor and Bacon Kevin Bacon number
The Kevin Bacon Game Invented by Albright College students in 1994: Craig Fass, Brian Turtle, Mike Ginelly Goal: Connect any actor to Kevin Bacon, by linking actors who have acted in the same movie. Oracle of Bacon website uses Internet Movie Database (IMDB.com) to find shortest link between any two actors:
Title Data
Famous distances Math PhD genealogies
Famous distances nodes = {mathematicians} edges = if 2 mathematicians co-author a paper Erd ő s number = distance between mathematican and Erdos Paul Erd ő s number
Erdős Numbers Erd ő s wrote papers with 507 co-authors. Number of links required to connect scholars to Erd ő s, via co- authorship of papers What type of graph do you expect? Jerry Grossman (Oakland Univ.) website allows mathematicians to compute their Erd ő s numbers: Connecting path lengths, among mathematicians only: avg = 4.65 max = 13
Famous distances Erdős number of …
Famous distances Erdős number of … Fan ChungF.T. LeightonP.T. Metaxas Erdos
Famous distances Erdos number of … if you publish with me!
Diameter Definition: The diameter of a graph is the maximum shortest-path distance between any two nodes. The diameter is 3.
Six degrees of separation The diameter of a social network is typically small.
Milgram: Six Degrees of Separation 296 People in Omaha, NE, were given a letter, asked to try to reach a stockbroker in Sharon, MA, via personal acquaintances 20% reached target average number of “hops” in the completed chains = 6.5 Why are chains so short? “Random Graphs have small diameter” Do they?
Why are Chains so Short? Maybe exponential growth of Most people know at least 100 Through their friends: Through their friends’ friends: Through their friends’ friends’ friends: 10^k
Not so fast… Your friends mostly know each other… In high school self-reported friendships, clusters based on race (left-right) and age (top-bottom) Homophily: Your friends are similar to you! Pr [two of your friends are friends] is high Social networks have vertices with high clustering coefficient (how much its neighborhood resembles a clique) So, exponential growth does not explain it We want a model with small diameter and large clustering coefficient
Watts/Strogatz: Rewire Ring Lattice Proposed a model (ring lattice) with small diameter and large clustering coefficient: Put people on circle connect each to x closest neighbors; with prob. p, rewire each connection randomly Result: Yes, short chains exist for p>0.1! p =0.0 p =0.1 p =1.0
Ok, short chains exist, but… Will people be able to find the short chains? Milgram showed that people were able to find them. Kleinberg [2000]: No search strategy in a Watts/Strogatz network, based only on local information, can find short chains…
Kleinberg’s Rewire Grid
Now you can find short paths!
The effect of distance Searching with local information gets more efficient as increases up to 2, then gets worse again! In fact, it finds short paths in logarithmic time! Theory and practice agree!
Translated into English? “Distance scales” Count friends within log distances: … When = 2, nodes have the same volume of links to each distance scale
The Power of Long Distance Relations Probability of friendship is falling off like the square of the distance! Geographic location is a primary reason for selecting next person in chain We have eventually understood Milgram’s experiment But does this explains what happens on the internet? (It depends on how you define distance: see Liben-Nowell’s paper)