Download presentation
Presentation is loading. Please wait.
1
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian
2
Topics covered in the course Structure and modeling of social networks Power law graphs; Small world phenomenon; High clustering coefficient; Probabilistic and game theoretic models Algorithms for link analysis Crawling the web; HITS; Page Rank; Webspam; Rank aggregation; Spectral clustering Economic aspects of the Internet Peering relations; Alternative mechanisms for routing; P2P networks Topics motivated by e-commerce Reputation mechanisms; Recommendation systems; Ad auctions
3
Logistics Course web page: http://www.cs.washington.edu/education/courses/cse522/05au/ Course work: reading papers (1/week on avg) possibly a few problem sets How to contact us: {nickle,mahdian}@microsoft.com
4
Social Networks A social network is a graph that represents relationships between independent entities. Graph of friendships (or in the virtual world, networks like orkut) Web of sexual contact Graph of scientific collaborations Cross-posts in newsgroups Web graph (links between webpages) Internet: Inter/Intra-domain graph
5
Scientific Collaboration Network 400,000 nodes, authors in Mathematical Reviews database An edge between two authors if they have a joint paper Just 676,000 edges Picture from orgnet.com
6
Scientific Collaboration Network Average degree 3.36 A few high-degrees: Paul Erdös, 509 Frank Harary, 268 Yuri Alekseevich Mitropolskii, 244 Many low-degrees: (100,000 of degree 1) Picture from orgnet.com
7
Scientific Collaboration Network Short paths Max Erdös # is 13 Any two authors connected by path of length at most 23 Average distance between two authors is 7.64 e.g.: John Nash → Shapley → Fulkerson → Hoffman → Paul Erdös Many triangles … Picture from orgnet.com
8
9/11 Terrorist Network Picture from orgnet.com
9
Newsgroup Cross-Post Graph Nodes are newsgroups, essentially archived email lists Edges are cross-posts, i.e. there is an edge between two newsgroups to which an identical email is posted alt.microsoft.sucksalt.linux.sucks
10
Internet Graphs Inter-domain graphs Nodes are autonomous systems or domains Edges are inter-domain connections SPRINTAOL
11
Inter-domain graph Picture from caida.org
12
Internet Graphs Intra-domain graphs Nodes are routers Edges are links between routers 199.45.130.13199.45.143.14
13
Intra-domain graph
14
Colored by AS number Picture from lumeta.com
15
World Wide Web Nodes are webpages Arcs (i.e., directed edges) are hyperlinks http://research.microsoft.com/~mahdianhttp://theory.csail.mit.edu
16
Web graph, Chicago Tribune Page Picture generated by Nicheworks
17
Social Networks
18
Why Study These Networks Understand the creation of these networks Understand viral epidemics Help design crawling strategies for the web Analyze behavior of algorithms (web/internet) Predict evolution of the network and emergence of new phenomena
19
In this lecture Common properties of social networks Power law degree distribution Small world phenomenon High clustering coefficient Structure of the web graph
20
Power Laws Two quantities x and y are related by a power law if y is proportional to x (-c) for a constant c y = .x (-c) If x and y are related by a power law, then the graph of log(y) versus log(x) is a straight line log(y) = -c.log(x) + log( ) The slope of the log-log plot is the power exponent c
21
Power Law Distributions A random variable X has a power law distribution if Pr[X=k] is proportional to k (-c) for a constant c The cumulative distribution, Pr[X>k], of a power law distribution is proportional to k (-c+1), and is called the Pareto law Similar to a power law, the Zipf law relates the rank r of X to its size: the r’th largest instance of X is proportional to r (-c’)
22
Example: City Populations 1. New York7,322,564 2. Los Angeles 3,485,398 3. Chicago2,783,726 4. Houston 1,630,553 5. Philadelphia 1,585,577 6. San Diego 1,110,549 7. Detroit 1,027,974 8. Dallas 1,006,877 9. Phoenix 983,403 10. San Antonio 935,933
23
Example: City Populations 1. New York7,322,564 2. Los Angeles 3,485,398 3. Chicago2,783,726 21. Seattle 516,259 94. Spokane, WA 177,196 95. Tacoma, WA 176,664 96. Little Rock, AR 175,795 97. Bakersfield, CA 174,820 98. Fremont, CA 173,339 99. Fort Wayne, IN 173,072 100. Arlington, VA 170,936
24
Example: City Populations Power law exponent: c = 0.74
25
Power Laws in Networks Degree distribution often satisfies a power law: fraction of nodes f d of degree d is proportional to d -c Degree dFraction f d = 1/(2d) 11/2 21/4 31/6 4~1/8
26
Example: Collaboration Graph Power law exp: c = 2.97 With exponential decay factor, c = 2.46
27
Example: Cross-Post Graph Power law exponent: c = 1.3
28
Example: Inter-Domain Internet Power law exponent: 2.15 < c < 2.2
29
Example: Intra-Domain Internet Power law exponent: c = 2.48
30
Example: Web Graph In-Degree Power law exponent: c = 2.09
31
Example: Web Graph Out-Degree Power law exponent: c = 2.72
32
Small World Phenomenon Six degrees of separation: “Everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names.”
33
Small World Phenomenon Milgram’s famous experiment (1960s): Choose a random person in Nebraska, Bob Ask Bob to deliver a letter to a random person in Massachusetts, Lashawn Tell Bob target’s name, address, and occupation Instruct Bob to only send letter to people he knows on a first-name basis
34
Small World Phenomenon Bob, a farmer in Nebraska David, mayor of Bob’s town Bernard, David’s cousin who went to college with Maya, who grew up in Boston With Lashawn
35
Small World Phenomenon in Graphs The diameter of a graph is the maximum distance (number of edges) between any pair of nodes The average distance of a graph is the average distance between any pair of nodes The average connected distance of a graph is the average distance between any pair of connected nodes
36
Small World Phenomenon in Graphs A graph exhibits a small world phenomenon if it has low diameter or average (connected) distance Typically, the average distance of a small world graph is on the order of log n (where n is the number of nodes)
37
Examples Collaboration graph 401,000 nodes, 676,000 edges (average degree 3.37) Diameter: 23, Average distance: 7.64 Cross-post graph, giant component 30,000 nodes, 800,000 edges (average degree 53.3) Diameter: 13, Average distance: 3.8 Web graph 200 million nodes, 1.5 billion edges (average degree 15) Average connected distance: 16 Inter-domain Internet 3500 nodes, 6500 edges (average degree 3.71) 95% of pairs of nodes within distance 5
38
High Clustering Coefficient The clustering coefficient of a graph is the fraction of triangles among connected triples of nodes Intuitively, the clustering coefficient reflects the probability that your friends are themselves friends We expect social networks to have a high clustering coefficient
39
Examples Collaboration graph Clustering coefficient is 0.14 Density of edges is 0.000008 Cross-post graph Clustering coefficient is 0.4492 Density of edges is 0.0016
40
Assignment READ: A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph structure in the web, WWW, 2000.
41
Graph Structure of the Web Breadth-first search from randomly chosen start nodes Follow both forward and backward links Reveal directed and undirected graph structure Over 90% of nodes reachable if links are treated as undirected Directed graph reveals complex bow-tie structure
42
Bow-Tie Structure of Web Graph Picture from the Nature journal
43
Next Time Probabilistic models for social networks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.