Presentation is loading. Please wait.

Presentation is loading. Please wait.

SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu.

Similar presentations


Presentation on theme: "SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu."— Presentation transcript:

1 SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu

2 What are the properties of the WWW Graph? Is the World Wide Web connected? If not, how large is the largest component, the second largest component, etc.? Can these questions be answered exactly? Probably not! The WWW is changing constantly. Even a “snapshot” of the Web is too large to handle.

3 An important observation power law WWW graph has a power law degree distribution Broder, Kleinberg, Kumar, Raghavan, Rajagopalan aaand Tomkins, 1999. Barabási, Albert and Jeung, 1999. Discovered by several groups independently

4 Power Law Graphs Power law decay of the degree distribution: The number of vertices of degree d is proportional to 1/d b where b is some constant > 0. Let y(d) be the number of nodes of degree d y ~ 1/d b log y = a – b log d

5

6 Power Law Graphs Robust and Ubiquitous Internet Router Graph Power Grid Graph Phone Call Graph Scientific Citation Graph Co-Stars Graph (e.g. the six degrees of Kevin Bacon) The power in the power law stays constant even as the graphs grow and change.

7 What does a massive graph look like? sparse clustered small diameter Hard to describe ! Harder to analyze !! prohibitively large dynamically changing incomplete information

8 Don’t worry about exact answers— Use Models Instead Data sets too large and dynamic for exact analysis occur in many other areas: the physical, biological, and social sciences and engineering. Progress in understanding often made by iterative interplay between modeling and experimental data, where both often have a random or statistical nature.

9 Modeling Power Law Graphs Develop model of Power Law Graphs Analyze properties of model, e.g., connected component structure Compare results to experimental data Our model will be of variant of an important model in graph theory called Random Graphs

10 Random Graphs G (n,e) – n nodes – all graphs with e edges have uniform probability H(3,1) prob 1/3 H(3,2) prob 1/3

11 Random Graphs G(n,p) – n nodes – each edge is included with probability p – expected degree = p(n-1) (1-p) 3 p3p3 p(1-p) 2 p 2 (1-p)

12 Paul Erdos and A. Renyi, On the evolution of random graphs Magyar Tud. Akad. Mat. Kut. Int. Kozl. 5 (1960) 17-61... /

13 The evolution of random graphs G(n,p) 0 cycles of any size one giant component, i.e., size  (n), other components are o(n)-sized trees log n/n connected and almost regular, expected degree ~ w log n 1/n p disjoint union of trees the double jumps c’/n, c>1 G(n,p) is connected w log n/n, w  c/n, 0<c<1

14 Random Graphs and Degree Distributions H(n,s) –n nodes –s = (y(1), y(2), …, y(n-1)), where y(i) is the number of nodes with degree i. –all graphs with degree distribution s have uniform probability

15 Random Graphs and Degree Distributions H(4,s), s = (1,2,1). All have prob. 1/12

16 Random Power Law Graphs A power law degree distribution can be described by two parameters:  y = e  /x  log y =  –  log x where y is the number of nodes of degree x A new random graph model: P( ,  ). P( a,b ) assigns uniform probability to all graphs with degree distribution y = e  /x 

17 A few facts about P( ,  ): The maximum degree is e  / . The number of vertices n is n =  e  /x  ~  (  ) e , 1  x  e   where  x  the Reimann Zeta function. The number of edges E is E =  1/2  e  /x  -1 ~  (  -1) e  /2 The density E/n =  is controlled by 

18 Facts on P( ,  ): a root of ς(  -2)=2 ς(  -1) The second largest components are of size O(log n). For any x, 2<x<O(log n), there is a component of size x. smaller components are of size O(log n/log log n). For any x, 2<≤x<O(log n/log log n), there is a component of size x. smaller components are of size O(1). connected 0  1 2 3.478... not connected —unique giant component of size  (n) no giant component

19

20 How do Power Law Graphs Arise? The previous model takes the power law degree distribution as a given. It does not explain how such graphs arise. Results which hold in the model with high probability (e.g., our connected component results) will apply to the vast majority of power law graphs regardless of the particulars of the evolution process.

21 Yet Another Random Graph Model  (n) is a random graph evolution: Let K n be the set of all possible edges Let E t be the edges chosen in steps 1 through t. At time step t+1 choose uniformly one of the edges in K n – E t Add this edge to E t to get E t+1. Study what structures appear with high probability as a function of t.

22 Need a new idea   (n) fixes the set of nodes and then adds edges.  Can show that to get a power law, need to add both nodes and edges.   (n) chooses uniformly among all eligible edges  Can show that selecting edges uniformly will not yield a power law.

23 A Graph Evolution Process At each time step t, toss a biased coin having heads with probability p. “tails” -> add a new vertex with a self-loop. “heads” -> add a new edge between the existing set of nodes: –Select a vertex u with probability proportional to the the degree of u, i.e., Pr[ u chosen ] = deg(u)/2|E|. –Independently select vertex v with probability proportional to deg v. –Add the edge {u,v}.

24 A Graph Evolution Process p 1-p u v GtGt The number of nodes grows with time Edges are not added uniformly Nodes which are added early have an “advantage” over nodes added late Gives a power law degree distribution y ~ 1/d 1+1/p

25 Comparisons From simulation using Model B From real data

26 Evolution Process for Directed Graphs Select a vertex u with probability proportional to the the out degree of u, i.e., Pr[ u chosen ] = out- deg(u)/|E|. Select a vertex v with probability proportional to the the in degree of v. Flip two coins; heads with prob p1 and p2. –Heads, heads -> add an edge from u to v. –Heads, tails -> add an edge from u to a new node. –Tails, heads -> add an edge from a new node to v. –Tails, Tails -> add a directed self-loop to a new node. # nodes w/outdegree d ~ 1/d 1+1/p1 # nodes w/indegree of d ~ 1/d 1+1/p2

27 Massive Graphs Random graphs Similarities: Adding one (random) edge at a time. Differences:Random graphs <-- almost regular. Massive graphs <-- uneven degrees. Correlations.

28 The advantages of power law models Approximating real data graphs. Possible to analyze rigorously—discover implicit structure of massive graphs Models for generating network topologies

29 Erdös and Réyni’s seminal papers. Methods : Martingales. Concentration bounds. Molloy+Reed’s results on random graphs with. given degree squences.

30 Can be found at http://math.ucsd.edu/~llu A JAVA generation/simulation of power graphs Future directions The evolution of power graphs concerning ---- diameters of connected components luuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuLu’s thesis -- - frequency of occurrences of certain subgraphs - power law of eigenvalues - scaling behavior of power law graphs - “signatures” in graphs to distinguish models


Download ppt "SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu."

Similar presentations


Ads by Google