1 A Random-Surfer Web-Graph Model Avrim Blum, Hubert Chan, Mugizi Rwebangira Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Advertisements

Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.
Analysis and Modeling of Social Networks Foudalis Ilias.
Week 5 - Models of Complex Networks I Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Generative Models for the Web Graph José Rolim. Aim Reproduce emergent properties: –Distribution site size –Connectivity of the Web –Power law distriubutions.
Information Retrieval Lecture 8 Introduction to Information Retrieval (Manning et al. 2007) Chapter 19 For the MSc Computer Science Programme Dell Zhang.
22C:19 Discrete Structures Discrete Probability Fall 2014 Sukumar Ghosh.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
Lecture 7 CS 728 Searchable Networks. Errata: Differences between Copying and Preferential Attachment In generative model: let p k be fraction of nodes.
The influence of search engines on preferential attachment Dan Li CS3150 Spring 2006.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Power law random graphs. Loose definition: distribution is power-law if Over some range of values for some exponent Examples  Degree distributions of.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Expectation of random variables Moments (Sec )
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
1 Mazes In The Theory of Computer Science Dana Moshkovitz.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves
CS Lecture 6 Generative Graph Models Part II.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Multiple random variables Transform methods (Sec , 4.5.7)
13. The Weak Law and the Strong Law of Large Numbers
Advanced Topics in Data Mining Special focus: Social Networks.
SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu.
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Complexity 1 Mazes And Random Walks. Complexity 2 Can You Solve This Maze?
Random Walks Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2004 Lecture 24April 8, 2004Carnegie Mellon University.
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
Computer Science 1 Web as a graph Anna Karpovsky.
1 Dynamic Models for File Sizes and Double Pareto Distributions Michael Mitzenmacher Harvard University.
Peer-to-Peer and Social Networks Random Graphs. Random graphs E RDÖS -R ENYI MODEL One of several models … Presents a theory of how social webs are formed.
Randomized Algorithms Morteza ZadiMoghaddam Amin Sayedi.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Entropy Rate of a Markov Chain
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology The Weak Law and the Strong.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
Random Walks Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2005 Lecture 24April 7, 2005Carnegie Mellon University.
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
Introduction to Behavioral Statistics Probability, The Binomial Distribution and the Normal Curve.
1 Permutation routing in n-cube. 2 n-cube 1-cube2-cube3-cube 4-cube.
Challenges and Opportunities Posed by Power Laws in Network Analysis Bruno Ribeiro UMass Amherst MURI REVIEW MEETING Berkeley, 26 th Oct 2011.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Many random walks are faster than one Noga AlonTel Aviv University Chen AvinBen Gurion University Michal KouckyCzech Academy of Sciences Gady KozmaWeizmann.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Sampling and estimation Petter Mostad
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
CSci 162 Lecture 7 Martin van Bommel. Random Numbers Until now, all programs have behaved deterministically - completely predictable and repeatable based.
Chapter 6 Large Random Samples Weiqi Luo ( 骆伟祺 ) School of Data & Computer Science Sun Yat-Sen University :
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2013 Figures are taken.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Markov Chains Mixing Times Lecture 5
Minimum Spanning Tree 8/7/2018 4:26 AM
Enumerating Distances Using Spanners of Bounded Degree
Peer-to-Peer and Social Networks
Log-periodic oscillations due to discrete effects in complex networks
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

1 A Random-Surfer Web-Graph Model Avrim Blum, Hubert Chan, Mugizi Rwebangira Carnegie Mellon University

2 The Web as a Graph Consider the World Wide Web as a graph, with web pages as nodes and links between pages as edges. Experiments suggest that the degree distribution of the Web-Graph follows a power law [FFF99]. links.html resume.html index.html

3 Power Law Taking the logarithm of both sides: log Pr (X=k) = log C –α log k The distribution of a quantity X follows a power law if Pr (X=k) = Ck -α Thus if we take a log-log plot of a power law distribution we will obtain a straight line.

4 Previous Work Barabási and Albert proposed the Preferential Attachment model[BA99]: Each new node connects to the existing nodes with a probability proportional to their degree. It is known that Preferential Attachment gives a power-law distribution. [Mitzenmacher, Cooper & Frieze 03, KRRSTU00] Other models proposed include the “copying model.” [KRRSTU00]

5 Motivating Questions Why would a new node connect to nodes of high degree? -Are high degree nodes more attractive? -Or are there other explanations? How does a new node find out what the high degree nodes are? Motivating Observation: Suppose each page has a small probability p of being interesting. Suppose a user does a (undirected) random walk until they find an interesting page. If p is small then this is the same as preferential attachment. What about other processes and directed graphs?

6 Directed 1-Step Random Surfer At time 1, we start with a single node with a self-loop. At time t, a node is chosen uniformly at random, with probability p the new node connects to this node, or with probability 1-p it connects to a random out-neighbor of that node. (Extension: Repeat process k times for each new node to get out-degree k) Note: This model is just another way of stating the directed preferential attachment model.

7 Directed 1-step Random Surfer, p=.5 T=1T=2 ¾ T=3 ¼ T=4 (½) (½)+ (½) (½)+ (½) (½)(½) (⅓)+ (½) (⅓)+ (½) (⅓)+(½) (⅓)

8 Directed Coin Flipping model 1.At time 1, we start with a single node with a self-loop. 2.At time t, we choose a node uniformly at random. 3.We then flip a coin of bias p. 4.If the coin comes up heads, we connect to the current node. 5.Else we walk to a random neighbor and go to step 3. “each page has equal probability p of being interesting to us”

9 NEW NODE RANDOM STARTING NODE 1. COIN TOSS: TAIL 2. COIN TOSS: TAIL 3. COIN TOSS: HEAD

10 Is Directed Coin-Flipping Power- lawed? We don’t know … but we do have some partial results... Note: unlike for undirected graphs, the case p → 0 is not so interesting since then you just get a star.

11 Virtual Degree Definition: Let l i (u) be the number of level i descendents of u. Let  i (i ≥ 1) is a sequence of real number with  1 =1. Then v(u) = 1 + ∑ β i l i (u) (i ≥ 1)

12 u = v(u) = 1 + β 1 (2) + β 2 (4) + β 3 (0) + β 4 (0) +... Virtual Degree v(u) = 1 + β 1 l 1 (u) + β 2 l 2 (u) + β 3 l 3 (u) + β 4 l 4 (u) +... Easy observation: If we set β i = (1-p) i then the expected increase in degree(u) is proportional to v(u).

13 Virtual Degree Theorem: There always exist β i such that 1.For i ≥ 1, |β i | · 1. 2.As i → ∞, β i →0 exponentially. 3.The expected increase in v(u) is proportional to v(u). Theorem: For any node u and time t ≥ t u, E[v t (u)] = Θ((t/t u ) p ) Let v t (u) be the virtual degree of node u at time t and t u be the time when node u first appears. Recurrence:  1 =1,  2 =p,  i+1 =  i – (1-p)  i-1 E.g., for p=¾,  i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,... for p=½,  i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, …

14 Virtual Degree, contd Theorem: For any node u and time t ≥ t u, E[v t (u)] = Θ((t/t u ) p ) Let v t (u) be the virtual degree of node u at time t and t u be the time when node u first appears. We also have some weak concentration bounds. Unfortunately not strong enough: if these could be strengthened then would have a proof that virtual degrees (not just their expectations) follow power law.

15 Actual Degree Theorem: For any node u and time t ≥ t u, E[l 1 (u)] ≥ Ω((t/t u ) p(1-p) ) We can also obtain lower bounds on the actual degrees:

16 Experiments Random graphs of n=100,000 nodes Compute statistics averaged over 100 runs. K=1 (Every node has out-degree 1)

17 Uniform random connections

18 Directed 1-Step Random Surfer, p=3/4

19 Directed 1-Step Random Surfer, p=1/2

20 Directed 1-Step Random Surfer, p=1/4

21 Directed Coin Flipping, p=1/2

22 Directed Coin Flipping, p=1/4

23 Undirected coin flipping, p=1/2

24 Undirected Coin Flipping p=0.05

25 Conclusions Directed random walk models appear to generate power-laws (and partial theoretical results). Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”. (Even in absence of “role model” as in copying-model)

26 Open questions Can we prove that the degrees in the directed coin- flipping model indeed follow a power law? Analyze degree distribution for undirected coin- flipping model with p=1/2? Suppose page i has “interestingness” p i. Can we analyze the degree as a function of t, i and p i ?

27 Questions?