1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.

Slides:

Advertisements

Similar presentations

The Theory of Zeta Graphs with an Application to Random Networks Christopher Ré Stanford.

Advertisements

Algorithmic and Economic Aspects of Networks Nicole Immorlica.

Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.

Analysis and Modeling of Social Networks Foudalis Ilias.

Jure Leskovec, CMU Lars Backstrom, Cornell Ravi Kumar, Yahoo! Research Andrew Tomkins, Yahoo! Research.

Week 5 - Models of Complex Networks I Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.

Week 4 – Random Graphs Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.

Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.

VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.

Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.

Synopsis of “Emergence of Scaling in Random Networks”* *Albert-Laszlo Barabasi and Reka Albert, Science, Vol 286, 15 October 1999 Presentation for ENGS.

Generative Models for the Web Graph José Rolim. Aim Reproduce emergent properties: –Distribution site size –Connectivity of the Web –Power law distriubutions.

Information Retrieval Lecture 8 Introduction to Information Retrieval (Manning et al. 2007) Chapter 19 For the MSc Computer Science Programme Dell Zhang.

SILVIO LATTANZI, D. SIVAKUMAR Affiliation Networks Presented By: Aditi Bhatnagar Under the guidance of: Augustin Chaintreau.

On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.

The influence of search engines on preferential attachment Dan Li CS3150 Spring 2006.

School of Information University of Michigan SI 614 Random graphs & power law networks preferential attachment Lecture 7 Instructor: Lada Adamic.

Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec

1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.

Scale-free networks Péter Kómár Statistical physics seminar 07/10/2008.

CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.

Web as Graph – Empirical Studies The Structure and Dynamics of Networks.

Common Properties of Real Networks. Erdős-Rényi Random Graphs.

CS Lecture 6 Generative Graph Models Part II.

Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.

Advanced Topics in Data Mining Special focus: Social Networks.

SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu.

CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006

On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.

1 A Random-Surfer Web-Graph Model Avrim Blum, Hubert Chan, Mugizi Rwebangira Carnegie Mellon University.

Computer Science 1 Web as a graph Anna Karpovsky.

1 Dynamic Models for File Sizes and Double Pareto Distributions Michael Mitzenmacher Harvard University.

Information Networks Power Laws and Network Models Lecture 3.

(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (3) Domain-Based Mathematical Models for Protein Evolution Tatsuya Akutsu Bioinformatics.

Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial

Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.

1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.

Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.

Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.

“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. Hammad Iqbal CS April 2006.

COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.

Emergence of Scaling and Assortative Mixing by Altruism Li Ping The Hong Kong PolyU

Optimal Link Bombs are Uncoordinated Sibel Adali Tina Liu Malik Magdon-Ismail Rensselaer Polytechnic Institute.

Networks Igor Segota Statistical physics presentation.

Percolation Processes Rajmohan Rajaraman Northeastern University, Boston May 2012 Chennai Network Optimization WorkshopPercolation Processes1.

Class 9: Barabasi-Albert Model-Part I

Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.

1 CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014 Network models Tamer Kahveci.

Properties of Growing Networks Geoff Rodgers School of Information Systems, Computing and Mathematics.

Jure Leskovec, CMU Lars Backstrom, Cornell Ravi Kumar, Yahoo! Research Andrew Tomkins, Yahoo! Research.

1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.

Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.

Network (graph) Models

Topics In Social Computing (67810)

Learning to Generate Networks

CS224W: Social and Information Network Analysis

Generative Model To Construct Blog and Post Networks In Blogosphere

The likelihood of linking to a popular website is higher

Peer-to-Peer and Social Networks

Lecture 21 Network evolution

Log-periodic oscillations due to discrete effects in complex networks

Modelling and Searching Networks Lecture 5 – Random graphs

Modelling and Searching Networks Lecture 6 – PA models

Discrete Mathematics and its Applications Lecture 5 – Random graphs

Network Models Michael Goodrich Some slides adapted from:

Discrete Mathematics and its Applications Lecture 6 – PA models

Advanced Topics in Data Mining Special focus: Social Networks

Presentation transcript:

1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira

2 The Web as a Graph Consider the World Wide Web as a graph, with web pages as nodes and hyperlinks between pages as edges. links.html resume.html index.html

3 Studying the Web Since the Web emerged there has been a lot of interest in: 1.Empirically studying properties of the Web Graph. 2.Modeling the Web Graph mathematically. Benefits of Generative Models: 1.Simulation – When real data is scarce 2.Extrapolation – How will the graph change? 3.Understanding – Inspire further research on real data

4 Power Law The distribution of a random variable X follows a power law if Prob [X=k] ~ Ck -α f(x) ~ g(x) if Lim x→∞ f(x)/g(x) = 1 e.g (x+1) ~ (x+2) Example: Prob [X=k] = k -2

5 Power Law: Prob [X=k] = k -2

6 Power Law log Prob [X=k] ~ log C –α log k Prob [X=k] ~ Ck -α Prob [X=k] = k -2 log Prob [X=k] = -2 log k

7 Power Law: Log-Log plot

8 Power Law contd. Prob [X≥k] ~ Ck -α Particularly useful if X takes on real values. More general definition: Sometimes referred to as “heavy tailed” or “scale free.”

9 Power Laws in Degree distribution Let G be a graph. Let X k be the proportion of nodes with degree k in G. Then if X k ~ Ck -α we say that G has power law degree distribution.

10 Properties of the Web Graph A Power-law degree distribution has been observed in a wide variety of graphs including citation networks, social networks, protein-protein interaction networks and so on. It has also been observed in the Web Graph. [Barabási & Albert]

11 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

12 Classic Random Graph Models In the G(n,p) random graph model: 1.There are n nodes. 2.There is an edge between any two nodes with probability p. Was proposed by Erdös and Renyi in 1960s.

13 Online G(n,p) In this model each new node makes k connections to existing nodes uniformly at random. For this talk we will focus on k = 1, hence the graph will be a tree.

14 Online G(n,p) T=1 T=2 ½ T=3 ½ T=4 ⅓ ⅓ ⅓

15 Properties of Online G(n,p) X k = Proportion of nodes with degree k E[X k ] =  (½ k ) E[degree of first node] = 1+ 1/2 +1/3+1/4 + … 1/n =  (log n) E[max degree] =  (log n) NOT POWER LAWED!!

16 Online G(n,p) (n=100,000, average of 100 runs)

17 Preferential Attachment In the Preferential Attachment model, each new node connects to the existing nodes with a probability proportional to their degree. [Barabási & Albert]

18 Preferential Attachment T=2 ¾ T=3 ¼ Deg = 3 Deg = 1 T=4 Deg = 4 Deg = 1 T=1 Degree = in-degree + out-degree

19 Preferential Attachment Preferential Attachment gives a power-law degree distribution. [Mitzenmacher, Cooper & Frieze 03, KRRSTU00] E[degree of 1st node] = √n

20 Preferential Attachment

21 Other Models Kumar et. al. proposed the “copying model.” [KRRSTU00] Leskovec et. al. propose a “forest fire” model which has some similarites to this work. [LKF05]

22 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

23 Motivating Questions Why would a new node connect to nodes of high degree? -Are high degree nodes more attractive? -Or are there other explanations? How does a new node find out what the high degree nodes are?

24 Motivating Questions Motivating Observation: If p is small then this is the same as preferential attachment. Suppose a user does a (undirected) random walk until they find an interesting page. What about other processes and directed graphs? Suppose each page has a small probability p of being interesting.

25 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

26 Directed 1-step Random Surfer, p=.5 ¾ T=3 ¼ (½) (½)+ (½) (½)+ (½) (½) T=1 Start with a single node with a self-loop. T=2 1.Choose a node uniformly at random 2.With probability p connect 3.With probability (1-p) connect to its neighbor

27 Directed 1-step Random Surfer It turns out this model is a mixture of connecting to nodes uniformly at random and preferential attachment. But taking one step is not very natural. Has a power-law degree distribution. What about doing a real random walk?

28 NEW NODE RANDOM STARTING NODE 1. COIN TOSS: TAIL (at node A) 2. COIN TOSS: TAIL (at node B) 3. COIN TOSS: HEAD (at node C) 1.Pick a node uniformly at random. 2. Flip a coin of bias pIf HEADS connect to current node, else walk to neighbor A B C D Directed Coin Flipping model

29 Directed Coin Flipping model 1.At time 1, we start with a single node with a self-loop. 2.At time t, we choose a node u uniformly at random. 3.We then flip a coin of bias p. 4.If the coin comes up heads, we connect to the current node. 5.Else we walk to a random neighbor and go to step 3. “each page has equal probability p of being interesting to us”

30 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

31 Is Directed Coin-Flipping Power- lawed? We don’t know … but we do have some partial results...

32 Virtual Degree Definitions: Let l i (u) be the number of level i descendents of node u. l 1 (u) = # of children l 2 (u) = # of grandchildren, e.t.c. Let  = (β 1, β 2,..) be a sequence of real numbers with  1 =1. Then v  (u) = 1 + β 1 l 1 (u) + β 2 l 2 (u) + β 3 l 3 (u) + … We’ll call v  (u) the “Virtual degree of u with respect to .”

33 u Virtual Degree v(u) = 1 + β 1 (2) + β 2 (4) + β 3 (0) + β 4 (0) +... # of children# of grandchildren

34 Virtual Degree Easy observation: If we set β i = (1-p) i then the expected increase in deg(u) is proportional to v(u). Expected increase in deg(u) = p/t + (1-p)pl 1 (u)/t + (1-p) 2 pl 2 (u)/t + … = (p/t)v(u) u

35 Virtual Degree Theorem: There always exist β i such that 1.For i ≥ 1, |β i | · 1. 2.As i → ∞, β i →0 exponentially. 3.The expected increase in v(u) is proportional to v(u). Recurrence:  1 =1,  2 =p,  i+1 =  i – (1-p)  i-1 for p=½,  i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, … E.g., for p=¾,  i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,...

36 Virtual Degree, continued Theorem: For any node u and time t ≥ t u, E[v t (u)] = Θ((t/t u ) p ) Let v t (u) be the virtual degree of node u at time t and t u be the time when node u first appears. So, the expected virtual degrees follow a power law.

37 Actual Degree Theorem: For any node u and time t ≥ t u, E[degree(u)] ≥ Ω((t/t u ) p(1-p) ) We can also obtain lower bounds on the expected values of the actual degrees:

38 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

39 Experiments Random graphs of n=100,000 nodes Compute statistics averaged over 100 runs. K=1 (Every node has out-degree 1)

40 Online Erdös-Renyi

41 Directed 1-Step Random Surfer, p=3/4

42 Directed 1-Step Random Surfer, p=1/2

43 Directed 1-Step Random Surfer, p=1/4

44 Directed Coin Flipping, p=1/2

45 Directed Coin Flipping, p=1/4

46 Undirected coin flipping, p=1/2

47 Undirected Coin Flipping p=0.05

48 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

49 Conclusions Directed random walk models appear to generate power-laws (and partial theoretical results). Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”.

50 Open questions Can we prove that the degrees in the directed coin- flipping model do indeed follow a power law? Analyze degree distribution for the undirected coin-flipping model with p=1/2? Suppose page i has “interestingness” p i. Can we analyze the degree as a function of t, i and p i ?

51 Questions?