CS224W: Social and Information Network Analysis

Slides:

Advertisements

Similar presentations

Peer-to-Peer and Social Networks Power law graphs Small world graphs.

Advertisements

Scale Free Networks.

Algorithmic and Economic Aspects of Networks Nicole Immorlica.

Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.

Analysis and Modeling of Social Networks Foudalis Ilias.

Week 5 - Models of Complex Networks I Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.

Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.

VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.

Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.

SILVIO LATTANZI, D. SIVAKUMAR Affiliation Networks Presented By: Aditi Bhatnagar Under the guidance of: Augustin Chaintreau.

Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta, Michael Mahoney Yahoo! Research.

Advanced Topics in Data Mining Special focus: Social Networks.

4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.

Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec

1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.

1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.

CS728 Lecture 5 Generative Graph Models and the Web.

Scale Free Networks Robin Coope April Abert-László Barabási, Linked (Perseus, Cambridge, 2002). Réka Albert and AL Barabási,Statistical Mechanics.

Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏

CS Lecture 6 Generative Graph Models Part II.

Advanced Topics in Data Mining Special focus: Social Networks.

Analysis of Social Information Networks Thursday January 27 th, Lecture 3: Popularity-Power law 1.

CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006

Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.

Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –

Information Networks Power Laws and Network Models Lecture 3.

(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.

Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial

Author: M.E.J. Newman Presenter: Guoliang Liu Date:5/4/2012.

Models and Algorithms for Complex Networks Power laws and generative processes.

COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.

School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2013 Figures are taken.

Class 9: Barabasi-Albert Model-Part I

Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.

How Do “Real” Networks Look?

Performance Evaluation Lecture 1: Complex Networks Giovanni Neglia INRIA – EPI Maestro 10 December 2012.

Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.

Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.

Network (graph) Models

Lecture 23: Structure of Networks

The simple linear regression model and parameter estimation

Random Walk for Similarity Testing in Complex Networks

Oliver Schulte Machine Learning 726

Lecture 1: Complex Networks

Topics In Social Computing (67810)

How Do “Real” Networks Look?

Lecture 23: Structure of Networks

Random Graph Models of large networks

How Do “Real” Networks Look?

How Do “Real” Networks Look?

Social Network Analysis

Models of Network Formation

Lecture 13 Network evolution

Models of Network Formation

Peer-to-Peer and Social Networks Fall 2017

Models of Network Formation

How Do “Real” Networks Look?

Models of Network Formation

Peer-to-Peer and Social Networks

Lecture 23: Structure of Networks

Lecture 21 Network evolution

Modelling and Searching Networks Lecture 6 – PA models

Network Science: A Short Introduction i3 Workshop

Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst

Network Models Michael Goodrich Some slides adapted from:

Advanced Topics in Data Mining Special focus: Social Networks

Discrete Mathematics and its Applications Lecture 6 – PA models

Advanced Topics in Data Mining Special focus: Social Networks

What did we see in the last lecture?

Presentation transcript:

Network Formation Processes: Power-law degree distributions and Preferential Attachment CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu

Network Formation Processes What do we observe that needs explaining Small-world model? Diameter Clustering coefficient Preferential Attachment: Node degree distribution What fraction of all nodes have degree k (as a function of k)? Prediction from simple random graph models: 𝑃(𝑘)= exponential function of –k Observation: Power-law: 𝑃(𝑘)= 𝑘 −𝛼 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Degree Distributions 𝑷 𝒌 ∝ 𝒌 −𝜶 Found in data Expected based on Gnp 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node Degrees in Networks [Leskovec et al. KDD ‘08] Node Degrees in Networks Take a network, plot a histogram of P(k) vs. k Plot: fraction of nodes with degree k: 𝑝(𝑘)= | 𝑢| 𝑑 𝑢 =𝑘 | 𝑁 Probability: P(k) = P(X=k) Flickr social network n= 584,207, m=3,555,115 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node Degrees in Networks [Leskovec et al. KDD ‘08] Node Degrees in Networks Plot the same data on log-log axis: 𝑃 𝑘 ∝ 𝑘 −1.75 How to distinguish: 𝑃(𝑘)∝exp⁡(𝑘) vs. 𝑃(𝑘)∝ 𝑘 −𝛼 ? Take logarithms: if 𝑦=𝑓(𝑥)= 𝑒 −𝑥 then log 𝑦 =−𝑥 If 𝑦= 𝑥 −𝛼 then log 𝑦 =−𝛼 log⁡(𝑥) So, on log-log axis power-law looks like a straight line of slope −𝛼 Slope = −𝛼=1.75 Probability: P(k) = P(X=k) Flickr social network n= 584,207, m=3,555,115 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node Degrees: Faloutsos3 Internet Autonomous Systems [Faloutsos, Faloutsos and Faloutsos, 1999] Internet domain topology 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node Degrees: Web The World Wide Web [Broder et al., 2000] 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node Degrees: Barabasi&Albert Other Networks [Barabasi-Albert, 1999] Actor collaborations Web graph Power-grid 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Exponential vs. Power-Law 20 40 60 80 100 0.2 0.6 1 f(x) x Above a certain x value, the power law is always higher than the exponential. 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Exponential vs. Power-Law [Clauset-Shalizi-Newman 2007] Exponential vs. Power-Law Power-law vs. exponential on log-log and log-lin scales 10 1 2 3 -4 -3 -2 -1 log-log semi-log x … logarithmic y … logarithmic x … linear y … logarithmic 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Exponential vs. Power-Law 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Power-Law Degree Exponents Power-law degree exponent is typically 2 <  < 3 Web graph: in = 2.1, out = 2.4 [Broder et al. 00] Autonomous systems:  = 2.4 [Faloutsos3, 99] Actor-collaborations:  = 2.3 [Barabasi-Albert 00] Citations to papers:   3 [Redner 98] Online social networks:   2 [Leskovec et al. 07] 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Scale-Free Networks Definition: Networks with a power law tail in their degree distribution are called “scale-free networks” Where does the name come from? Scale invariance: there is no characteristic scale Scale-free function: 𝒇 𝒂𝒙 = 𝒂 𝝀 𝒇(𝒙) Power-law function: 𝑓 𝑎𝑥 = 𝑎 𝜆 𝑥 𝜆 = 𝑎 𝜆 𝑓(𝑥) 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Power-Laws are Everywhere [Clauset-Shalizi-Newman 2007] Power-Laws are Everywhere Many other quantities follow heavy-tailed distributions 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Anatomy of the Long Tail [Chris Anderson, Wired, 2004] Anatomy of the Long Tail Skip! 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Not Everyone Likes Power-Laws  CMU grad-students at the G20 meeting in Pittsburgh in Sept 2009 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Mathematics of Power-Laws

Heavy Tailed Distributions Skip! Degrees are heavily skewed: Distribution P(X>x) is heavy tailed if: 𝐥𝐢𝐦 𝒙→∞ 𝑷 𝑿>𝒙 𝒆 −𝝀𝒙 =∞ Note: Normal PDF: 𝑓 𝑥 = 1 2𝜋𝜎 𝑒 𝑥−𝜇 2 2 𝜎 2 Exponential PDF: 𝑓 𝑥 =𝜆 𝑒 −𝜆𝑥 then 𝑃 𝑋>𝑥 =1−𝑃(𝑋≤𝑥)= 𝑒 −𝜆𝑥 are not heavy tailed! 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Heavy Tails Various names, kinds and forms: [Clauset-Shalizi-Newman 2007] Heavy Tails Skip! Various names, kinds and forms: Long tail, Heavy tail, Zipf’s law, Pareto’s law Heavy tailed distributions: P(x) is proportional to: 𝑃 𝑥 ∝ 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Mathematics of Power-laws [Clauset-Shalizi-Newman 2007] Mathematics of Power-laws What is the normalizing constant? P(x) = z x- z=? 𝑃(𝑥) is a distribution: 𝑃 𝑥 𝑑𝑥 =1 Continuous approximation 1= 𝑥 𝑚𝑖𝑛 ∞ 𝑃 𝑥 𝑑𝑥 =𝑧 𝑥 𝑚 ∞ 𝑥 −𝛼 𝑑𝑥 =− 𝑧 𝛼−1 𝑥 −𝛼+1 𝑥 𝑚 ∞ =− 𝑧 𝛼−1 ∞ 1−𝛼 − 𝑥 𝑚 1−𝛼 𝑧= 𝛼−1 𝑥 𝑚 𝛼−1 xm P(x) diverges as x0 so xm is the minimum value of the power-law distribution x  [xm, ∞] 𝑝 𝑥 = 𝛼−1 𝑥 𝑚 𝑥 𝑥 𝑚 −𝛼 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Mathematics of Power-laws [Clauset-Shalizi-Newman 2007] Mathematics of Power-laws What’s the expectation of a power-law random variable x? 𝐸 𝑥 = 𝑥 𝑚 ∞ 𝑥 𝑃 𝑥 𝑑𝑥 =𝑧 𝑥 𝑚 ∞ 𝑥 −𝛼+1 𝑑𝑥 =− 𝑧 2−𝛼 𝑥 2−𝛼 𝑥 𝑚 ∞ =− 𝛼−1 𝑥 𝑚 𝛼−1 2−𝛼 [ ∞ 2−𝛼 − 𝑥 𝑚 2−𝛼 ] 𝐸 𝑥 = 𝛼−1 𝛼−2 𝑥 𝑚 Need:  > 2 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Mathematics of Power-Laws ADD: How to sample from a power law? 𝐸 𝑥 = 𝛼−1 𝛼−2 𝑥 𝑚 Power-laws: Infinite moments! If α ≤ 2 : E[x]= ∞ If α ≤ 3 : Var[x]=∞ Average is meaningless, as the variance is too high! Sample average of n samples from a power-law with exponent α: In real networks 2 <  < 3 so: E[x] = const Var[x]=∞ 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Estimating Power-Law Exponent  Estimating  from data: Fit a line on log-log axis using least squares method: Solve min 𝛼 log 𝑦 −𝛼 log 𝑥 2 BAD! 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Estimating Power-Law Exponent  Estimating  from data: Plot Complementary CDF 𝑃 𝑋>𝑥 . Then 𝛼=1+𝛼′ where 𝛼 is the slope of 𝑃(𝑋>𝑥). If 𝐏 𝐗=𝐱 ∝ 𝒙 −𝜶 then 𝐏 𝐗>𝒙 ∝ 𝒙 −(𝜶−𝟏) 𝑃 𝑋>𝑥 = 𝑗=𝑥 ∞ 𝑃(𝑗) ≈ = 𝑥 ∞ 𝑧 𝑗 −𝛼 𝑑𝑗 = = 𝑧 𝛼 𝑗 1−𝛼 𝑥 ∞ = 𝑧 𝛼 𝑥 − 𝛼−1 OK 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Estimating Power-Law Exponent  Estimating  from data: Use MLE: 𝛼 =1+𝑛 𝑖 𝑛 ln 𝑑 𝑖 𝑥 𝑚 −1 𝐿 𝛼 = ln 𝑖 𝑛 𝑝 𝑑 𝑖 = 𝑖 𝑛 ln⁡𝑝( 𝑑 𝑖 ) = 𝑖 𝑛 ln⁡(𝛼−1) − ln 𝑥 𝑚 −𝛼 ln 𝑑 𝑖 𝑥 𝑚 Want to find 𝜶 that max 𝐿(𝜶): Set dL 𝛼 d𝛼 =0 dL 𝛼 d𝛼 =0 𝑛 𝛼−1 −∑ ln 𝑑 𝑖 𝑥 𝑚 =0 𝛼 =1+𝑛 𝑖 𝑛 ln 𝑑 𝑖 𝑥 𝑚 −1 OK Power-law density: 𝑝 𝑥 = 𝛼−1 𝑥 𝑚 𝑥 𝑥 𝑚 −𝛼 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Flickr: Fitting Degree Exponent Show examples of the fitting – not the nice line but the fitted line Linear scale Log scale, α=1.75 CCDF, Log scale, α=1.75, exp. cutoff CCDF, Log scale, α=1.75 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Maximum Degree Skip! What is the expected maximum degree K in a scale-free network? The expected number of nodes with degree > K should be less than 1: 𝐾 ∞ 𝑃 𝑥 𝑑𝑥 ≈ 1 𝑛 =𝑧 𝐾 ∞ 𝑥 −𝛼 𝑑𝑥= 𝑧 1−𝛼 𝑥 1−𝛼 𝐾 ∞ = = 𝛼−1 𝑥 𝑚 𝛼−1 −𝛼+1 0− 𝐾 1−𝛼 = 𝑥 𝑚 𝛼−1 𝐾 𝛼−1 ≈ 1 𝑛 𝐾= 𝑥 𝑚 𝑁 1 𝛼−1 Power-law density: 𝑝 𝑥 = 𝛼−1 𝑥 𝑚 𝑥 𝑥 𝑚 −𝛼 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Maximum Degree: Consequence Skip! Why don’t we see networks with exponents in the range of 𝜶=𝟒,𝟓,𝟔 ? In order to reliably estimate 𝛼, we need 2-3 orders of magnitude of K. That is, 𝐾≈ 10 3 E.g., to measure an degree exponent 𝛼=5,we need to maximum degree of the order of: 𝐾= 𝑥 𝑚 𝑁 1 𝛼−1 𝑁= 𝐾 𝑥 𝑚 𝛼−1 ≈ 10 12 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Why are Power-Laws Surprising Can not arise from sums of independent events Recall: in 𝐺 𝑛𝑝 each pair of nodes in connected independently with prob. 𝑝 𝑋… degree of node 𝑣, 𝑋 𝑤 … event that w links to v 𝑋= 𝑤 𝑋 𝑤 𝐸 𝑋 = 𝑤 𝐸 𝑋 𝑤 = 𝑛−1 𝑝 Now, what is 𝑷 𝑿=𝒌 ? Central limit theorem! 𝑋,𝑋,…, 𝑋 𝑛 : rnd. vars with mean , variance 2 𝑆 𝑛 =∑ 𝑋 𝑖 : 𝐸 𝑆 𝑛 =𝑛𝜇 , var 𝑆 𝑛 =𝑛 𝜎 2 , SD 𝑆 𝑛 =𝜎 𝑛 𝑃 𝑆 𝑛 =𝐸 𝑆 𝑛 +𝑥∙SD 𝑆 𝑛 ~ 1 2𝜋 e − x 2 2 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Random vs. Scale-free network Skip! Random network Scale-free (power-law) network (Erdos-Renyi random graph) Degree distribution is Power-law Degree distribution is Binomial 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Model: Preferential Attachment

Model: Preferential attachment Preferential attachment [Price ‘65, Albert-Barabasi ’99, Mitzenmacher ‘03] Nodes arrive in order 1,2,…,n At step j, let di be the degree of node i < j A new node j arrives and creates m out-links Prob. of j linking to a previous node i is proportional to degree di of node i 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Rich Get Richer New nodes are more likely to link to nodes that already have high degree Herbert Simon’s result: Power-laws arise from “Rich get richer” (cumulative advantage) Examples [Price 65]: Citations: New citations to a paper are proportional to the number it already has 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

The Exact Model We will analyze the following model: [Mitzenmacher, ‘03] The Exact Model Make everything as a directed graph. We only care about in-degree as every node has out-degre 1. Node j We will analyze the following model: Nodes arrive in order 1,2,3,…,n When node j is created it makes a single link to an earlier node i chosen: 1)With prob. p, j links to i chosen uniformly at random (from among all earlier nodes) 2) With prob. 1-p, node j chooses node i uniformly at random and links to a node i points to. Note this is same as saying: With prob. 1-p, node j links to node u with prob. proportional to du (the degree of u) 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

The Model Givens Power-Laws Claim: The described model generates networks where the fraction of nodes with degree k scales as: where q=1-p 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Continuous Approximation Consider deterministic and continuous approximation to the degree of node i as a function of time t t is the number of nodes that have arrived so far Degree di(t) of node i (i=1,2,…,n) is a continuous quantity and it grows deterministically as a function of time t Plan: Analyze di(t) – continuous degree of node i at time t  i 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Continuous Degree: What We Know Initial condition: di(t)=0, when t=i (node i just arrived) Expected change of di(t) over time: Node i gains an in-link at step t+1 only if a link from a newly created node t+1 points to it. What’s the probability of this event? With prob. p node t+1 links randomly: Links to our node i with prob. 1/t With prob. 1-p node t+1 links preferentially: Links to our node i with prob. di(t)/t So: Prob. node t+1 links to i is: 𝐩 𝟏 𝒕 + 𝟏−𝒑 𝒅 𝒊 (𝒕) 𝒕 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Continuous Degree At t=5 node i=5 comes and has total degree of 1 to deterministically share with other nodes: How does di(t) change as t∞? i di(t-1)+1 di(t)+1 1 =1+𝑝 1 4 + 1−𝑝 1 6 2 3 =3+𝑝 1 4 + 1−𝑝 3 6 4 i=5 Node i=5 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

END!!!!!!!!!!!!!!!!!!!!! Too long, cut at the beginning where discussing importance of power-laws and why they are surprising. Cover also how to sample form a power-law distribution – derive why is that the case This lecture should finish with the proof of PA giving power-laws. 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

What is the rate of growth of di? d 𝑑 𝑖 (𝑡) d𝑡 =𝑝 1 𝑡 + 1−𝑝 𝑑 𝑖 (𝑡) 𝑡 = 𝑝+𝑞 𝑑 𝑖 (𝑡) 𝑡 1 𝑝+𝑞 𝑑 𝑖 (𝑡) d 𝑑 𝑖 (𝑡)= 1 𝑡 d𝑡 1 𝑝+𝑞 𝑑 𝑖 (𝑡) d 𝑑 𝑖 (𝑡) = 1 𝑡 d𝑡 1 𝑞 ln 𝑝+𝑞 𝑑 𝑖 𝑡 = ln 𝑡 +𝑐 𝑞 𝑑 𝑖 𝑡 +𝑝=𝐴 𝑡 𝑞 𝑑 𝑖 𝑡 = 1 𝑞 𝐴 𝑡 𝑞 −𝑝 Divide by p+q di(t) integrate Let A=ec and exponentiate 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

What is the constant A? What is the constant A? We know: 𝑑 𝑖 𝑖 =0 𝑑 𝑖 𝑡 = 1 𝑞 𝐴 𝑡 𝑞 −𝑝 What is the constant A? We know: 𝑑 𝑖 𝑖 =0 So: 𝑑 𝑖 𝑖 = 1 𝑞 𝐴 𝑖 𝑞 −𝑝 =0 𝐴= 𝑝 𝑖 𝑞 𝑑 𝑖 𝑡 = 𝑝 𝑞 𝑡 𝑖 𝑞 −1 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Degree Distribution What is F(d) the fraction of nodes that has degree at least d at time t? How many nodes i have degree > t? 𝑑 𝑖 𝑡 = 𝑝 𝑞 𝑡 𝑖 𝑞 −1 >𝑑 then: i<t 𝑞 𝑝 𝑑−1 − 1 𝑞 There are t nodes total at time t so F(d): 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Degree Distribution What is the fraction of nodes with degree exactly d? Take derivative of F(d): 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Preferential attachment: Reflections Two changes from the Gnp The network grows Preferential attachment Do we need both? Yes! Add growth to Gnp (assume p=1): xj = degree of node j at the end Xj(u)= 1 if u links to j, else 0 xj = xj(j+1)+xj(j+2)+…+xj(n) E[xj(u)] = P[u links to j]= 1/(u-1) E[xj] =  1/(u-1) = 1/j + 1/(j+1)+…+1/(n-1) = Hn-1 – Hj E[xj] = log(n-1) – log(j) = log((n-1)/j) NOT (n/j) Hn…n-th harmonic number: 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Preferential attachment: Good news Preferential attachment gives power-law degrees Intuitively reasonable process Can tune p to get the observed exponent On the web, P[node has degree d] ~ d-2.1 2.1 = 1+1/(1-p)  p ~ 0.1 There are also other network formation mechanisms that generate scale-free networks: Random surfer model Forest Fire model 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

PA-like Link Formation Skip! Copying mechanism (directed network) select a node and an edge of this node attach to the endpoint of this edge Walking on a network (directed network) the new node connects to a node, then to every first, second, … neighbor of this node Attaching to edges select an edge attach to both endpoints of this edge Node duplication duplicate a node with all its edges randomly prune edges of new node 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Preferential attachment: Bad news Give a picture of age correlation! Preferential attachment is not so good at predicting network structure Age-degree correlation Links among high degree nodes On the web nodes sometime avoid linking to each other Further questions: What is a reasonable probabilistic model for how people sample through web-pages and link to them? Short+Random walks Effect of search engines – reaching pages based on number of links to them 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

PA: Many Extensions & Variations Skip! Preferential attachment is a key ingredient Extensions: Early nodes have advantage: node fitness Geometric preferential attachment Copying model: Picking a node proportional to the degree is same as picking an edge at random (pick node and then it’s neighbor) 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Network resilience (1) We observe how the connectivity (length of the paths) of the network changes as the vertices get removed [Albert et al. 00; Palmer et al. 01] Vertices can be removed: Uniformly at random In order of decreasing degree It is important for epidemiology Removal of vertices corresponds to vaccination 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Internet (Autonomous systems) Network Resilience (2) Real-world networks are resilient to random attacks One has to remove all web-pages of degree > 5 to disconnect the web But this is a very small percentage of web pages Random network has better resilience to targeted attacks Preferential removal Internet (Autonomous systems) Random network Mean path length Random removal Fraction of removed nodes Fraction of removed nodes 11/11/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu