Carnegie Mellon Finding patterns in large, real networks Christos Faloutsos CMU www.cs.cmu.edu/~christos/TALKS/UCLA-05.

Slides:



Advertisements
Similar presentations
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Advertisements

Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
Analysis and Modeling of Social Networks Foudalis Ilias.
The Connectivity and Fault-Tolerance of the Internet Topology
School of Computer Science Carnegie Mellon Sensor and Graph Mining Christos Faloutsos Carnegie Mellon University & IBM
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Kronecker Graphs: An Approach to Modeling Networks Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani Presented.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
CS728 Lecture 5 Generative Graph Models and the Web.
CMU SCS Copyright: C. Faloutsos (2014)# : Multimedia Databases and Data Mining Lecture #29: Graph mining - Generators & tools Christos Faloutsos.
Modeling Real Graphs using Kronecker Multiplication
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon, Leman Akoglu, Christos Faloutsos Carnegie Mellon University School.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Network Design IS250 Spring 2010 John Chuang. 2 Questions  What does the Internet look like? -Why do we care?  Are there any structural invariants?
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
CS Lecture 6 Generative Graph Models Part II.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
On Power-Law Relationships of the Internet Topology CSCI 780, Fall 2005.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
CMU SCS Multimedia and Graph mining Christos Faloutsos CMU.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Data Mining using Fractals and Power laws
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.
On Power-Law Relationships of the Internet Topology.
School of Computer Science Carnegie Mellon UIUC 04C. Faloutsos1 Advanced Data Mining Tools: Fractals and Power Laws for Graphs, Streams and Traditional.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
School of Computer Science Carnegie Mellon Data Mining using Fractals (fractals for fun and profit) Christos Faloutsos Carnegie Mellon University.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
CMU SCS Finding patterns in large, real networks Christos Faloutsos CMU.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
Models of Web-Like Graphs: Integrated Approach
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Carnegie Mellon Data Mining – Research Directions C. Faloutsos CMU
1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Christian Schindelhauer Search Algorithms Winter Semester 2004/ Dec.
Carnegie Mellon IIT Bombay KDD04© 2004, S. Chakrabarti and C. Faloutsos1 Graph structures in data mining Soumen Chakrabarti (IIT-Bombay) Christos Faloutsos.
Graph Models Class Algorithmic Methods of Data Mining
Modeling networks using Kronecker multiplication
NetMine: Mining Tools for Large Graphs
Finding patterns in large, real networks
15-826: Multimedia Databases and Data Mining
Part 1: Graph Mining – patterns
Lecture 13 Network evolution
15-826: Multimedia Databases and Data Mining
Graph and Tensor Mining for fun and profit
Large Graph Mining: Power Tools and a Practitioner’s guide
Data Mining using Fractals and Power laws
Lecture 21 Network evolution
Modelling and Searching Networks Lecture 2 – Complex Networks
Presentation transcript:

Carnegie Mellon Finding patterns in large, real networks Christos Faloutsos CMU

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos2 Thanks to Deepayan Chakrabarti (CMU) Michalis Faloutsos (UCR) George Siganos (UCR)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos3 Introduction Internet Map [lumeta.com] Food Web [Martinez ’91] Protein Interactions [genomebiology.com] Friendship Network [Moody ’01] Graphs are everywhere!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos4 Physical graphs Physical networks Physical Internet Telephone lines Commodity distribution networks

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos5 Networks derived from "behavior" Telephone call patterns , Blogs, Web, Databases, XML Language processing Web of trust, epinions.com

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos6 Outline Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Generators Tools

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos7 Motivating questions What do real graphs look like? –What properties of nodes, edges are important to model? –What local and global properties are important to measure? How to generate realistic graphs?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos8 Why should we care? A1: extrapolations: how will the Internet/Web look like next year? A2: algorithm design: what is a realistic network topology, –to try a new routing protocol? –to study virus/rumor propagation, and immunization?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos9 Why should we care? (cont’d) A3: Sampling: How to get a ‘good’ sample of a network? A4: Abnormalities: is this sub-graph / sub- community / sub-network ‘normal’? (what is normal?)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos10 Virus propagation Who is the best person/computer to immunize against a virus?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos11 Outline Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Generators Tools

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos12 Topology How does the Internet look like? Any rules? (Looks random – right?)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos13 Laws and patterns Real graphs are NOT random!! Diameter in- and out- degree distributions other (surprising) patterns

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos14 Laws – degree distributions Q: avg degree is ~2 - what is the most probable degree? degree count ?? 2

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos15 Laws – degree distributions Q: avg degree is ~3 - what is the most probable degree? degree count ?? 2 count 2

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos16 I.Power-law: outdegree O The plot is linear in log-log scale [FFF’99] freq = degree (-2.15) O = Exponent = slope Outdegree Frequency Nov’

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos17 II.Power-law: rank R The plot is a line in log-log scale Exponent = slope R = R outdegree Rank: nodes in decreasing outdegree order Dec’98

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos18 III. Eigenvalues Let A be the adjacency matrix of graph and v is an eigenvalue/eigenvector pair if: A v = v Eigenvalues are strongly related to graph topology A BC D

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos19 III.Power-law: eigen E Eigenvalues in decreasing order (first 20) [Mihail+, 02]: R = 2 * E E = Exponent = slope Eigenvalue Rank of decreasing eigenvalue Dec’98

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos20 IV. The Node Neighborhood N(h) = # of pairs of nodes within h hops

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos21 IV. The Node Neighborhood Q: average degree = 3 - how many neighbors should I expect within 1,2,… h hops? Potential answer: 1 hop -> 3 neighbors 2 hops -> 3 * 3 … h hops -> 3 h

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos22 IV. The Node Neighborhood Q: average degree = 3 - how many neighbors should I expect within 1,2,… h hops? Potential answer: 1 hop -> 3 neighbors 2 hops -> 3 * 3 … h hops -> 3 h WE HAVE DUPLICATES!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos23 IV. The Node Neighborhood Q: average degree = 3 - how many neighbors should I expect within 1,2,… h hops? Potential answer: 1 hop -> 3 neighbors 2 hops -> 3 * 3 … h hops -> 3 h ‘avg’ degree: meaningless!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos24 IV. Power-law: hopplot H Pairs of nodes as a function of hops N(h)= h H H = 4.86 Dec 98 Hops # of Pairs Hops Router level ’95 H = 2.83

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos25 Observation Q: Intuition behind ‘hop exponent’? A: ‘intrinsic=fractal dimensionality’ of the network N(h) ~ h 1 N(h) ~ h 2...

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos26 Hop plots More on fractal/intrinsic dimensionalities: very soon

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos27 But: Q1: How about graphs from other domains? Q2: How about temporal evolution?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos28 The Peer-to-Peer Topology Frequency versus degree Number of adjacent peers follows a power-law [Jovanovic+]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos29 More Power laws Also hold for other web graphs [Barabasi+, ‘99], [Kumar+, ‘99] citation graphs (see later) and many more

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos30 Time Evolution: rank R The rank exponent has not changed! [Siganos+, ‘03] Domain level #days since Nov. ‘97

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos31 Outline Part 1: Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Power laws for degree, eigenvalues, hop-plot ??? Generators Tools Part 2: PageRank, HITS and eigenvalues

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos32 Any other ‘laws’? Yes!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos33 Any other ‘laws’? Yes! Small diameter –six degrees of separation / ‘Kevin Bacon’ –small worlds [Watts and Strogatz]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos34 Any other ‘laws’? Bow-tie, for the web [Kumar+ ‘99] IN, SCC, OUT, ‘tendrils’ disconnected components

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos35 Any other ‘laws’? power-laws in communities (bi-partite cores) [Kumar+, ‘99] 2:3 core (m:n core) Log(m) Log(count) n:1 n:2 n:3

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos36 Any other ‘laws’? “Jellyfish” for Internet [Tauro+ ’01] core: ~clique ~5 concentric layers many 1-degree nodes

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos37 How do graphs evolve? degree-exponent seems constant - anything else?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos38 Evolution of diameter? Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) i.e.., slowly increasing with network size Q: What is happening, in reality?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos39 Evolution of diameter? Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) i.e.., slowly increasing with network size Q: What is happening, in reality? A: It shrinks(!!), towards a constant value

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos40 Shrinking diameter ArXiv physics papers and their citations [Leskovec+05a]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos41 Shrinking diameter ArXiv: who co-authored with whom

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos42 Shrinking diameter U.S. patents citing each other

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos43 Shrinking diameter Autonomous systems

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos44 Temporal evolution of graphs N(t) nodes; E(t) edges at time t suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =?...* E(t)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos45 Temporal evolution of graphs N(t) nodes; E(t) edges at time t suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =?...* E(t) A: over-doubled!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos46 Temporal evolution of graphs A: over-doubled - but obeying: E(t) ~ N(t) a for all t where 1<a<2 a=1: constant avg degree a=2: ~full clique Real graphs densify over time [Leskovec+05a]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos47 Temporal evolution of graphs A: over-doubled - but obeying: E(t) ~ N(t) a for all t Identically: log(E(t)) / log(N(t)) = constant for all t

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos48 Densification Power Law ArXiv: Physics papers and their citations 1.69 N(t) E(t)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos49 Densification Power Law U.S. Patents, citing each other 1.66 N(t) E(t)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos50 Densification Power Law Autonomous Systems 1.18 N(t) E(t)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos51 Densification Power Law ArXiv: who co-authored with whom 1.15 N(t) E(t)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos52 Summary of ‘laws’ Power laws for degree distributions …………... for eigenvalues, bi-partite cores Small & shrinking diameter (‘6 degrees’) ‘Bow-tie’ for web; ‘jelly-fish’ for internet ``Densification Power Law’’, over time

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos53 Outline Part 1: Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Generators Tools

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos54 Generators How to generate random, realistic graphs? –Erdos-Renyi model: beautiful, but unrealistic –process-based generators –recursive generators

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos55 Erdos-Renyi random graph – 100 nodes, avg degree = 2 Fascinating properties (phase transition) But: unrealistic (Poisson degree distribution != power law)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos56 Process-based Barabasi; Barabasi-Albert: Preferential attachment -> power-law tails! –‘rich get richer’ [Kumar+]: preferential attachment + mimic –Create ‘communities’

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos57 Process-based (cont’d) [Fabrikant+, ‘02]: H.O.T.: connect to closest, high connectivity neighbor [Pennock+, ‘02]: Winner does NOT take all... and many more

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos58 Recursive generators - intuition recursion self-similarity power laws (see details later) Recursion -> communities within communities within communities

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos59 Wish list for a generator: Power-law-tail in- and out-degrees Power-law-tail scree plots shrinking/constant diameter Densification Power Law communities-within-communities Q: how to achieve all of them?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos60 Wish list for a generator: Power-law-tail in- and out-degrees Power-law-tail scree plots shrinking/constant diameter Densification Power Law communities-within-communities Q: how to achieve all of them? A: Kronecker matrix product [Leskovec+05b]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos61 Kronecker product

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos62 Kronecker product

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos63 Kronecker product N N*N N**4

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos64 Properties of Kronecker graphs: Power-law-tail in- and out-degrees Power-law-tail scree plots constant diameter perfect Densification Power Law communities-within-communities

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos65 Properties of Kronecker graphs: Power-law-tail in- and out-degrees Power-law-tail scree plots constant diameter perfect Densification Power Law communities-within-communities and we can prove all of the above (first and only generator that does that)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos66 Properties of Kronecker graphs: ‘stochastic’ version gives even better results and –Includes Erdos-Renyi as special case –Includes ‘RMAT’ as special case [Chakrabarti+,’04] (stochastic version: generate Kronecker matrix; decimate edges with some probability)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos67 Kronecker - ArXiv real (stochastic) Kronecker DegreeScreeDiameterD.P.L. (det. Kronecker)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos68 Kronecker - patents Degree Scree Diameter D.P.L.

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos69 Kronecker - A.S.

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos70 Conclusions ‘ Laws ’ and patterns: Power laws for degrees, eigenvalues, ‘ communities ’ /cores Small / Shrinking diameter Bow-tie; jelly-fish

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos71 Conclusions, cont ’ d Generators Preferential attachment (Barabasi) Variations Recursion – Kronecker product & RMAT

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos72 Outline Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Generators Tools

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos73 Outline Part 1: Topology, ‘ laws ’ and generators ‘ Laws ’ and patterns Generators Tools: power laws and fractals Why so many power laws? Self-similarity, power laws, fractal dimension

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos74 Power laws Q1: Are they only in graph-related settings? A1: Q2: Why so many? A2:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos75 Power laws Q1: Are they only in graph-related settings? A1: NO! Q2: Why so many? A2: self-similarity; ‘rich-get-richer’

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos76 A famous power law: Zipf’s law Bible - rank vs frequency (log-log) log(rank) log(freq) “a” “the”

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos77 Power laws, cont’ed length of file transfers [Bestavros+] web hit counts [Huberman] magnitude of earthquakes (Guttenberg- Richter law) sizes of lakes/islands (Korcak’s law) Income distribution (Pareto’s law)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos78 Web Site Traffic log(freq) log(count) Zipf Click-stream data u-id’s url’s log(freq) log(count) ‘yahoo’ ‘super-surfer’

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos79 Lotka’s law (Lotka’s law of publication count); and citation counts: (citeseer.nj.nec.com 6/2001) log(#citations) log(count) J. Ullman

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos80 Power laws Q1: Are they only in graph-related settings? A1: NO! Q2: Why so many? A2: self-similarity; ‘rich-get-richer’

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos81 Fractals and power laws Power laws and fractals are closely related And fractals appear in MANY cases –coast-lines: –brain-surface: 2.6 –rain-patches: 1.3 –tree-bark: ~2.1 –stock prices / random walks: 1.5 –... [see Mandelbrot; or Schroeder]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos82 Digression: intro to fractals Fractals: sets of points that are self similar

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos83 A famous fractal e.g., Sierpinski triangle:... zero area; infinite length! dimensionality = ??

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos84 A famous fractal e.g., Sierpinski triangle:... zero area; infinite length! dimensionality = log(3)/log(2) = 1.58

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos85 A famous fractal equivalent graph:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos86 Intrinsic (‘fractal’) dimension How to estimate it?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos87 Intrinsic (‘fractal’) dimension Q: fractal dimension of a line? A: nn ( <= r ) ~ r^1 (‘power law’: y=x^a) Q: fd of a plane? A: nn ( <= r ) ~ r^2 fd== slope of (log(nn) vs log(r) )

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos88 Sierpinsky triangle log( r ) log(#pairs within <=r ) 1.58 == ‘correlation integral’ = CDF of pairwise distances

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos89 Sierpinsky triangle log( r ) log(#pairs within <=r ) 1.58 == ‘correlation integral’ = CDF of pairwise distances

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos90 Line log( r ) log(#pairs within <=r ) 1.58 == ‘correlation integral’ = CDF of pairwise distances 1...

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos91 2-d (Plane) log( r ) log(#pairs within <=r ) 1.58 == ‘correlation integral’ = CDF of pairwise distances 2

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos92 Recall: Hop Plot Internet routers: how many neighbors within h hops? (= correlation integral!) Reachability function: number of neighbors within r hops, vs r (log- log). Mbone routers, 1995 log(hops) log(#pairs) 2.8

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos93 Fractals and power laws They are related concepts: fractals self-similarity scale-free power laws ( y= x a ) –F = C r (-2)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos94 Conclusions Real settings/graphs: skewed distributions –‘mean’ is meaningless degree count ?? 2 count 2

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos95 Conclusions Real settings/graphs: skewed distributions –‘mean’ is meaningless –slope of power law, instead degree count ?? 2 count 2 log(degree) log(count)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos96 Conclusions: Tools: rank-frequency plot (a’la Zipf) Correlation integral (= neighborhood function)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos97 Conclusions (cont’d) Recursion/self-similarity –May reveal non-obvious patterns (e.g., bow- ties within bow-ties within bow-ties) [Dill+, ‘01] “To iterate is human, to recurse is divine”

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos98 Resources Generators: RMAT (deepay AT cs.cmu.edu) Kronecker ({deepay,jure} AT cs.cmu.edu) BRITE INET:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos99 Other resources Visualization - graph algo’s: Graphviz: pajek: lj.si/pub/networks/pajek/ Kevin Bacon web site:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos100 References [Aiello+, '00] William Aiello, Fan R. K. Chung, Linyuan Lu: A random graph model for massive graphs. STOC 2000: [Albert+] Reka Albert, Hawoong Jeong, and Albert-Laszlo Barabasi: Diameter of the World Wide Web, Nature (1999) [Barabasi, '03] Albert-Laszlo Barabasi Linked: How Everything Is Connected to Everything Else and What It Means (Plume, 2003)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos101 References, cont’d [Barabasi+, '99] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in random networks. Science, 286: , 1999 [Broder+, '00] Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. Graph structure in the web, WWW, 2000

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos102 References, cont’d [Chakrabarti+, ‘04] RMAT: A recursive graph generator, D. Chakrabarti, Y. Zhan, C. Faloutsos, SIAM-DM 2004 [Dill+, '01] Stephen Dill, Ravi Kumar, Kevin S. McCurley, Sridhar Rajagopalan, D. Sivakumar, Andrew Tomkins: Self- similarity in the Web. VLDB 2001: 69-78

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos103 References, cont’d [Fabrikant+, '02] A. Fabrikant, E. Koutsoupias, and C.H. Papadimitriou. Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet. ICALP, Malaga, Spain, July 2002 [FFF, 99] M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On power-law relationships of the Internet topology," in SIGCOMM, 1999.

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos104 References, cont’d [Leskovec+05a] Jure Leskovec, Jon Kleinberg and Christos Faloutsos Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations KDD 2005, Chicago, IL. (Best research paper award) [Leskovec+05b] Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg and Christos Faloutsos, Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, ECML/PKDD 2005, Porto, Portugal.

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos105 References, cont’d [Jovanovic+, '01] M. Jovanovic, F.S. Annexstein, and K.A. Berman. Modeling Peer-to-Peer Network Topologies through "Small-World" Models and Power Laws. In TELFOR, Belgrade, Yugoslavia, November, 2001 [Kumar+ '99] Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins: Extracting Large-Scale Knowledge Bases from the Web. VLDB 1999:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos106 References, cont’d [Leland+, '94] W. E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On the Self-Similar Nature of Ethernet Traffic, IEEE Transactions on Networking, 2, 1, pp 1-15, Feb [Mihail+, '02] Milena Mihail, Christos H. Papadimitriou: On the Eigenvalue Power Law. RANDOM 2002:

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos107 References, cont’d [Milgram '67] Stanley Milgram: The Small World Problem, Psychology Today 1(1), (1967) [Montgomery+, ‘01] Alan L. Montgomery, Christos Faloutsos: Identifying Web Browsing Trends and Patterns. IEEE Computer 34(7): (2001)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos108 References, cont’d [Palmer+, ‘01] Chris Palmer, Georgos Siganos, Michalis Faloutsos, Christos Faloutsos and Phil Gibbons The connectivity and fault-tolerance of the Internet topology (NRDM 2001), Santa Barbara, CA, May 25, 2001 [Pennock+, '02] David M. Pennock, Gary William Flake, Steve Lawrence, Eric J. Glover, C. Lee Giles: Winners don't take all: Characterizing the competition for links on the web Proc. Natl. Acad. Sci. USA 99(8): (2002)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos109 References, cont’d [Schroeder, ‘91] Manfred Schroeder Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W H Freeman & Co., 1991 (excellent book on fractals)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos110 References, cont’d [Siganos+, '03] G. Siganos, M. Faloutsos, P. Faloutsos, C. Faloutsos Power-Laws and the AS-level Internet Topology, Transactions on Networking, August [Watts+ Strogatz, '98] D. J. Watts and S. H. Strogatz Collective dynamics of 'small-world' networks, Nature, 393: (1998) [Watts, '03] Duncan J. Watts Six Degrees: The Science of a Connected Age W.W. Norton & Company; (February 2003)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos111 Thank you!

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos112 EXTRA Virus propagation

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos113 Outline Topology, ‘ laws ’ and generators EXTRA: Virus Propagation

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos114 Problem definition Q1: How does a virus spread across an arbitrary network? Q2: will it create an epidemic?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos115 Framework Susceptible-Infected-Susceptible (SIS) model –Cured nodes immediately become susceptible Susceptible & healthy Infected & infectious Infected by neighbor Cured internally

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos116 The model (virus) Birth rate  probability than an infected neighbor attacks (virus) Death rate  : probability that an infected node heals Infected Healthy NN1 N3 N2 Prob. β Prob. 

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos117 The model Virus ‘strength’ s=  /  Infected Healthy NN1 N3 N2 Prob. β Prob. δ

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos118 Epidemic threshold  of a graph, defined as the value of , such that if strength s =  /  <  an epidemic can not happen Thus, given a graph compute its epidemic threshold

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos119 Epidemic threshold  What should  depend on? avg. degree? and/or highest degree? and/or variance of degree? and/or third moment of degree?

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos120 Epidemic threshold [Theorem] We have no epidemic, if β/δ <τ = 1/ λ 1,A

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos121 Epidemic threshold [Theorem] We have no epidemic, if β/δ <τ = 1/ λ 1,A largest eigenvalue of adj. matrix A attack prob. recovery prob. epidemic threshold Proof: [Wang+03]

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos122 Experiments (Oregon)  /  > τ (above threshold)  /  = τ (at the threshold)  /  < τ (below threshold)

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos123 Our result: Holds for any graph includes older results as special cases

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos124 Reference [Wang+03] Yang Wang, Deepayan Chakrabarti, Chenxi Wang and Christos Faloutsos: Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint, SRDS 2003, Florence, Italy.

Carnegie Mellon UCLA-IPAM 05(c) 2005 C. Faloutsos125 Thank you! (really done this time )