1 Spectral Analysis of Power-Law Graphs and its Application to Internet Topologies Milena Mihail Georgia Tech.

Slides:



Advertisements
Similar presentations
1 Complex Networks: Connectivity and Functionality Milena Mihail Georgia Tech.
Advertisements

1 Generating Network Topologies That Obey Power LawsPalmer/Steffan Carnegie Mellon Generating Network Topologies That Obey Power Laws Christopher R. Palmer.
Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov.
Milena Mihail Theory Applications driving technology :  Th e Internet Phenomenon : - New Computational Paradigm  Biology, Genomics Theory is/has historically.
1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &
Analysis and Modeling of Social Networks Foudalis Ilias.
Internet Topology Caterina Scoglio KSU. Why need for Internet Topology models To evaluate performance of algorithms and protocols Realistic models at.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
Progress in inferring business relationships between ASs Dmitri Krioukov 4 th CAIDA-WIDE Workshop.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
1 Algorithmic Performance in Power Law Graphs Milena Mihail Christos Gkantsidis Christos Papadimitriou Amin Saberi.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
Scale-free networks Péter Kómár Statistical physics seminar 07/10/2008.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Overview of Markov chains David Gleich Purdue University Network & Matrix Computations Computer Science 15 Sept 2011.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly.
Mining and Searching Massive Graphs (Networks)
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
Traffic Engineering With Traditional IP Routing Protocols
Network Design IS250 Spring 2010 John Chuang. 2 Questions  What does the Internet look like? -Why do we care?  Are there any structural invariants?
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.
Network Bandwidth Allocation (and Stability) In Three Acts.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Mining and Searching Massive Graphs (Networks) Introduction and Background Lecture 1.
On Power-Law Relationships of the Internet Topology CSCI 780, Fall 2005.
Link Analysis, PageRank and Search Engines on the Web
Small World Networks Somsubhra Sharangi Computing Science, Simon Fraser University.
1 Dong Lu, Peter A. Dinda Prescience Laboratory Department of Computer Science Northwestern University Evanston, IL GridG: Synthesizing Realistic.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
1 Characterizing Selfishly Constructed Overlay Routing Networks March 11, 2004 Byung-Gon Chun, Rodrigo Fonseca, Ion Stoica, and John Kubiatowicz University.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
The structure of the Internet. The Internet as a graph Remember: the Internet is a collection of networks called autonomous systems (ASs) The Internet.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Computer Science 1 Web as a graph Anna Karpovsky.
Algorithmic Problems in the Internet Christos H. Papadimitriou
New Directions and Half-Baked Ideas in Topology Modeling Ellen W. Zegura College of Computing Georgia Tech.
The Erdös-Rényi models
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
1 Milena Mihail Web Science Tea Feb 29, 08 Discussion Topic:
Traceroute-like exploration of unknown networks: a statistical analysis A. Barrat, LPT, Université Paris-Sud, France I. Alvarez-Hamelin (LPT, France) L.
1 Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green Flexible Models for Complex Networks.
1 Algorithmic Performance in Complex Networks Milena Mihail Georgia Tech.
Flexible Graph Models for Complex Networks Complex Networks: Internet and its applications: WWW, content sharing, social online Other: further social,
1 “Erdos and the Internet” Milena Mihail Georgia Tech. The Internet is a remarkable phenomenon that involves graph theory in a natural way and gives rise.
Measurement, Modeling and Analysis of the Internet Wang Xiaofei Vishal Misra, Columbia University.
1 “Expansion” in Power Law and Scale Free Graphs Milena Mihail Georgia Tech with Christos Gkantsidis, Christos Papadimitriou and Amin Saberi.
1 Milena Mihail Georgia Tech. “with network elements maintaining characteristic profiles” Models and Algorithms for Complex Networks “with categorical.
Complex Networks: Models Lecture 2 Slides by Panayiotis TsaparasPanayiotis Tsaparas.
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
KPS 2007 (April 19, 2007) On spectral density of scale-free networks Doochul Kim (Department of Physics and Astronomy, Seoul National University) Collaborators:
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Scaling Properties of the Internet Graph Aditya Akella, CMU With Shuchi Chawla, Arvind Kannan and Srinivasan Seshan PODC 2003.
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.
1 Milena Mihail Georgia Tech. Algorithmic Performance in Complex Networks.
Scaling Properties of the Internet Graph Aditya Akella With Shuchi Chawla, Arvind Kannan and Srinivasan Seshan PODC 2003.
1 “Hybrid Search Schemes for Unstructured Peer- to-Peer Networks” “Random Walks in Peer-to-Peer Networks” Christos Gkantsidis, Milena Mihail, Amin Saberi.
Random Walk for Similarity Testing in Complex Networks
Topics In Social Computing (67810)
Search Engines and Link Analysis on the Web
On Growth of Limited Scale-free Overlay Network Topologies
Complex Networks: Connectivity and Functionality
Introduction to Internet Routing
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Presentation transcript:

1 Spectral Analysis of Power-Law Graphs and its Application to Internet Topologies Milena Mihail Georgia Tech

2 The Internet Phenomenon Routers WWW P2P Open Decentralized Dynamic Market Competition Security, Privacy Paradigm Shift : Networks as Artifacts that we construct. Networks as Phenomena that we study !

3 Internet Performance Congestion (TCP/IP, ) Stability (Game Theory, ) Scalability (TCP ? Moore’s Law ?) WWW, P2P : Index, Search Van Jacobson 88 Kelly 99 (Kleinberg 97, Google 98)

4 Required Data & Models Routers WWW P2P Connectivity Capacity Traffic / Demand Internet Models, such as GT-ITM, Brite, Inet, for Analytic & Simulation based studies : How do elements organize ?

5 The Internet Phenomenon Routers WWW P2P Open Decentralized Dynamic Market Competition Security, Privacy Paradigm Shift : Networks as Artifacts that we construct. Networks as Phenomena that we study !

6 Level of Autonomous Systems SprintAT&T Georgia Tech CNN Topology data from BGP routing tables, collected by NLANR, looking glass -U. Oregon Decentralized Routing !

7 The AS Graph ~14K nodes in 2002 ( ~2K nodes in 1997) ~30K links in 2002 Georgia Tech CNN AT&T Sprint

8 The Directed AS Graph Georgia Tech CNN AT&T Sprint Peering Relationships : Customer – Provider Peers Gao 00, Subramanian et al 01 Five Tier Hierarchy Subramanian et al 01

9 The Real AS Graph CAIDA

10 Degree-Frequency Power Law Faloutsos et al degree frequency 2100

11 Rank-Degree Power Law rank degree Faloutsos et al 99 UUNET Sprint C&WUSA AT&T BBN

12 Eigenvalue Power Law rank eigenvalue Faloutsos et al 99

13 Eigenvalue Power Law rank eigenvalue Faloutsos et al 99 UUNET Sprint C&WUSA AT&T BBN

14 Heavy Tailed Degree Distribution Departure from standard Internet Models such as Waxman, Transit-Stub Zegura et al 95 Models and techniques must be revisited Degrees not concentrated around mean Highly irregular graphs Departure from Erdos-Renyi Sharp concentration around mean, Exponential Tails

15 Power Law Graphs Which primitives drive their evolution ? Preferential attachment, Barabasi 99, Bollobas et al 00 Multiobjective Optimization, Carlson & Doyle 00, Papadimitriou 02 What are their structural properties ? Hierarchy, Subramanian et al 01, Govindran et al 02 Clustering

16 Spectral Analysis of Matrices Examines eigenvalues and eigenvectors. Useful analogy to signal processing. All eigenvectors form a basis (complete representation). Focus on large eigenvalues and the corresponding eigenvectors. Pervasive in Algebra : Representation Theory Algorithms : Markov chain sampling Complexity : Expanders, Pseudorandomness Datamining, Information Retrieval Highly technical application specific adaptations

17 0. Spectral primitives : eigenvalues and eigenvectors. 1. Eigenvectors ~ Significance, hence HIERARCHY ( capacity / load ) 2. Eigenvectors ~ Clustering CLUSTERING impacts CONGESTION (1.) and (2.) use normalization preprocessing of the data 3. On Eigenvectors of Eigenvalue Power Law 4. Further Directions Outline of Results in this Talk

18 Eigenvectors & Eigenvalues A = Axx=

19 Matrix as a Linear Transformation A = Matrix as a Linear Transformation

Step 0 Step 1 Step 2 Step 3

=

22 Stochastic Normalization 0 1/3 0 1/3 1/3 0 1/3 0 1/ /3 0 1/3 0 1/3 0 1/3 1/ /3 0 1/3 0 1/3 1/3 0 1/ /3 1/ /3 A = Axx=

23 The Random Walk Eigenvalues between 1 and –1 ( 1 and 0 also easy) / /91/9 2/9

24 In undirected graphs the weights of the principal eigenvector are proportional to degrees. 1. Principal Eigenvector ~ Significance Principal Eigenvector is Stationary Distribution, corresponding to to = 1 1/6 3/16 2/16 3/16 2/16 3/16

25 1. Principal Eigenvector ~ Significance In directed graphs the weights of the principal eigenvector can vary way beyond degrees. 10^-42*10^-45*10^

26 1. Hierarchy from Principal Eigenvector of Directed AS Graph Significance by High Degree Significance by Significant Peers and Customers Add 5% prob. Uniform jump to avoid sinks In WWW : Google’s pagerank

27 1. Principal Eigenvector Ranking

28 Eigenvector vs Five Tiers

29 1. Principal Eigenvector Ranking

30 0. Spectral primitives : eigenvalues and eigenvectors. 1. Eigenvectors ~ Significance, hence HIERARCHY ( capacity / load ) 2. Eigenvectors ~ Clustering CLUSTERING impacts CONGESTION (1.) and (2.) use normalization preprocessing of the data 3. On Eigenvectors of Eigenvalue Power Law 4. Further Directions Outline of Results in this Talk

31 2. Eigenvectors ~ Clustering 1/6 = 1

32 2. Eigenvectors ~ Clustering 1/6

33 2. Eigenvectors ~ Clustering 1/

34 2. Eigenvectors ~ Clustering 1/6

35 2. Eigenvectors ~ Clustering 1/6

36 2. Eigenvectors ~ Clustering 1/ ~ ~ ~ ~ ~ ~ ~~ ~~ ~~ Matrix Perturbation Theory

37 Spectral Filtering n K+1 k =1 > > >>> > Find clusters in most positive and most negative ends of eigenvectors associated with large eigenvalues. Heuristic : ( Kleinberg 97, Fiat et al 01 )

38 2. Eigenvectors ~ Clustering Weight of eigenvector k rank n K+1 k =1 > > >>> >

An Example of a Cluster

44 Additional Matrices Similarity Matrix A*A^T, where A is directed AS graph. Complete and Pruned AS topology. In all cases prune leaves of very big degree nodes. Necessary frequency normalization. Clusters consistent and evolving over time. Synthetic Internet topologies have much weaker clustering properties.

45 Clustering and Congestion Assume 1 unit of traffic between each pair of ASes in each direction. Route traffic in the graph (like BGP). Compute # of connections using each link. This is a measure of congestion.

46 Effect of intra-cluster and inter-cluster traffic to most congested link InternetInet Internet Inet 0%100% 0%100% 20%91.5%97.7%20%126%109% 40%83.1%95.3%40%153%117% 60%74.2%92.9%60%172%128% 80%65.7%90.8%80%191%136% 100%57.3%88.5%100%207%142%

47 Outline of Results in this Talk 1. Eigenvectors ~ Significance, hence HIERARCHY ( capacity / load ) 2. Eigenvectors ~ Clustering CLUSTERING impacts CONGESTION (1) and (2) used normalization preprocessing of the data Normalization preprocessing of data is necessary. 3. Eigenvectors of Eigenvalue Power Law LOCALIZED 4. Further Directions

48 Which Eigenvectors correspond to Eigenvalue Power Law ? rank eigenvalue Faloutsos et al 99 UUNET Sprint C&WUSA AT&T BBN

49 Large Degrees, or“Stars” of AS Graph Dominate Spectrum of Adjacency Matrix, Prior to Normalization

50 Principal Eigenvector of a Star d

51 Disjoint Stars

52 “Mostly” Disjoint Stars Proof By Matrix Perturbation Theory, Spectral Graph Theory

53 3. Explanation of Eigenvalue Power Law Theorem : Random graphs whose largest degrees are, in expectation, d_1 > d_2 > … > d_k, d_j ~ j ^ -a have largest eigenvalues sharply concentrated around _j ~ j ^ -b for j = 1,…,k, and corresponding eigenvectors localized on corresponding largest degrees, with very high probability.

54 Summary 0. First Spectral Analysis on Internet Topologies. 1. PRINCIPAL EIGENVECTOR implies HIERARCHY 2. EIGENVECTORS of LARGE EIGENVALUES imply CLUSTERING 3. CLUSTERING impacts CONGESTION 4. Defined Intra-cluster and Inter-cluster Traffic. 5. Introduced Heady Tailed Specific Normalization Preprocessing. 6. Explained Eigenvalue Power Law. (1)Through (5) with C. Gkantsidis and E. Zegura (6) with C. Papadimitriou

55 Further Directions  How does congestion scale in power law graphs ? Other properties, such as resilience. What are the growth primitives of power law graphs ? Optimization tradeoff primitives translate to cost – service Towards efficient and accurate synthetic data. Level of Autonomous Systems : Routing protocol (BGP) stability by game theory.