1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &
2 Network and application studies need properties and models of: Internet graphs & Internet Traffic. Shift of networking paradigm: Open, decentralized, dynamic. Intense measurement efforts. Intense modeling efforts. Internet Measurement and Models Routers WWW P2P
3 Internet & WWW Graphs Routers exchanging traffic.Web pages and hyperlinks. 10K – 300K nodes Avrg degree ~ 3
4 Real Internet Graphs CAIDA Average Degree = Constant A Few Degrees VERY LARGE Degrees not sharply concentrated around their mean.
5 Degree-Frequency Power Law degree frequenc y WWW measurement: Kumar et al 99 Internet measurement: Faloutsos et al 99 E[d] = const., but No sharp concentration
6 Degree-Frequency Power Law frequenc y E[d] = const., but No sharp concentration degree E[d] = const., but No sharp concentration Erdos-Renyi sharp concentration Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02
7 Rank-Degree Power Law rank degree Internet measurement: Faloutsos et al 99 UUNET Sprint C&WUSA AT&T BBN
8 Eigenvalue Power Law rank eigenvalue Internet measurement: Faloutsos et al 99
9 This Paper: Large Degrees & Eigenvalues rank eigenvalues UUNET Sprint C&WUSA AT&T BBN degrees
10 This Paper: Large Degrees & Eigenvalues
11 Principal Eigenvector of a Star d
12 Large Degrees 2 3 4
13 Large Eigenvalues 2 3 4
14 Main Result of the Paper The largest eigenvalues of the adjacency martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed. Explains Internet measurements. Negative implications for the spectral filtering method in information retrieval.
15 Random Graph Model let Connectivity analyzed by Chung & Lu ‘01
16 Random Graph Model
17 Random Graph Model
18 Theorem : Ffor large enough Wwith probability at least
19 Proof : Step 1. Decomposition Vertex Disjoint StarsLR-extra RR LL LR =-
20 Proof: Step 2: Vertex Disjoint Stars Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_i Hence Principal Eigenvalue Sharply Concentrated around
21 Proof: Step 3: LL, RR, LR-extra LR-extra has max degree LL has edges RR has max degree
22 Proof: Step 3: LL, RR, LR-extra LR-extra has max degree RR has max degree LL has edges
23 Proof: Step 4: Matrix Perturbation Theory Vertex Disjoint Stars have principal eigenvalues All other parts have max eigenvalue QED
24 Implication for Info Retrieval Spectral filtering, without preprocessing, reveals only the large degrees. Term-Norm Distribution Problem :
25 Implication for Info Retrieval Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees. Local information. No “latent semantics”.
26 Implication for Information Retrieval Application specific preprocessing (normalization of degrees) reveals clusters: WWW: related to searching, Kleinberg 97 IR, collaborative filtering, … Internet: related to congestion, Gkantsidis et al 02 Open : Formalize “preprocessing”. Term-Norm Distribution Problem :