Download presentation
Presentation is loading. Please wait.
Published byAmbrose Flynn Modified over 9 years ago
1
Fan Chung University of California, San Diego The PageRank of a graph
3
What is PageRank? Ranking the vertices of a graph Partially ordered sets + graph theory -Applications --web search,ITdata sets, xxxxxxxxxxxxxpartitioning algorithms,… -Theory --- correlations among vertices A new graph invariant for dealing with:
4
fan
5
Outline of the talk Motivation Local cuts and the Cheeger constant using: eigenvectors random walks PageRank heat kernel Define PageRank Four versions of the Cheeger inequality Four partitioning algorithms Green’s functions and hitting time
6
What is PageRank? What is Rank?
9
What is PageRank? PageRank is defined on any graph.
10
An induced subgraph of the collaboration graph with authors of Erdös number ≤ 2.
11
A subgraph of the Hollywood graph.
12
A subgraph of a BGP graph
13
Yahoo IM graph Reid Andersen 2005 The Octopus graph
14
Graph Theory has 250 years of history. Leonhard Euler 1707-1783 The Bridges of Königsburg Is it possible to walk over every bridge once and only once?
15
Geometric graphs Algebraic graphs General graphs Topological graphs
16
Massive data Massive graphs WWW-graphs Call graphs Acquaintance graphs Graphs from any data a.base protein interaction network Jawoong Jeong
17
Big and bigger graphsNew directions in graph theory
18
Many basic questions: Correlation among vertices? The `geometry’ of a network ? Quantitative analysis? distance, flow, cut, … eigenvalues, rapid mixing, … Local versus global?
19
Google’s answer: The definition for PageRank?
20
A measure for the “importance” of a website The “importance” of a website is proportional to the sum of the importance of all the sites that link to it.
21
A solution for the “importance” of a website Solvefor x
22
A solution for the “importance” of a website Solvefor x x = ρ A x Adjacency matrix
23
A solution for the “importance” of a website Solvefor x x = ρ A x
24
Graph models directed graphs weighted graphs (undirected) graphs
25
Graph models directed graphs weighted graphs (undirected) graphs
26
Graph models directed graphs weighted graphs (undirected) graphs 1.5 1.2 3.3 \ 1.5 2 2.3 1 2.81.1
27
In a directed graph, authority there are two types of “importance”: hub Jon Kleinberg 1998
28
Two types of the “importance” of a website Solveand x x = r A y Importance as Authorities : y Importance as Hubs : y = s A x T x = rs A A x T y = rs A A y T
29
Eigenvalue problem for n x n matrix:. n ≈ 30 billion websites Hard to compute eigenvalues Even harder to compute eigenvectors
30
In the old days, compute for a given (whole) graph. In reality, can only afford to compute “locally”.
31
A traditional algorithm Input: a given graph on n vertices. New algorithmic paradigm Input: access to a (huge) graph Efficient algorithm means polynomial algorithms n 3, n 2, n log n, n (e.g., for a vertex v, find its neighbors) Bounded number of access.
32
A traditional algorithm Input: a given graph on n vertices. New algorithmic paradigm Input: access to a (huge) graph Efficient algorithm means polynomial algorithms n 3, n 2, n log n, n (e.g., for a vertex v, find its neighbors) Bounded number of access. Exponential polynomial Infinity finite
33
The definition of PageRank given by Brin and Page is based on random walks.
34
Random walks in a graph. G : a graph P : transition probability matrix A lazy walk: := the degree of u.
35
A (bored) surfer either surf a random webpage with probability 1- α Original definition of PageRank α : the jumping constant or surf a linked webpage with probability α
36
Two equivalent ways to define PageRank pr(α,s) (1) s: the seed as a row vector Definition of personalized PageRank α : the jumping constant s
37
Two equivalent ways to define PageRank p=pr(α,s) (1) (2) s = Definition of PageRank the (original) PageRank some “seed”, e.g., s = personalized PageRank
38
Depends on the applications? How good is PageRank as a measure of correlationship? Isoperimetric properties How “good” is the cut?
39
Isoperimetric properties “What is the shortest curve enclosing a unit area?” In a graph G and an integer m, what is the minimum cut disconnecting a subgraph of ≥ m vertices? In a graph G, what is the minimum cut e(S,V-S) so that e(S,V-S) is the smallest? _____ Vol S
40
Two types of cuts: Vertex cut edge cut How “good” is the cut? S E(S,V-S)
41
e(S,V-S) _____ Vol S e(S,V-S) _____ |S| Vol S = Σ deg(v) v ε S |S| = Σ 1 v ε S S V-S
42
The Cheeger constant The Cheeger constant for graphs h G and its variations are sometimes called “conductance”, “isoperimetric number”, … The volume of S is
43
The Cheeger constant The Cheeger inequality : the first nontrivial eigenvalue of the xx(normalized) Laplacian of a connected graph.
44
The spectrum of a graph Many ways to define the spectrum of a graph. How are the eigenvalues related to properties of graphs? properties of graphs? Adjacency matrix
45
Combinatorial Laplacian diagonal degree matrix adjacency matrix Gustav Robert Kirchhoff 1824-1887 The spectrum of a graph Adjacency matrix
46
Combinatorial Laplacian diagonal degree matrix adjacency matrix Gustav Robert Kirchhoff 1824-1887 The spectrum of a graph Adjacency matrix Matrix tree theorem #spanning. trees
47
Combinatorial Laplacian diagonal degree matrix adjacency matrix Gustav Robert Kirchhoff 1824-1887 The spectrum of a graph Adjacency matrix Normalized Laplacian Random walks Rate of convergence
48
The spectrum of a graph not symmetric in general Normalized Laplacian symmetric normalized { Discrete Laplace operator with eigenvalues { loopless, simple
49
The spectrum of a graph Normalized Laplacian symmetric normalized Discrete Laplace operator with eigenvalues = { { not symmetric in general
50
dictates many properties of a graph. expander diameter discrepancy subgraph containment …. Spectral implications for finding good cuts?
51
Finding a cut by a sweep For order the vertices Consider sets and the Cheeger constant of Define
52
Using a sweep by the eigenvector, can reduce the exponential number of choices of subsets to a linear number. Finding a cut by a sweep
53
Using a sweep by the eigenvector, can reduce the exponential number of choices of subsets to a linear number. Finding a cut by a sweep Still, there is a lower bound guarantee by using the Cheeger inequality.
54
Four types of Cheeger inequalities. eigenvectors random walks PageRank heat kernel Four proofs using: Leading to four different one-sweep partitioning algorithms.
55
graph spectral method random walks PageRank heat kernel spectral partition algorithm local partition algorithms Four proofs of Cheeger inequalities
56
Graph partitioning Local graph partitioning
57
What is a local graph partitioning algorithm? A local graph partitioning algorithm finds a small cut near the given seed(s) with running time depending only on the size of the output.
58
Examples of local partitioning
64
graph spectral method random walks PageRank heat kernel spectral partition algorithm local partition algorithms Four proofs of Cheeger inequalities
65
graph spectral method random walks PageRank heat kernel Lovasz, Simonovits, 90, 93 Spielman, Teng, 04 Andersen, Chung, Lang, 06 Chung, PNAS, 08. Four proofs of Cheeger inequalities Cheeger 60’s, Fiedler ’73 Alon 86’, Jerrum+Sinclair 89’
66
where is the first non-trivial eigenvalue of the Laplacian and is the minimum Cheeger ratio in a sweep using the eigenvector. Using eigenvector the Cheeger inequality can be stated as The Cheeger inequalityPartition algorithm,
67
Proof of the Cheeger inequality: from definition by Cauchy-Schwarz ineq. summation by parts. from the definition.
68
where is the minimum Cheeger ratio over sweeps by using a lazy walk of k steps from every vertex for an appropriate range of k. A Cheeger inequality using random walks Lovász, Simonovits, 90, 93 Leads to a Cheeger inequality:
69
Using the PageRank vector. A Cheeger inequality using PageRank Recall the definition of PageRank p=pr(α,s): (1) (2) Organize the random walks by a scalar α.
70
Random walks versus PageRank How fast is the convergence to the stationary distribution? Choose α to satisfy the required property. For what k, can one have ?
71
with seed as a subset S Using the PageRank vector and a Cheeger inequality can be obtained : where S is the Dirichlet eigenvalue of the Laplacian, and is the minimum Cheeger ratio over sweeps by using the appropriate personalized PageRank with seeds S. A Cheeger inequality using PageRank
72
over all f satisfying the Dirichlet boundary condition: Dirichlet eigenvalues for a subset for all
73
Local Cheeger constant for a subset
74
with seed as a subset S Using the PageRank vector and a Cheeger inequality can be obtained : where S is the Dirichlet eigenvalue of the Laplacian, and is the minimum Cheeger ratio over sweeps by using personalized PageRank with seed S. A Cheeger inequality using PageRank
75
Algorithmic aspects of PageRank Fast approximation algorithm for x personalized PageRank Can use the jumping constant to approximate PageRank with a support of the desired size. greedy type algorithm, almost linear complexity Errors can be effectively bounded.
76
A graph partition algorithm using PageRank Given a set S with randomly choose a vertex v in S. With probability at least the one-sweep algorithm using has an initial segment with the Cheeger constant at most
77
Graph partitioning using PageRank vector. 198,430 nodes and 1,133,512 edges
80
Kevin Lang 2007
81
graph spectral method random walks PageRank heat kernel Lovasz, Simonovits, 90, 93 Spielman, Teng, 04 Andersen, Chung, Lang, 06 Chung, PNAS, 08. Four proofs of Cheeger inequalities Fiedler ’73, Cheeger, 60’s Alon 86’
82
PageRank versus heat kernel Geometric sumExponential sum
83
PageRank versus heat kernel Geometric sumExponential sum recurrenceHeat equation
84
A Cheeger inequality using the heat kernel Theorem: where is the minimum Cheeger ratio over sweeps by using heat kernel pagerank over all u in S. Theorem: For
85
Definition of heat kernel
86
Using the upper and lower bounds, a Cheeger inequality can be obtained : where S is the Dirichlet eigenvalue of the Laplacian, and is the minimum Cheeger ratio over sweeps by using heat kernel with seeds S for appropriate t. A Cheeger inequality using the heat kernel
90
Covering and packing using PageRank Many applications of PageRank for problems in Graph Theory Pebbing and routing using PageRank Graph embedding using PageRank Relating graph invariants of subgraphs to the host graph using PageRank Your favorite old problem using PageRank? Graph drawing using PageRank
91
Random graphs with general degrees pageranks Algorithmic game theory, graphical games New Directions in Graph Theory for information networks Topics: Using: Spectral methods Probabilistic methods Quasirandom …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.