Download presentation
Presentation is loading. Please wait.
Published byDarren Bishop Modified over 9 years ago
1
Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan jang@mirlab.org http://mirlab.org/jang
2
PageRank Algorithm Facts about PageRank Algorithm – Developed by Google’s founders, Larry Page and Sergey Brin, when they were graduate students at Stanford University – Determined entirely by the link structure of the WWW – Recomputed about once a month – The world’s largest matrix computation Ideas – A random walk problem known as Markov chain/process – Page rank: Limiting probability that a random surfer visits a page – A page has a high rank if other pages with high ranks link to it. 2
3
Connectivity Matrix G Notations – U: the set of all n web pages in the world (n > 4 billion by June 2004) – G: the connectivity matrix g ij = 1 if there is a hyperlink to page i from page j and g ij = 0 otherwise. Facts – G is huge, but very sparse – No. of nonzeros in G is the total no. of hyperlinks in U. 3 1 2 3 4 65
4
Degrees of a Page Degrees of a page – Define row and column sums of G: – r i : in-degree of page i – c j : out-degree of page j 4 1 2 3 4 65
5
From Connectivity Matrix to Transition Probability Matrix Connectivity matrix – G: g ij = 1 if there is a hyperlink from page j to i Transition prob. Matrix – A: a ij the prob. of jumping from page j to i 5 1 2 3 4 65 Column j is the prob. of jumping from page j to others.
6
Two Types of Transitions Type 1: Follow one of the link (with prob. p) Type 2: Jump to a random page (with prob. 1-p) 6 Overall:
7
Transition Probability Matrix Facts – A is the transition prob. matrix of the Markov chain. Its elements are all strictly between 0 and 1 and its column sums are all equal to 1. – If z is the initial prob. on each page, Az is the prob after 1 transition, A 2 z is the prob after 2 transitions, … – A k z converges to the page rank if k is big enough. – A k+1 z= A k z when k is big Ax=x, with x= A k z when k is big – Perron-Frobenius theorem: A nonzero solution of x=Ax exists and is unique to within a scaling factor. – If the scaling factor is chosen so that the sum of x is 1, then x is Google’s PageRank. – Most of the elements of A are equal to (1-p)/n. If n=4*10^9 and p=0.85, then (1-p)/n=3.75*10^-11. 7
8
How to Compute PageRank Eigenvector method – x=A*x x is the eigenvector corresponding to eigenvalue 1 – Fact A always has an eigenvalue of 1 Power method – Repeat x=A*x until x converges – The only possible approach for a large n – Fact 1 is the maximum eigenvalue of A A n z is not affected by z as n increases 8
9
Fact 1: Proof A always has an eigenvalue of 1 – Since the column sum of A is an all-1 vector, A T has 1 as its eigenvalue: – So 1 is also an eigenvalue of A since 9
10
Fact 1: Another Proof 10 Given a square matrix A with each column sum equal to K, prove that K is a eigenvalue of A.
11
Eigenvalue Decomposition 11
12
Fact 2 A has 1 as its eigenvalue of max magnitude A n z approaches the page rank as long as n is big enough and z sums to 1. 12
13
Example A tiny web Transition matrix A When p=0.85, we have the page rank (via pagerank.m): 13 2 3 4 5 1 6
14
Application Scenerio Team ranking in a sport – Eigenvalue decomposition for soccer games Eigenvalue decomposition for soccer games 互推系統 14
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.