Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture #11 PageRank (II)

Similar presentations


Presentation on theme: "Lecture #11 PageRank (II)"— Presentation transcript:

1 Lecture #11 PageRank (II)
CS492 Special Topics in Computer Science: Distributed Algorithms and Systems Lecture #11 PageRank (II)

2 Remind : PageRank Algorithm
PR(A) = (1-d) + d( PR(T1)/C(T1) PR(Tn)/C(Tn) ) = (1-d) + d( ) PR(A) : PageRank of page A PR(Ti) : PageRank of Pages Ti which has link to pageA C(Ti) : number of outbound links on page Ti d : damping factor ( between 0 and 1 )

3 Simple Example PR(A) = (1-d) + d( ) let d = 0.85 A B C

4 How to calculate PageRank
PR(A) = PR(C) PR(B) = (PR(A) / 2) PR(C) = (PR(A) / 2 + PR(B)) Method 1 : Solving the equations Do the math Method 2 : Iterative Computation of Page Rank Huge size of Web : hard to solve the equations Iterative computation of PageRank values

5 Solve the equations Solve these equations Answers
PR(A) = PR(C) PR(B) = (PR(A) / 2) PR(C) = (PR(A) / 2 + PR(B)) Answers PR(A) = PR(B) = PR(C) =

6 Iterative Computation of Page Rank
Set initial PageRank values to all pages Calculate PageRanks for all pages in several iterations Stop iteration when PageRanks converge

7 What does PageRank mean?
Random surfer who is given a web page at random and keep clicking on links. (never hit back button) eventually gets bored and starts on another random page PageRank the probability that the random surfer visits a page the proportion of time that the random surfer spends on each page

8 What is the damping factor?
PR(A) = (1-d) + d( ) Damping factor (1-d) : the probability at each page the random surfer will get bored and request another random page The higher d, the more likely will the random surfer keep clicking links

9 Loop which acts as a Rank Sink
Rank Sink Problem What if we don’t have the damping factor? No way to escape loop (A-B-C). Loop which acts as a Rank Sink A B C

10 Dangling Link (Dead End)
Danglink link points to any page with no outgoing links CA and BA are dangling links A cannot distribute its weight to the network. How to fix Method 1 : Remove dangling links until all the PageRanks are calculated. Method 2 : Make random jump to any other page

11 References [PBMW] L. Page, S. Brin, R. Motwani, T. Winograd, “The PageRank citation ranking: bringing order to the web,” WWW 1998 [BP98] Sergey Brin, Lawrence Page, “The anatomy of a large-scale hypertextual Web search engine,” Computer Networks and ISDN Systems, Vol. 30, [BGS05] Monica Bianchini, Marco Gori, Franco Scarselli, “Inside PageRank,” ACM Transactions on Internet Technology, Vol. 5, No. 1, Feb [LM04] Amy N. Langville, Carl Meyer, “Deeper inside PageRank,” Internet Mathematics, Vol. I, No. 3, [K99] Jon Kleinberg, “Authoritative sources in a Hyperlinked Environment,” Journal of the ACM 46:5 (1999).


Download ppt "Lecture #11 PageRank (II)"

Similar presentations


Ads by Google