Presentation is loading. Please wait.

Presentation is loading. Please wait.

PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google.

Similar presentations


Presentation on theme: "PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google."— Presentation transcript:

1 PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google

2 Presentation Outline Introduction on PageRank
Calculation of PageRank on Webpage Original algorithm on PageRank Modifications of the original algorithm Result on the modifications Applications of PageRank

3 Introduction on PageRank
PageRank is a link analysis algorithm … with the purpose of "measuring" its (Webpage) relative importance within the set. – From Wikipedia, the free encyclopedia Developed by Larry Page as his PhD research topic 3 years later, he quitted Stanford and founded Google with Brin He lost his PhD qualification. In return, his net worth now is …

4 Introduction on PageRank
PageRank = Importance of the Webpage Concept is simple: Bloomberg PageRank=60 20 10 PageRank= 20+10 = 30

5 Introduction on PageRank
An example of Webpage system

6 Calculation of PageRank on Webpage
D B

7 Calculation of PageRank on Webpage
R(.) = PageRank of a Webpage R(A) = 100%R(B) %R(C) R(B) = 50%R(C) + 100%R(D) R(C) = 100%R(A) ( )( ) = ( ) A B C D 1 -1 -0.5

8 Calculation of PageRank on Webpage
Let A = 50, then, B = 25, C = 50 and D = 0 Normalize the PageRank by dividing the number by 100. (A+B+C+D = ) Therefore, A = 0.5 B = 0.25 C = 0.5 D = 0 In general:

9 Calculation of PageRank on Webpage
There are 2 PROBLEMS !!! Problem 1: What if there are over 280,000 Web-pages, over 3 millions hyperlinks and the? Problem 2: The PageRank of D = 0 It will be a bias Rank Sink may appear

10 Original algorithm on PageRank
In order to tackle the 2 problems, an calculation algorithm was introduced: Where: c - Normalization factor N - No. of links on the page v E - A factor to tackle rank sink

11 Original algorithm on PageRank
Multiply Rk by matrix, A, to form Rk+1 (i.e. ARk = Rk+1) A is a square matrix. Au,v = 1/Nu if there is an edge from u to v. Au,v = 0 if there is no edge from u to v. R = cAR, where c is the eigenvalue, and R is the eigenvector We can treat c=1/normalization factor and R is the PageRank vector

12 Original algorithm on PageRank
The algorithm is:

13 Modifications of the original algorithm
The run time of the original algorithm is not efficient Because the Web-page with low PageRank converge faster while the one with high rank spend more time to converge

14 Modifications of the original algorithm

15 Modifications of the original algorithm
Main concept: For the Webpage which PageRank is converged already, we could ignore them Therefore we separate the matrix and vector into 2 parts N = not yet converge; C = converged

16 Modifications of the original algorithm

17 Modifications of the original algorithm
Disadvantage on modification 1: the reordering cost of matrix A is expensive Set AC be 0

18 Modifications of the original algorithm

19 Result on the modifications

20 Applications of PageRank
Searching machine Type 1 Title search Finds all the webpages which titles contain all of the query words. Then it sorts the results by PageRank Type 2 Google Full-text search engine using PageRank


Download ppt "PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google."

Similar presentations


Ads by Google