Download presentation
Presentation is loading. Please wait.
Published byShawn Allen Modified over 6 years ago
1
PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google
2
Presentation Outline Introduction on PageRank
Calculation of PageRank on Webpage Original algorithm on PageRank Modifications of the original algorithm Result on the modifications Applications of PageRank
3
Introduction on PageRank
PageRank is a link analysis algorithm … with the purpose of "measuring" its (Webpage) relative importance within the set. – From Wikipedia, the free encyclopedia Developed by Larry Page as his PhD research topic 3 years later, he quitted Stanford and founded Google with Brin He lost his PhD qualification. In return, his net worth now is …
4
Introduction on PageRank
PageRank = Importance of the Webpage Concept is simple: Bloomberg PageRank=60 20 10 PageRank= 20+10 = 30
5
Introduction on PageRank
An example of Webpage system
6
Calculation of PageRank on Webpage
D B
7
Calculation of PageRank on Webpage
R(.) = PageRank of a Webpage R(A) = 100%R(B) %R(C) R(B) = 50%R(C) + 100%R(D) R(C) = 100%R(A) ( )( ) = ( ) A B C D 1 -1 -0.5
8
Calculation of PageRank on Webpage
Let A = 50, then, B = 25, C = 50 and D = 0 Normalize the PageRank by dividing the number by 100. (A+B+C+D = ) Therefore, A = 0.5 B = 0.25 C = 0.5 D = 0 In general:
9
Calculation of PageRank on Webpage
There are 2 PROBLEMS !!! Problem 1: What if there are over 280,000 Web-pages, over 3 millions hyperlinks and the? Problem 2: The PageRank of D = 0 It will be a bias Rank Sink may appear
10
Original algorithm on PageRank
In order to tackle the 2 problems, an calculation algorithm was introduced: Where: c - Normalization factor N - No. of links on the page v E - A factor to tackle rank sink
11
Original algorithm on PageRank
Multiply Rk by matrix, A, to form Rk+1 (i.e. ARk = Rk+1) A is a square matrix. Au,v = 1/Nu if there is an edge from u to v. Au,v = 0 if there is no edge from u to v. R = cAR, where c is the eigenvalue, and R is the eigenvector We can treat c=1/normalization factor and R is the PageRank vector
12
Original algorithm on PageRank
The algorithm is:
13
Modifications of the original algorithm
The run time of the original algorithm is not efficient Because the Web-page with low PageRank converge faster while the one with high rank spend more time to converge
14
Modifications of the original algorithm
15
Modifications of the original algorithm
Main concept: For the Webpage which PageRank is converged already, we could ignore them Therefore we separate the matrix and vector into 2 parts N = not yet converge; C = converged
16
Modifications of the original algorithm
17
Modifications of the original algorithm
Disadvantage on modification 1: the reordering cost of matrix A is expensive Set AC be 0
18
Modifications of the original algorithm
19
Result on the modifications
20
Applications of PageRank
Searching machine Type 1 Title search Finds all the webpages which titles contain all of the query words. Then it sorts the results by PageRank Type 2 Google Full-text search engine using PageRank
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.