Download presentation
Presentation is loading. Please wait.
Published byAmberly Lawson Modified over 9 years ago
1
S eminar on Page Ranking Techniques In Search Engines Phapale Gaurav S. [05 IT 6010] Guide: Prof. A. Gupta
2
Introduction Need Increasing need of Search engine. Search results should be ordered by Relevancy. Importance. What is Page Ranking
3
Algorithms HITS (Hyperlink Induced Topic Search) e.g.Alta Vista PageRank e.g. Google.
4
Definition – PageRank. We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor, which can be set between 0 and 1. We usually set d to 0.85.……. C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows: PR(A) = (1-d) + d (PR(T1)/C(T1) +... + PR(Tn)/C(Tn)) Ref: Sergey Brin and Lawrence Page ”The Anatomy of a Large- Scale Hypertextual Web Search Engine” http://www-db.stanford.edu/~backrub/google.html
5
How to use formula. e.g. 2 pages A and B, pointing to each other. AB
6
Start with PR(A) = PR(B) =1 PR(A) = (1-d) + d * (PR(B)/C(B)) = (1-0.85) + 0.85 * (1/1) = 1 PR(B) = (1-d) + d * (PR(A)/C(A)) = (1-0.85) + 0.85 * (1/1) = 1
7
Lets start with PR(A) = PR(B) = 10 After 1 st iteration: PR(A) = (1-d) + d*(PR(B)/C(B)) = 0.15 + 0.85 * (10/1) = 8.65 PR(B) = (1-d) + d*(PR(A)/C(A)) = 0.15 + 0.85 * (8.65/1) = 7.50
8
After 2nd iteration: PR(A) = (1-d) + d*(PR(B)/C(B)) = 0.15 + 0.85 * (7.50/1) = 6.527 PR(B) = (1-d) + d*(PR(A)/C(A)) = 0.15 + 0.85 * (6.527/1) = 5.698 And so on….. till?
9
Ans: Iterations should be repeated till PR values converges…….. In this example ……..till PR(A) = PR(B) =1. Thus we can start with any values of PR, and should repeat iterations till PR values converges i.e. don’t change too much.
10
Difference… Result of PR calculation. Google toolbar values
11
Examples Assumption: We’ll take initial PR value of each page as 1.0
12
Example 1 A B PR(A) = (1-d) + d ( 0) = 0.15 PR(B) = (1-d) + d (0) = 0.15 For practicing examples on PageRank use calculator: www.webworkshop.net/pagerank_calculator.php?lnks=2,10,15&i blprs=0.15,0.15,0.15,0.15&pgnms=&pgs=2&initpr=1&its=100&ty pe=simple
13
Example 2 PR (A) = (1-d) + d (PR(B) / C(B)) = 0.15 + 0.85 (1/1) = 1 PR (B) = (1-d) + d (0) = 0.15 Dangling links are links that go to pages that don't have any outbound links. Orphan pages are those, which don’t have any inbound link. A B
14
Example 3 From here onwards I’ll represent final PR values after sufficient no. of iterations inside page. A 1.0 B 1.0 C 1.0 A 1.0 B 1.0 C 1.0
15
Example 4 Observation: We can channel large proportion of PR of site to a particular page. A 1.85 B 0.575 C 0.575
16
Example 5 Observation: We can reduce PR leak by increasing internal link structure. C 1.255 A 2.6 B 1.255 External Site 1 1.0 External Site 2 1.215 External Site1 1.0 A 1.0 B 0.575 C 0.575 External Site 2 0.638
17
Example 5 Cont.. External Site 1 1.0 A 2.146 B 1.549 C 1.720 External Site 2 1.215
18
How to increase PR? By adding spam pages. Join forum. Submit to search engine directories. Reciprocating links. Contents.
19
Adding spam pages. A 331.0 B 281.6 Spam 1 0.39 Spam 2 0.39 Spam 1000 0.39
20
Conclusion. Even though formula for calculating PageRank seems to be difficult, it is easy to understand. But when a simple calculation is applied hundreds of times, the results can seem complicated. And we can not predict the result of these iterations. Surely, more practice can yield more observations. PageRank is important factor considered in Google ranking, but it is only one of the important factors considered. e.g. now a days Google is paying a lot of attention to the link’s anchor text while deciding relevancy of target page. But as Page Rank is also one of the important factor, one should be well aware of PageRank while designing the website.
21
References. http://www.webworkshop.net/pagerank.html http://www.iprcom.com/papers/pagerank/ http://www-db.stanford.edu/~backrub/google.html http://www.google.com/intl/en/technology/ http://www.google-watch.org/pagerank.html
22
?
23
Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.