Presentation is loading. Please wait.

Presentation is loading. Please wait.

Roshnika Fernando P AGE R ANK. W HY P AGE R ANK ?  The internet is a global system of networks linking to smaller networks.  This system keeps growing,

Similar presentations


Presentation on theme: "Roshnika Fernando P AGE R ANK. W HY P AGE R ANK ?  The internet is a global system of networks linking to smaller networks.  This system keeps growing,"— Presentation transcript:

1 Roshnika Fernando P AGE R ANK

2 W HY P AGE R ANK ?  The internet is a global system of networks linking to smaller networks.  This system keeps growing, so there must be a way to sort though all the information available.  PageRank is the algorithm used by the search engine Google to sort through internet webpages  A webpage’s rank determines the order it appears when a keyword search is performed on Google  Fun Fact: PageRank is named after Larry Page, one of the founders of Google, not after webpages

3 P OPULARITY C ONTEST  Rank, at its simplest, is the probability that a webpage will be visited  Sum of rank of all pages is 1  Rank of linked pages affects rank of page  Initially, rank = 1/(total # of pages available) ≈ 0 for internet

4 D ETERMINING R ANK  Let P be an i x j stochastic matrix where p i,j is the probability of going to webpage j from webpage i.  p i,j = (# of links to page j from page i) (# of links on page i)  Note: i and j are integers and positive values  Note: There are around 25 billion p i,j combinations on the internet

5 L ONG T ERM P ROBABILITY  After a very long time, what is the probability that web surfers will be at a certain website?  Let be the stationary distribution vector where is the probability of being at state k.  Since stochastic matrices have eigenvalue λ = 1,  Solve for to determine long term probability of being at each webpage (aka the rank)

6 S MALL S CALE E XAMPLE 7 pages linked to one another

7 L INEAR P ROGRAM  Solve for x vector using (P - I)x = 0 to obtain Page Rank  x vector is the eigenvector for eigenvalue λ = 1

8 S MALL S CALE S OLUTION As t → ∞ p i,j given PageRank: x 1 =.304 x 2 =.166 x 3 =.141 x 4 =.105 x 5 =.179 x 6 =.045 x 7 =.061

9 S ENSITIVITY A NALYSIS  What if a page has no links? What happens to the probability matrix P?  P is stochastic, meaning the sum of the columns must equal 1.  If a page has no links leading out, then p i,j for that given column will be distributed evenly to all rows in j so that  This assumes when someone reaches a dead end, the possibility of him/her going to a new page is entirely random

10 P ROBABILITY AND R ANK  The stationary distribution vector contains the rank of each webpage, which determines the order it appear when a keyword search is performed  This rank is the probability that a person will be at each of the billions of pages available online.  This takes several powerful computers to compute.

11 Q UESTIONS ?

12 C ITATIONS  Austin, David. "How Google Finds Your Needle in the Web's Haystack." AMS.org. American Mathematical Society. Web. 09 Nov. 2009..  "PageRank." Wikipedia, the free encyclopedia. Web. 09 Nov. 2009..  Photograph. PageRanks-Example. Wikipedia, 8 July 2009. Web. 9 Nov. 2009..  "Stochastic matrix." Wikipedia, the free encyclopedia. Web. 09 Nov. 2009..


Download ppt "Roshnika Fernando P AGE R ANK. W HY P AGE R ANK ?  The internet is a global system of networks linking to smaller networks.  This system keeps growing,"

Similar presentations


Ads by Google