Download presentation
Presentation is loading. Please wait.
Published byBrice Kennedy Modified over 7 years ago
1
Motivation Modern search engines for the World Wide Web use methods that require solving huge problems. Our aim: to develop multiscale techniques that will work much faster than existing methods.
2
Web Search
3
Web Search in a Nutshell
Crawlers Keyword Search Link Matrix PageRank Results Ranked Results
4
Interpretation - Random Walk
A monkey is clicking randomly at links on its browser. What is the probability for it to reach each page after a long time?
5
Problem Definition The rank of a page is its importance relative to other pages (its probability). Each page “distributes” its own pagerank equally to the pages to which it points. 1/2 1/3 1
6
Problem Definition Pagerank vector 1/2 1/3 1 Link Matrix B
7
Problem Definition (Cont.)
The matrix B may have zero-columns that correspond to pages with no out-links. We call these troublesome pages “dangling pages”. Dangling Page 1/2 1/3 1
8
Problem Definition (Cont.)
The matrix B may have zero-columns that correspond to pages with no out-links. We call these troublesome pages “dangling pages”. Interpretation: If the monkey finds no links on the page, it leaps to some random page on the web. Dangling Page 1/2 1/3 1
9
Problem Definition (Cont.)
Still – there might be a group with no outlinks! We therefore introduce a “fudge factor” 0 < α < 1. Interpretation: With probability 1-a, the monkey leaps to some random page on the web.
10
Problem Definition (Cont.)
B is a stochastic matrix. We seek its eigenvector whose eigenvalue is 1. It is called the principal eigenvector.
11
Computing the principal eigenvector
The Power Method (eqvivalent to Jacobi’s): Starting with a random vector, xinitial, multiply it repeatedly by B. That is, iterate: This process converges to the principal eigenvector. Iterations are cheap and simple. However, the error decays roughly like |l2|/|l1| per each iteration – may be very slow!
12
Power Method (Jacobi’s Method)
7 iterations for a 4-variable problem, and only 3 accurate digits!!! What will happen with 1M variables? ~1.2 million pages, ~3 Million links x4 x3 x2 x1 0.2500 0.3333 0.2917 0.1667 0.2083 0.3611 0.2639 0.1896 0.1944 0.3287 0.2755 0.1852 0.2106 0.3457 0.2724 0.1798 0.2022 0.3398 0.2725 0.1826 0.2051 0.3409 0.2729 0.1816 0.2046 0.3409 0.2727 0.1818 0.2045
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.