ACCELERATING GOOGLE’S PAGERANK Liz & Steve. Background  When a search query is entered in Google, the relevant results are returned to the user in an.

Slides:



Advertisements
Similar presentations
Chapter 9 Approximating Eigenvalues
Advertisements

Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
Information Networks Link Analysis Ranking Lecture 8.
Link Analysis: PageRank
Google’s PageRank By Zack Kenz. Outline Intro to web searching Review of Linear Algebra Weather example Basics of PageRank Solving the Google Matrix Calculating.
Lecture 17 Introduction to Eigenvalue Problems
Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods, converge faster –Krylov subspace method Steepest descent.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Link Analysis Ranking. How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
1 CS 430 / INFO 430: Information Retrieval Lecture 16 Web Search 2.
Estimating the Global PageRank of Web Communities Paper by Jason V. Davis & Inderjit S. Dhillon Dept. of Computer Sciences University of Texas at Austin.
Pádraig Cunningham University College Dublin Matrix Tutorial Transition Matrices Graphs Random Walks.
Multimedia Databases SVD II. Optimality of SVD Def: The Frobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 21: Link Analysis.
Shawn Sickel A Comparison of some Iterative Methods in Scientific Computing.
Link Analysis, PageRank and Search Engines on the Web
An introduction to iterative projection methods Eigenvalue problems Luiza Bondar the 23 rd of November th Seminar.
Kathryn Linehan Advisor: Dr. Dianne O’Leary
CSE 245: Computer Aided Circuit Simulation and Verification
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 6. Eigenvalue problems.
The Further Mathematics network
Dominant Eigenvalues & The Power Method
Stochastic Approach for Link Structure Analysis (SALSA) Presented by Adam Simkins.
Presented By: - Chandrika B N
The PageRank Citation Ranking: Bringing Order to the Web Presented by Aishwarya Rengamannan Instructor: Dr. Gautam Das.
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
Google’s Billion Dollar Eigenvector Gerald Kruse, PhD. John ‘54 and Irene ‘58 Dale Professor of MA, CS and I T Interim Assistant Provost Juniata.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Solving Scalar Linear Systems Iterative approach Lecture 15 MA/CS 471 Fall 2003.
Exploiting Web Matrix Permutations to Speedup PageRank Computation Presented by: Aries Chan, Cody Lawson, and Michael Dwyer.
Using Adaptive Methods for Updating/Downdating PageRank Gene H. Golub Stanford University SCCM Joint Work With Sep Kamvar, Taher Haveliwala.
1 Information Retrieval through Various Approximate Matrix Decompositions Kathryn Linehan Advisor: Dr. Dianne O’Leary.
Methods of Computing the PageRank Vector Tom Mangan.
Lecture 7 - Systems of Equations CVEN 302 June 17, 2002.
How works M. Ram Murty, FRSC Queen’s Research Chair Queen’s University or How linear algebra powers the search engine.
CSE 245: Computer Aided Circuit Simulation and Verification Matrix Computations: Iterative Methods I Chung-Kuan Cheng.
PageRank Algorithm -- Bringing Order to the Web (Hu Bin)
By: Jesse Ehlert Dustin Wells Li Zhang Iterative Aggregation/Disaggregation(IAD)
Stats & Summary. The Woodbury Theorem where the inverses.
WORKSHOP ERCIM Global convergence for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
Krylov-Subspace Methods - I Lecture 6 Alessandra Nardi Thanks to Prof. Jacob White, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
F. Fairag, H Tawfiq and M. Al-Shahrani Department of Math & Stat Department of Mathematics and Statistics, KFUPM. Nov 6, 2013 Preconditioning Technique.
ILAS Threshold partitioning for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
MA237: Linear Algebra I Chapters 1 and 2: What have we learned?
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
PageRank Google : its search listings always seemed deliver the “good stuff” up front. 1 2 Part of the magic behind it is its PageRank Algorithm PageRank™
Extrapolation to Speed-up Query- dependent Link Analysis Ranking Algorithms Muhammad Ali Norozi Department of Computer Science Norwegian University of.
Web Mining Link Analysis Algorithms Page Rank. Ranking web pages  Web pages are not equally “important” v  Inlinks.
Motivation Modern search engines for the World Wide Web use methods that require solving huge problems. Our aim: to develop multiscale techniques that.
PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google.
Elementary Linear Algebra
Search Engines and Link Analysis on the Web
PageRank and Markov Chains
DTMC Applications Ranking Web Pages & Slotted ALOHA
PageRank & Random Walk “The important of a Web page is depends on the readers interest, knowledge and attitudes…” –By Larry Page, Co-Founder of Google.
Link Counts GOOGLE Page Rank engine needs speedup
Iterative Aggregation Disaggregation
Lecture 22 SVD, Eigenvector, and Web Search
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Piyush Kumar (Lecture 2: PageRank)
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Updating PageRank by Iterative Aggregation
Junghoo “John” Cho UCLA
Lecture 22 SVD, Eigenvector, and Web Search
Lecture 22 SVD, Eigenvector, and Web Search
Presentation transcript:

ACCELERATING GOOGLE’S PAGERANK Liz & Steve

Background  When a search query is entered in Google, the relevant results are returned to the user in an order that Google predetermines.  This order is determined by each web page’s PageRank value.  Google’s system of ranking web pages has made it the most widely used search engine available.  The PageRank vector is a stochastic vector that gives a numerical value (0<val<1) to each web page.  To compute this vector, Google uses a matrix denoting links between web pages.

Background Main ideas:  Web pages with the highest number of inlinks should receive the highest rank.  The rank of a page P is to be determined by adding the (weighted) ranks of all the pages linking to P.

Background  Problem: Compute a PageRank vector that contains an meaningful rank of every web page

Power Method  The PageRank vector is the dominant eigenvector of the matrix H…after modification  Google currently uses the Power Method to compute this eigenvector. However, H is often not suitable for convergence.  Power Method:

Creating a usable matrix e is a vector of ones and u (for the moment) is an arbitrary probabilistic vector.

Using the Power Method  The rate of convergence is:, where is the dominant eigenvalue and is the aptly named subdominant eigenvalue

Alternative Methods: Linear Systems

Langville & Meyer’s reordering

Alternative Methods: Iterative Aggregation/Disaggregation (IAD)

IAD Algorithm  Form the matrix  Find the stationary vector   If, then stop. Otherwise,

New Ideas: The Linear System In IAD

New Ideas: Finding and

Functional Codes  Power Method  We duplicated Google’s formulation of the power method in order to have a base time with which to compare our results  A basic linear solver  We used Gauss-Seidel method to solve the very basic linear system:  We also experimented with reordering by row degree before solving the aforementioned system.  Langville & Meyer’s Linear System Algorithm  Used as another time benchmark against our algorithms

Functional Codes (cont’d)  IAD - using power method to find w 1  We used the power method to find the dominant eigenvector of the aggregated matrix A. The rescaling constant, c, is merely the last entry of the dominant eigenvector  IAD – using a linear system to find w 1  We found the dominant eigenvector as discussed earlier, using some new reorderings

And now… The Winner!  Power Method with preconditioning  Applying a row and column reordering by decreasing degree almost always reduces the number of iterations required to converge.

Why this works… 1. The power method converges faster as the magnitude of the subdominant eigenvalue decreases 2. Tugrul Dayar found that partitioning a matrix in such a way that its off-diagonal blocks are close to 0, forces the dominant eigenvalue of the iteration matrix closer to 0. This is somehow related to the subdominant eigenvalue of the coefficient matrix in power method.

Decreased Iterations

Decreased Time

Some Comparisons CalifStanCNRStan BerkEU Sample Size Interval Size Mean Time Pwr/Reorder STD Time Pwr/Reorder Mean Iter Pwr/Reorder STD Iter Pwr/Reorder Favorable100.00%98.25%100.00%

Future Research  Test with more advanced numerical algorithms for linear systems (Krylov subspaces methods and preconditioners, i.e. GMRES, BICG, ILU, etc.)  Test with other reorderings for all methods  Test with larger matrices (find a supercomputer that works)  Attempt a theoretical proof of the decrease in the magnitude of the subdominant eigenvalue as result of reorderings.  Convert codes to low level languages (C++, etc.)  Decode MATLAB’s spy

Langville & Meyer’s Algorithm

Theorem: Perron-Frobenius  If is a non-negative irreducible matrix, then  is a positive eigenvalue of  There is a positive eigenvector associated with  has algebraic and geometric multiplicity 1

The Power Method: Two Assumptions  The complete set of eigenvectors are linearly independent  For each eigenvector there exists eigenvalues such that