By: Jesse Ehlert Dustin Wells Li Zhang Iterative Aggregation/Disaggregation(IAD)

Slides:



Advertisements
Similar presentations
Numerical Solution of Linear Equations
Advertisements

Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Linear Algebra Applications in Matlab ME 303. Special Characters and Matlab Functions.
Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
ACCELERATING GOOGLE’S PAGERANK Liz & Steve. Background  When a search query is entered in Google, the relevant results are returned to the user in an.
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Linear Algebra.
1.5 Elementary Matrices and a Method for Finding
Prime An integer greater than one is called a prime number if its only positive divisors (factors) are one and itself. Examples: The first six primes are.
Rayan Alsemmeri Amseena Mansoor. LINEAR SYSTEMS Jacobi method is used to solve linear systems of the form Ax=b, where A is the square and invertible.
The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer.
Matrices, Digraphs, Markov Chains & Their Use. Introduction to Matrices  A matrix is a rectangular array of numbers  Matrices are used to solve systems.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
1.2 Row Reduction and Echelon Forms
Linear Equations in Linear Algebra
Distributed PageRank Computation Based on Iterative Aggregation- Disaggregation Methods Yangbo Zhu, Shaozhi Ye and Xing Li Tsinghua University, Beijing,
Link Analysis, PageRank and Search Engines on the Web
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
1 © 2012 Pearson Education, Inc. Matrix Algebra THE INVERSE OF A MATRIX.
Chapter 5 Determinants.
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
5  Systems of Linear Equations: ✦ An Introduction ✦ Unique Solutions ✦ Underdetermined and Overdetermined Systems  Matrices  Multiplication of Matrices.
Matrices Write and Augmented Matrix of a system of Linear Equations Write the system from the augmented matrix Solve Systems of Linear Equations using.
The effect of New Links on Google Pagerank By Hui Xie Apr, 07.
Chapter 4 Systems of Linear Equations; Matrices
SYSTEMS OF LINEAR EQUATIONS
1 1.1 © 2012 Pearson Education, Inc. Linear Equations in Linear Algebra SYSTEMS OF LINEAR EQUATIONS.
Chapter 10 Review: Matrix Algebra
Linear Equations in Linear Algebra
Entropy Rate of a Markov Chain
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
Google’s Billion Dollar Eigenvector Gerald Kruse, PhD. John ‘54 and Irene ‘58 Dale Professor of MA, CS and I T Interim Assistant Provost Juniata.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Divide-and-Conquer 7 2  9 4   2   4   7
 Row and Reduced Row Echelon  Elementary Matrices.
MATH 250 Linear Equations and Matrices
Exploiting Web Matrix Permutations to Speedup PageRank Computation Presented by: Aries Chan, Cody Lawson, and Michael Dwyer.
Methods of Computing the PageRank Vector Tom Mangan.
We will use Gauss-Jordan elimination to determine the solution set of this linear system.
1 1.5 © 2016 Pearson Education, Inc. Linear Equations in Linear Algebra SOLUTION SETS OF LINEAR SYSTEMS.
1 1.7 © 2016 Pearson Education, Inc. Linear Equations in Linear Algebra LINEAR INDEPENDENCE.
Chapter 3 Solution of Algebraic Equations 1 ChE 401: Computational Techniques for Chemical Engineers Fall 2009/2010 DRAFT SLIDES.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
PageRank. s1s1 p 12 p 21 s2s2 s3s3 p 31 s4s4 p 41 p 34 p 42 p 13 x 1 = p 21 p 34 p 41 + p 34 p 42 p 21 + p 21 p 31 p 41 + p 31 p 42 p 21 / Σ x 2 = p 31.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Section 1.7 Linear Independence and Nonsingular Matrices
COMS Network Theory Week 5: October 6, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010.
Divide and Conquer Faculty Name: Ruhi Fatima Topics Covered Divide and Conquer Matrix multiplication Recurrence.
Stochastic Processes and Transition Probabilities D Nagesh Kumar, IISc Water Resources Planning and Management: M6L5 Stochastic Optimization.
WORKSHOP ERCIM Global convergence for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
11. Markov Chains (MCs) 2 Courtesy of J. Bard, L. Page, and J. Heyl.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
ILAS Threshold partitioning for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
1 1.2 Linear Equations in Linear Algebra Row Reduction and Echelon Forms © 2016 Pearson Education, Ltd.
CHAPTER 7 Determinant s. Outline - Permutation - Definition of the Determinant - Properties of Determinants - Evaluation of Determinants by Elementary.
MAT 322: LINEAR ALGEBRA.
Search Engines and Link Analysis on the Web
DTMC Applications Ranking Web Pages & Slotted ALOHA
Chapter 8: Lesson 8.1 Matrices & Systems of Equations
Determinants Section The focus before was to determine information about the solution set of the linear system of equations given as the matrix equation.
Iterative Aggregation Disaggregation
Linear Equations in Linear Algebra
Lecture 4: Algorithmic Methods for G/M/1 and M/G/1 type models
Linear Algebra Lecture 37.
Linear Equations in Linear Algebra
CS 416 Artificial Intelligence
DETERMINANT MATH 80 - Linear Algebra.
Linear Equations in Linear Algebra
Linear Equations in Linear Algebra
Presentation transcript:

By: Jesse Ehlert Dustin Wells Li Zhang Iterative Aggregation/Disaggregation(IAD)

Introduction What are we trying to do? We are trying to find a more efficient way than the power method to compute the pagerank vector. How are we going to do this? We are going to use an IAD from the theory of Markov Chains to compute the pagerank vector. We are going to apply the power method to

Markov Chains We will represent the web by a Markov chain. Markov chain is a stochastic process describing a chain of events. Consist of a set of states S = {s 1, …, s n } Web pages will be the states Probability to move from state s i to state s j in one step is p ij. We can represent this by a stochastic matrix with entries p ij Probabilistic vector v is a stationary distribution if: v T = v T G This means that the PageRank vector is also a stationary distribution vector of the Markov chain represented by the matrix G

Aggregation/Disaggregation Approach Main idea to compute the pagerank vector v is to block the matrix G so the size of the problem is reduced to about the size of one of the diagonal blocks. In fact (I – G 11 ) is non singular. Then we define: and S to be

Aggregation/Disaggregation Approach Cont. From the previous slide we can show that I – G = LDU Thus, because U is nonsingular, we have: From the last equation, we can get v 2 T = v 2 T Swhich implies that V 2 is a stationary distribution of S. If u 2 is the unique stationary distribution of S with then we have:

Aggregation/Disaggregation Approach Cont. We need to find an expression for v 1 Let A be the aggregated matrix associated to G, defined as: What we want to do now is find the stationary distribution of A. From v T LD = 0, we can get: v 1 T (I – G 11 ) – v 2 T G 21 = 0 If we rearrange things, we can get

Aggregation/Disaggregation Approach Cont. From v 2 T = v 2 T S, we also have: From the previous three statements we can get an expression for v 1.

Theorem 3.20 (Exact aggregation/disaggregation) Theorem 3.20

Theorem 3.20 Cont. Instead of finding the stationary distribution of G, we have broken it down to find the stationary distribution of two smaller matrices. Problem- Forming the matrix S and computing its stationary distribution u 2 is very expensive and not very efficient. Solution: Use an approximation This leads us to Approximate Aggregation Matrix

Approximate Aggregation Matrix We now define the approximate aggregation matrix as: The only difference between this matrix and the previous aggregation matrix is the last row where we use an arbitrary probabilistic vector that plays the role of the exact stationary distribution u 2. In general this approach does not give a very good approximation to the stationary distribution of the original matrix G. To improve the accuracy, we add a power method step.

Approximate Aggregation Matrix Typically, we will have so that the actual algorithm to be implemented consists of repeated applications of the algorithm above. This gives us an iterative aggregation/disaggregation algorithm (IAD)

Iterative Aggregation/Disaggregation Algorithm (IAD) using Power Method As you can see from above, we still need to compute, the stationary distribution of

IAD Cont. First, we write so that we get rid of G 22 We then let From we have:

IAD Cont. Now we will try to get some sparsity out of G. We will write G like we did before: G = α H + α au T + (1 − α )eu T. From the blocking of G, we will block the matrices H, au T and eu T for some matrices A, B, C, D, E, F, J, K. From here you can see

IAD Cont. We now take G 11, G 12 and G 21 and plug them into We get: For the iterative process of power method within IAD, we give an arbitrary initial guess and iterate according to the formulas above for the next approximation until our tolerance is reached.

Combine Linear Systems and IAD Before, we had This can be written as

Combine Linear Systems and IAD Cont. The problem with this is the matrices G 11, G 12 and G 21 are full matrices which means the computations at each step are generally very expensive

Combine Linear Systems and IAD Cont. We will return to the original matrix H to find some sparsity. From this equation, we can look at G11 in more depth to get: We will use the fact that to simplify the equation to get: Note: we used

Using Dangling Nodes We can reorder H by dangling nodes so that H 21 is a matrix of zeros Then our equation from before reduces to: We approximate as: We can show that:

Linear Systems and IAD Process Combined Now, we combine ideas from IAD and linear systems, with H arranged by dangling nodes, to get the process below:

Conclusion Instead of finding the stationary distribution of G directly, we have broken it down to find the stationary distribution of smaller matrices, S and A, which gives us the stationary distribution of G The problem with this is that it was very inefficient. So we found the approximation of the stationary distribution and used power method techniques to improve accuracy. Then we used linear systems along with our iterative aggregation/disaggregation algorithm to find another solution to the pagerank vector.