CS 584 l Assignment. Systems of Linear Equations l A linear equation in n variables has the form l A set of linear equations is called a system. l A solution.

Slides:



Advertisements
Similar presentations
Dense Matrix Algorithms. Topic Overview Matrix-Vector Multiplication Matrix-Matrix Multiplication Solving a System of Linear Equations.
Advertisements

Parallel Matrix Operations using MPI CPS 5401 Fall 2014 Shirley Moore, Instructor November 3,
1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).
1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).
CS 484. Dense Matrix Algorithms There are two types of Matrices Dense (Full) Sparse We will consider matrices that are Dense Square.
Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.
1 Linear Triangular System L – lower triangular matrix, nonsingular Lx=b L: nxn nonsingular lower triangular b: known vector b(1) = b(1)/L(1,1) For i=2:n.
1 Parallel Algorithms II Topics: matrix and graph algorithms.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Chapter: 3b System of Linear Equations
Lecture 9: Introduction to Matrix Inversion Gaussian Elimination Sections 2.4, 2.5, 2.6 Sections 2.2.3, 2.3.
Numerical Algorithms Matrix multiplication
Lesson 8 Gauss Jordan Elimination
Numerical Algorithms • Matrix multiplication
Solving Linear Equations Rule 7 ‑ 1: We can perform any mathematical operation on one side of an equation, provided we perform the same operation on the.
Chapter 6 Floyd’s Algorithm. 2 Chapter Objectives Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and.
Introduction to Parallel Programming Language notation: message passing Distributed-memory machine (e.g., workstations on a network) 5 parallel algorithms.
Chapter 6 Floyd’s Algorithm. 2 Chapter Objectives Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and.
Pipelined Computations Divide a problem into a series of tasks A processor completes a task sequentially and pipes the results to the next processor Pipelining.
Design of parallel algorithms
CS 584. Dense Matrix Algorithms There are two types of Matrices Dense (Full) Sparse We will consider matrices that are Dense Square.
1/26 Design of parallel algorithms Linear equations Jari Porras.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
1 Friday, November 03, 2006 “The greatest obstacle to discovery is not ignorance, but the illusion of knowledge.” -D. Boorstin.
Design of parallel algorithms Matrix operations J. Porras.
Dense Matrix Algorithms CS 524 – High-Performance Computing.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
Today Objectives Chapter 6 of Quinn Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and printing 2-D.
Introduction to Parallel Programming Language notation: message passing Distributed-memory machine –(e.g., workstations on a network) 5 parallel algorithms.
LINEAR EQUATION IN TWO VARIABLES. System of equations or simultaneous equations – System of equations or simultaneous equations – A pair of linear equations.
Assignment Solving System of Linear Equations Using MPI Phạm Trần Vũ.
1 Calculating Polynomials We will use a generic polynomial form of: where the coefficient values are known constants The value of x will be the input and.
MATRICES AND DETERMINANTS
Scientific Computing Linear Systems – LU Factorization.
Barnett/Ziegler/Byleen Finite Mathematics 11e1 Review for Chapter 4 Important Terms, Symbols, Concepts 4.1. Systems of Linear Equations in Two Variables.
Linear Systems Gaussian Elimination CSE 541 Roger Crawfis.
MATH 250 Linear Equations and Matrices
Matrix Solutions to Linear Systems. 1. Write the augmented matrix for each system of linear equations.
Introduction to Parallel Programming
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Copyright © 2011 Pearson Education, Inc. Solving Linear Systems Using Matrices Section 6.1 Matrices and Determinants.
Matrices and Systems of Equations
CS 484. Iterative Methods n Gaussian elimination is considered to be a direct method to solve a system. n An indirect method produces a sequence of values.
3.1 Systems of Linear Equations (Elimination – or as the book calls it, Addition Method)
Slide Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley A set of equations is called a system of equations. The solution.
1 Section 5.3 Linear Systems of Equations. 2 THREE EQUATIONS WITH THREE VARIABLES Consider the linear system of three equations below with three unknowns.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Introduction to Parallel Programming Language notation: message passing 5 parallel algorithms of increasing complexity: –Matrix multiplication –Successive.
Objective 1 You will be able to find the determinant of a 2x2 and a 3x3 matrix.
RECOGNIZING INCONSISTENT LINEAR SYSTEMS. What is an Inconsistent Linear System?  An inconsistent linear system is a system of equations that has no solutions.
Chapter 5: Matrices and Determinants Section 5.5: Augmented Matrix Solutions.
Introduction to Parallel Programming Language notation: message passing Distributed-memory machine –(e.g., workstations on a network) 5 parallel algorithms.
Numerical Computation Lecture 6: Linear Systems – part II United International College.
Numerical Algorithms Chapter 11.
Simultaneous Equations 1
Parallel Programming with MPI and OpenMP
Introduction to Parallel Programming
Solving Linear Systems Algebraically
6-3 Solving Systems Using Elimination
Parallel Matrix Operations
Decomposition Data Decomposition Functional Decomposition
CSCE569 Parallel Computing
Simultaneous Equations
CSCE569 Parallel Computing
Systems of Equations Solve by Graphing.
To accompany the text “Introduction to Parallel Computing”,
Notes Over 6.1 Graphing a Linear Inequality Graph the inequality.
Matrices and Linear Equations
Multiply by 5/40 and sum with 2nd row
Presentation transcript:

CS 584 l Assignment

Systems of Linear Equations l A linear equation in n variables has the form l A set of linear equations is called a system. l A solution exists for a system iff the solution satisfies all equations in the system. l Many scientific and engineering problems take this form. a 0 x 0 + a 1 x 1 + … + a n-1 x n-1 = b

Solving Systems of Equations l Many such systems are large. –Thousands of equations and unknowns a 0,0 x 0 + a 0,1 x 1 + … + a 0,n-1 x n-1 = b 0 a 1,0 x 0 + a 1,1 x 1 + … + a 1,n-1 x n-1 = b 1 a n-1,0 x 0 + a n-1,1 x 1 + … + a n-1,n-1 x n-1 = b n-1

Solving Systems of Equations l A linear system of equations can be represented in matrix form a 0,0 a 0,1 … a 0,n-1 x 0 b 0 a 1,0 a 1,1 … a 1,n-1 x 1 b 1 a n-1,0 a n-1,1 … a n-1,n-1 x n-1 b n-1 = Ax = b

Solving Systems of Equations l Solving a system of linear equations is done in two steps: –Reduce the system to upper-triangular –Use back-substitution to find solution l These steps are performed on the system in matrix form. –Gaussian Elimination, etc.

Solving Systems of Equations l Reduce the system to upper-triangular form l Use back-substitution a 0,0 a 0,1 … a 0,n-1 x 0 b 0 0 a 1,1 … a 1,n-1 x 1 b … a n-1,n-1 x n-1 b n-1 =

Reducing the System l Gaussian elimination systematically eliminates variable x[k] from equations k+1 to n-1. –Reduces the coefficients to zero l This is done by subtracting a appropriate multiple of the k th equation from each of the equations k+1 to n-1

Procedure GaussianElimination(A, b, y) for k = 0 to n-1 /* Division Step */ for j = k + 1 to n - 1 A[k,j] = A[k,j] / A[k,k] y[k] = b[k] / A[k,k] A[k,k] = 1 /* Elimination Step */ for i = k + 1 to n - 1 for j = k + 1 to n - 1 A[i,j] = A[i,j] - A[i,k] * A[k,j] b[i] = b[i] - A[i,k] * y[k] A[i,k] = 0 endfor end

Parallelizing Gaussian Elim. l Use domain decomposition –Rowwise striping l Division step requires no communication l Elimination step requires a one-to-all broadcast for each equation. l No agglomeration l Initially map one to to each processor

Communication Analysis l Consider the algorithm step by step l Division step requires no communication l Elimination step requires one-to-all bcast –only bcast to other active processors –only bcast active elements l Final computation requires no communication.

Communication Analysis l One-to-all broadcast –log 2 q communications –q = n - k - 1 active processors l Message size –q active processors –q elements required T = (t s + t w q)log 2 q

Computation Analysis l Division step –q divisions l Elimination step –q multiplications and subtractions l Assuming equal time --> 3q operations

Computation Analysis l In each step, the active processor set is reduced by one resulting in: 2/)1(      nnCompTime kn n k

Can we do better? l Previous version is synchronous and parallelism is reduced at each step. l Pipeline the algorithm l Run the resulting algorithm on a linear array of processors. l Communication is nearest-neighbor l Results in O(n) steps of O(n) operations

Pipelined Gaussian Elim. l Basic assumption: A processor does not need to wait until all processors have received a value to proceed. l Algorithm –If processor p has data for other processors, send the data to processor p+1 –If processor p can do some computation using the data it has, do it. –Otherwise, wait to receive data from processor p-1

Conclusion l Using a striped partitioning method, it is natural to pipeline the Gaussian elimination algorithm to achieve best performance. l Pipelined algorithms work best on a linear array of processors. –Or something that can be linearly mapped l Would it be better to block partition? –How would it affect the algorithm?