Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.

Slides:



Advertisements
Similar presentations
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Advertisements

CSE 245: Computer Aided Circuit Simulation and Verification Matrix Computations: Iterative Methods (II) Chung-Kuan Cheng.
CSE 245: Computer Aided Circuit Simulation and Verification
Siddharth Choudhary.  Refines a visual reconstruction to produce jointly optimal 3D structure and viewing parameters  ‘bundle’ refers to the bundle.
CS 240A: Solving Ax = b in parallel Dense A: Gaussian elimination with partial pivoting (LU) Same flavor as matrix * matrix, but more complicated Sparse.
Applied Linear Algebra - in honor of Hans SchneiderMay 25, 2010 A Look-Back Technique of Restart for the GMRES(m) Method Akira IMAKURA † Tomohiro SOGABE.
Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.
MATH 685/ CSI 700/ OR 682 Lecture Notes
Solving Linear Systems (Numerical Recipes, Chap 2)
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Systems of Linear Equations
Iterative methods TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A A A A.
Iterative Methods and QR Factorization Lecture 5 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.
Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.
Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods, converge faster –Krylov subspace method Steepest descent.
Solution of linear system of equations
1cs542g-term Notes  Make-up lecture tomorrow 1-2, room 204.
1cs542g-term Notes  Assignment 1 is out (questions?)
1cs542g-term Notes  Assignment 1 is out (due October 5)  Matrix storage: usually column-major.
CSCI 317 Mike Heroux1 Sparse Matrix Computations CSCI 317 Mike Heroux.
Sparse Matrix Algorithms CS 524 – High-Performance Computing.
Symmetric Weighted Matching for Indefinite Systems Iain Duff, RAL and CERFACS John Gilbert, MIT and UC Santa Barbara June 21, 2002.
Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.
Sparse Matrix Methods Day 1: Overview Day 2: Direct methods
The Landscape of Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage (if sparse) More Robust.
Sparse Matrix Methods Day 1: Overview Day 2: Direct methods Nonsymmetric systems Graph theoretic tools Sparse LU with partial pivoting Supernodal factorization.
Sparse Matrix Methods Day 1: Overview Matlab and examples Data structures Ax=b Sparse matrices and graphs Fill-reducing matrix permutations Matching and.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
CS240A: Conjugate Gradients and the Model Problem.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 6. Eigenvalue problems.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Conjugate gradients, sparse matrix-vector multiplication, graphs, and meshes Thanks to Aydin Buluc, Umit Catalyurek, Alan Edelman, and Kathy Yelick for.
CS 290H Lecture 12 Column intersection graphs, Ordering for sparsity in LU with partial pivoting Read “Computing the block triangular form of a sparse.
Support-Graph Preconditioning John R. Gilbert MIT and U. C. Santa Barbara coauthors: Marshall Bern, Bruce Hendrickson, Nhat Nguyen, Sivan Toledo authors.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Complexity of direct methods n 1/2 n 1/3 2D3D Space (fill): O(n log n)O(n 4/3 ) Time (flops): O(n 3/2 )O(n 2 ) Time and space to solve any problem on any.
The Landscape of Sparse Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage More Robust More.
Scientific Computing Partial Differential Equations Poisson Equation.
CS 219: Sparse Matrix Algorithms
Administrivia: May 20, 2013 Course project progress reports due Wednesday. Reading in Multigrid Tutorial: Chapters 3-4: Multigrid cycles and implementation.
1 Incorporating Iterative Refinement with Sparse Cholesky April 2007 Doron Pearl.
Elliptic PDEs and the Finite Difference Method
1 Marijn Bartel Schreuders Supervisor: Dr. Ir. M.B. Van Gijzen Date:Monday, 24 February 2014.
Forward modelling The key to waveform tomography is the calculation of Green’s functions (the point source responses) Wide range of modelling methods available.
Parallel Solution of the Poisson Problem Using MPI
CS240A: Conjugate Gradients and the Model Problem.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
Direct Methods for Sparse Linear Systems Lecture 4 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.
CS 290H Administrivia: May 14, 2008 Course project progress reports due next Wed 21 May. Reading in Saad (second edition): Sections
CS 290H 31 October and 2 November Support graph preconditioners Final projects: Read and present two related papers on a topic not covered in class Or,
CS 290H Lecture 15 GESP concluded Final presentations for survey projects next Tue and Thu 20-minute talk with at least 5 min for questions and discussion.
Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.
Krylov-Subspace Methods - I Lecture 6 Alessandra Nardi Thanks to Prof. Jacob White, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy.
Symmetric-pattern multifrontal factorization T(A) G(A)
The Landscape of Sparse Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage More Robust More.
Model Problem: Solving Poisson’s equation for temperature
CS 290N / 219: Sparse Matrix Algorithms
A computational loop k k Integration Newton Iteration
Solving Linear Systems Ax=b
CS 290H Administrivia: April 16, 2008
CSE 245: Computer Aided Circuit Simulation and Verification
A robust preconditioner for the conjugate gradient method
Conjugate Gradient Method
~ Least Squares example
Numerical Linear Algebra
Administrivia: November 9, 2009
A computational loop k k Integration Newton Iteration
Ax = b Methods for Solution of the System of Equations (ReCap):
Presentation transcript:

Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0 = 0, r 0 = b, d 0 = r 0 for k = 1, 2, 3,... α k = (r T k-1 r k-1 ) / (d T k-1 Ad k-1 ) step length x k = x k-1 + α k d k-1 approx solution r k = r k-1 – α k Ad k-1 residual β k = (r T k r k ) / (r T k-1 r k-1 ) improvement d k = r k + β k d k-1 search direction

Conjugate gradient: Convergence In exact arithmetic, CG converges in n steps (completely unrealistic!!) Accuracy after k steps of CG is related to: consider polynomials of degree k that are equal to 1 at 0. how small can such a polynomial be at all the eigenvalues of A? Thus, eigenvalues close together are good. Condition number: κ (A) = ||A|| 2 ||A -1 || 2 = λ max (A) / λ min (A) Residual is reduced by a constant factor by O(κ 1/2 (A)) iterations of CG.

The Landscape of Sparse Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage More Robust More General D

Other Krylov subspace methods Nonsymmetric linear systems: GMRES: for i = 1, 2, 3,... find x i  K i (A, b) that minimizes ||b – Ax i || 2 But, no short recurrence => save old vectors => lots more space (Usually “restarted” every k iterations to use less space.) BiCGStab, QMR, etc.: Two spaces K i (A, b) and K i (A T, b) w/ mutually orthogonal bases Short recurrences => O(n) space, but less robust Convergence and preconditioning more delicate than CG Active area of current research Eigenvalues: Lanczos (symmetric), Arnoldi (nonsymmetric)

Preconditioners Suppose you had a matrix B such that: 1.condition number κ (B -1 A) is small 2.By = z is easy to solve Then you could solve (B -1 A)x = B -1 b instead of Ax = b Actually (B -1/2 AB -1/2 ) (B 1/2 x) = B -1/2 b, but never mind… B = A is great for (1), not for (2) B = I is great for (2), not for (1) Domain-specific approximations sometimes work B = diagonal of A sometimes works Better: blend in some direct-methods ideas...

Preconditioned conjugate gradient iteration x 0 = 0, r 0 = b, d 0 = B -1 r 0, y 0 = B -1 r 0 for k = 1, 2, 3,... α k = (y T k-1 r k-1 ) / (d T k-1 Ad k-1 ) step length x k = x k-1 + α k d k-1 approx solution r k = r k-1 – α k Ad k-1 residual y k = B -1 r k preconditioning solve β k = (y T k r k ) / (y T k-1 r k-1 ) improvement d k = y k + β k d k-1 search direction One matrix-vector multiplication per iteration One solve with preconditioner per iteration

Incomplete Cholesky factorization (IC, ILU) Compute factors of A by Gaussian elimination, but ignore fill Preconditioner B = R T R  A, not formed explicitly Compute B -1 z by triangular solves (in time nnz(A)) Total storage is O(nnz(A)), static data structure Either symmetric (IC) or nonsymmetric (ILU) x ARTRT R

Incomplete Cholesky and ILU: Variants Allow one or more “levels of fill” unpredictable storage requirements Allow fill whose magnitude exceeds a “drop tolerance” may get better approximate factors than levels of fill unpredictable storage requirements choice of tolerance is ad hoc Partial pivoting (for nonsymmetric A) “Modified ILU” (MIC): Add dropped fill to diagonal of U or R A and R T R have same row sums good in some PDE contexts

Incomplete Cholesky and ILU: Issues Choice of parameters good: smooth transition from iterative to direct methods bad: very ad hoc, problem-dependent tradeoff: time per iteration (more fill => more time) vs # of iterations (more fill => fewer iters) Effectiveness condition number usually improves (only) by constant factor (except MIC for some problems from PDEs) still, often good when tuned for a particular class of problems Parallelism Triangular solves are not very parallel Reordering for parallel triangular solve by graph coloring

Matrix permutations for iterative methods Symmetric matrix permutations don’t change the convergence of unpreconditioned CG Symmetric matrix permutations affect the quality of an incomplete factorization – poorly understood, controversial Often banded (RCM) is best for IC(0) / ILU(0) Often min degree & nested dissection are bad for no-fill incomplete factorization but good if some fill is allowed Some experiments with orderings that use matrix values e.g. “minimum discarded fill” order sometimes effective but expensive to compute

Sparse approximate inverses Compute B -1  A explicitly Minimize || A B -1 – I || F (in parallel, by columns) Variants: factored form of B -1, more fill,.. Good: very parallel, seldom breaks down Bad: effectiveness varies widely AB -1

Nonsymmetric matrix permutations for iterative methods Dynamic permutation: ILU with row or column pivoting E.g. ILUTP (Saad), Matlab “luinc” More robust but more expensive than ILUT Static permutation: Try to increase diagonal dominance E.g. bipartite weighted matching (Duff & Koster) Often effective; no theory for quality of preconditioner Field is not well understood, to say the least

Row permutation for heavy diagonal Row permutation for heavy diagonal [Duff, Koster] Represent A as a weighted, undirected bipartite graph (one node for each row and one node for each column) Find matching (set of independent edges) with maximum product of weights Permute rows to place matching on diagonal Matching algorithm also gives a row and column scaling to make all diag elts =1 and all off-diag elts <= A PA

Complexity of direct methods n 1/2 n 1/3 2D3D Space (fill): O(n log n)O(n 4/3 ) Time (flops): O(n 3/2 )O(n 2 ) Time and space to solve any problem on any well- shaped finite element mesh

Complexity of linear solvers 2D3D Dense Cholesky:O(n 3 ) Sparse Cholesky:O(n 1.5 )O(n 2 ) CG, exact arithmetic: O(n 2 ) CG, no precond: O(n 1.5 )O(n 1.33 ) CG, modified IC0: O(n 1.25 )O(n 1.17 ) CG, support trees: O(n 1.20 ) -> O(n 1+ )O(n 1.75 ) -> O(n 1+ ) Multigrid:O(n) n 1/2 n 1/3 Time to solve model problem (Poisson’s equation) on regular mesh