Download presentation
Presentation is loading. Please wait.
Published byOwen McDowell Modified over 9 years ago
1
CSE 245: Computer Aided Circuit Simulation and Verification Matrix Computations: Iterative Methods (II) Chung-Kuan Cheng
2
2 Outline Introduction Direct Methods Iterative Methods Formulations Projection Methods Krylov Space Methods Preconditioned Iterations Multigrid Methods Domain Decomposition Methods
3
3 Introduction Direct Method LU Decomposition Iterative Methods Jacobi Gauss-Seidel Conjugate Gradient GMRES Multigrid Domain Decomposition Preconditioning General and Robust but can be complicated if N>= 1M Excellent choice for SPD matrices Remain an art for arbitrary matrices
4
4 Formulation Error in A norm Matrix A is SPD (Symmetric and Positive Definite Minimal Residue Matrix A can be arbitrary
5
5 Formulation: Error in A norm Min E(x)=1/2 x T Ax – b T x, A is SPD Suppose that we knew Ax * =b E(x)=1/2 (x-x * ) T A(x-x * ). For Krylov space approach, search space: x=x 0 +Vy, where x 0 is an initial solution, matrix V nxm is given and full ranked vector y m contains the m variables.
6
6 Solution and Error Min E(x)=1/2x T Ax - b T x, Search space: x=x 0 +Vy, r=b-Ax Derivation 1.V T AV is nonsingular (A is SPD & V is full ranked 2.The variable: y=(V T AV) -1 V T r 0 (derivation) 3.Thus, solution: x=x 0 +V(V T AV) -1 V T r 0 Property of the solution 1.Residue: V T r=0 (proof) 2.Error: E(x)=E(x 0 )-1/2 r 0 T V(V T AV) -1 V T r 0
7
7 Solution and Error Min E(x)=1/2x T Ax - b T x, Search space: x=x 0 +Vy, r=b-Ax The variable: y=(V T AV) -1 V T r 0 Derivation of y: Use the condition that dE/dy=0 We have V T AVy+V T Ax 0 -V T b=0 Thus y=(V T AV) -1 V T (b-Ax 0 )
8
8 Solution and Error Min E(x)=1/2x T Ax - b T x, Search space: x=x 0 +Vy, r=b-Ax Property of Solution 1.Residue: V T r=0 (proof) Proof: V T r=V T (b-Ax)=V T (b-Ax 0 -AV(V T AV) -1 V T r 0 ) =V T (r 0 -AV(V T AV) -1 V T r 0 )=0 The residue of the solution is orthogonal to the previous bases.
9
9 Transformation of the bases For any V’=VW, where V nxm is given and W mxm is nonsigular, solution x remains the same for search space: x=x 0 +V’y Proof: 1.V’ T AV’ is nonsingular (A is SPD & V’ is full ranked) 2.x=x 0 +V’(V’ T AV’) -1 V’ T r 0 =x 0 +V(V T AV) -1 V T r 0 3.V’ T r=W T V T r=0 4.E(x)=E(x 0 )-1/2 r 0 T V(V T AV) -1 V T r 0
10
10 Steepest Descent: Error in A norm Min E(x)=1/2 x T Ax - b T x, Gradient: -r = Ax-b Set x=x 0 +yr 0 1.r 0 T Ar 0 is nonsingular (A is SPD) 2.y=(r 0 T Ar 0 ) -1 r 0 T r 0 3.x=x 0 +r 0 (r 0 T Ar 0 ) -1 r 0 T r 0 4.r 0 T r=0 5.E(x)=E(x 0 )-1/2 (r 0 T r 0 ) 2 /(r 0 T Ar 0 )
11
11 Lanczos: Error in A norm Min E(x)=1/2 x T Ax-b T x, Set x=x 0 +Vy 1.v 1 =r 0 2.v i is in K{r 0, A, i} 3.V=[v 1,v 2, …,v m ] is orthogonal 4.AV=VH m +v m+1 e m T 5.V T AV=H m Note since A is SPD, H m is T m Tridiagonal
12
12 Lanczos: Derive from Arnoldi Process Arnoldi Process: AV=VH+h m+1 v m+1 e m T Input A, v 1 =r 0 /|r 0 | Output V =[v 1,v 2, …, v m ], H and h m+1, v m+1 k=0, h 10 =1 While h k+1,k !=0, k<=m v k+1 =r k /h k+1,k k=k+1, r k =Av k For i=1, k h ik =v i T r k r k =r k -h ik v i End h k+1,k= |r k | 2 End
13
13 Lanczos: Create V=[v 1,v 2, …,v m ] which is orthogonal Input A, r 0, Output V and T=V T AV Initial: k=0, β 0 =|r 0 | 2, v 0 =0, v 1 =r 0 /β 0 while k≤ K or β k != 0 v k+1 =r k /β k k=k+1 a k =v k T Av k r k =Av k -a k v k - β k-1 v k-1 β k =|r k | 2 End
14
14 Lanczos: Create V=[v 1,v 2, …,v m ] which is orthogonal Input A, r 0, Output V and T=V T AV T ii =a i, T ij =b i (j=i+1), =b j (j=i-1), =0 (else) Proof: By induction that v j T Av j = 0, if i<j-1 b j, if i=j-1 a j, if i=j Since v i v j =0 if i!=j v j T Av i =v j T (b i-1 v i-1 +a i v i +b i v i+1 ) =b i-1 v j T v i-1 +a i v j T v i +b i v j T v i+1
15
15 Conjugate Gradient: Min E(x)=1/2 x T Ax-b T x, Set x=x 0 +Vy 1.v 1 =r 0 2.v i is in K{r 0, A, i} 3.Set V=[v 1,v 2, …,v m ] orthogonal in A norm, i.e. V T AV= [diag(v i T Av i )]=D 4.x=x 0 +VD -1 V T r 0, 5.x= x 0 +∑ i=1,m d i v i v i T r 0, where d i =(v i T Av i ) -1
16
16 Conjugate Gradient Method Steepest Descent Repeat search direction Why take exact one step for each direction? Search direction of Steepest descent method
17
17 Orthogonal A-orthogonal Instead of orthogonal search direction, we make search direction A –orthogonal (conjugate)
18
Conjugate Search Direction How to construct A-orthogonal search directions, given a set of n linear independent vectors. Since the residue vector in steepest descent method is orthogonal, a good candidate to start with 18
19
19 Conjugate Gradient: Min E(x)=1/2 x T Ax-b T x, Set x=x 0 +Vy Lanczos Method to derive V and T x=x 0 +VT -1 V T r 0 Decompose T to LDU=T (U=L T ) Thus, we have x=x 0 +V(LDL T ) -1 V T r 0 =x 0 +VL -T D -1 L -1 V T r 0
20
Conjugate Gradient Algorithm Given x0, iterate until residue is smaller than error tolerance 20
21
21 Conjugate gradient: Convergence In exact arithmetic, CG converges in n steps (completely unrealistic!!) Accuracy after k steps of CG is related to: consider polynomials of degree k that is equal to 1 at step 0. how small can such a polynomial be at all the eigenvalues of A? Eigenvalues close together are good. Condition number: κ(A) = ||A|| 2 ||A -1 || 2 = λ max (A) / λ min (A) Residual is reduced by a constant factor by O(κ 1/2 (A)) iterations of CG.
22
22 Preconditioners Suppose you had a matrix B such that: 1.condition number κ (B -1 A) is small 2.By = z is easy to solve Then you could solve (B -1 A)x = B -1 b instead of Ax = b B = A is great for (1), not for (2) B = I is great for (2), not for (1) Domain-specific approximations sometimes work B = diagonal of A sometimes works Better: blend in some direct-methods ideas...
23
23 Preconditioned conjugate gradient iteration x 0 = 0, r 0 = b, d 0 = B -1 r 0, y 0 = B -1 r 0 for k = 1, 2, 3,... α k = (y T k-1 r k-1 ) / (d T k-1 Ad k-1 ) step length x k = x k-1 + α k d k-1 approx solution r k = r k-1 – α k Ad k-1 residual y k = B -1 r k preconditioning solve β k = (y T k r k ) / (y T k-1 r k-1 ) improvement d k = y k + β k d k-1 search direction One matrix-vector multiplication per iteration One solve with preconditioner per iteration
24
24 Other Krylov subspace methods Nonsymmetric linear systems: GMRES: for i = 1, 2, 3,... find x i K i (A, b) such that r i = (Ax i – b) K i (A, b) But, no short recurrence => save old vectors => lots more space (Usually “restarted” every k iterations to use less space.) BiCGStab, QMR, etc.: Two spaces K i (A, b) and K i (A T, b) w/ mutually orthogonal bases Short recurrences => O(n) space, but less robust Convergence and preconditioning more delicate than CG Active area of current research Eigenvalues: Lanczos (symmetric), Arnoldi (nonsymmetric)
25
25 Formulation: Residual Min |r| 2 =|b-Ax| 2, for an arbitrary square matrix A Min R(x)=(b-Ax) T (b-Ax) Search space: x=x 0 +Vy where x 0 is an initial solution, matrix V nxm has m bases of subspace K vector y m contains the m variables.
26
26 Solution: Residual Min R(x)=(b-Ax) T (b-Ax) Search space: x=x 0 +Vy 1.V T A T AV is nonsingular if A is nonsingular and V is full ranked. 2.y=(V T A T AV) -1 V T A T r 0 3.x=x 0 +V(V T A T AV) -1 V T A T r 0 4.V T A T r= 0 5.R(x)=R(x 0 )-r 0 T AV(V T A T AV) -1 V T A T r 0
27
27 Steepest Descent: Residual Min R(x)=(b-Ax) T (b-Ax) Gradient: -2A T (b-Ax)=-2A T r Let x=x 0 +yA T r 0 1.V T A T AV is nonsingular if A is nonsingular where V=A T r 0. 2.y=(V T A T AV) -1 V T A T r 0 3.x=x 0 +V(V T A T AV) -1 V T A T r 0 4.V T A T r= 0 5.R(x)=R(x 0 )-r 0 T AV(V T A T AV) -1 V T A T r 0
28
28 GMRES: Residual Min R(x)=(b-Ax) T (b-Ax) Gradient: -2A T (b-Ax)=-2A T r Let x=x 0 +yA T r 0 1.v 1 =r 0 2.v i is in K{r 0, A, i} 3.V=[v 1,v 2, …,v m ] is orthogonal H 4.AV=VH m +v m+1 e m T =V m+1 H m 5.x=x 0 +V(V T A T AV) -1 V T A T r 0 HHH 6.=x 0 +V(H m T H m ) -1 H m T e 1 |r 0 | 2
29
29 Conjugate Residual: Residual Min R(x)=(b-Ax) T (b-Ax) Gradient: -2A T (b-Ax)=-2A T r Let x=x 0 +yA T r 0 1.v 1 =r 0 2.v i is in K{r 0, A, i} 3.(AV) T AV= D Diagonal Matrix 4.x=x 0 +V(V T A T AV) -1 V T A T r 0 5.=x 0 +VD -1 V T A T r 0
30
Outline Iterative Method Stationary Iterative Method (SOR, GS,Jacob) Krylov Method (CG, GMRES) Multigrid Method 30
31
What is the multigrid A multilevel iterative method to solve Ax=b Originated in PDEs on geometric grids Expend the multigrid idea to unstructured problem – Algebraic MG Geometric multigrid for presenting the basic ideas of the multigrid method. 31
32
The model problem + v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 v8v8 vsvs Ax = b 32
33
Simple iterative method x (0) -> x (1) -> … -> x (k) Jacobi iteration Matrix form : x (k) = R j x (k-1) + C j General form: x (k) = Rx (k-1) + C (1) Stationary: x* = Rx* + C (2) 33
34
Error and Convergence Definition: error e = x* - x (3) residual r = b – Ax (4) e, r relation: Ae = r (5) ((3)+(4)) e (1) = x*-x (1) = Rx* + C – Rx (0) – C =Re (0) Error equation e (k) = R k e (0) (6) ((1)+(2)+(3)) Convergence: 34
35
Error of diffenent frequency Wavenumber k and frequency = k/n High frequency error is more oscillatory between points k= 1 k= 4 k= 2 35
36
Iteration reduce low frequency error efficiently Smoothing iteration reduce high frequency error efficiently, but not low frequency error Error Iterations k = 1 k = 2 k = 4 36
37
Multigrid – a first glance Two levels : coarse and fine grid 1 2 3 4 56 7 8 1 2 3 4 A 2h x 2h =b 2h A h x h =b h hh Ax=b 2h 37
38
Idea 1: the V-cycle iteration Also called the nested iteration A 2h x 2h = b 2h hh 2h A h x h = b h Iterate => Start with Prolongation: Restriction: Iterate to get Question 1: Why we need the coarse grid ? 38
39
Prolongation Prolongation (interpolation) operator x h = x 2h 1 2 3 4 56 7 8 1 2 3 4 39
40
Restriction Restriction operator x h = x 2h 1 2 3 4 56 7 8 1 2 3 4 40
41
Smoothing The basic iterations in each level In ph : x ph old x ph new Iteration reduces the error, makes the error smooth geometrically. So the iteration is called smoothing. 41
42
Why multilevel ? Coarse lever iteration is cheap. More than this… Coarse level smoothing reduces the error more efficiently than fine level in some way. Why ? ( Question 2 ) 42
43
Error restriction Map error to coarse grid will make the error more oscillatory K = 4, = /2 K = 4, = 43
44
Idea 2: Residual correction Known current solution x Solve Ax=b eq. to MG do NOT map x directly between levels Map residual equation to coarse level 1.Calculate r h 2.b 2h= I h 2h r h ( Restriction ) 3.e h = I h 2h x 2h ( Prolongation ) 4.x h = x h + e h 44
45
Why residual correction ? Error is smooth at fine level, but the actual solution may not be. Prolongation results in a smooth error in fine level, which is suppose to be a good evaluation of the fine level error. If the solution is not smooth in fine level, prolongation will introduce more high frequency error. 45
46
` Revised V-cycle with idea 2 Smoothing on x h Calculate r h b 2h= I h 2h r h Smoothing on x 2h e h = I h 2h x 2h Correct: x h = x h + e h Restriction Prolongation 2h h 46
47
What is A 2h Galerkin condition 47
48
Going to multilevels V-cycle and W-cycle Full Multigrid V-cycle h 2h 4h h 2h 4h 8h 48
49
Performance of Multigrid Complexity comparison Gaussian eliminationO(N 2 ) Jacobi iteration O(N 2 log) Gauss-Seidel O(N 2 log) SOR O(N 3/2 log) Conjugate gradient O(N 3/2 log) Multigrid ( iterative ) O(Nlog) Multigrid ( FMG )O(N) 49
50
Summary of MG ideas Important ideas of MG 1.Hierarchical iteration 2.Residual correction 3.Galerkin condition 4.Smoothing the error: high frequency : fine grid low frequency : coarse grid 50
51
AMG :for unstructured grids Ax=b, no regular grid structure Fine grid defined from A 1 2 3 4 5 6 51
52
Three questions for AMG How to choose coarse grid How to define the smoothness of errors How are interpolation and prolongation done 52
53
How to choose coarse grid Idea: C/F splitting As few coarse grid point as possible For each F-node, at least one of its neighbor is a C-node Choose node with strong coupling to other nodes as C-node 1 2 3 4 5 6 53
54
How to define the smoothness of error AMG fundamental concept: Smooth error = small residuals ||r|| << ||e|| 54
55
How are Prolongation and Restriction done Prolongation is based on smooth error and strong connections Common practice: I 55
56
AMG Prolongation (2) 56
57
AMG Prolongation (3) Restriction : 57
58
Summary Multigrid is a multilevel iterative method. Advantage: scalable If no geometrical grid is available, try Algebraic multigrid method 58
59
59 Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage (if sparse) More Robust More General The landscape of Solvers
60
60 References G.H. Golub and C.F. Van Loan, Matrix Computataions, 4th Edition, Johns Hopkins, 2013 Y. Saad, Iterative Methods for Sparse Linear Systems, Second Edition, SIAM, 2003.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.