1 Incorporating Iterative Refinement with Sparse Cholesky April 2007 Doron Pearl
2 Robustness of Cholesky Recall the Cholesky algorithm – a lot of subtractions/additions cancellation and round-off errors accumulate Sparse Cholesky with Symbolic Factorization provides high performance – but what about accuracy and robustness?
3 Test case: IPM All IPMs implementations involve solving a system of linear equations (ADA T x=b) in each step. Usually in IPM when approaching the optimum the ADA T matrix becomes ill-conditioned.
4 Sparse Ax=b solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage More Robust More General
5 Iterative Refinement A technique for improving a computed solution to a linear system Ax = b. r is constructed in higher precision. x 2 should be more accurate (why?) Algorithm 0. (Solve Ax 1 =b someway – LU/Chol) 1.Compute the residual r = b – Ax 1 2.Solve the correction d in Ad = r 3.Update the solution x 2 = x 1 + d
6 Iterative Refinement 1. L L T = chol(A) % Choleskey factorization (SINGLE) O(n 3 ) 2. x = L\(L T \b) % Back solve (SINGLE) O(n 2 ) 2. r = b – Ax % Residual (DOUBLE) O(n 2 ) 3. while ( || r || not small enough ) %stopping criteria 3.1 d = L\(L T \r) % Choleskey fct. on the residual (SINGLE) O(n 2 ) 3.2 x = x + d % new solution (DOUBLE) O(n 2 ) 3.3 r = b - Ax % new residual (DOUBLE) O(n 2 ) COST: (SINGLE) O(n 3 ) + #ITER * (DOUBLE) O(n 2 ) My implementation is available here:
7 Convergence rate of IR n=40, Cond. #: 3.2* n=60, Cond. #: 1.6* n=80 Cond#: 5* n=100, Cond#: 1.4*
8 Convergence rate of IR N=250, Condition number: 1.9* … For N>350, Cond#= 1.6*10 11 : No convergence iteration ||Err|| 2
9 More Accurate Conventional Gaussian Elimination With extra precise iterative refinement
10 Conjugate Gradient in a nutshell Iterative method for solving Ax=b Minimizes the a quadratic function: f(x) = 1/2x T Ax-b T x+c Choose search direction that are conjugated to each other. In non-finite precision converges after n iterations. But to solve efficiently CG needs a good preconditioners – not available for the general case.
11 Conjugate Gradient One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0 = 0, r 0 = b, p 0 = r 0 for k = 1, 2, 3,... α k = (r T k-1 r k-1 ) / (p T k-1 Ap k-1 ) step length x k = x k-1 + α k p k-1 approx solution r k = r k-1 – α k Ap k-1 residual β k = (r T k r k ) / (r T k-1 r k-1 ) improvement p k = r k + β k p k-1 search direction
