ML: Multilevel Preconditioning Package Trilinos User’s Group Meeting Wednesday, October 15, 2003 Jonathan Hu Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.
Outline What is ML? Basic multigrid concepts Configuring and building ML Interoperability with other packages User options available within ML Example: Using ML as a preconditioner What to do if something goes wrong Sandia apps that use ML Future Plans Documentation, mailing lists, and getting help
ML Package C package that provides multigrid preconditioning for linear solver methods Developers: Ray Tuminaro, Jonathan Hu, Charles Tong (LLNL) Main methods –Geometric Grid refinement hierarchy 2-level FE basis function domain decomposition –AMG (algebraic) Smoothed aggregation Edge-element AMG for Maxwell’s equations Classical AMG
Geometric Multigrid A 4 u 4 = f 4 Solve Approximate PDE on (user supplied) grid hierarchy A 3 u 3 = f 3 A 2 u 2 = f 2 A 1 u 1 = f 1 Develop grid transfer operators: restriction R k and interpolation I k. R3R3 I3I3 R2R2 I2I2 R1R1 I1I1 (S 1 ) (S 4 ) Develop smoothers S k (approximate solve on a level) Jacobi, Gauss-Seidel, CG, etc. (S 3 ) (S 2 ) Use coarse solves (k<4) to accelerate convergence for A 4
Algebraic Multigrid (AMG) Build MG operators (A k, I k, R k, S k ’s) automatically to define a hierarchy: A k u k =f k, k=1,…,L Main difference: In AMG, interpolation operators I k ’s built automatically –also main difficulty Once I k ’s are defined, the rest follows “easily”: Same Goal: Solve A L u L =f L. R k = I k T (usually) A k-1 = R k-1 A k I k-1 (triple matrix product) Smoother (iterative method) S k +Gauss-Seidel, polynomial, conjugate gradient, etc.
MG(f, u, k) { if (k == 1) u 1 = (A 1 ) -1 f 1 else { S k (A k, f k,u k ) //pre-smooth r k = f k – A k u k f k-1 = R k-1 r k ; u k-1 = 0 //restrict MG(f k-1, u k-1, k-1) u k = u k + I k-1 u k-1 //interpolate & correct S k (A k, f k,u k ) //post-smooth } A k, R k, I k, S k required on all levels S k : smoothers R k : restriction operators I k : interpolation operators Recursive MG Algorithm (V Cycle) V Cycle to solve A 4 u 4 =f 4 k=4 k=1 W Cycle FMV Cycle
Algebraic construction of I k : Aggregation Greedy algorithm –parallel can be complicated
Build tentative t I k to interpolate null space where t I k (i,j) = Algebraic Construction of I k : Coefficients Finding I k Smoothed aggregation +Improves t I k with Jacobi’s method: I k = (I - diag(A k ) -1 A k ) t I k +I k emphasizes what is not smoothed by Jacobi { 1 if i th point within j th aggregate 0 otherwise
ML Capabilities MG cycling: V, W, full V, full W Grid Transfers –Several automatic coarse grid generators –Several automatic grid transfer operators Smoothers –Jacobi, Gauss-Seidel, Hiptmair, LU, Aztec methods, sparse approximate inverses, polynomial Kernels: matrix/matrix multiply, etc.
Configuring and Building ML ML builds by default when Trilinos is configured/built Configure help –See ml/README file, or ml/configure --help Enabling direct solver support (default is off) configure --with-ml_superlu \ --with-incdirs=“-I/usr/local/superlu/include” \ --with-ldflags=“/usr/local/superlu/libsuperlu.a” Performance monitoring --enable-ml_timing --enable_ml_flops Example suite builds by default Your-build-location/packages/ml/examples
ML Interoperability with Trilinos Packages ML Epetra Accepts user data as Epetra objects Can be wrapped as Epetra_Operator TSF TSF interface exists Other matvecs Other solvers Accepts other solvers and MatVecs Amesos Amesos interface coming very soon Aztecoo Meros Via Epetra & TSF
Common Decisions for ML Users What smoother to use –# of smoothing steps Coarsening strategy Cycling strategy
Jacobi –Simplest, cheapest, usually least effective. –Damping parameter ( ) needed Point Gauss-Seidel –Equation satisfied one unknown at a time –Better than Jacobi –May need damping –Can be problematic in parallel processor-based (stale off-proc values) ML Smoother Choices
Block Gauss-Seidel –Satisfy several equations simultaneously by modifying several DOFs (inverting subblock associated with DOFs). –Blocks can correspond to aggregates Aztec smoothers –Any Aztec preconditioner can be used as a smoother. –Probably most interesting is ILU & ILUT methods: may be more robust than Gauss-Seidel. Sparse LU Hiptmair –Specialized 2-stage smoother for Maxwell’s Eqns. MLS –Approximation to inverse based on Chebyshev polynomials of smoothed operator. –Competitive with true Gauss-Seidel in serial. –Doesn’t degrade with # processors (unlike processor-based GS)
ML Cycling Choices V is default (usually works best). W more expensive, may be more robust. Full MG (V cycle) more expensive –Less conventional within preconditioners These choices decide how frequently coarse grids are visited compared to fine grid.
ML Aggregation Choices MIS –Expensive, usually best with many processors Uncoupled (UC) –Cheap, usually works well. Hybrid = UC + MIS –Uncoupled on fine grids, MIS on coarser grids
A Simple Example: Trilinos/Packages/aztecoo/example/MLAztecOO Linear Solver ML: multi- grid pre- cond. AztecOO (Epetra) ML CG AMG Solution component Example methods Packages used Solve Ax=b: A: linear elasticity problem Linear solver: Conjugate Gradient Precond.: AMG smoothed aggregation
Simple Example #include “ml_include.h” #include “ml_epetra_operator.h” #include “ml_epetra_utils.h” Epetra_CrsMatrix A; Epetra_Vector x,bb;... //matrix, vectors loaded here // Construct Epetra Linear Problem Epetra_LinearProblem problem(&A, &x, &bb); // Construct a solver object AztecOO solver(problem); solver.SetAztecOption(AZ_solver, AZ_cg); // Create and set an ML multilevel preconditioner int N_levels = 10; // max # of multigrid levels possible ML_Set_PrintLevel(3); // how much ML info is output to screen ML *ml_handle; ML_Create(&ml_handle,N_levels); // Make linear operator A accessible to ML EpetraMatrix2MLMatrix(ml_handle, 0, &A);
Simple Example (contd.) // Create multigrid hierarchy ML_Aggregate *agg_object; ML_Aggregate_Create(&agg_object); ML_Aggregate_Set_CoarsenScheme_Uncoupled(agg_object); N_levels = ML_Gen_MGHierarchy_UsingAggregation(ml_handle, 0, ML_INCREASING, agg_object); // Set symmetric Gauss-Seidel smoother for MG method int nits = 1; double dampingfactor = ML_DDEFAULT; ML_Gen_Smoother_SymGaussSeidel(ml_handle, ML_ALL_LEVELS, ML_BOTH, nits, dampingfactor); ML_Gen_Solver(ml_handle, ML_MGV, 0, N_levels-1); // Set preconditioner within Epetra Epetra_ML_Operator MLop(ml_handle,comm,map,map); solver.SetPrecOperator(&MLop); // Set some Aztec solver options and iterate solver.SetAztecParam(AZ_rthresh, 1.4); solver.SetAztecParam(AZ_athresh, 10.0); int Niters = 500; solver.Iterate(Niters, 1e-12);
Multigrid: Issues to be aware of Severely stretched grids or anisotropies Loss of diagonal dominance Atypical stencils Jumps in material properties Non-symmetric matrices Boundary conditions Systems of PDEs Non-trivial null space (Maxwell’s equations, elasticity)
Small aggregates high complexity Large aggregates poor convergence Different size aggregates both –Try different aggregation methods, drop tolerancesaggregation methodsdrop tolerances Stretched grids poor convergence Variable regions poor convergence –Try different smoothers, drop tolerancessmoothersdrop tolerances Ineffective smoothing poor convergence (perhaps due to non-diagonal dominance or non- symmetry in operator) –Try different smootherssmoothers What can go wrong?
Things to Try if Multigrid Isn’t Working Smoothers –Vary number of smoothing steps. –More `robust’ smoothers: block Gauss-Seidel, ILU, MLS polynomial (especially if degradation in parallel). –Vary damping parameters (smaller is more conservative). Try different aggregation schemes Try fewer levels Try drop tolerances, if …Try drop tolerances – high complexity (printed out by ML). –Severely stretched grids –Anisotropic problems –Variable regions Reduce prolongator damping parameter –Concerned about operators properties (e.g. highly non- symmetric).
ALEGRA/NEVADA Zpinch Results procs# Elementsits 16155, , ,259,97647 CG preconditioned with V(1,1) AMG cycle ||r|| 2 /||r 0 || 2 < stage Hiptmair smoother (4 th order polys in each stage) Edge element interpolation
Flow and Transport Solution Algorithms for Chemical Attack in an Airport Terminal: 3D prototype PreconditionerProblem Size (# Unknowns) # Newton Steps # GMRES Its. Total Time (sec.) 16 1-GHz P3 Procs. 64 procs Cplant 512 procs Cplant 1 – Level Additive Schwarz: DD - ILU 118, , Level Additive Schwarz: Fine mesh - DD ILU Coarse problem 17,360 unknowns – SuperLU 118, , ,679, ~50,000,000 Algorithm; Steady State: Newton – Krylov (GMRES)
Future Improvements Amesos interface –More direct solvers Self-correcting capabilities –Better analysis tools –Improved coarse grids –“On-the-fly” adjustments “Re-partitioning”
Getting Help See Trilinos/packages/ml/examples See guide: –ML User’s Guide, ver. 2.0 (in ml/doc) –ML User’s Guide, ver. 3.0 (under construction) Mailing lists Bug reporting, enhancement requests via bugzilla: – us directly