New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.
Overview Multigrid Options –ParMETIS –Zoltan –Repartitioning Analysis Tools –GGB method –Memory usage –Visualization Documentation
Traditional Coarsening Coarsening rate fixed: h/H 3 n in n-d problem What can go wrong? AMG complexity goes up ∑ [nnz( A (j) )] / nnz(A (1) ) result: more time per iteration In parallel, each coarse grid has latency penalty
Aggressive Coarsening ● Idea: use graph partitioner to make larger aggregates – METIS / ParMETIS ● Coarsening rate: user- determined Fewer levels: mitigates coarse grid latency Smaller + fewer coarse grids → lower complexity Convergence rate could suffer --with-ml_metis --with-ml_parmetis3x
3D transient LES (13M DOFs/1K node Cplant) App: MPSalsa Airport Simulation Aggressive coarsening
Coarsening with Zoltan Main idea –App provides coordinates on fine level (only) –Call to Zoltan for coarsening (RCB algorithm) ML internally creates coordinates for coarser levels –Centers of mass Status: still in testing phase -- with-ml_zoltan
A Repartitioning to Improve Parallel Performance Load balances operators in multigrid hierarchy Motivation –App load balancing may be non-optimal for linear solver –App may take large % of memory (e.g., multiphysics) Linear solver gets remaining memory Result: low parallel efficiency –Coarsening rate may slow as get to few unknowns / proc Main idea –Determine “good” partitioning with ParMETIS –Construct permutation matrix P based on partitioning –Apply to multigrid coarse grid operators A P Proc. 1 Proc. 3 Proc. 2 Proc. 1 Proc. 2
Repartitioning applied to Zpinch simulation No repartioningXXXX Repartitioning 310 / 492s 284 / 479s 257 / 530s X* Before repartitioning on Janus… 210+ processor simulations failed App-supplied linear system already imbalanced
Find modes not captured by MG adaptive filter extra coarse grid MGGGB GMRES \ QMR Adaptive AMG GMRES(20) + GGB/ML GMRES(150) + ML GGB GB
Analysis / Profiling Tools Aggregate visualization –Assess aggregate quality –User provides fine-level coordinates –CoM used as coordinates on coarser levels –Stats calculated on avg size, diameters –Currently using 3 rd party package, OpenDX Error visualization
Analysis/Profiling Tools (cont’d) Matrix performance –Matrix statistics –Eigen analysis –Detailed operator profiling Apply & communication time MultilevelPreconditioner::AnalyzeMatrixCheap() ML_Operator_Profile() Internal memory profiling –Lightweight –Highwater mark, largest free block –Postprocessing for plotting
Updated Documentation ML User’s Guide, version 3.0 –Configure & build information –MultilevelPreconditioner() class intro –Exhaustive options list ML Developer’s Guide –Configuration, building, testing details –Suggested practices –Intro to tools on software.sandia.gov Updated web pages –Now built automatically each night –Incorporates doxygen comments –
ML with Epetra: Example of use Trilinos_Util_CrsMatrixGallery Gallery(“laplace_3d", Comm); Gallery.Set("problem_size", 100*100*100); // linear system matrix & linear problem Epetra_RowMatrix * A = Gallery.GetMatrix(); Epetra_LinearProblem * Problem = Gallery.GetLinearProblem(); // Construct outer solver object AztecOO solver(*Problem); solver.SetAztecOption(AZ_solver, AZ_cg); // Set up multilevel precond. with smoothed aggr. defaults ParameterList MLList; // parameter list for ML options ML_Epetra::SetDefaults(“SA”,MLList); MLList.set(“aggregation: type”, “METIS”); // create preconditioner ML_Epetra::MultiLevelPreconditioner * MLPrec = new ML_Epetra::MultiLevelPreconditioner(*A, MLList, true); solver.SetPrecOperator(MLPrec); // set preconditioner solver.Iterate(500, 1e-12); // iterate at most 500 times delete MLPrec; Trilinos/packages/ml/examples/ml_example_MultiLevelPreconditioner.cpp Triutils AztecOO ML AztecOO