COMPASS All-hands Meeting, Fermilab, Sept Scalable Solvers in Petascale Electromagnetic Simulation Lie-Quan (Rich) Lee, Volkan Akcelik, Ernesto Prudencio, Lixin Ge Stanford Linear Accelerator Center Xiaoye Li, Esmond Ng Lawrence Berkeley National Laboratory Work supported by DOE ASCR, BES & HEP Divisions under contract DE-AC02-76SF00515
Overview Shape Determination/Optimization V. Akcelik, L. Lee (SLAC) T. Tautges, P. Knupp, L. Diachin (ITAPS) O. Ghattas, E. Ng, D. Keyes (TOPS) Linear and Nonlinear Eigensolvers L. Lee(SLAC), X. Li, E. Ng, C. Yang (LBNL/TOPS) Scalable Linear Solvers L. Lee (SLAC), X. Li, E. Ng (TOPS)
Shape Determination and Optimization
Shape Determination and Optimization For SCRF Cavities Shape changes due to Fabrication errors Addition of stiffening rings Tuning for accelerating mode Change HOM Damping -> Beam quality Ring in the middle HOM Damping changes Tuning
Least-squares Minimization Unknowns are shape deviation parameters Gauss-Newton with truncated-SVD Indefinite linear systems from KKT (deferred) Its forward problem is Maxwell eigenvalue problem
Example 1 for ILC TDR Cavity Create a synthetic example, artificially deform a 3D 9 cell ILC cavity. Choose a set of parameters defining shape variations, in total 26 independent inversion parameters. Cell radius dr (x9) an cell length dz (x9) Iris radius (x8) Assign random values to these variables, and deform the cavity. Solve the Maxwell eigenvalue problem. Use the first 45 nonzero frequencies, and first 9 modes field distribution as the targeted values
Results for Example 1 The nonlinear solver converges within a handful of iterations Frequencies and Fields match remarkably Objective function decreases by 10e6 The “target” and “inverted” cavity shapes are very close to each other
Determining TDR Shape with Measured Frequencies Experimental data for manufactured baseline ILC cavities from DESY The first 45 mode frequencies, and the first 9 monopole mode field distribution along the cavity axis 82 parameters: cell radius, length, tuning, warping, and iris radius Cell length error Cell radius errorDeformed surfaceElliptical shape
Results Difference of Frequencies and Field values Red: inverted cavity - measured values Black/blue: ideal shape - measured values An article has been accepted by JCP MHz
Future Work on Shape Determination Measurement data contain error better algorithm Choices of shape deviation parameters Extending the method to using frequencies, fields and external Qs where The forward problem is a complex nonlinear eigenvalue problem! Mesh smoothing (ITAPS) Meshes near pickup gap red: deformed black: original
Linear and Nonlinear Eigensolvers
RF Cavity Eigenvalue Problem EE Closed Cavity MM Nedelec-type Element Find frequency and field vector of normal modes: “Maxwell’s Eqns in Frequency Domain”
Cavity with Waveguide Coupling Vector wave equation with waveguide boundary conditions can be modeled by a non-linear eigenvalue problem Open Cavity Waveguide BC With One waveguide mode per port only
Cavity with Waveguide Coupling for Multiple Waveguide Modes Vector wave equation with waveguide boundary conditions can be modeled by a non-linear eigenvalue problem (NEP) Open Cavity Waveguide BC where
i WSMPMUMPSSuperLU_Dist Krylov Subspace Methods Domain-specific preconditioners Different solver options have different performance dynamics Omega3P Lossless Lossy Material Periodic Structure External Coupling ESIL/with Restart ISIL w/ refinement Implicit/Explicit Restarted Arnoldi SOAR Self-Consistent Iteration Nonlinear Arnoldi/JD Physics Problems and Solver Options
Path to Simulate ILC RF Unit (3-cryomodule) Optimized ILC single cavity routinely Simulated 4-cavity STF last year Simulating 8-cavity ILC Cryomodule this year Simulate ILC 3-cryomodule RF Unit - ~200M DOFs, further CS/AM advance needed, petascale
Future Work for Eigensolvers Parallelize AMLS, understand and improve its performance and scalability Nonlinear Jacobi-Davidson Choice of initial space Strategy for updating preconditioner and choice of preconditioners New algorithm development for NEP/LEP avoid shift-invert for interior eigenvalues LEP helps NEP (Self Consistent Iterations)
Scalable Linear Solvers
Linear Solver is Computational Kernel of Many Codes Indefinite Matrices Linear systems arising from shift-invert eigensolver in Omega3P Indefinite linear system from KKT conditions S-parameter computation in S3P Symmetric Positive Definite (SPD) Matrices From implicit time-stepping in T3P From thermal and mechanical analysis TEM3P From electro/magneto static analysis Gun3P Issues in Petascale Electromagnetic simulations: Direct solver: memory usage, scalability of triangular solver Iterative solver: performance, effectiveness (preconditioner)
Omega3P Scalability on Jaguar/XT with Iterative Linear Solver 1.5M tetrahedral elements NDOFs = 9.6M NNZ = 506M LCLS RF Gun
Scalability Using Sparse Direct Solver MUMPS Sparse Direct Solver is effective for highly indefinite matrices Scalability affected by performance of Triangular Solver N=2M, PSPASES Triangular Solver N=2,019,968, nnz=32,024,600 No. of entries in L =1 billion Need more scalable Triangular Solvers
More “Memory-usage” Scalable Sparse Direct Solvers Maximal per-rank MU is 4- 5 times than the average MU Once it cannot fit into Nprocs, it most likely will not fit into 2*Nprocs More “memory-usage” scalable solvers needed MUMPS per-rank memory usage N=1.11M, nnz=46.1M Complex matrix
Memory Saving Techniques Single precision for factor matrix, iterative refinement to recover double precision accuracy (F) Domain-specific Preconditioners Factorize real part of the matrix (R) Real part is a good approximation to the complex matrix User single precision to factorize real part of the matrix (RF) Hierarchical preconditioners (FE order is the level) (HP) single precision for (1,1)-block (HPF) real part only for (1,1)-block (HPR) single precision & real part for (1,1)-block (HPRF)
Testing Results for Complex Shifted Linear Systems
Recent Progress of SuperLU (Xiaoye Li) Parallel symbolic factorization significantly reduces memory usage Matrix for DDSMatrix for ILC Cavity
Future Work on Linear Solvers Direct versus iterative solvers, hybrid solvers Investigate applicability of out-of-core sparse direct solvers from TOPS Apply multigrid solvers from TOPS for SPD matrices Extend PSPASES to indefinite/complex matrices Develop more effective domain-specific preconditioners