COMPASS All-hands Meeting, Fermilab, Sept. 17-18 2007 Scalable Solvers in Petascale Electromagnetic Simulation Lie-Quan (Rich) Lee, Volkan Akcelik, Ernesto.

Slides:



Advertisements
Similar presentations
Solving Large-scale Eigenvalue Problems in SciDAC Applications
Advertisements

A Discrete Adjoint-Based Approach for Optimization Problems on 3D Unstructured Meshes Dimitri J. Mavriplis Department of Mechanical Engineering University.
Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS.
MATH 685/ CSI 700/ OR 682 Lecture Notes
P. Venkataraman Mechanical Engineering P. Venkataraman Rochester Institute of Technology DETC2011 –47658 Determining ODE from Noisy Data 31 th CIE, Washington.
1 Design of Gridded-Tube Structures for the 805 MHz RF Cavity Department of Mechanical, Materials, and Aerospace Engineering M. Alsharoa (PhD candidate)
The Improved 3D Matlab_based FDFD Model and Its Application Qiuzhao Dong(NU), Carey Rapapport(NU) (contact: This.
10 October 2006 MICE CM-16 at RAL 1 Distributed versus Lumped Coupling Magnets Michael A. Green and Soren Prestemon Lawrence Berkeley Laboratory, Berkeley.
Advancing Computational Science Research for Accelerator Design and Optimization Accelerator Science and Technology - SLAC, LBNL, LLNL, SNL, UT Austin,
DUAL FEED RF GUN DESIGN FOR LCLS Liling XIAO, Zenghai LI Advanced Computations Department Stanford Linear Accelerator Center Nov , SLAC-LCLS Injector.
Iterative Solvers for Coupled Fluid-Solid Scattering Jan Mandel Work presentation Center for Aerospace Structures University of Colorado at Boulder October.
Design of Standing-Wave Accelerator Structure
SLAC is focusing on the modeling and simulation of DOE accelerators using high- performance computing The performance of high-brightness RF guns operating.
Wakefield Damping Effects in the CLIC Power Extraction and Transfer Structure (PETS) Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite.
A Solenoidal Basis Method For Efficient Inductance Extraction H emant Mahawar Vivek Sarin Weiping Shi Texas A&M University College Station, TX.
1 Parallel Simulations of Underground Flow in Porous and Fractured Media H. Mustapha 1,2, A. Beaudoin 1, J. Erhel 1 and J.R. De Dreuzy IRISA – INRIA.
Wakefield Computations at Extreme-scale for Ultra-short Bunches using Parallel hp-Refinement Lie-Quan Lee, Cho Ng, Arno Candel, Liling Xiao, Greg Schussman,
Applications of Frequency Extraction to Cavity Modeling Travis M. Austin* and John R. Cary*,¶ Tech-X Corporation*, University of Colorado-Boulder ¶ Collaborator:
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Fast Low-Frequency Impedance Extraction using a Volumetric 3D Integral Formulation A.MAFFUCCI, A. TAMBURRINO, S. VENTRE, F. VILLONE EURATOM/ENEA/CREATE.
Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.
Zenghai Li SLAC National Accelerator Laboratory LHC-CC13 CERN, December 9-11, 2013 HOM Coupler Optimization & RF Modeling.
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
Thermoelastic analysis with a home-made FEM Tübingen 8^th-9^th 20074th ILIAS-GW Annual General Meeting Michele Bonaldi, Enrico Serra.
© 2011 Autodesk Freely licensed for use by educational institutions. Reuse and changes require a note indicating that content has been modified from the.
On the Use of Sparse Direct Solver in a Projection Method for Generalized Eigenvalue Problems Using Numerical Integration Takamitsu Watanabe and Yusaku.
© 2011 Autodesk Freely licensed for use by educational institutions. Reuse and changes require a note indicating that content has been modified from the.
1.Institute For Research in Fundamental Science (IPM), Tehran, Iran 2.CERN, Geneva, Switzerland Mohsen Dayyani Kelisani Thermionic & RF Gun Simulations.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
Study of Absorber Effectiveness in the ILC Main Linacs K. Bane, C. Nantista and C. Adolphsen SLAC, March 26, 2010 Goal: Compute the HOM monopole losses.
DOE/HEP SciDAC AST Project: “Advanced Computing for 21 st Century Accelerator Science and Technology” Impact of SciDAC on Accelerator Projects Across SC.
Adaptive Meshing Control to Improve Petascale Compass Simulations Xiao-Juan Luo and Mark S Shephard Scientific Computation Research Center (SCOREC) Interoperable.
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
ILC Damping Rings Mini-Workshop, KEK, Dec 18-20, 2007 Status and Plans for Impedance Calculations of the ILC Damping Rings Cho Ng Advanced Computations.
© 2012 Autodesk A Fast Modal (Eigenvalue) Solver Based on Subspace and AMG Sam MurgieJames Herzing Research ManagerSimulation Evangelist.
Adaptive Control Loops for Advanced LIGO
COMPASS All-Hands Meeting, FNAL, Sept , 2007 Accelerator Prototyping Through Multi-physics Analysis Volkan Akcelik, Lie-Quan Lee, Ernesto Prudencio,
Summery of the power coupler session at the LCWS13 workshop E. Kako W.-D. Möller H. Hayano A. Yamamoto All members of SCRF WG November 14, 2013.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
1 Cold L-Band Cavity BPM: Design Status July 2006 Gennady Romanov Linda Valerio Manfred Wendt Fermilab July 21, 2006.
Report from LBNL TOPS Meeting TOPS/ – 2Investigators  Staff Members:  Parry Husbands  Sherry Li  Osni Marques  Esmond G. Ng 
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.
Brain (Tech) NCRR Overview Magnetic Leadfields and Superquadric Glyphs.
Managed by UT-Battelle for the Department of Energy Vector Control Algorithm for Efficient Fan-out RF Power Distribution Yoon W. Kang SNS/ORNL Fifth CW.
Finding Rightmost Eigenvalues of Large, Sparse, Nonsymmetric Parameterized Eigenvalue Problems Minghao Wu AMSC Program Advisor: Dr. Howard.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
Midterm Review 28-29/05/2015 Progress on wire-based accelerating structure alignment Natalia Galindo Munoz RF-structure development meeting 13/04/2016.
Dr Ian Shinton Researcher HEP group Manchester, Cockcroft Institute Daresbury.
G. Cheng, R. Rimmer, H. Wang (Jefferson Lab, Newport News, VA, USA)
Block Low Rank Approximations in LS-DYNA
Orbit Response Matrix Analysis
Katsuyo Thornton1, R. Edwin García2, Larry Aagesen3
Coupler RF kick simulations.
Challenges in Electromagnetic Modeling Scalable Solvers
SIMULATION TOOLS FOR PHOTONIC CRYSTAL FIBER*
SIMULATION TOOLS FOR PHOTONIC CRYSTAL FIBER*
Parallel 3D Finite Element Particle-In-Cell Simulations with Pic3P*
PARALLEL FINITE ELEMENT MODELING TOOLS FOR ERL DESIGN AND ANALYSIS
L Ge, L Lee, A. Candel, C Ng, K Ko, SLAC
Overview of SRF system of Ring and Linac (1)
V. Akcelik, L-Q Lee, Z. Li, C-K Ng, L. Xiao and K. Ko,
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P* Arno Candel, Andreas Kabel, Zenghai Li, Cho Ng, Liequan.
SRF Cavity Designs for the International Linear Collider*
Beamline Absorber Study Using T3P
LARGE SCALE SHAPE OPTIMIZATION FOR ACCELERATOR CAVITIES*
Numerical Linear Algebra
RKPACK A numerical package for solving large eigenproblems
PARALLEL FINITE ELEMENT MODELING TOOLS FOR ERL DESIGN AND ANALYSIS
V. Akcelik, L-Q Lee, Z. Li, C-K Ng, L. Xiao and K. Ko,
Presentation transcript:

COMPASS All-hands Meeting, Fermilab, Sept Scalable Solvers in Petascale Electromagnetic Simulation Lie-Quan (Rich) Lee, Volkan Akcelik, Ernesto Prudencio, Lixin Ge Stanford Linear Accelerator Center Xiaoye Li, Esmond Ng Lawrence Berkeley National Laboratory Work supported by DOE ASCR, BES & HEP Divisions under contract DE-AC02-76SF00515

Overview Shape Determination/Optimization V. Akcelik, L. Lee (SLAC) ‏ T. Tautges, P. Knupp, L. Diachin (ITAPS) ‏ O. Ghattas, E. Ng, D. Keyes (TOPS) ‏ Linear and Nonlinear Eigensolvers L. Lee(SLAC), X. Li, E. Ng, C. Yang (LBNL/TOPS) ‏ Scalable Linear Solvers L. Lee (SLAC), X. Li, E. Ng (TOPS) ‏

Shape Determination and Optimization

Shape Determination and Optimization For SCRF Cavities Shape changes due to Fabrication errors Addition of stiffening rings Tuning for accelerating mode Change HOM Damping -> Beam quality Ring in the middle HOM Damping changes Tuning

Least-squares Minimization Unknowns are shape deviation parameters Gauss-Newton with truncated-SVD Indefinite linear systems from KKT (deferred) ‏ Its forward problem is Maxwell eigenvalue problem

Example 1 for ILC TDR Cavity Create a synthetic example, artificially deform a 3D 9 cell ILC cavity. Choose a set of parameters defining shape variations, in total 26 independent inversion parameters. Cell radius dr (x9) an cell length dz (x9) ‏ Iris radius (x8) ‏ Assign random values to these variables, and deform the cavity. Solve the Maxwell eigenvalue problem. Use the first 45 nonzero frequencies, and first 9 modes field distribution as the targeted values

Results for Example 1 The nonlinear solver converges within a handful of iterations Frequencies and Fields match remarkably Objective function decreases by 10e6 The “target” and “inverted” cavity shapes are very close to each other

Determining TDR Shape with Measured Frequencies Experimental data for manufactured baseline ILC cavities from DESY The first 45 mode frequencies, and the first 9 monopole mode field distribution along the cavity axis 82 parameters: cell radius, length, tuning, warping, and iris radius Cell length error Cell radius errorDeformed surfaceElliptical shape

Results Difference of Frequencies and Field values Red: inverted cavity - measured values Black/blue: ideal shape - measured values An article has been accepted by JCP MHz

Future Work on Shape Determination Measurement data contain error better algorithm Choices of shape deviation parameters Extending the method to using frequencies, fields and external Qs where The forward problem is a complex nonlinear eigenvalue problem! Mesh smoothing (ITAPS) ‏ Meshes near pickup gap red: deformed black: original

Linear and Nonlinear Eigensolvers

RF Cavity Eigenvalue Problem EE Closed Cavity MM Nedelec-type Element Find frequency and field vector of normal modes: “Maxwell’s Eqns in Frequency Domain”

Cavity with Waveguide Coupling Vector wave equation with waveguide boundary conditions can be modeled by a non-linear eigenvalue problem Open Cavity Waveguide BC With One waveguide mode per port only

Cavity with Waveguide Coupling for Multiple Waveguide Modes Vector wave equation with waveguide boundary conditions can be modeled by a non-linear eigenvalue problem (NEP)‏ Open Cavity Waveguide BC where

i WSMPMUMPSSuperLU_Dist Krylov Subspace Methods Domain-specific preconditioners Different solver options have different performance dynamics Omega3P Lossless Lossy Material Periodic Structure External Coupling ESIL/with Restart ISIL w/ refinement Implicit/Explicit Restarted Arnoldi SOAR Self-Consistent Iteration Nonlinear Arnoldi/JD Physics Problems and Solver Options

Path to Simulate ILC RF Unit (3-cryomodule) ‏ Optimized ILC single cavity routinely Simulated 4-cavity STF last year Simulating 8-cavity ILC Cryomodule this year Simulate ILC 3-cryomodule RF Unit - ~200M DOFs, further CS/AM advance needed, petascale

Future Work for Eigensolvers Parallelize AMLS, understand and improve its performance and scalability Nonlinear Jacobi-Davidson Choice of initial space Strategy for updating preconditioner and choice of preconditioners New algorithm development for NEP/LEP avoid shift-invert for interior eigenvalues LEP helps NEP (Self Consistent Iterations)

Scalable Linear Solvers

Linear Solver is Computational Kernel of Many Codes Indefinite Matrices Linear systems arising from shift-invert eigensolver in Omega3P Indefinite linear system from KKT conditions S-parameter computation in S3P Symmetric Positive Definite (SPD) Matrices From implicit time-stepping in T3P From thermal and mechanical analysis TEM3P From electro/magneto static analysis Gun3P Issues in Petascale Electromagnetic simulations: Direct solver: memory usage, scalability of triangular solver Iterative solver: performance, effectiveness (preconditioner)

Omega3P Scalability on Jaguar/XT with Iterative Linear Solver 1.5M tetrahedral elements NDOFs = 9.6M NNZ = 506M LCLS RF Gun

Scalability Using Sparse Direct Solver MUMPS Sparse Direct Solver is effective for highly indefinite matrices Scalability affected by performance of Triangular Solver N=2M, PSPASES Triangular Solver N=2,019,968, nnz=32,024,600 No. of entries in L =1 billion Need more scalable Triangular Solvers

More “Memory-usage” Scalable Sparse Direct Solvers Maximal per-rank MU is 4- 5 times than the average MU Once it cannot fit into Nprocs, it most likely will not fit into 2*Nprocs More “memory-usage” scalable solvers needed MUMPS per-rank memory usage N=1.11M, nnz=46.1M Complex matrix

Memory Saving Techniques Single precision for factor matrix, iterative refinement to recover double precision accuracy (F) ‏ Domain-specific Preconditioners Factorize real part of the matrix (R) Real part is a good approximation to the complex matrix User single precision to factorize real part of the matrix (RF) Hierarchical preconditioners (FE order is the level) (HP) ‏ single precision for (1,1)-block (HPF) ‏ real part only for (1,1)-block (HPR) ‏ single precision & real part for (1,1)-block (HPRF) ‏

Testing Results for Complex Shifted Linear Systems

Recent Progress of SuperLU (Xiaoye Li) Parallel symbolic factorization significantly reduces memory usage Matrix for DDSMatrix for ILC Cavity

Future Work on Linear Solvers Direct versus iterative solvers, hybrid solvers Investigate applicability of out-of-core sparse direct solvers from TOPS Apply multigrid solvers from TOPS for SPD matrices Extend PSPASES to indefinite/complex matrices Develop more effective domain-specific preconditioners