Wakefield Computations at Extreme-scale for Ultra-short Bunches using Parallel hp-Refinement Lie-Quan Lee, Cho Ng, Arno Candel, Liling Xiao, Greg Schussman,

Slides:



Advertisements
Similar presentations
Steady-state heat conduction on triangulated planar domain May, 2002
Advertisements

Accelerator Science and Technology Centre Prospects of Compact Crab Cavities for LHC Peter McIntosh LHC-CC Workshop, CERN 21 st August 2008.
Normal-Conducting Photoinjector for High Power CW FEL Sergey Kurennoy, LANL, Los Alamos, NM, USA An RF photoinjector capable of producing high continuous.
Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS.
COMPASS All-hands Meeting, Fermilab, Sept Scalable Solvers in Petascale Electromagnetic Simulation Lie-Quan (Rich) Lee, Volkan Akcelik, Ernesto.
Steady Aeroelastic Computations to Predict the Flying Shape of Sails Sriram Antony Jameson Dept. of Aeronautics and Astronautics Stanford University First.
D. Li and R. Rimmer, RF Workshop, Fermilab, MHz Cavity Refurbishment and suggestions on future tests Derun Li and Robert Rimmer* Lawrence.
L. Ge, C. Adolphsen, K. Ko, L. Lee, Z. Li, C. Ng, G. Schussman, F. Wang, SLAC, B. Rusnak, LLNL Multipacting Simulations of TTF-III Coupler Components *#
Advancing Computational Science Research for Accelerator Design and Optimization Accelerator Science and Technology - SLAC, LBNL, LLNL, SNL, UT Austin,
DUAL FEED RF GUN DESIGN FOR LCLS Liling XIAO, Zenghai LI Advanced Computations Department Stanford Linear Accelerator Center Nov , SLAC-LCLS Injector.
Design of Standing-Wave Accelerator Structure
SLAC is focusing on the modeling and simulation of DOE accelerators using high- performance computing The performance of high-brightness RF guns operating.
Wakefield Damping Effects in the CLIC Power Extraction and Transfer Structure (PETS) Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite.
Module on Computational Astrophysics Jim Stone Department of Astrophysical Sciences 125 Peyton Hall : ph :
ALPHA Storage Ring Indiana University Xiaoying Pang.
SIMULATION PROGRESS AND PLANS AT ROSTOCK/DESY Aleksandar Markovic ECL2, CERN, March 1, 2007 Gisela Pöplau.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
Dark Current Measurements and Simulations Chris Adolphsen 2/4/15.
Simulation Technology & Applied Research, Inc N. Port Washington Rd., Suite 201, Mequon, WI P:
Improved pipelining and domain decomposition in QuickPIC Chengkun Huang (UCLA/LANL) and members of FACET collaboration SciDAC COMPASS all hands meeting.
Impedance and Collective Effects in BAPS Na Wang Institute of High Energy Physics USR workshop, Huairou, China, Oct. 30, 2012.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
Electron cloud in the wigglers of ILC Damping Rings L. Wang SLAC ILC Damping Rings R&D Workshop - ILCDR06 September 26-28, 2006 Cornell University.
Higher-Order Modes and Beam-Loading Compensation in CLIC Main Linac Oleksiy Kononenko BE/RF, CERN CLIC RF Structure Development Meeting, March 14, 2012.
LCLS-II Injector Impedance Study
 Advanced Accelerator Simulation Panagiotis Spentzouris Fermilab Computing Division (member of the SciDAC AST project)
DOE/HEP SciDAC AST Project: “Advanced Computing for 21 st Century Accelerator Science and Technology” Impact of SciDAC on Accelerator Projects Across SC.
Adaptive Meshing Control to Improve Petascale Compass Simulations Xiao-Juan Luo and Mark S Shephard Scientific Computation Research Center (SCOREC) Interoperable.
Design of Microwave Undulator Cavity
PBG Structure Experiments, AAC 2008 Photonic Bandgap Accelerator Experiments Roark A. Marsh, Michael A. Shapiro, Richard J. Temkin Massachusetts Institute.
Collimator wakefields - G.Kurevlev Manchester 1 Collimator wake-fields Wake fields in collimators General information Types of wake potentials.
ILC Damping Rings Mini-Workshop, KEK, Dec 18-20, 2007 Status and Plans for Impedance Calculations of the ILC Damping Rings Cho Ng Advanced Computations.
COMPASS All-Hands Meeting, FNAL, Sept , 2007 Accelerator Prototyping Through Multi-physics Analysis Volkan Akcelik, Lie-Quan Lee, Ernesto Prudencio,
Trapped Modes in LHC Collimator (II) Liling Xiao Advanced Computations Department SLAC National Accelerator Laboratory.
GWENAEL FUBIANI L’OASIS GROUP, LBNL 6D Space charge estimates for dense electron bunches in vacuum W.P. LEEMANS, E. ESAREY, B.A. SHADWICK, J. QIANG, G.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Coupler Short-Range Wakefield Kicks Karl Bane and Igor Zagorodnov Wake Fest 07, 11 December 2007 Thanks to M. Dohlus; and to Z. Li, and other participants.
Multipacting Simulation for the Muon Collider Cooling Cavities* L Ge, Z Li, C Ng, K Ko, SLAC R.B. Palmer, BNL D Li, LBNL The muon cooling cavity for the.
A Parallel Hierarchical Solver for the Poisson Equation Seung Lee Deparment of Mechanical Engineering
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Operated by JSA for the U.S. Department of Energy Thomas Jefferson National Accelerator Facility Alex Bogacz IDS- NF Acceleration Meeting, Jefferson Lab,
USPAS 2005 Recirculated and Energy Recovered Linacs1 CHESS / LEPP USPAS Course on Recirculated and Energy Recovered Linacs I. V. Bazarov Cornell University.
OPERATED BY STANFORD UNIVERSITY FOR THE U.S. DEPT. OF ENERGY 1 Alexander Novokhatski April 13, 2016 Beam Heating due to Coherent Synchrotron Radiation.
2D AFEAPI Overview Goals, Design Space Filling Curves Code Structure
G. Cheng, R. Rimmer, H. Wang (Jefferson Lab, Newport News, VA, USA)
Introduction to the Finite Element Method
Challenges in Electromagnetic Modeling Scalable Solvers
T. Agoh (KEK) Introduction CSR emitted in wiggler
Odu/slac rf-dipole prototype
SIMULATION TOOLS FOR PHOTONIC CRYSTAL FIBER*
Lecture 19 MA471 Fall 2003.
SIMULATION TOOLS FOR PHOTONIC CRYSTAL FIBER*
Parallel 3D Finite Element Particle-In-Cell Simulations with Pic3P*
PARALLEL FINITE ELEMENT MODELING TOOLS FOR ERL DESIGN AND ANALYSIS
L Ge, L Lee, A. Candel, C Ng, K Ko, SLAC
Multipacting Simulations of TTF-III Coupler Components*#
V. Akcelik, L-Q Lee, Z. Li, C-K Ng, L. Xiao and K. Ko,
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P* Arno Candel, Andreas Kabel, Zenghai Li, Cho Ng, Liequan.
¼ meshed models for Omega3P calculations
Multipacting Simulation for the Muon Collider Cooling Cavities*
SRF Cavity Designs for the International Linear Collider*
GENERAL VIEW OF KRATOS MULTIPHYSICS
Beamline Absorber Study Using T3P
LARGE SCALE SHAPE OPTIMIZATION FOR ACCELERATOR CAVITIES*
William Lou Supervisor: Prof. Georg Hoffstaetter
Using the Omega3P Eigensolver
Comparison of CFEM and DG methods
Parallel Implementation of Adaptive Spacetime Simulations A
PARALLEL FINITE ELEMENT MODELING TOOLS FOR ERL DESIGN AND ANALYSIS
V. Akcelik, L-Q Lee, Z. Li, C-K Ng, L. Xiao and K. Ko,
Presentation transcript:

Wakefield Computations at Extreme-scale for Ultra-short Bunches using Parallel hp-Refinement Lie-Quan Lee, Cho Ng, Arno Candel, Liling Xiao, Greg Schussman, Lixin Ge, Zenghai Li, Andreas Kabel, Vineet Rawat, and Kwok Ko SLAC National Accelerator Laboratory SciDAC 2010 Conference, Chattanooga, TN, July 11-15, 2010

Overview *Parallel computing for accelerator modeling *Finite-element time-domain analysis *Moving window technique with finite-element method –Wakefield calculations with p-refinement –Wakefield calculations with online hp-refinement *Benchmarking and Results –PEP-X undulator taper structure –PEP-X wiggler taper structure –Cornel ERL vacuum chamber –ILC/ProjectX Coupler

*Particle accelerators are billion-dollar class facilities –Proposed International Linear Collider: 6.75 billion dollars –Large Hadron Collider: 6.5 billion dollars *Advanced computing enables virtual prototyping –Cost savings from design optimization through computing can be significant Advanced Computing for Accelerators 30 km 27 km

*ComPASS Accelerator Project ( ) –TOPS/LBL (Linear solvers and eigen-solvers) E Ng, X Li, I Yamazaki –ITAPS/RPI, LLNL, Sandia (Parallel curvilinear meshing and adaptation) Q Lu, M Shephard, L Diachin –Ultra-scale Visualization Institute/Sandia, UC Davis K Mooreland, K Ma –ITAPS/CSCAPES/Sandia (Load balancing) K Devine, E Boman o Collaborations on algorithmic aspects SciDAC for Accelerator Modeling

Parallel Finite Element Code Suite ACE3P Visualization: ParaView – Meshes, Fields and Particles Over more than a decade, SLAC has developed the conformal, higher-order, C++/MPI-based parallel finite-element suite of electromagnetic codes, under the supports of AST SciDAC1 and ComPASS SciDAC2 projects ACE3P Modules– Accelerator physics application Frequency Domain:Omega3P– Eigensolver (nonlinear, damping) S3P– S-Parameter Time Domain: T3P– Wakefields and Transients Particle Tracking: Track3P– Multipacting and Dark Current EM Particle-in-cellPic3P– RF gun simulation Aiming for the Virtual Prototyping of accelerator structures

INCITE Program *INCITE Award: Petascale Computing for Tera-eV Accelerators ( ) -ORNL (Jaguar / Lens / Hpss) -Kitware (for ParaView visualization and analysis software) -Center for Scalable Application Development Software (CScADS) o Provide major computing resources and supports for efficiently using them in tackling challenging accelerator modeling problems

SLAC’s INCITE Allocation Usage at NCCS Average Job size: Average Job size: Average Job size: 2005 Average Job size: 1834 Average Job size: 6349 Average Job size: total 4.5 million 2009 total 8 million ~60% of total time is used with 10% to 50% of the total resources (22k cores to 120k cores) Number of Cores 2010 Allocation Usage (up to 07/2010) 2010 total 12 million

Cavity Shape - Ideal in silver vs deformed in gold Solving CEBAF BBU Using Shape Uncertainty Quantification Method Method of Solution - Using the measured cavity parameters as inputs, the deformed cavity shape was recovered by solving the inverse problem through an optimization method. The calculations showed that the cavity was 8 mm shorter than designed, which was subsequently confirmed by measurements. The result explains why the troublesome modes have high Qs because in the deformed cavity, the fields shift away from the HOM coupler where they can be damped. This shows that quality control in cavity fabrication can play an important role in accelerator performance.. SciDAC Success as a Collaboration between Accelerator Simulation, Computational Science and Experiment – Beam Breakup (BBU) instabilities at well below the designed beam current were observed in the CEBAF12 GeV upgrade of the Jefferson Lab (TJNAF) in which Higher Order Modes (HOM) with exceptionally high quality factor (Q) were measured. Using the shape uncertainty quantification tool developed under SciDAC, the problem was found to be a deformation of the cavity shape due to fabrication errors. This discovery was achieved as a team effort between SLAC, TOPS, and JLab which underscores the importance of the SciDAC multidisciplinary approach in tacking challenging applications. Field profiles in deformed cavity HOM coupler High Q modes

Finite Element Discretization *Curvilinear tetrahedral elements: High fidelity modeling *High-order hierarchical vector basis functions (H(curl)) –Provide tangential continuity required by physics –Easily set the boundary conditions –Significantly reduce phase error 0.5 mm gap 200 mm Entities# of bases Edgep Facep(p-1) Volumep(p-1)(p-2)/2 Total6E+4F+V Shape functions

Time-Domain Analysis Second-order vector wave equation: Compute transient and wakefield effects of beams inside cavities Denote :

Finite-Element Time-Domain Analysis H(Curl)-conforming Element (discretized in space): NiNi ODE in matrix and vector form:

Newmark-  Scheme for Time Stepping *Unconditionally stable* when  > 0.25 *A linear system Ax=b needs to be solved for each time step *Matrix in the linear system is symmetric positive definite *Conjugate gradient + block Jacobi / incomplete Cholesky *Gedney & Navsariwala, An unconditionally stable finite element time-domain solution of the vector wave equation, IEEE microwave and guided wave letters, vol. 5, pp , 1995

Wakefields Calculations *At given time, electric field and magnetic flux can be calculated: *Wake Potential: Coupler region Cavity for International Linear Collider

Parallel Finite-Element Time-Domain Code: T3P *Ongoing work: -Preconditioners and multiple right hand sides -Fine-tune mesh partitioning Mesh generation Analysis & viz CAD model Netcdf mesh file Input parameters Partition mesh Assemble matrices Solver/time stepping Postprocess: E/B Strong Scaling Study million elements p=2 NDOFs=1.6 billion

3D PEP-X Undulator Taper *Need to compute short-range wakefield due to 100 mm smooth transition from 50mm×6mm elliptical pipes to 75mm×25mm *Beam bunch length is 0.5 mm in a 300 mm long structure Computer Model of Undulator Taper 50 mm × 6 mm 75 mm × 25 mm

Moving Window with p-Refinement *Use a window to limit the computational domain –Fields from the left of the window will not catch up as beam moves with speed of light –Fields on the right of the window should be zero *Use finite element basis function order p to implement window –p is nonzero for elements inside the window –p = 0 for any elements outside of the window *Move the window when the beam is close to the right boundary of the window –Prepare the mesh, repartition the mesh, assemble matrices, transfer solution, … p=2 p=0 f b d

Solution Transfer when Window Moves *Scatter x n to each element –Keep values in the overlapping region –Drop values in the left of the window –Set zeros in the right of the overlapping region –Redistribute to different processes according to partitioning *Gather values to form new x n

Benchmarking with 4mm Bunch *run on XT5 with 5 nodes (60 cores) –Case 1: the whole structure/mesh with 2 mm size : 21.5 minutes –Case 2: moving window with p refinement (same mesh): 11 minutes *Identical wakefields within beam (40mm) and its tailing zone(10mm) *Significantly reducing computational resources with moving window *Additional Benefits: *Smaller cores count *Faster execution time *Solve larger problems Wake potential

Movie of Wakefields in PEP-X Undulator Taper

Wakefields of Undulator Taper with 0.5mm Bunch *60 mm trailing area with 0.5mm beam bunch size *0.25mm mesh size, 19 million quadratic tetrahedral elements *p=2 inside the window, p=0 outside *15 windows (90mm) *Maximal NDOFs is 54.9M only, Maximal Nelems 8.6 million Reconstructing wake of 3mm bunch

21 PEP-X Wiggler Taper 400 mm (10*taper_height) 45mm x 15 mm R48mm Model: 400 mm transition between the rectangular beam pipe (45mm × 15mm) and circular beam pipe (radius 48mm). Calculate: wakefields with 0.5mm bunch size with 60mm tail. Wiggler mesh 50mm x 6 mm 75mm x 25 mm 100mm (10*taper_height) Wiggler Taper Undulator Taper

Challenges in Wakefields Calculation *Number of mesh elements would be roughly 4 to 5 times larger than that of undulator taper *Number of time steps is about 3 to 4 times larger *Meshing problem: cannot easily generate such a large mesh (~100 million curvilinear elements) –Mesh generator is serial (parallel mesh generation is under development at ITAPS) –Need a computer with lot of memory (128GB) *Inefficient use of computer resources –Mesh in the new window need to be read in from the file –Or store in the memory in parallel Can we do online mesh refinement in the window region? o Prepare mesh, repartition, assemble matrices, transfer solution, …

Mesh Refinement: Edge Splitting *There are many different ways of splitting a tetrahedron *Divide all 6 edges: better suited for resolving short beam bunch *Generate 8 smaller tetrahedral elements Splitting linear tetrahedron Splitting quadratic tetrahedron

Online Parallel Mesh Refinement *Midpoints of 6 edges are new vertices –Need a coordinated way for new vertices on process boundary *Similar to number the DOFs (edge element) using 1 st order basis function in our electromagnetic simulation *Regard each new vertices as a DOF *Run our parallel partitioning and numbering routines to get unique IDs cross different processors *Form the dense mesh by local edge splitting of each element owned by the processor *~500 lines of additional code only

Solution Transfer When Mesh Changed A Projection Method for Discretized Electromagnetic Fields on Unstructured Meshes, Lie-Quan Lee, CSE09, Miami, Florida, March 2-6, 2009; Tech. pub. SLAC-PUB x n x n-1 f n f n-1 p=0p=2p=0

26 Solution Transfer between Meshes *Vectors x n, x n-1, f n, f n-1 need to be transferred onto the new mesh *For x n and x n-1, a projection method is used: *Scalable parallel projection is a challenge  Vectors f n and f n-1 –Recalculation on the new mesh

Wakefield Benchmarking *20mm bunch size: window size 0.5 m (b=0.1m,f=0.2m) on XT5 with 5nodes (60 cores) –Case 1: 10 mm mesh size with hp-refinement: 36.5 minutes –Case 2: 5 mm mesh size with p-refinement: 23 minutes *Identical results for s < 0.3m. *Longer time in Case 1 is due to overhead associated with mesh refinement *Using less memory

Wakefields of 1mm Bunch in PEP-X Wiggler Taper *11 windows (170mm) with 60mm tailing area *Maximal NDOFs: 74.7 million *Maximal Nelems: 11.8 million Reconstructing wake of 3mm bunch

Wakefields in Cornell ERL Vacuum Chamber *Energy Recovery Linac (ERL) Aluminum Vacuum Chamber *Moving window with online mesh refinement; cores with ~5 hours on Jaguarpf at ORNL *Not possible without algorithmic advance and INCITE-scale resources for calculating wakefields of such a short bunch Computer model of an ERL vacuum chamber device For 0.6mm bunch length, loss factor is V/pC.

Movie of Wakefields in ERL Vacuum Chamber

Preliminary Results of Wakefields in ILC Coupler *Without Windows –Would be 256 million elements with 1.6 billion DOFs –100k cores with 6s per step *With Moving Windows –~30 million elements are used inside each window –12k cores with 6s per step (p=2) –12k cores with 0.4s per step (p=1)  = 0.3 mm Project X shares the same cavity design

Summary *Calculating wakefields due to ultra-short bunches requires efficient computational methods –Moving window with p-refinement –Moving window with hp-refinement –Parallel online mesh refinement –Parallel solution transfer through projection *We are making major impact on accelerator design through extreme-scale computing *Only possible through –SciDAC supports for algorithmic advancement, and –INCITE allocation for large computing resources