J.-L. Vay1, R. Lehe1, H. Vincenti1,2, High-performance modeling of plasma-based acceleration using the full PIC method J.-L. Vay1, R. Lehe1, H. Vincenti1,2, B. Godfrey1,3, P. Lee4, I. Haber3 1Lawrence Berkeley National Laboratory, California, USA 2Commissariat à l’Energie Atomique, Saclay, France 3University of Maryland, Maryland, USA 4University of Paris-Sud, Orsay, France 2nd European Advanced Accelerator Concepts 13-19 September 2015, Elba Island, Italy
Outline Numerical Cherenkov LPA modeling Spectral EM solver Boosted frame Spectral EM solver FDTD PSATD Lab frame 3D CIRC Spectral CIRC solver CIRC FDTD CIRC PSATD PML Novel analysis Parallel Multi-source model
Laser plasma acceleration (LPA) analogy boat wake surfer laser wake e- beam 3
Modeling from first principle is very challenging For a 10 GeV scale stage: ~1mm wavelength laser propagates into ~1m plasma millions of time steps needed (similar to modeling 5m boat crossing ~5000 km Atlantic Ocean) 4
Solution: model in frame moving near the speed of light* compaction X20,000 l’=200. m 0.01 m/200. m=50. Boosted frame = 100 Hendrik Lorentz L’=0.01 m Lab frame l≈1. m L≈1. m 1. m/1. m=1,000,000 BELLA-scale w/ ~ 5k CPU-Hrs: 2006 - 1D run 2011: 3D run Alternate or complementary solutions: quasistatic, laser envelope, azimuthal Fourier decomposition (“Circ”), … *J.-L. Vay, Phys. Rev. Lett. 98, 130405 (2007) 5
Outline Numerical Cherenkov LPA modeling Spectral EM solver Boosted frame Spectral EM solver FDTD PSATD Lab frame 3D CIRC Spectral CIRC solver CIRC FDTD CIRC PSATD PML Novel analysis Parallel Multi-source model
Relativistic plasmas PIC subject to “numerical Cherenkov” Numerical dispersion leads to crossing of EM field and plasma modes -> instability. Exact Maxwell Standard PIC *B. B. Godfrey, “Numerical Cherenkov instabilities in electromagnetic particle codes”, J. Comput. Phys. 15 (1974) 7
Space/time discretization aliases more crossings in 2/3-D Exact Maxwell Standard PIC light light kz kx w kz kx w plasma at b=0.99 plasma at b=0.99 8
Space/time discretization aliases more crossings in 2/3-D Exact Maxwell Standard PIC aliases aliases light light kz kx w kz kx w plasma at b=0.99 plasma at b=0.99 Analysis calls for full PIC numerical dispersion relation Need to consider at least first aliases mx={-3…+3} to study stability. 9
Numerical dispersion relation of full-PIC algorithm* 2-D relation (Fourier space): *B. B. Godfrey, J. L. Vay, I. Haber, J. Comp. Phys. 248 (2013) 10
Numerical dispersion relation of full-PIC algorithm (II) *B. B. Godfrey, J. L. Vay, I. Haber, J. Comp. Phys. 248 (2013) 11
Numerical dispersion relation of full-PIC algorithm (III) Then simplify and solve with Mathematica… *B. B. Godfrey, J. L. Vay, I. Haber, J. Comp. Phys. 248 (2013) 12
Rapid recent progress on analysis and mitigation of numerical Cherenkov instability Analysis of Numerical Cherenkov has been generalized: to finite-difference PIC codes (“Magical” time step explained): B. B. Godfrey and J.-L. Vay, J. Comp. Phys. 248 (2013) 33. X. Xu, et. al., Comp. Phys. Comm., 184 (2013) 2503. to pseudo-spectral PIC codes: B. B. Godfrey, J. -L. Vay, I. Haber, J. Comp. Phys., 258 (2014) 689. P. Yu et. al, J. Comp. Phys. 266 (2014) 124. Efficient suppression techniques were recently developed: for finite-difference PIC codes: B. B. Godfrey and J.-L. Vay, J. Comp. Phys. 267 (2014) 1. B. B. Godfrey and J.-L. Vay, Comp. Phys. Comm., in press for pseudo-spectral PIC codes: B. B. Godfrey, J.-L. Vay, I. Haber, IEEE Trans. Plas. Sci. 42 (2014) 1339. P. Yu, et. al., arXiv:1407.0272 (2014). 13
Outline Numerical Cherenkov LPA modeling Spectral EM solver Boosted frame Spectral EM solver FDTD PSATD Lab frame 3D CIRC Spectral CIRC solver CIRC FDTD CIRC PSATD PML Novel analysis Parallel Multi-source model
High-order FDTD + small time steps exact solution but expensive cDt/Dx~0.45 Higher order cDt/Dx~0.045 15
Finite-Difference Time-Domain Pseudo-Spectral Time-Domain Spectral solver offers “infinite order” but still needs small time steps Finite-Difference Time-Domain (FDTD) Pseudo-Spectral Time-Domain (PSTD) F=FFT F-1 F FDTD cDt/Dx~0.45 PSTD cDt/Dx~0.45 PSTD cDt/Dx~0.045 PSTD converges to exact solution (on grid) for Dt0. PSTD is limit of high-order FDTD when ninfinity. 16
Pseudo-Spectral Analytical Time-Domain1 Analytical pseudo-spectral solver offers exact solution with no Courant condition Pseudo-Spectral Analytical Time-Domain1 (PSATD) F-1 F F-1 F with PSTD cDt/Dx~0.045 PSATD cDt/Dx=50 1 time step 1I. Haber, R. Lee, H. Klein & J. Boris, Proc. Sixth Conf. on Num. Sim. Plasma, Berkeley, CA, 46-48 (1973) 17
Analytical theory and simulations show PSATD most stable Pseudo-Spectral Time Domain algorithm has restrictive Courant condition (UCLA team has article in arXiv) Energy growth levels influenced by number of unstable modes, initial noise spectrum, nonlinear effects, as well as linear growth rates Taylored filters showed to lower instability rate further in all cases. *B. B. Godfrey, J.-L. Vay, I. Haber, “Numerical stability analysis of the Pseudo-Spectral Analytical Time-Domain PIC algorithm”, J. Comp. Phys. 258 (2014) 689.
Outline Numerical Cherenkov LPA modeling Spectral EM solver Boosted frame Spectral EM solver FDTD PSATD Lab frame 3D CIRC Spectral CIRC solver CIRC FDTD CIRC PSATD PML Novel analysis Parallel Multi-source model
Development of a spectral quasi-cylindrical PIC code
Development of a spectral quasi-cylindrical PIC code
Development of a spectral quasi-cylindrical PIC code * FFT in Z Hankel Transform in R *R. Lehe, M. Kirchen, I. A. Andriyash, B. Godfrey, J.-L. Vay, “A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm”, arXiv:1507.04790 (2015).
Offers better dispersion relation e.g. group velocity of a laser pulse in vacuum Very important for LPA, since small differences in group velocity determine the dephasing length.
More stable to numerical Cherenkov instability Removes numerical Cherenkov main mode (aliases remain) Very important for LPA: avoids spurious growth of emittance (NB: Can also be obtained by using special finite-difference stencil)
Space-time collocation reduces field-particles interpolation errors Evolution of the transverse momentum (for different resolutions) e.g. relativistic electron co-propagating with a laser (staggered) (nodal) Staggered finite-difference codes introduce interpolation errors, which lead to a spurious force on the electrons
Outline Numerical Cherenkov LPA modeling Spectral EM solver Boosted frame Spectral EM solver FDTD PSATD Lab frame 3D CIRC Spectral CIRC solver CIRC FDTD CIRC PSATD PML Novel analysis Parallel Multi-source model
Challenge for scaling very-high order/pseudo spectral solvers to million cores Standard: Global FFTs/ No halo cells Proposed: Local FFTs/ fixed Ng guard cells* Local exchanges Truncation approximation Poorly scales with # CPUS Scales but approximation at very high/infinite orders! Need to characterize/mitigate truncation errors to enable new method! *J.-L. Vay, I. Haber, B. Godfrey, J. Comput. Phys. 243, 260-268 (2013)
New error-predictive analytical model for stencil spatial variations/truncations in simulations Previous model: stencil modification at one point accounted with a single source term Yields inaccurate results for orders p>2 Total truncation error amplitude: New model*: stencil modification at one point accounted by multi-source error terms Solves discrepancy at orders p>2 Total truncation error amplitude: Multi- *H. Vincenti, J.-L. Vay, arXiv 1507.05572 (2015)
Model predicts error from stencil truncation (l=10Dx, staggered grid) Spectral error with 1/Nguards^2 Full model Approximate formula Variation with wavelength (not shown) are also perfectly reproduced Finite very-high order requires << # guard cells than spectral. *H. Vincenti, J.-L. Vay, arXiv 1507.05572 (2015)
Near-spectral precision achievable with very high-order Maxwell solvers and domain decomposition Number of guard cells required to achieve 0 error at machine precision Nguards=p/2 Nguards=3e7 For p=∞ Double precision Nguards=1e4 For p=∞ Single precision
New analysis confirms efficacy of PML with spectral Perfectly Matched Layer (PML) enables efficient open boundary conditions PML Simulation grid spurious reflections no PML with PML New analysis predicts exact coefficients of reflections Simulations New model1 Previous model2 Simulations New model1 Previous model2 1H. Vincenti, J.-L. Vay, arXiv 1507.05572 (2015) 2P. Lee, J.-L. Vay, Comp. Phys. Comm. 194, 1-9 (2015) 31
Accomplishments, lessons learned and prospects Extension of analysis of numerical Cherenkov to various schemes in 2D/3D explained magical time step high-order/spectral codes are mode stable than 2nd order FDTD has led to the development of mitigation for FDTD, high-order & pseudo-spectral Circ-spectral code based on Hankel transform was successfully developed combines advantages of 2D-like costs and spectral accuracy New analysis of stencil truncation new parallelization scheme based on domain decomposition possible confirms efficacy of Perfectly Matched Layers with high-order/spectral solvers Latest developments and ongoing work to enable fast, scalable solvers toward “realtime” modeling of plasma accelerators. Main results: Fig1: Infinite order error poorly scale with Nguards Fig2: Solution: going to very high FINITE order can still allow to achieve spectral precision (on a large band of freq) with domain decomp and a reasonably low number of halo cells Fig3: Model will be very important to predict required mimimun number of guard cells to get less than machine precision Errors