Funded by US-DOE grants DE-FG02-05ER84172, Dey-Mittra ADI: An absolutely stable method for Maxwell’s Equations with Embedded Boundaries Travis Austin John Cary, David Smithe Tech-X Corporation Funded by US-DOE grants DE-FG02-05ER84172, DE-FC02-07ER41499, and FA9451-06-D-0115/002. Serguei Ovtchinnikov and Johan Carlsson have also done some work on ADI in addition to Chet Nieter and Peter Stoltz who have tested out the method in VORPAL. ComPASS Meeting Boulder, CO Tuesday, October 6, 2009
Overview Background Motivation Dey-Mittra Cut-Cell Stability Alternating-Direction Implicit (ADI) Methods Divergence-preserving Tridiagonal Solves (Smithe, Cary, Carlsson, Ovtchinnikov) Dey-Mittra ADI Method Implementation Frequency Extraction Results Performance Future Work Present overview relatively quickly.
Background Maxwell’s Equations: Yee Method: Bz Ey Ez Bx By Ex Remember: Click brings up second equation for Maxwell’s. Explain that P contains positive derivative terms and M contains negative derivative terms. Ex Ez Ey Bx By Bz 3
Background Yee Method (Faraday’s Law): Courant-Friedrichs-Lewy Stability Condition: Faraday’s Law for the By component where Axz is area of face and lx and lz are edge lengths. 4
Motivation Dey-Mittra Cut-Cell Stability Start by asking rhetorical question: Why consider ADI for Maxwell’s Equations? An embedded boundary can create cells with only a fraction of their volume inside of the computational domain. (Point to four corner cells.) These cells with fractional volumes lead, via stability considerations, a significantly reduced time step to maintain stability. Stairstep does not lead to these restrictions but method is only first-order as it inaccurately approximates the boundary. (a) Dey-Mittra Approach (b) Stairstep Approach - Only change Faraday update 5
Motivation Dey-Mittra Cut-Cell Stability Dey-Mittra-Induced Stability Condition: Determination of fDM: fractional value between 0 and 1 based on stability derived from Gershgorin circle theorem too small cut-cells make time step prohibitively small an implicit method would overcome these restrictions A fractional factor, fDM, is defined which, via Gerchgorin results, leads to a reduction in the time step and the excluding of “too-small” cells to maintain stability. We need an implicit method to overcome these stability considerations. 6
Time Step Based on Accuracy Motivation Time Step Based on Accuracy Smithe’s Rule: 6 temporal points/period => good picture 8 temporal points/period => beat theoretician 12 temporal points/period => build something Explicit FDTD requires us to overresolve the temporal aspect of the simulation because of stability considerations. Desire an implicit approach that is not too costly and permits the temporal resolution to be governed by accuracy and not stability. Explicit FDTD at high spatial resolution requires overresolution of temporal space because of CFLcondition. Again, we need an implicit method which is limited by accuracy and not stability. 7
Alternating-Direction Implicit Methods Divergence-Preserving Form There are four 2nd-order accurate variants of the ADI algorithm, depending on the order of the operands: ZCZ is the first. We investigated the last … Mention work of Smithe, Carlsson, and Cary and funded by a DOE SBIR Phase I in 2007.
Alternating-Direction Implicit Methods Divergence-Preserving Form The operator P+M is the curl operator, so for Yee-cell Of the four ADI combinations, only the last form, DP, can be algebraically manipulated to show that its final operation is equivalent to a finite-difference curl Thus it is divergence preserving, for source S.
Alternating-Direction Implicit Methods Divergence-Preserving Form Full details are in the paper D. N. Smithe, J. R. Cary, J. A. Carlsson, ”Divergence preservation in the ADI algorithms for electromagnetics,” J. of Comp. Physics 228, 7289 (2009).
Alternating-Direction Implicit Methods Tridiagonal Solves Each domain does forward-solve over its domain Passes boundary data to single process global solve Receives data, then back-solves over its domain Colored Blocks correspond to (EU^-1)*(L^-1D) which is the serial bottleneck matrix.
Alternating-Direction Implicit Methods Tridiagonal Solves Remedy is Concurrent Divide & Conquer In 2-D and 3-D there are multiple 1-D solves. Global solves are distributed across the processes. Reduced global solve is now a source of idle processor time.
Alternating-Direction Implicit Methods Tridiagonal Solves For good scaling, the local backward solve must cover latency. N # cells in process Longer 1-D dimension, N1/2 rather than N1/3, means more time to cover latency. 2-D scales easier than 3-D for multiprocessor distributed computing
Alternating-Direction Implicit Methods Tridiagonal Solves There was good scaling as long as Ncells(1/Ndim) 64. Implies typically good scaling for ADI in both 2-D and 3-D. On office linux cluster, needed Ncells(1/Ndim) 128 for good scaling Implies good scaling for 2-D, and marginal scaling for 3-D. (Ideal = dotted line) Scaling study on BG/L was favorable
Dey-Mittra ADI Implementation Accelerator devices have boundaries which are nonconvex. This breaks each row tri-diagonal solve into several solves. Simply unit-fill the diagonals of rows for fully exterior field components, and set RHS source to zero.
Dey-Mittra ADI Implementation Dey-Mittra is a metallic cut-cell algorithm giving 2nd-order accurate global solutions. Modify Faraday’s law to use non-metallic electric line length and magnetic flux area. What is the Dey-Mittra Cut-Cell Algorithm? dAnon-metallic dlnon-metallic
Division by small area limits algorithm. Dey-Mittra ADI Implementation Division by small area limits algorithm. Reduction in time step is significant, 0.5Dt for decent results, and 0.25Dt or even 0.10Dt for excellent results. Throws away small cells, leading to occasional “pits” and “scratches” in geometry including particle creation/destruction surfaces. Using ADI can eliminate both these inconveniences. Dey-Mittra Cut-Cell Algorithm has two disadvantages
Dey-Mittra ADI Implementation Technically, the stability of ADI requires that the two alternating-direction curl matrix operators, P and M, be anti-symmetric. The Dey-Mittra length/area factors appear to destroy this anti-symmetry. However, anti-symmetry can be recovered by solving for re-scaled fields. Then removing scale factors. Bottom-line: OK to use Dey-Mittra difference matrix even in non-anti-symmetric matrices. E.g., it’s still stable. (Even when matrix element )
Frequency Extraction Results Dey-Mittra ADI Frequency Extraction Results Testing modes of A6 magnetron. Frequency (Hz) ADI with Dey-Mittra is validated in VORPAL 1/Nx2
Frequency Extraction Results Dey-Mittra ADI Frequency Extraction Results 2nd-order accuracy verified, even for dt that is 8 times the normal Courant limit. ADI with Dey-Mittra is validated in VORPAL
Frequency Extraction Results Dey-Mittra ADI Frequency Extraction Results Method investigated in 3D for A15 cavity Magnetic Field (z-component) tf = 343.06 ps Time Steps = 2000 ADI with Dey-Mittra is validated in VORPAL Electric Field (x-component) Time Step = 2.0 DtCFL fDM = 0.0000001 => all cells kept in the simulation
Dey-Mittra ADI Performance: 2D Simulation Parameters: Single CPU Results 2D A6 Magnetron Benchmark 2500 (50x50) cells GMRES w/ Jacobi preconditioner 1.358 ns of simulation time Explicit: 1280 time steps ADI-1.0: 320 time steps ADI-2.0: 160 time steps ADI-4.0: 80 time steps ADI-8.0: 40 time steps ADI-16.0: 20 time steps ADI-32.0: 10 time steps ADI-64.0: 5 time steps MaxIts raised at 8x CFL from 100 to 500 and then to 1000 at 32x CFL and then to 5000 at 64x CFL ADI with Dey-Mittra is validated in VORPAL
Performance: Strong Scaling Dey-Mittra ADI Performance: Strong Scaling ADI with Dey-Mittra is validated in VORPAL
Full details are in the paper Dey-Mittra ADI Full details are in the paper T. M. Austin, J. R. Cary, D. N. Smithe, C. Nieter, ”Alternating Direction Implicit Methods for FDTD using the Dey-Mittra Embedded Boundary Method,” accepted in The Open Plasma Physics Journal.
Summary Motivation for ADI methods was discussed in the context of Dey-Mittra method. Efficient parallel tridiagonal solves were presented and performance was verified. Implementation of ADI for the Dey-Mittra method was introduced and results showed stability beyond CFL. Argument made for fast tridiagonal solves. Multiprocessor? GPUs?