Download presentation
Presentation is loading. Please wait.
1
AstroBEAR Finite volume hyperbolic PDE solver Discretizes and solves equations of the form Solves hydrodynamic and MHD equations Written in Fortran, with MPI support libraries
2
Adaptive Mesh Refinement Method of reducing computation in finite volume calculations Starts with a base resolution and overlays grids of greater refinement where higher resolution is needed. Grids must be properly nested For parallelization purposes, only one parent per grid
3
AMR Algorithm (Cunningham et al., 2009) AMR(level, dt) { if (level == 0) nsteps = 1; if (level > 0) nsteps = refine_ratio; For n = 1, nsteps { DistributeGrids(level); IF (level < MaxLevel) { CreateRefinedGrids(level + 1); SwapGhostData(level + 1); } Integrate(level, n); If (level < MaxLevel) CALL AMR(level + 1, dt/refine_ratio); } If (level > 1) synchronize_data_to_parent(level); }
4
Parallel Communications Grids rely on external “ghost cells” to perform calculations. Data from neighboring grids needs to be copied into ghost region. Major source of scaling problems Alternate fixed-grid code (AstroCUB) has different communication method
5
AstroBEAR Parallel Communication TransferOverlapData() { TransferWorkerToWorkerOverlaps(); TransferMasterToWorkerOverlaps(); TransferWorkerToMasterOverlaps(); } foreach overlap transfer t { If (Worker(t.source)) SignalSendingProcessor(t.source); If (Worker(t.dest)) SignalReceivingProcessor(t.dest); IF (Worker(t.source)) SendLocalOverlapRegion(t.source); IF (Worker(t.dest)) SendLocalOverlapRegion(t.dest); }
6
AstroCUB Parallel Communication TransferOverlapData(Grid g) { for dim = 1, ndim { foreach boundary along dim { foreach field_type { MPI_ISEND( at ); MPI_IRECV( at ); MPI_WAIT(); }
7
AstroBEAR/AstroCub Comparison AstroBEAR: Recalculates overlaps before each synchronization Each send/receive operation is handled individually Groups transfers based on source and destination processor (master or worker) 10 MPI calls per grid per timestep in 3D hydro runs AstroCUB: Calculates overlaps once, prior to first synchronization Send/receive operations handled together 6 MPI calls per processor per timestep in 3D hydro runs
8
Requirements Physics Hydro/MHD Cooling Cylindrical Source Self-Gravity Sink Particles Numerics MUSCL-Hancock, Runge-Kutta Strang Splitting Constrained Transport Roe, Marquina Flux
9
Language Options Python: Pros: Good stack trace, flexibility, resource management Cons: Requires SciPy, GPU or hybridization for numerics C: Pros: Speed, no interfaces required Cons: More memory, pointer management work falls on developer Fortran: Pros: Fast number-crunching Cons: Clumsy data structures, more memory and pointer management for developer
10
Hybridization Not unheard-of in scientific codes Cactus (Max Planck Institute) We've tried it already (HYPRE) Can benefit from strengths of scripting and compiled languages May result in steeper learning curve for new developers
11
Parallelization Improvements Transmission caching Each processor stores its ghost zone transmission details until regrid Message packing Sending big blocks containing many messages msg 1 msg2msg3
12
Parallelization Improvements, ctd. Redundancy in root domains “Stretching” root grids initially to pull in extra data from other grids Reduces the need for refined ghost transmissions core grid stretched grid
13
Further Options for Improvement Refined grids: Can Berger-Rigoutsos be further simplified/parallelized?
14
Concerns for New Code Solver Modularity Code should run on CPU cluster or GPU cluster Scalability Code must run effectively on more than 64 CPUs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.