Download presentation
Presentation is loading. Please wait.
1
AstroBEAR Parallelization Options
2
Areas With Room For Improvement Ghost Zone Resolution MPI Load-Balancing Re-Gridding Algorithm Upgrading MPI Library
3
Ghost Zone Resolution Can exceed 30% of total program execution time. Affects fixed grid as well as AMR For runs using >2 processors, 98-99% of ghost zone execution time is MPI processing.
4
Ghost Zone Resolution Options Duplex Transmission Old version swaps ghost zone data serially between two processors. Duplex transmission would have the two processors handle sending, receiving and copying concurrently. Pros: Reduces the amount of duplicated overhead. Makes more efficient use of worker processors. Cons: Little reduction in the amount of MPI overhead. Still has a high computation cost relative to the number of nodes. Status: In progress
5
Alternate option: Ghost Zone broadcast Use the MPI Broadcast routines to have a grid send all its ghost zones to its neighbors at once, who then process that data and broadcast their own ghost zones when it is their turn. Pros: Eliminates need for pairwise iteration over level (i.e., transfer would only be done once per grid). Cons: Potential congestion if all a grid’s neighbors are on the same processor. No guarantee that it’s an improvement over pairwise duplex transmission. Status: Speculative
6
Load Balancing Does it need to be done as often? Ramses code only rebalances every ten frames. Re-gridding happens locally as usual, but it is assumed that the AMR structure does not change enough between two iterations to warrant a load-rebalance. Pros: Significant reduction in MPI overhead (BalanceLoads() gets called a lot). Non-MPI overhead will likely be reduced as well, as the current load-balancing scheme recalculates the load across the entire Forest. Cons: “patch-based AMR” vs. “tree-based AMR”; can it be adapted to AstroBEAR? Requires implementation of some Hilbert-space algorithm—how complex/computationally intensive? Status: Speculative
7
Re-Gridding Parallelization Parallelization of re-gridding is handled using MPI and OpenMP Problem: MPI-1 limits thread usage Only one thread for the worker processors and two for the master processor. Only one thread on each processor is MPI-capable. Performance bottlenecks happen if one processor gets tied up.
8
01 23 01 23 MPI with OpenMP, multi-thread MPI with OpenMP, single thread Advantage of Multiple Threads
9
Unfortunately... LAM MPI is not thread-safe. You can write multi-threaded applications using LAM MPI, but it is explicitly not thread-safe and so we would be responsible for maintaining MPI exclusion. In a collaborative development environment like AstroBEAR, this is a bad idea. LAM is making noise about supporting this eventually, but they're not there yet. Alternatives: Improve efficiency of pairwise message passing. Offload more re-gridding computation to worker processors. Status: We're looking at it.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.