Download presentation
Presentation is loading. Please wait.
Published byHarriet Gray Modified over 9 years ago
1
Partitioning using Mesh Adjacencies Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes reasonable time In the case of unstructured meshes, a graph node is represented as a mesh region, mesh adjacencies define edges Mesh adjacencies are a more complete representation then a standard partition graph All mesh entities can be considered (graph has to decide what defines graph nodes, information on the adjacencies that define the graph edges lost) Any adjacency obtained in O(1) time, as apposed to having to construct multiple graphs (assuming use of a complete mesh adjacency structure) Possible advantages Avoid graph construction (assuming you have needed adjacencies) Account for multiple entity types – important for the solve process - typically the most computationally expensive step Easy to use with diffusive procedures, but not ideal for “global” balancing Disadvantage Lack of well developed algorithms for parallel partitioning operations directly from mesh adjacencies
2
ParMA: Partition Improvement Improve scaling of applications by reducing imbalances through exchange of mesh regions between neighboring parts Current algorithm focused on improved scalability of the solve by accounting for balance of multiple entity types Imbalance is limited to a small number of heavily loaded parts, referred to as spikes, which limit the scalability of applications Example: Reduce the small number of entity imbalance spikes at the cost of an increase in imbalance in regions which was the entity used as the nodes in the standard graph Similar approaches can be used to: Improve balance when using multiple parts per process - may be as good as full rebalance for lower total cost Improve balance during mesh adaptation – likely want extensions past simple diffusive methods
3
Example of C 0, linear shape function finite elements Assembly sensitive to mesh element imbalances Solve sensitive to mesh vertex imbalances since vertices hold the dof – dominant computation Heaviest loaded part dictates solver performance Element-based partitioning results in spikes of dofs ParMA: Application Requirements element imbalance increased from 2.64% to 4.54% dof imbalance reduced from 14.7% to 4.92%
4
ParMA: Algorithm Input: Types of mesh entities need to be balanced (Rgn, Face, Edge, Vtx) The relative importance (priority) between them (= or >) The balance of entities not specified in the input are not explicitly improved or preserved Mesh with complete representation and communication, computation and migration weights for each entity Algorithm: From high to low priority if separated by “>” (different groups) From low to high dimensions based on entities topologies if separated by “=” (same group) Compute migration schedule Select regions for migration and migrate e.g., “Rgn>Face=Edge>Vtx” is the user’s input Step 1: improve balance for mesh regions Step 2.1: improve balance for mesh edges Step 2.2: improve balance for mesh faces Step 3: improve balance for mesh vertices
5
ParMA: Application Defined Partition Criteria Application defined priority list of entity types such that imbalance of high priority types is not increased when balancing lower priority types Satisfying multiple constraints simultaneously is difficult as more are added Multi-constraint graph based partitioning methods balance all constraints equally [Karypis1999, Karypis2003, Aykanat2008] Constraint priorities give flexibility to element migration and selection procedures that can result in increased partition quality Quantify balance requirements with application defined weights on mesh entities communication, computation, and data migration
6
ParMA: Migration Schedule Coordination needed to migrate elements between parts without ‘stepping on toes’ Ex) Consider three adjacent parts, two of which are heavily loaded, the other lightly. The two heavily loaded parts migrate elements to the lightly loaded part making it heavily loaded. Migrate computational load to the correct part Multilevel graph schemes create several partitions before converging to the final partition – the mesh element migration cost only paid once to create the final partition Apply Hu and Blake’s diffusive solution algorithm to determine low migration cost migration schedule that balances computational load for a given mesh entity type. [HuBlake] - Green parts are overweight by 10 - White parts are underweight by 10 - Yellow parts have average weights - The diffusive solution is noted on each edge Figure 1. Diffusive Solution [Dongarra2002]
7
ParMA: Region Selection Vertex: The vertices on inter-part boundaries bounding a small number of regions on source part P0; tips of ‘spikes’ Edge: The edges on inter-part boundaries bounding a small number of faces; ‘ridge’ edges with (a) 2 bounding faces, and (b) 3 bounding faces on source part P0 Face/Region: Regions which have two or three faces on inter-part boundaries; (a) ‘spike’ region (b) region on a ‘ridge’ Apply KL/FM like greedy heuristic to measure the relative change, or gain, in communication cost if a given mesh element is migrated Migrate regions that have large ratio of computational cost to migration cost – high ‘bang for the buck’
8
ParMA: Strong Scaling – 1B Mesh up to 160k Cores AAA 1B elements: effective partitioning at extreme scale with and without ParMA (uniform weights, iterative migration using simple schedule) Full system Without ParMAwith ParMA PMod (see graph)
9
ParMA: Tests 133M region mesh on 16k parts Table 1: Users input Table 2:Balance of partitions Table 3: Time usage and iterations (tests on Jaguar Cray XT5 system)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.