2D AFEAPI Overview Goals, Design Space Filling Curves Code Structure Key generation/indexing scheme, mesh partitioning, ordering, hashtable Code Structure Preprocessing, AFEAPI code, postprocessing Node, element, hashtable Available at: http://wings.buffalo.edu/eng/mae/acm2e/afeapi_download.html
Difficulties of Parallel Adaptive Codes Dynamic allocation of memory Dynamic creation/deletion of objects (eg. nodes, elements) Dynamic load-balancing Dealing with on processor, off-processor and global information Global constraints
Infrastructure Requirements Ability to insert and delete objects during simulation dynamic allocation and de-allocation of memory Automatically distribute/redistribute data and computation among processors Dynamic Load Balancing Maintain irregularity and other refinement constraints on distributed data
Data/Computation Management Persistent Geometric/Mesh Data Geometry: Vertices, Edges, Faces, regions (Shephard,Flaherty …) Mesh: Nodes, Elements, Edges(?), Faces(?) Dynamic Computational Data: Matrices, Intermediate Solution Vectors Matrices generated as additive blocks following the distribution of the mesh Vectors follow distribution of nodes
Data/Computation Management Data Distribution/Scheduling is achieved by: Assigning a key to each object that is derived from the data itself These keys define a simple ordering scheme for the data Partitioning the key space produces a data/computation distribution
Space Filling Curve hn is continuous hn can completely fill a unit n dimensional hypercube there exists For any set of points
If points in the n-dimensional hyper-cube are close to each other then Space Filling Curve Characteristics of the SFC ordering (important in the case of adaptive meshes): Geometric locality: If points in the n-dimensional hyper-cube are close to each other then the images under are also close to each other in the mean sense -- can help cache performance too Sub-cube property: If the entire domain split up in a recursive fashion, the curve passes through all points in each sub-cube at a particular level before going through points in a neighboring sub-cube. Self similarity Fractals The curve can be generated from a basic stencil Possible future use for integrating GIS and simulation data
Space Filling Curve A unique key (identifier) of an object can be created from the SFC The key is stored as an array of size keylength of unsigned integers Map the location from [0,1]d to the key space K[0,Maximum Unsigned Integer*keylength] For the key, the left most bit is the most important and the right most bit is the least important Objects with keys that have “close” values for the leading digits of a key are near each other and objects that do not have “close” values for the leading digits of a key will likely be far from each other
Mesh Partitioning To achieve good parallel efficiency computational load should be equally distributed (if computational load changes dynamically, mesh partitioning needs to be done dynamically as well) Good partitioning: load balanced; communication minimized (all the processors are equally used); (communication takes a lot of time) Two basic types of partitioning are based on: graph (connectivity) of the mesh Mesh quality Cost High High Geometric/mesh traversal Mesh quality Cost Fair Low
Mesh Partitioning and Repartitioning The Space Filling Curve (SFC) based algorithm Geometric mesh partitioning algorithm IDEA: Use the SFC ordering because it is easier to deal with ordering objects and splitting sorted list in 1 dimension then in dimensions greater than 1 Determine the key of each element given by the SFC algorithm Sort the list of objects by their keys Divide the list to P equal pieces
Mesh Partitioning and Repartitioning Problem Repartitioning causes massive data migration and loss of parallel efficiency Solution Predictive load balancing strategies are used to compute incremental modifications! Load balancing is performed before mesh is adapted based on expected amount of work after mesh adaptation.
Predictive Load Balancing
Hashtable Hashtable is used to access objects quickly while decreasing memory usage Hashtable size should be much larger than the amount of objects that are accessed through the hashtable
Hashing Object is put in hashtable according to its SFC key Address calculator finds an object’s address from the SFC key and the minimum and maximum key for a given processor Minimum and maximum key values are calculated at creation of the hashtable – possible to have an object that has a key less than the minimum key or greater than the maximum key, then object will be put in the beginning or end of the hashtable, respectively Objects with larger key values are put in after objects with smaller key values If the particular place in the hashtable is already occupied, objects at that level are stored in linked list
AFEAPI Hashtable
Code Structure
1. Create mesh file (preferable using Hypermesh) Using AFEAPI Instructions at: http://wings.buffalo.edu/eng/mae/acm2e/afeapi_download.html (use version AFEAPI_VBR_04.tar.gz) 1. Preprocessing 1. Create mesh file (preferable using Hypermesh) 2. Run serial preprocessing code specifying material properties and amount of processors to use 2. Run parallel code 3. Serial postprocessing in Tecplot
Code Structure Set up MPI (Initialize, create MPI communication structures) Read in data/Create persistent data (eg. Hashtable) Create local and global ordering of dof Solve Calculate sparse storage info (VBR sparse storage scheme) Assemble stiffness matrices (eliminate bubble dof if they exist) Solve global system (reconstruct bubble dof if they were eliminated) Postprocess Calculate constrained nodes Put solution in node objects Calculate error estimate Create results file for use in Tecplot If error estimate is below desired tolerance: end, otherwise, go on to 7 Perform predictive load balancing/mesh partitioning Refine the element size (h adapt) Refine the element polynomial order (p adapt) Smooth load balance/mesh partitions Go to Step 3
Can be recycled from old FORTRAN codes!! Code Customization Customization for adaptive static hp-FEM requires providing routines to compute element stiffness and error e.g. subroutine elemcom(ifg, nequ, ndff, Nc, Norder, Nelb, bcvalue, Icon, xnod, ek, ef) subroutine errest(nequ, Norder, xnod, Utemp, Nelb, bcvalue, Icon, errorsq, solsq) Can be recycled from old FORTRAN codes!! For dynamic calculations or other discretization methods, customization may be a little mover involved
Important C++ Classes in AFEAPI 3 major classes: node class, element class, hashtable class HASHTABLE class (csrc/header/hashtab.h): Used for accessing node and element objects NODE class (csrc/header/node.h): Key is generated from node coordinates 3 Types of nodes: vertex, edge and bubble ELEMENT Class (csrc/header/element2.h) Only use/allow quadrilateral elements Geometry is defined by 9 nodes Element key is generated from bubble node coordinates Node ordering is counterclockwise