The Charm++ ParFUM Framework PARallel Framework for Unstructured Meshing presented by: Isaac Dooley Parallel Programming Lab University Illinois Urbana-Champaign.

Slides:



Advertisements
Similar presentations
Distributor meeting October 2007
Advertisements

Three types of remote process invocation
MPI Message Passing Interface
Part IV: Memory Management
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
© Fluent Inc. 4/16/ Introductory GAMBIT Notes GAMBIT v2.0 Jan 2002 Fluent User Services Center Edge and Face Meshing.
The Assembly Language Level
 2005 Pearson Education, Inc. All rights reserved Introduction.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
File Management Systems
CSE351/ IT351 Modeling And Simulation Choosing a Mesh Model Dr. Jim Holten.
CSE351/ IT351 Modeling and Simulation
SSS Software Update Ian Buck Mattan Erez August 2002.
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.
1 LiveViz – What is it? Charm++ library Visualization tool Inspect your program’s current state Client runs on any machine (java) You code the image generation.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
1 ATPESC 2014 Vijay Mahadevan Tutorial Session for Scalable Interfaces for Geometry and Mesh based Applications (SIGMA) FASTMath SciDAC Institute.
Chapter 3 Memory Management: Virtual Memory
ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.
Parallelization Of The Spacetime Discontinuous Galerkin Method Using The Charm++ FEM Framework (ParFUM) Mark Hills, Hari Govind, Sayantan Chakravorty,
Data Structures Using C++ 2E
1 Data Structures for Scientific Computing Orion Sky Lawlor charm.cs.uiuc.edu 2003/12/17.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar.
ANSYS Fundamentals This document contains no technical data subject to the EAR or the ITAR.
Molecular Dynamics Sathish Vadhiyar Courtesy: Dr. David Walker, Cardiff University.
Agenda Project discussion Modeling Critical Sections in Amdahl's Law and its Implications for Multicore Design, S. Eyerman, L. Eeckhout, ISCA'10 [pdf]pdf.
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
An introduction to the finite element method using MATLAB
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
1 FEM Framework Tutorial Sayantan Chakravorty 10/19/2004.
Adaptive Mesh Modification in Parallel Framework Application of parFUM Sandhya Mangala (MIE) Prof. Philippe H. Geubelle (AE) University of Illinois, Urbana-Champaign.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Workshop on Operating System Interference in High Performance Applications Performance Degradation in the Presence of Subnormal Floating-Point Values.
Parallel Solution of the Poisson Problem Using MPI
Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling.
Ghost Elements. Ghost Elements: Overview Most FEM programs communicates via shared nodes, using FEM_Update_field Most FEM programs communicates via shared.
Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
9/12/99R. Moore1 Level 2 Trigger Software Interface R. Moore, Michigan State University.
Implementation: Charm++ Orion Sky Lawlor
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
Linux Operations and Administration
1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
1 Network Access to Charm Programs: CCS Orion Sky Lawlor 2003/10/20.
Lecture 1 Page 1 CS 111 Summer 2013 Important OS Properties For real operating systems built and used by real people Differs depending on who you are talking.
Scalable Dynamic Adaptive Simulations with ParFUM Terry L. Wilmarth Center for Simulation of Advanced Rockets and Parallel Programming Laboratory University.
Basic Concepts of FEM Framework & API
In-situ Visualization using VisIt
File System Implementation
ParFUM: High-level Adaptivity Algorithms for Unstructured Meshes
Performance Evaluation of Adaptive MPI
Charm++ FEM Framework Tutorial
Operation System Program 4
Component Frameworks:
Craig Schroeder October 26, 2004
GENERAL VIEW OF KRATOS MULTIPHYSICS
An Orchestration Language for Parallel Objects
Higher Level Languages on Adaptive Run-Time System
SPL – PS1 Introduction to C++.
Presentation transcript:

The Charm++ ParFUM Framework PARallel Framework for Unstructured Meshing presented by: Isaac Dooley Parallel Programming Lab University Illinois Urbana-Champaign

ParFUM Overview

Charm Workshop Why use the ParFUM Framework? Publicly Available (included with Charm++) Makes parallelizing a serial code faster and easier Handles mesh partitioning Handles communication Handles load balancing (via Charm++) Bindings for Fortran, C++, C Provides Advanced features Parallel Adaptivity and Incremental Mesh Modification Visualization Support with ParaView or NetFEM Collision Detection Library Iterative Matrix Solver Library (IFEM)

Charm Workshop Features Overview Originally Designed for Finite Element/Volume Problems Parallel Partitioning Field registration and updating for shared nodes, ghosts -- “User can own Data” Arbitrary element types, and arbitrary user data associated to elements or nodes Boundary condition representation for faces, edges and nodes Efficient ID translation mechanism for communication Mesh Modification Support 3-D Fractography in ParFUM Rocket Burn Simulation, CSAR

Charm Workshop ParFUM Framework Users CSAR Rocflu: Fluids solver RocRem: Remeshing / Parallel Solution Transfer CPSD SpaceTime meshing Frac3D Fracture Mechanics Dendritic Growth Metal Solidification process

Charm Workshop Outline for the rest of this talk ParFUM concepts & philosophy Program Structure Synchronization of Ghost & Shared entities Topological Relationships (Adjacencies) Parallel Mesh Adaptivity Obtaining & Compiling ParFUM Libraries within ParFUM Collision Detection Visualization Tools One Sample Program

ParFUM Concepts

Charm Workshop ParFUM Basics Nodes Points in the problem domain’s space. Elements Discretized areas or volumes representing portions of a problem domain Arbitrary Dimension and Type Each element is defined by a set of nodes Data Attributes, can be associated with nodes or Elements Data can be owned by either Framework or Application

Charm Workshop Serial Mesh N5N4N2E3 N4N2N1E2 N4N3N1E1 Surrounding Nodes Element

Charm Workshop Partitioning Partition the ParFUM Mesh into multiple chunks Distribute elements, replicate shared nodes and/or add ghosts Keep track of communication Partition so that communication is minimized

Charm Workshop Partitioned Mesh N3N2N1E2 N4N3N1E1 Surrounding Nodes Element N3N2N1E1 Surrounding Nodes Element Shared Nodes N3N4 N1N2 BA

Charm Workshop ParFUM Parallel Model: Shared Nodes “Shared Node” model Element computations based on values of surrounding nodes Node values are sum of surrounding elements Example: Mechanical Simulation Element stresses are computed from element’s nodal displacements

Charm Workshop ParFUM Parallel Model: Ghosts “Ghost” model Element computations based only on values of surrounding nodes and elements Example: Fluid Dynamics Element pressures and velocities come from neighboring element pressures and velocities, and node locations

Charm Workshop Virtualization Charm++ Runtime System Applications built using migratable objects Each ParFUM mesh chunk mapped to a migratable object. Virtualization = multiple migratable objects per processor Load Balancing High(90-100%) Processor Utilization and Scaling. System Implementation User View

Charm Workshop ParFUM Application On Eight Physical Processors Benefit of Virtualization to Structural Dynamics Applications

ParFUM Program Structure

Charm Workshop Consists of at least two user-written subroutines init driver init is called on chunk 0 driver is called on every chunk Structure of an ParFUM Application

Charm Workshop init() subroutine init read the serial mesh and configuration data inform the framework about the mesh end subroutine

Charm Workshop driver() subroutine driver get local mesh chunk time loop ParFUM computations communication more ParFUM computations end time loop end subroutine

Charm Workshop Structure of an ParFUM Application init() Update driver

Charm Workshop Framework Owns Data User gives framework mesh in init using FEM_Mesh_Data() User Accesses mesh by calling FEM_Mesh_Data() in driver to copy data into the user’s arrays User Owns Data User registers mesh with framework in init using FEM_Register_entity() and FEM_Register_array() User uses their own data arrays in driver User must also supply a resize function, to be called when framework partitions mesh and resizes mesh in adaptive refinement calls. Data Management In ParFUM

Charm Workshop void Mesh_data( int mesh, int entity, int attr, void *data, int first, int length, int datatype, int width ); Get/Set, and multi-mesh support NODE, ELEM, SPARSE (+GHOST) DATA, CONN, SYM,GLOBALNO,… User data: width x length array Apply to fir st…first+length-1 User data formatting Setting/Getting Mesh Data

Charm Workshop Mesh_data( FEM_Mesh_default_read() FEM_Mesh_default_read(), FEM_NODE, FEM_DATA+23, coord, 0,nNodes, FEM_DOUBLE,3 ); Mesh Access: Example 1 Read from default mesh Access nodal data Some user defined attribute #23 User data: 3 x nNodes array Get data for nodes 0 to nNodes 3 doubles for each node

Charm Workshop Mesh_data( FEM_Mesh_default_write() FEM_Mesh_default_write(), FEM_ELEM, FEM_CONN, conn, 0,nElem, FEM_INDEX_0, 3 ); Mesh Access: Example 2 Write to default mesh Modify element data Element Connectivity Connectivity Array: 3 x nElem For elements 0 to nElem 3 integers for each element (0-based C-style indexing)

Charm Workshop Mesh_data( FEM_Mesh_default_read() FEM_Mesh_default_read(), FEM_ELEM+FEM_GHOST, FEM_CONN, conn, 0,nElem, FEM_INDEX_0,3 ); Mesh Access: Example 3 Read from default mesh Access ghost element data

ParFUM Ghost Elements

Charm Workshop Ghost Elements: Overview Most ParFUM programs communicate via shared nodes Some computations require read-only copies of remote elements—“ghosts” Stencil-type finite volume computation Many kinds of mesh modification

Charm Workshop Ghosts: 2D Example Ghost of 3 Ghost of 2 34 Serial Mesh Left Chunk Right Chunk

Charm Workshop Defining Ghost Layers: Add ghost elements layer-by-layer from init A chunk will include ghosts of all the elements it is connected to by “tuples”—sets of nodes For 2D, a tuple might be a 2-node edge For 3D, a tuple might be a 4-node face You specify a ghost layer with Add_ghost_layer(tupleSize,ghostNodes) ghostNodes indicates whether to add ghost nodes as well as ghost elements.

Charm Workshop Ghosts: Node adjacency /* Node-adjacency: triangles have 3 nodes */ FEM_Add_ghost_layer(1,0); /* 1 node per tuple */ const static int tri2node[]={0,1,2}; FEM_Add_ghost_elem(0,3,tri2node);

Charm Workshop Ghosts: Edge adjacency /* Edge-adjacency: triangles have 3 edges */ FEM_Add_ghost_layer(2,0); /* 2 nodes per tuple */ const static int tri2edge[]={0,1, 1,2, 2,0}; FEM_Add_ghost_elem(0,3,tri2edge); 0 1 2

ParFUM Ghosts & Shared Node Synchronization

Charm Workshop Node Fields Framework handles combining data for shared nodes and keeps them in sync Framework does not understand meaning of node fields, only their location and types Framework needs to be informed of locations and types of fields Create_field once, Update_field every timestep

Charm Workshop Create a Field integer function FEM_Create_simple_field( datatype, len) integer, intent(in) :: datatype, len

Charm Workshop Update Field: Shared Nodes subroutine FEM_Update_Field(fid,nodes) integer, intent(in) :: fid varies, intent(inout) :: nodes Can be used to sum forces contributed to a node in FE Method

Charm Workshop Update Field: Ghosts subroutine Update_ghost_field(fid,elType,elts) integer, intent(in) :: fid,elType varies, intent(inout) :: elts Can be used to update ghost copies of local nodes(with current displacements) in FE Method

Topological Relationships: Adjacencies

Charm Workshop Adjacencies User registers element to node connectivity Framework can derive other relationships: Element to Element Node to Node Node to Element User specifies a “tuple” to specify when two arbitrary elements are adjacent. The framework uses this to construct the other adjacencies if needed.

Charm Workshop Adjacencies const int triangleFaces[6] = {0,1,1,2,2,0}; FEM_Add_elem2face_tuples(mesh,0,2,3,triangleFaces); FEM_Mesh_create_elem_elem_adjacency(mesh); FEM_Mesh_create_node_elem_adjacency(mesh); FEM_Mesh_create_node_node_adjacency(mesh); User must explicitly have framework generate the adjacency tables

Charm Workshop Adjacencies Get a list of elements adjacent to element e: e2e_getAll(int e, int *neighbors); Get a list of elements adjacent to node n: n2e_getAll(int n, int **adjelements, int *sz); The adjacency tables are created as attributes which can be accessed via Mesh_Data() or by simple accessor functions.

Charm Workshop Adjacencies Caveats or Limitations: Currently some of the adjacency functions only work correctly for meshes containing one type of element. Two nodes are adjacent if and only if the two nodes are listed in the connectivity for some element. Thus in a rectangular grid, two nodes on opposite corners of a rectangle will be listed as adjacent nodes.

How Do I get a copy of ParFUM and Run a ParFUM Application?

Charm Workshop ParFUM or FEM? FEM is the old name for ParFUM FEM and ParFUM are currently the same library in Charm++ The FEM framework as described in the manual does not contain some new features: Adjacency Data Structures, Adaptivity, Mesh Modification The codebase currently uses the old naming conventions of FEM, functions start with “FEM_” The current version of FEM/ParFUM can no longer run apart from Charm++

Charm Workshop Where to Get It ? ParFUM is included in Charm++ CVS distribution CSH: setenv CVSROOT Or BASH: export You should now be able to do a > cvs login (no password needed, just type [Enter] at prompt) and then > cvs co -P charm ParFUM-FEM is in charm/src/libs/ck-libs/fem

Charm Workshop How to Build It ? > cd charm and do >./build LIBS net-linux -O This will make a net-linux directory, with bin, include, lib etc subdirectories.

Charm Workshop How to Compile & Link ? Use “charmc”: available under bin a multi-lingual compiler driver, understands f90 Knows where modules and libraries are Portable across machines and compilers Linking use “-language femf” : for F90 Use “–language fem” : for C/C++ See example Makefiles charm/examples/fem/…

Charm Workshop How to Run ? Charmrun A portable parallel job execution script Specify number of processors: +pN Specify number of chunks: +vpN Special “nodelist” file for net-* versions./charmrun./pgm +vp100 +p16

Charm Workshop Example./charmrun pgm +p4 +vp70 Nodelist File: $(HOME)/.nodelist group main ++shell ssh host tur0001.cs.uiuc.edu host tur0002.cs.uiuc.edu host tur0003.cs.uiuc.edu host tur0004.cs.uiuc.edu

ParFUM Parallel Mesh Adaptivity

Charm Workshop Mesh Adaptivity in ParFUM 2D refinement 2D coarsening 3D refinement 3D coarsening -- coming soon Support for user defined mesh operations.

Charm Workshop Mesh Modification Primitives ParFUM provides incremental asynchronous parallel mesh modificiation primitives: Add_node() Remove_node() Add_element() Remove_element() Higher Level Operations are built using these primitives edge_bisect() edge_flip() edge_collapse() …

Mesh Adjacency e2e,e2n,n2e,n2n Generate, Modify … Mesh Modification Lock(),Unlock() Add/Remove Node() Add/Remove Element() Nodes: Local, Shared, Ghost Elements: Local,Ghost SDG Application API (serial or parallel) ParFUM Structure Mesh Adaptivity Edge Flip, Edge Bisect, Edge Contract, … ParFUM

Charm Workshop Mesh Modification Examples Edge Flip: Remove elements e1 Remove element e2 Add element (n1,n2,n4) Add element (n2,n3,n4)

Charm Workshop Mesh Modification Examples Edge Bisect: Remove elements e1 Remove element e2 Add node Add element (n1,n2,n5) Add element (n3,n5,n2) Add element (n4,n5,n3) Add element (n4,n1,n5)

Charm Workshop Mesh Modification in Parallel Mesh on Processor 1 before edge flip Mesh on Processor 2 before edge flip Mesh on Processor 2 after edge flip

Charm Workshop Mesh Modification in ParFUM Primitive Operations must do the following: Perform the operation on local and all applicable remote processors Convert local nodes to shared nodes when they become part of the new boundary Update ghost layers(nodes and elements) for all applicable processors. The ghost layers can grow or shrink

Charm Workshop Parallel Mesh Refinement/Coarsening Parallel Refinement and Coarsening algorithms in action: A wave propagates through a block across 8 processors

Charm Workshop Parallel 3D Refinement Built from same primitives Longest edge based refinement Longest face based refinement Currently implementing 3D coarsening

Charm Workshop Refinement and Coarsening in a 2D ParFUM Application Shock propagation and reflection down the length of the bar Adaptive mesh modification to capture the shock propagation

Charm Workshop Solution Comparison Initial Mesh Adaptive Mesh Fine Mesh

ParFUM Collision Detection

Charm Workshop Charm++ Collision Detection Detect collisions/intersections/contacts between objects scattered across processors Built on Charm++ Arrays Overlay regular 3D sparse grid of voxels (boxes) Send objects to all voxels they touch Collect collisions from each voxel Collision response is left to caller

ParFUM Mesh Visualization

Charm Workshop Mesh Visualization in ParFUM Currently we support: NetFEM Online Visualization NetFEM Offline Visualization ParaView Offline Visualization

Charm Workshop NetFEM Client: pretty pictures Wave dispersion off a crack (simplified frac3d)

Charm Workshop ParFUM Unified Visualization Interface n=NetFEM_Begin(…); NetFEM_Nodes(…); NetFEM_Vector(…); NetFEM_Elements(…); NetFEM_Scalar(…); NetFEM_End(n); Start output for one timestep, specifying Output method(online/offline) Specify number of nodes and coord. Give an array of nodal vector data Give a second array of nodal data Specify number of elements and conn. Give an array of scalar element data Give a second array of element data Give a third array of element data NetFEM_End(n);

Charm Workshop NetFEM n=NetFEM_Begin(2,t,NetFEM_POINTAT); NetFEM_Nodes(n,nnodes,(double *)g.coord,"Position (m)"); NetFEM_Vector(n,(double *)g.d,"Displacement (m)"); NetFEM_Vector(n,(double *)g.v,"Velocity (m/s)"); NetFEM_Elements(n,nelems,3,(int *)g.conn,"Triangles"); NetFEM_Scalar(n,g.S11,1,"X Stress (pure)"); NetFEM_Scalar(n,g.S22,1,"Y Stress (pure)"); NetFEM_Scalar(n,g.S12,1,"Shear Stress (pure)"); NetFEM_End(n); ParFUM Unified Visualization Interface

Charm Workshop NetFEM Online Visualization To allow the NetFEM client to connect, you add NetFEM registration calls to your server Register nodes and element types Register data items: scalars or spatial vectors associated with each node or element You provide the display name and units for each data item Link your program with “-module netfem” Run with “++server”, and connect!

Charm Workshop NetFEM: Setup n=NetFEM_Begin(FEM_My_partition(),timestep, dim,NetFEM_POINTAT) Call this each time through your timeloop; or skip timestep identifies this data update dim is the spatial dimension—must be 2 or 3 Returns a NetFEM handle n used by everything else NetFEM_POINTAT for online visualization NetFEM_End(n) Finishes update n

Charm Workshop NetFEM: Nodes NetFEM_Nodes(n,nnodes,coord,”Position (m)”) Registers node locations with NetFEM—future vectors and scalars will be associated with nodes n is the handle returned by NetFEM_Begin nnodes is the number of nodes coord is a dim by nnodes array of doubles The string describes the coordinate system and meaning of nodes Currently, there can only be one call to nodes

Charm Workshop NetFEM: Node Displacement

Charm Workshop NetFEM: Elements NetFEM_Elements(n,nelem,nodeper, conn,”Triangles”) Registers elements with NetFEM—future vectors and scalars will be associated with these elements n is the handle returned by NetFEM_Begin nelem is the number of elements nodeper is the number of nodes per element conn is a nodeper by nelem array of node indices The string describes the kind of element Repeat to register several kinds of element Perhaps: Triangles, squares, pentagons, …

Charm Workshop NetFEM: Element Stress

Charm Workshop NetFEM: Vectors NetFEM_Vector(n,val,”Displacement (m)”) Registers a spatial vector with each node or element Whichever kind was registered last n is the handle returned by NetFEM_Begin val is a dim by nitems array of doubles There’s also a more general NetFEM_Vector_field in the manual The string describes the meaning and units of the vectors Repeat to register multiple sets of vectors Perhaps: Displacement, velocity, acceleration, rotation, …

Charm Workshop NetFEM: Element Stress

Charm Workshop NetFEM: Element Velocity

Charm Workshop NetFEM: Zoom in

Charm Workshop NetFEM: Outline Elements

Charm Workshop Offline Visualization with NetFEM Call NetFEM_Begin() with NetFEM_WRITE, ParFUM application will then write out binary output dump files to a new directory called “NetFEM” containing directories for each timestep Visualize with command netfem NetFEM/10 In offline mode, the ``update'' button fetches the next extant timestep directory.

Charm Workshop Offline Visualization with ParaView The directory NetFEM can be converted to a ParaView/VTK compatible XML format. Converter can be built by calling “make” in charm/tmp/libs/ck-libs/netfem/ParaviewConverter/ Run the resulting program NetFEM_To_Paraview to convert the NetFEM directory to a new directory called ParaViewData. ParaView can open the files inside this directory either as individual chunks, or as the entire mesh. Scan through the timesteps using the standard timestep slider.

Charm Workshop Offline Visualization with ParaView Limitations: Multiple element types with varying numbers of data attributes may not work Current converter produces ASCII XML files, not binary, so they use excessive disk space. Benefits: Advanced visualization functionality More stable than offline NetFEM which crashes if you try to access non-existent timesteps, or huge meshes

Migration and Load Balancing

Charm Workshop Advanced API: Migration Chunks may not be computationally equal: Results in load imbalance Multiple chunks per processor Chunks cannot have writable global data Automatic load balancing Migrate chunks to balance load How to migrate allocated data for chunks ? Embed it in a user-defined type

Charm Workshop Chunk Data Example MODULE my_block_mod TYPE my_block INTEGER :: n1,n2x,n2y REAL*8, POINTER, DIMENSION(:,:) :: arr END TYPE END MODULE

Charm Workshop Pack/Unpack (PUP) Routine SUBROUTINE pup_my_block(p,m) USE my_block_mod USE pupmod INTEGER :: p TYPE(my_block) :: m call fpup_int(p,m%n1) call fpup_int(p,m%n2x) call fpup_int(p,m%n2y) IF (fpup_isUnpacking(p)) THEN ALLOCATE(m%arr(m%n2x,m%n2y)) END IF call fpup_doubles(p,m%arr,m%n2x*m%n2y) IF (fpup_isDeleting(p)) THEN DEALLOCATE(m%arr) END IF END SUBROUTINE

Charm Workshop Registering Chunk Data !- Fortran driver subroutine use my_block_mod interface subroutine pup_my_block(p,m) use my_block_mod INTEGER :: p TYPE(my_block) :: m end subroutine end interface TYPE(my_block) :: m CALL FEM_Register(m,pup_my_block)

Charm Workshop Migration Every chunk driver calls FEM_Migrate() Framework calls PUP for getting the size of packed data For packing data Chunk migrates to new processor Framework calls PUP for unpacking Driver returns from FEM_Migrate()

A sample ParFUM Program

Charm Workshop Simple2D Can be found in charm/examples/fem/simple2D Generates simple but interesting visualizations A constrained triangle Finite Element Code Here we will show the init and driver routines, almost exactly copied from pgm.C, omitting some comments and trivial code fragments.

Charm Workshop extern "C" void init(void) { CkPrintf("init started\n"); double startTime=CmiWallTimer(); int nPts=0; //Number of nodes vector2d *pts=0; //Node coordinates int nEle=0; connRec *ele=NULL; // Omitted for readability: // Read in mesh from users files to ele and pts // Tell framework we will be writing to the mesh int fem_mesh=FEM_Mesh_default_write();

Charm Workshop FEM_Mesh_data( fem_mesh, // Add nodes to the current mesh FEM_NODE, // We are registering nodes FEM_DATA+0, // Register the point locations which (double *)pts, // The array of point locations 0, // First node is 0 nPts, // The number of points FEM_DOUBLE, // Coordinates are doubles 2); // Points have dimension 2 (x,y) FEM_Mesh_data( fem_mesh, // Add elements to the current mesh FEM_ELEM+0, // Element type 0 FEM_CONN, // Register the connectivity table for this (int *)ele,// The array of point locations 0, // 0 based indexing nEle, // The number of elements FEM_INDEX_0,// We use zero based node numbering 3); // Elements have three nodes

Charm Workshop delete[] ele; delete[] pts; CkPrintf("Finished with init\n"); }

Charm Workshop extern "C" void driver(void){ int nnodes,nelems,ignored; int i, myId=FEM_My_partition(); myGlobals g; FEM_Register(&g,(FEM_PupFn)pup_myGlobals); int mesh=FEM_Mesh_default_read(); // Get node data nnodes=FEM_Mesh_get_length(mesh,FEM_NODE); g.coord=new vector2d[nnodes]; FEM_Mesh_data(mesh, FEM_NODE, FEM_DATA+0, (double*)g.coord, 0, nnodes, FEM_DOUBLE, 2);

Charm Workshop // Read element data from framework nelems=FEM_Mesh_get_length(mesh,FEM_ELEM+0); g.nnodes=nnodes; g.nelems=nelems; g.conn=new connRec[nelems]; g.S11=new double[nelems]; g.S22=new double[nelems]; g.S12=new double[nelems]; FEM_Mesh_data(mesh, FEM_ELEM+0, FEM_CONN, (int *)g.conn, 0, nelems, FEM_INDEX_0, 3);

Charm Workshop //Initialize associated data g.R_net=new vector2d[nnodes]; //Net force g.d=new vector2d[nnodes];//Node displacement g.v=new vector2d[nnodes];//Node velocity g.a=new vector2d[nnodes];//Node acceleration for (i=0;i<nnodes;i++) g.R_net[i]=g.d[i]=g.v[i]=g.a[i]=vector2d(0.0); int fid=FEM_Create_simple_field(FEM_DOUBLE,2);

Charm Workshop int tSteps=5000; for (int t=0;t<tSteps;t++) { //Structural mechanics //Compute forces on nodes exerted by elements CST_NL(g.coord,g.conn,g.R_net,g.d,matConst,nnodes,nelems,g.S11, g.S22,g.S12); //Communicate net force on shared nodesFEM_Update_field(fid,g.R_net); //Advance node positions advanceNodes(dt,nnodes,g.coord,g.R_net,g.a,g.v,g.d,0); /* perform migration-based load balancing */ if (t%1024==0) FEM_Migrate();

Charm Workshop if (t%1024==0) { //Publish data for Visualization NetFEM n=NetFEM_Begin(FEM_My_partition(),t, 2,NetFEM_POINTAT); NetFEM_Nodes(n,nnodes,(double*)g.coord,"Position (m)"); NetFEM_Vector(n,(double*)g.d,"Displacement (m)"); NetFEM_Vector(n,(double*)g.v,"Velocity (m/s)"); NetFEM_Elements(n,nelems,3,(int*)g.conn,"Triangles"); NetFEM_Scalar(n,g.S11,1,"X Stress (pure)"); NetFEM_Scalar(n,g.S22,1,"Y Stress (pure)"); NetFEM_Scalar(n,g.S12,1,"Shear Stress (pure)"); NetFEM_End(n); } } // end timeloop }// end driver

Conclusions

Charm Workshop Conclusions: ParFUM ParFUM is a functional, freely available Unstructured Mesh Framework. ParFUM has a wide range of useful features: Load Balancing, Fault Tolerance, … Parallel Partitioning Parallel Incremental Adaptivity, Remeshing, and Refinement Visualization Tools Collision/Contact Detection Library Matrix Solver

ParFUM Matrix Solver Library

Charm Workshop IFEM: Intro to Matrix-based ParFUM Many ParFUM applications use direct Matrix-Free methods for solving. ParFUM does provide a matrix solver library.

Charm Workshop IFEM: Intro to Matrix-based ParFUM Normal computations run element-by-element Only consider one element at a time Matrix-based computations are holistic One big matrix represents the whole system Solve the matrix == solve the problem Row of matrix solves for “Degree of Freedom” E.g., a node’s X coordinate Typically a row has only a few nonzero entries Around 10 6 degrees of freedom rows

Charm Workshop Rows is a Big Matrix Normal matrix operations are O(n 3 ) Can’t use Gaussian elimination, Jacobi, QR, … Almost everybody uses iterative solvers Based on a fast “guess-and-check” method Each iteration consists of a matrix-vector product Typically converges in 10 to 100 iterations Popular iterative solvers: Conjugate gradient (CG) Bidirectional Conjugate Gradient (BiCG) Generalized modified residual (GMRES)

Charm Workshop Matrix Example: ZHooke’s spring law: ZF=-kx ZF1=-k(D1-D2) ZLinear relation between node force and node displacement ZElement corresponds to little 2x2 matrix F1 F2 = D1 D2 -k k k -k D1D2

Charm Workshop Matrix Example: 2 Elements F1 F2 F3 = D1 D2 D3 -k k 0 k -k-k k 0 k -k D1D2 D3 AB

Charm Workshop Matrix Example: Derivation F1 F2 F3 = D1 D2 D3 -k k k –k-k k k -k D1D2 D3 AB A B

Charm Workshop Matrix Example: FEM D1D2 D3 AB D2 F1 F2A = D1 D2 -k k k -k F2B F3 = D2 D3 -k k k -k Shared Node Shared Node Later, sum partial forces across shared nodes to get: F2=F2A+F2B

Charm Workshop IFEM: Bottom Line If you can do matrix-vector product, you can run iterative solvers like conjugate gradient Matrix-vector product amounts to the usual ParFUM calculation (displacements to forces), plus the usual ParFUM communication (update shared nodes) You don’t even need to store the matrix Users can easily solve iterative problems with the ParFUM framework and IFEM solvers!

The End Thanks