1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

Slides:



Advertisements
Similar presentations
Parameterizing a Geometry using the COMSOL Moving Mesh Feature
Advertisements

The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
Dynamic Load Balancing for VORPAL Viktor Przebinda Center for Integrated Plasma Studies.
DEBRIS FLOWS & MUD SLIDES: A Lagrangian method for two- phase flow simulation Matthias Preisig and Thomas Zimmermann, Swiss Federal Institute of Technology.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
A Bezier Based Approach to Unstructured Moving Meshes ALADDIN and Sangria Gary Miller David Cardoze Todd Phillips Noel Walkington Mark Olah Miklos Bergou.
A Bezier Based Approach to Unstructured Moving Meshes ALADDIN and Sangria Gary Miller David Cardoze Todd Phillips Noel Walkington Mark Olah Miklos Bergou.
I/O Analysis and Optimization for an AMR Cosmology Simulation Jianwei LiWei-keng Liao Alok ChoudharyValerie Taylor ECE Department Northwestern University.
Thermo-fluid Analysis of Helium cooling solutions for the HCCB TBM Presented By: Manmeet Narula Alice Ying, Manmeet Narula, Ryan Hunt and M. Abdou ITER.
Software Version Control SubVersion software version control system WebSVN graphical interface o View version history logs o Browse directory structure.
CSE351/ IT351 Modeling And Simulation Choosing a Mesh Model Dr. Jim Holten.
CSE351/ IT351 Modeling and Simulation
Parallel Decomposition-based Contact Response Fehmi Cirak California Institute of Technology.
Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.
Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Finite Difference Methods Or Computational Calculus.
PPL-Dept of Computer Science, UIUC Component Frameworks: Laxmikant (Sanjay) Kale Parallel Programming Laboratory Department of Computer Science University.
Introduction to virtual engineering László Horváth Budapest Tech John von Neumann Faculty of Informatics Institute of Intelligent Engineering.
Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.
ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.
1CPSD NSF/DARPA OPAAL Adaptive Parallelization Strategies using Data-driven Objects Laxmikant Kale First Annual Review October 1999, Iowa City.
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
Parallelization Of The Spacetime Discontinuous Galerkin Method Using The Charm++ FEM Framework (ParFUM) Mark Hills, Hari Govind, Sayantan Chakravorty,
1 Data Structures for Scientific Computing Orion Sky Lawlor charm.cs.uiuc.edu 2003/12/17.
Scalable Algorithms for Structured Adaptive Mesh Refinement Akhil Langer, Jonathan Lifflander, Phil Miller, Laxmikant Kale Parallel Programming Laboratory.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Adaptive MPI Milind A. Bhandarkar
7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Supporting Multi-domain decomposition for MPI programs Laxmikant Kale Computer Science 18 May 2000 ©1999 Board of Trustees of the University of Illinois.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Discontinuous Galerkin Methods and Strand Mesh Generation
1 FEM Framework Tutorial Sayantan Chakravorty 10/19/2004.
Supercomputing Cross-Platform Performance Prediction Using Partial Execution Leo T. Yang Xiaosong Ma* Frank Mueller Department of Computer Science.
Adaptive Mesh Modification in Parallel Framework Application of parFUM Sandhya Mangala (MIE) Prof. Philippe H. Geubelle (AE) University of Illinois, Urbana-Champaign.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Workshop on Operating System Interference in High Performance Applications Performance Degradation in the Presence of Subnormal Floating-Point Values.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
LLNL-PRES DRAFT This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract.
1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©
ESMF Regridding Update Robert Oehmke Ryan O’Kuinghttons Amik St. Cyr.
Ghost Elements. Ghost Elements: Overview Most FEM programs communicates via shared nodes, using FEM_Update_field Most FEM programs communicates via shared.
Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,
Using Charm++ to Mask Latency in Grid Computing Applications Gregory A. Koenig Parallel Programming Laboratory Department.
Implementation: Charm++ Orion Sky Lawlor
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
CSAR Overview Laxmikant (Sanjay) Kale 11 September 2001 © ©2001 Board of Trustees of the University of Illinois.
1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
1 Network Access to Charm Programs: CCS Orion Sky Lawlor 2003/10/20.
First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Scalable Dynamic Adaptive Simulations with ParFUM Terry L. Wilmarth Center for Simulation of Advanced Rockets and Parallel Programming Laboratory University.
Basic Concepts of FEM Framework & API
Data Structures for Efficient and Integrated Simulation of Multi-Physics Processes in Complex Geometries A.Smirnov MulPhys LLC github/mulphys
In-situ Visualization using VisIt
Parallel Objects: Virtualization & In-Process Components
Programming Models for SimMillennium
Construction of Parallel Adaptive Simulation Loops
Performance Evaluation of Adaptive MPI
Component Frameworks:
GeoFEST tutorial What is GeoFEST?
Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar
GENERAL VIEW OF KRATOS MULTIPHYSICS
Parallel Implementation of Adaptive Spacetime Simulations A
Presentation transcript:

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21

2 Roadmap CSAR FEM Framework Collision Detection Remeshing

3 Dynamic, coupled physics simulation in 3D Finite-element solids on unstructured tet mesh or hex mesh Finite-volume fluids on unstructured mixed or structured hex mesh Coupling every timestep via a least-squares data transfer Challenges: Multiple developers, modules Surface of propellant is burning away: mesh adaptation Robert Fielder, Center for Simulation of Advanced Rockets CSAR: Rocket Simulation

4 CSAR: Organizational CSAR: Center for Simulation of Advanced Rockets In CSE department of UIUC Multidisciplinary groups CS: Solution transfer, Meshing Structures: Mechanics, Cracks Fluids: Turbulence, Gas, Radiation Combustion: Burn rate 100+ people (including me!)

5 CSAR: Multiple Modules Use of 2 or more CHARM++ frameworks in the same program FEM—multiple unstructured mesh chunks MBLOCK—multiple structured mesh blocks AMPI—Adaptive MPI-on-Charm++ All based on the Threaded CHARM++ framework (TCHARM) For example, Rocflu’s communication uses the FEM framework; but it’s coupled with an AMPI main program

6 Adaptive MPI-- AMPI Runs each MPI process as a user- level thread Multiple MPI processes per physical processor Cache usage, migration, load balancing,... Virtualized MPI implementation on Charm++: Debug 480-processor mesh motion bug using only 16 processors

7 Charm++ FEM Framework Handles parallel details in the runtime Leaves physics and numerics to user Presents clean, “almost serial” interface: One call to update cross-processor boundaries Not just for Finite Element computations! Now supports ghost cells Builds on top of AMPI or native MPI No longer depends on Charm directly Allows use of advanced Charm++ features: adaptive parallel computation Dynamic, automatic load balancing Other libraries: Collision, adaptation, visualization, …

8 FEM Mesh: Serial to Parallel

9 FEM Mesh: Communication Summing forces from other processors only takes one call: FEM_Update_field Can also access values from ghost elements

10 Charm++ Collision Detection Detect collisions (intersections) between objects scattered across processors Built on Charm++ Arrays Overlay regular 3D sparse grid of voxels (boxes) Send objects to all voxels they touch Collect collisions from each voxel Collision response is left to caller

11 Collision Detection Algorithm: Sparse 3D voxel grid (implemented as Charm array)

12 Serial Scaling

13 Parallel Scaled Problem

14 Remeshing As the solids burn away, the domain changes dramatically Fluids mesh expands Solids mesh contracts This distorts the elements of the mesh We need to be able to fix the deformed mesh

15 Initial mesh consists of small, good quality elements

16 As solids burn away, fluids domain becomes more and more stretched

17 As solids burn away, fluids domain becomes more and more stretched

18 As solids burn away, fluids domain becomes more and more stretched

19 As solids burn away, fluids domain becomes more and more stretched

20 As solids burn away, fluids domain becomes more and more stretched

21 As solids burn away, fluids domain becomes more and more stretched

22 As solids burn away, fluids domain becomes more and more stretched

23 As solids burn away, fluids domain becomes more and more stretched

24 Compare to original-- elements are much worse!

25 Remeshing: Solution Transfer Can use existing (off-the-shelf) tools to remesh our domain Must also handle solution data Density, velocity, displacement fields Gas pressure/temperature Boundary conditions! Accurate transfer of solution data is a difficult mathematical problem Solution data (and mesh) are scattered across processors

26 Remeshing and Solution Transfer FEM: reassemble a serial boundary mesh Call serial remeshing tools: YAMS, TetMesh™ FEM: partition new serial mesh Collision Library: match up old and new volume meshes Transfer Library: conservative, accurate volume-to-volume data transfer using common refinement method

27 Deformation has distorted elements Remeshing: Before

28 Note stretched elements on boundary Remeshing: Before (Closeup)

29 After remeshing-- better element size and shape Remeshing: After (Closeup)

30 Remeshing restores element size and quality Remeshing: After

31 Gas Velocity: Deformed Mesh

32 Gas Velocity: New mesh

33 Temperature: Deformed Mesh

34 Temperature: New Mesh

35 Remeshing: Continue the Simulation Theory: just start simulation using the new mesh, solution, and boundaries! In practice: not so easy with the real code (genx) Lots of little pieces: fluids, solids, combustion, interface,... Each have their own set of needs and input file formats! Prototype: treat remeshing like a restart Remeshing system writes mesh, solution, boundaries to ordinary restart files Integrated code thinks this is an ordinary restart Few changes needed inside genx

36 Remeshed, after solution transfer

37 Can now continue simulation using new mesh (prototype)

38 Can now continue simulation using new mesh (prototype)

39 Can now continue simulation using new mesh (prototype)

40 Remeshed simulation resolves boundary better than old!

41 Remeshing: Future Work Automatically decide when to remesh Currently manual Remesh the solids domain Currently only Rocflu is supported Remesh during a real parallel run Currently only serial data format supported Remesh without using restart files Remesh only part of the domain (e.g. burning crack) Currently remeshes entire domain at once Remesh without using serial tools Currently YAMS, TetMesh are completely serial

42 Remeshing and Solution Transfer FEM: reassemble a serial boundary mesh Call serial remeshing tools: YAMS, TetMesh™ FEM: partition new serial mesh Collision Library: match up old and new volume meshes Transfer Library: conservative, accurate volume-to-volume data transfer using common refinement method

43 Parallel Mesh Refinement To refine, split the longest edge But if split neighbor has a longer edge, split his edge first Refinement propagates across mesh, but preserves mesh quality Initial 2D parallel implementation built on Charm++ 3D version, with Delaunay flipping, in progress Interfaces with FEM Framework

44 Conclusion Charm’s advanced runtime is coming into wider use in CSAR Various features applicable to a variety of domains AMPI FEM Framework Collision Detection Charm brings these projects Faster development Better performance