7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar.

Slides:

Advertisements

Similar presentations

(a) Fracture Behavior Computer Simulation (d) Lab Testing

Advertisements

A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.

Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.

Coupling Continuum Model and Smoothed Particle Hydrodynamics Methods for Reactive Transport Yilin Fang, Timothy D Scheibe and Alexandre M Tartakovsky Pacific.

Parallel Algorithms in STAPL Implementation and Evaluation Jeremy Vu, Mauro Bianco, Nancy Amato Parasol Lab, Department of Computer.

Network Morphospace Andrea Avena-Koenigsberger, Joaquin Goni Ricard Sole, Olaf Sporns Tung Hoang Spring 2015.

Parallel Simulations of Fracture and Fragmentation I. Arias, J. Knap and M. Ortiz California Institute of Technology Figure 1. The tractions at each side.

Software Version Control SubVersion software version control system WebSVN graphical interface o View version history logs o Browse directory structure.

Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.

Fracture and Fragmentation of Thin-Shells Fehmi Cirak Michael Ortiz, Anna Pandolfi California Institute of Technology.

Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.

Akhil Langer, Harshit Dokania, Laxmikant Kale, Udatta Palekar* Parallel Programming Laboratory Department of Computer Science University of Illinois at.

Finite element computations of the dynamic contact and impact of fragment assemblies is notoriously difficult, due to the strong non-linearity and non-smoothness.

Bridge the gap between HPC and HTC Applications structured as DAGs Data dependencies will be files that are written to and read from a file system Loosely.

Parallelization: Conway’s Game of Life. Cellular automata: Important for science Biology – Mapping brain tumor growth Ecology – Interactions of species.

Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.

The sequence of graph transformation (P1)-(P2)-(P4) generating an initial mesh with two finite elements GENERATION OF THE TOPOLOGY OF INITIAL MESH Graph.

ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.

October 2008 Automation components for simulation-based engineering.

Parallelization Of The Spacetime Discontinuous Galerkin Method Using The Charm++ FEM Framework (ParFUM) Mark Hills, Hari Govind, Sayantan Chakravorty,

11 If you were plowing a field, which would you rather use? Two oxen, or 1024 chickens? (Attributed to S. Cray) Abdullah Gharaibeh, Lauro Costa, Elizeu.

Knowledge Extraction from Aerodynamic Design Data and its Application to 3D Turbine Blade Geometries Lars Graening

Scalable Algorithms for Structured Adaptive Mesh Refinement Akhil Langer, Jonathan Lifflander, Phil Miller, Laxmikant Kale Parallel Programming Laboratory.

Grid Computing With Charm++ And Adaptive MPI Gregory A. Koenig Department of Computer Science University of Illinois.

Adaptive Mesh Modification in Parallel Framework Application of parFUM Sandhya Mangala (MIE) Prof. Philippe H. Geubelle (AE) University of Illinois, Urbana-Champaign.

PIMA-motivation PIMA: Partition Improvement using Mesh Adjacencies  Parallel simulation requires that the mesh be distributed with equal work-load and.

Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.

Synchronization Transformations for Parallel Computing Pedro Diniz and Martin Rinard Department of Computer Science University of California, Santa Barbara.

Supercomputing ‘99 Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms Leonid Oliker NERSC Lawrence Berkeley National Laboratory.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

Workshop on Operating System Interference in High Performance Applications Performance Degradation in the Presence of Subnormal Floating-Point Values.

Presentation by Tom Hummel OverSoC: A Framework for the Exploration of RTOS for RSoC Platforms.

Mesh Coarsening zhenyu shu Mesh Coarsening Large meshes are commonly used in numerous application area Modern range scanning devices are used.

1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©

Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011.

Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.

Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,

Data Structures and Algorithms in Parallel Computing Lecture 7.

Solid Modeling 2002 A Multi-resolution Topological Representation for Non-manifold Meshes Leila De Floriani, Paola Magillo, Enrico Puppo, Davide Sobrero.

Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.

BOĞAZİÇİ UNIVERSITY – COMPUTER ENGINEERING Mehmet Balman Computer Engineering, Boğaziçi University Parallel Tetrahedral Mesh Refinement.

CS 351/ IT 351 Modeling and Simulation Technologies Review ( ) Dr. Jim Holten.

1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

ParMA: Towards Massively Parallel Partitioning of Unstructured Meshes Cameron Smith, Min Zhou, and Mark S. Shephard Rensselaer Polytechnic Institute, USA.

1 Overview (Part 1) Background notions A reference framework for multiresolution meshes Classification of multiresolution meshes An introduction to LOD.

Large Scale Parallel Graph Coloring 1. Presentation Overview Problem Description Basic Algorithm Parallel Strategy –Work Spawning –Graph Partition Results.

Predictive Load Balancing Using Mesh Adjacencies for Mesh Adaptation  Cameron Smith, Onkar Sahni, Mark S. Shephard  Scientific Computation Research Center.

University of Texas at Arlington Scheduling and Load Balancing on the NASA Information Power Grid Sajal K. Das, Shailendra Kumar, Manish Arora Department.

Load Rebalancing for Distributed File Systems in Clouds.

Strong Scalability Analysis and Performance Evaluation of a SAMR CCA-based Reacting Flow Code Sophia Lefantzi, Jaideep Ray and Sameer Shende SAMR: Structured.

APE'07 IV INTERNATIONAL CONFERENCE ON ADVANCES IN PRODUCTION ENGINEERING June 2007 Warsaw, Poland M. Nowakiewicz, J. Porter-Sobieraj Faculty of.

 Dan Ibanez, Micah Corah, Seegyoung Seol, Mark Shephard  2/27/2013  Scientific Computation Research Center  Rensselaer Polytechnic Institute 1 Advances.

Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.

High Performance Computing Seminar

Scalable Dynamic Adaptive Simulations with ParFUM Terry L. Wilmarth Center for Simulation of Advanced Rockets and Parallel Programming Laboratory University.

Parallel Algorithm Oriented Mesh Database

Ana Gainaru Aparna Sasidharan Babak Behzad Jon Calhoun

Implementing Chares in a High-Level Scripting Language

ParFUM: High-level Adaptivity Algorithms for Unstructured Meshes

Performance Evaluation of Adaptive MPI

Component Frameworks:

GENERAL VIEW OF KRATOS MULTIPHYSICS

Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale

Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.

Higher Level Languages on Adaptive Run-Time System

Support for Adaptivity in ARMCI Using Migratable Objects

Parallel Implementation of Adaptive Spacetime Simulations A

Dynamic Load Balancing of Unstructured Meshes

Presentation transcript:

7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar Celes 1 Noemi Rodriguez 1 Glaucio H. Paulino 2 1. Computer Science Dept., Pontifical Catholic University of Rio de Janeiro, Brazil 2. Dept. of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign

7 th Annual Workshop on Charm++ and its Applications 2 Motivation Fragmentation simulations using extrinsic cohesive models –Evolutive problems in space and time –Cohesive elements inserted dynamically –Highly refined mesh at crack tip region

7 th Annual Workshop on Charm++ and its Applications 3 ParTopS 1 Parallel framework for finite element meshes –Distributed mesh representation Extension of the TopS 2 topological data structure –Parallel algorithm for inserting cohesive elements Extension of the serial algorithm by Paulino et al Espinha R, Celes W, Rodriguez N, Paulino GH (2009) ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations. Submitted to Engineering with Computers 2. Celes W, Paulino GH, Espinha R (2005) A compact adjacency-based topological data structure for finite element mesh representation. Int J Numer Methods Eng 64(11):1529– Paulino GH, Celes W, Espinha R, Zhang Z (2008) A general topology-based framework for adaptive insertion of cohesive elements in finite element meshes. Engineering with Computers 24(1):59-78

7 th Annual Workshop on Charm++ and its Applications 4 Distributed mesh representation Sample mesh

7 th Annual Workshop on Charm++ and its Applications 5 Distributed mesh representation Communication layer Proxy element Proxy node Ghost node

7 th Annual Workshop on Charm++ and its Applications 6 Topological entities Element Node Facet –Interface between elements Edge Vertex

7 th Annual Workshop on Charm++ and its Applications 7 Cohesive elements True extrinsic elements –Inserted “on the fly”, where needed and when needed –No element activation or springs Two-facet elements Inserted between two adjacent bulk elements 2D 3D

7 th Annual Workshop on Charm++ and its Applications 8 Serial insertion of cohesive elements Insert cohesive element at a facet shared by E1 and E2 1. Create cohesive element at facet 2. Traverse non-cohesive elements adjacent to edges of E2 If E1 is not visited, duplicate edge and mid-nodes (if any) 3. Traverse non-cohesive elements adjacent to vertices of E2 If E1 is not visited, duplicated vertex E1 E2 E1 E2

7 th Annual Workshop on Charm++ and its Applications 9 Parallel insertion of cohesive elements At each simulation step –Analysis application identifies fractured facets –Insert cohesive elements 1. Insert elements at local and proxy facets 2. Update new proxy entities 3. Update affected ghost entities

7 th Annual Workshop on Charm++ and its Applications 10 Identification of fractured facets

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets Serial algorithm with additional constraints –Ghost nodes are not duplicated at this moment Dependence on remote information –All the copies of a new element or node must be owned by the same partition

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets Uniform criterion for selecting representative elements –E.g. the adjacent element with smallest (partition_id, element_id) Consistent topological results in both partitions

7 th Annual Workshop on Charm++ and its Applications Insert elements at local and proxy facets

7 th Annual Workshop on Charm++ and its Applications Update new proxy entities Create references from the new proxy elements and nodes to the corresponding real entities

7 th Annual Workshop on Charm++ and its Applications Update new proxy entities

7 th Annual Workshop on Charm++ and its Applications Update new proxy entities

7 th Annual Workshop on Charm++ and its Applications Update affected ghost entities Replace ghost nodes affected by remote cohesive elements –“Per-element” approach Partitions with elements adjacent to duplicated nodes notify others

7 th Annual Workshop on Charm++ and its Applications Update affected ghost entities

7 th Annual Workshop on Charm++ and its Applications 23 Resulting mesh

7 th Annual Workshop on Charm++ and its Applications 24 Verification Cluster of 12 machines –Intel(R) Pentium(R) D processor 3.40 GHz (dual core) with 2GB of RAM, Gigabit Ethernet Cohesive elements randomly inserted at 1% of internal facets x 50 steps Meshes with different discretizations and types of elements (T3, T6, Tet4, Tet10) 10 x 1%

7 th Annual Workshop on Charm++ and its Applications 25 Results

7 th Annual Workshop on Charm++ and its Applications 26 Summary ParTopS: parallel topological framework –Dynamic insertion of cohesive elements True extrinsic cohesive elements –Inserted “on the fly”, where needed and when needed Generic branching patterns are supported General 2D or 3D meshes Executed on a limited number of machines –However, linear scaling is expected

7 th Annual Workshop on Charm++ and its Applications 27 ParTopS Implementation using Charm++ Why Charm++? –Potential for optimization –Integrated load balancers –Set of available tools Implementation (so far) –Each mesh partition as a virtual processor (Chare) –Asynchronous method calls –SDAG used to manage control flow of the three phases of the algorithm Cohesive elements created asynchronously on partitions’ boundaries Bulk updates of modified data Implementation (objectives) –Explore asynchronous communication in mesh related algorithms –Explore load balancing in parallel fragmentation applications

7 th Annual Workshop on Charm++ and its Applications 28 Future work Execute on a large number of processors Other parallel adaptive operators –E.g. refinement & coarsening Mechanical analysis computer code

7 th Annual Workshop on Charm++ and its Applications 29 Thank you!