Unstructured Grids at Sandia National Labs

Slides:

Advertisements

Similar presentations

Modeling Electrical Systems With EMTP-RV

Advertisements

Thermo-fluid Analysis of Helium cooling solutions for the HCCB TBM Presented By: Manmeet Narula Alice Ying, Manmeet Narula, Ryan Hunt and M. Abdou ITER.

Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory

Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.

Institute for Mathematical Modeling RAS 1 Dynamic load balancing. Overview. Simulation of combustion problems using multiprocessor computer systems For.

7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar.

Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.

The european ITM Task Force data structure F. Imbeaux.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 

CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.

ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

STK (Sierra Toolkit) Update Trilinos User Group meetings, 2014 R&A: SAND PE Sandia National Laboratories is a multi-program laboratory operated.

CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.

CS 351/ IT 351 Modeling and Simulation Technologies Review ( ) Dr. Jim Holten.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.

On the Path to Trinity - Experiences Bringing Codes to the Next Generation ASC Platform Courtenay T. Vaughan and Simon D. Hammond Sandia National Laboratories.

Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.

Examples (D. Schmidt et al)

VisIt Project Overview

Applied Operating System Concepts

Scalable Interfaces for Geometry and Mesh based Applications (SIGMA)

Ioannis E. Venetis Department of Computer Engineering and Informatics

Distributed Processors

GWE Core Grid Wizard Enterprise (

Conception of parallel algorithms

StratusLab Final Periodic Review

StratusLab Final Periodic Review

VisIt Libsim Update DOE Computer Graphics Forum 2012 Brad Whitlock

Parallel Unstructured Mesh Infrastructure

Presented by Munezero Immaculee Joselyne PhD in Software Engineering

In-situ Visualization using VisIt

Parallel Objects: Virtualization & In-Process Components

Grid Computing.

Parallel Algorithm Design

Construction of Parallel Adaptive Simulation Loops

Introduction to HDFS: Hadoop Distributed File System

HPC Modeling of the Power Grid

Toward a Unified HPC and Big Data Runtime

Trilinos Software Engineering Technologies and Integration

SDM workshop Strawman report History and Progress and Goal.

Lecture 1: Multi-tier Architecture Overview

Support for ”interactive batch”

Huanyuan(Wayne) Sheng

GENERAL VIEW OF KRATOS MULTIPHYSICS

Operating System Concepts

Overview of big data tools

The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.

Distributed computing deals with hardware

Introduction to Teradata

Gary M. Zoppetti Gagan Agrawal

So what is Target Management all about?

Trilinos I/O Support (TRIOS)

Overview of Workflows: Why Use Them?

Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.

Assoc. Prof. Marc FRÎNCU, PhD. Habil.

Operating System Concepts

MapReduce: Simplified Data Processing on Large Clusters

Games Development 2 Entity / Architecture Review

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Unstructured Grids at Sandia National Labs Jay Lofstead (with help from Paul Lin) Scalable System Software Sandia National Laboratories Albuquerque, NM, USA gflofst@sandia.gov EIUG Workshop September 25, 2017 SAND2017-10341 C

Where are Unstructured Grids used? Primarily engineering codes, but many simulations as well Finite volume Finite element Particle in cell Sierra application family (https://sierradist.sandia.gov/) Challenges Addressed Coupled multiphysics: solid+fluid+thermal+chem. Large unstructured meshes: 10,000,000,000s elements (5-10 doubles/element) Massively parallel computing: 100,000s of processors Advanced algorithms: mesh erosion, h-adaptivity, multilevel, dynamic load balancing ATDM SPARC (FV) – re-entry widget in atmosphere Empire (EM) – electromagnetic field SIERRA -- A Computational Framework for Engineering Mechanics Applications (http://graphics.stanford.edu/sss/sierra_stanford_april02.pdf) https://www.osti.gov/scitech/servlets/purl/1145754

Sierra Applications

Sierra Architecture

Platforms we run on Sequoia (LLNL) 17 PF Trinity (SNL/LANL) > 40 PF 98304 nodes, 1.57 million compute cores, 1.6 PB RAM Trinity (SNL/LANL) > 40 PF 19000 nodes, 301,000 cores Xeon + KNL, ~2 PB Ram We currently run full machine jobs on both machines and would like to run much larger!

Output Frequency Steady State applications E.g., aircraft wing design at high velocity to see where the pressure point develops Single output when calculation stabilizes Non-steady state applications E.g., SPARC calculations of temperature, pressure and velocity over a path Output periodically, ~ every 10 timesteps Data sizes are O(1PiB) for large runs!

Underlying Computation Frameworks Wide variety Trilinos linear algebra family (C++ and fully templetized) 17 years old and starting to make inroads (Epetra and now Tpetra) Can do FE/FV well, but not recognized as great for other setups Physics really determines if it can be used more than anything else Hand-coded frameworks Built over time and optimized for exactly the problem at hand Very fast, but must be maintained Speed makes moving to something general like Trilinos unattractive. Others E.g., PetSc (Fortran interface available)

Common IO interface IO Interface is standardized for Sierra (IOSS) Standardized interface for reading and writing to various formats Does meshes and/or data Also used for code coupling Exodus II interface is for file IO NetCDF is the underlying format, 1 file per process, but moving to 1 file Nemesis: Exodus II extension for unstructured parallel finite element analyses. Adds data structures helpful in partitioning Fortran array ordering is generally standard

Exodus II Optimized calls tailored to the Sierra (unstructured grid) applications on top of NetCDF Long developed and proven to be both sufficient and effective. Currently 1 file per process with a data redistributor process to change the distribution for a different sized run Running into Lustre metadata server limits and sheer parallel access to the PFS Moving to 1 file total (still with NetCDF underneath) Thought this would solve all problems, but will introduce new problems. Sandia analysis applications all know how to read and use these files. http://prod.sandia.gov/techlib/access-control.cgi/1992/922137.pdf

Nemesis Focus on parallel extensions to Exodus II to make things easier to manage. This helps a lot with load balancing Strict superset of Exodus II API, but 100% backwards compatible file organization and contents. Still have lots of files and potentially manual manipulation. https://www.researchgate.net/publication/255655344_NEMESIS_I_A_Set_of_Functions_for_Describing_Unstructured_Finite-Element_Data_on_Parallel_Computers

Futures Data tagging Tagging granularity varies Annotate blocks or variables for the presence of a value of threshold (min/max) Identify feature existence from lightweight analysis Tagging granularity varies Some applications more at the element level, others at the region or whole variable level Physics affects what sort of tagging is useful/important Actually efficient storage and access for parallel codes

ATDM Multi-physics in asynchronous many task framework Faodail (Data Warehouse component) offers both AMT integrated service as well as workflow/external access Making this scalable is challenging (PIC codes with the communication patterns can be slow)

Questions? What else can I tell you about?