Enabling HPC Simulation Workflows for Complex Industrial Flow Problems  C.W. Smith, S. Tran, O. Sahni, and M.S. Shephard, Rensselaer Polytechnic Institute.

Slides:

Advertisements

Similar presentations

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University

Advertisements

SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.

© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.

Towards Autonomic Adaptive Scaling of General Purpose Virtual Worlds Deploying a large-scale OpenSim grid using OpenStack cloud infrastructure and Chef.

© 2011 Autodesk Autodesk® Moldflow® as an Integral Part of Numerical Part Optimization Andreas Wüst, BASF SE Teamleader Num. Optimization and Crash Analysis.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.

Software Version Control SubVersion software version control system WebSVN graphical interface o View version history logs o Browse directory structure.

Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.

Information Systems In The Enterprise

NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.

Installing software on personal computer

WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.

TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.

4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.

Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.

Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.

1 Presenters: Cameron W. Smith and Glen Hansen Workflow demonstration using Simmetrix/PUMI/PAALS for parallel adaptive simulations FASTMath SciDAC Institute.

NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

Physics Steven Gottlieb, NCSA/Indiana University Lattice QCD: focus on one area I understand well. A central aim of calculations using lattice QCD is to.

What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.

Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)

Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)

The Future of the iPlant Cyberinfrastructure: Coming Attractions.

Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.

Apache Airavata (Incubating) Gateway to Grids & Clouds Suresh Marru Nov 10 th 2011.

Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,

Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.

1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 

March 2004 At A Glance NASA’s GSFC GMSEC architecture provides a scalable, extensible ground and flight system approach for future missions. Benefits Simplifies.

Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.

A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.

1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.

A scalable and flexible platform to run various types of resource intensive applications on clouds ISWG June 2015 Budapest, Hungary Tamas Kiss,

HP PPM Center release 8 Helping IT answer the tough questions

16/11/ Semantic Web Services Language Requirements Presenter: Emilia Cimpian

TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.

Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.

National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.

SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.

David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.

1 1  Capabilities: PCU: Communication, threading, and File IO built on MPI APF: Abstract definition of meshes, fields, and their algorithms GMI: Interface.

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

Predictive Load Balancing Using Mesh Adjacencies for Mesh Adaptation  Cameron Smith, Onkar Sahni, Mark S. Shephard  Scientific Computation Research Center.

March 2004 At A Glance The AutoFDS provides a web- based interface to acquire, generate, and distribute products, using the GMSEC Reference Architecture.

© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.

PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.

Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.

ENEA GRID & JPNM WEB PORTAL to create a collaborative development environment Dr. Simonetta Pagnutti JPNM – SP4 Meeting Edinburgh – June 3rd, 2013 Italian.

TeraGrid’s Process for Meeting User Needs. Jay Boisseau, Texas Advanced Computing Center Dennis Gannon, Indiana University Ralph Roskies, University of.

INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.

INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.

Unstructured Meshing Tools for Fusion Plasma Simulations

HPC In The Cloud Case Study: Proteomics Workflow

Accessing the VI-SEEM infrastructure

Organizations Are Embracing New Opportunities

HPC In The Cloud Case Study: Proteomics Workflow

Landsat Remote Sensing Workflow

RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.

Joslynn Lee – Data Science Educator

Parallel Unstructured Mesh Infrastructure

Construction of Parallel Adaptive Simulation Loops

Software Defined Networking (SDN)

SDM workshop Strawman report History and Progress and Goal.

John E. Kolb, P.E. VP, Information Services & Technology and Chief Information Officer Consortium Director Rensselaer Polytechnic Institute.

Dynamic Load Balancing of Unstructured Meshes

Presentation transcript:

Enabling HPC Simulation Workflows for Complex Industrial Flow Problems  C.W. Smith, S. Tran, O. Sahni, and M.S. Shephard, Rensselaer Polytechnic Institute  S. Singh Indiana University  Outline Industry requires complete HPC workflows RPI efforts on HPC for industry Components for parallel adaptive simulation Science Gateway Application to complex industrial flow problems

HPC for Industry  Increasingly Industry requires parallel analysis to meet their simulation needswith key drivers being Higher spatial and temporal resolution More complex physics with many multiphysics problems Increased use of validation and movement toward uncertainty quantification  Reasonable progress being made on the analysis engines Research codes that scale to nearly 1,000,000 cores on unstructured meshes Commercial codes improving scaling to thousands for flow More reasonable software pricing models  However, the application of HPC in industry is growing slowly Economics of product design cycle indicate it should be growing quickly

HPC for Industry  Why is the use of industrial HPC growing slowly? The analysis codes are available. What is missing? To obtain the potential cost benefits the entire simulation workflow must be integrated into the HPC environment Workflow must included tools industry has spend years integrating and validating in their processes Need to use multiple CAD and CAE tools Effective industrial use of large scale parallel computations will demand simulation reliability Must have very high degree of automation – human in the loop kills scalability and performance Need easy access to cost effective parallel computers Must be able to do proprietary work Must have easy to use parallel simulation management

HPC for Industry  Approach being taken A component-based approach to integrate from design through results quantification Link to industry design data (e.g., CAD geometry) Manage the model construction directly on massively parallel computers Support the use of multiple analysis engines Support simulation automation Support in-memory integration of components as much as possible to avoid I/O bottlenecks Provide web-based portal for execution of massively parallel simulation workflows  This presentation will focus on components developed for parallel adaptive unstructured mesh simulations

Rensselaer’s Efforts to Bringing HPC to Industry Scientific Computation Research Center (SCOREC) Parallel methods for unstructured meshes and adaptive simulation control Component-based methods for developing parallel simulation workflows Center for Computational Innovations Petaflop IBM Blue Gene/Q and clusters Industry can gain guaranteed access to run proprietary applications (for a price less that cloud computing) Programs for HPC for Industry HPC2 – New York State HPC consortium NSF Partnership for Innovation NSF XSEDE Industrial Challenge Program 5 i M 0 j M 1 1 P 0 P 2 P off-node part boundary on-node part boundary Node j Node i

SCOREC’s Research Builds on Broad Partnerships  Interdisciplinary research program supported by Government – NSF, DOE, DoD, NASA, NIH, NY State Strong industrial support – 46 different companies have supported SCOREC Multiple pieces of software have been commercialized Center has generated a software vendor Multi-way partnerships are common Large industry, software vendor, SCOREC SBIR from government agencies to software vendor and SCOREC Government laboratory, software vendor(s), SCOREC University, SCOREC, etc.

Center for Computational Innovations IBM Blue Gene/Q petaflop computer 5120 compute nodes nodes each Each node has 16 A2 processing cores -17 th core for OS functions 16 GB of RAM per node 80 TB of RAM system wide 56 Gb/s IB external network 160 nodes for data I/O 1.2 PB parallel file system

High Performance Computation Consortium HPC2 supported by the NYSTAR Division of the Empire State Development Agency Goal is to provide NY State Industry support in the application of high performance computing technologies in: Research and discovery Product development Improved engineering and manufacturing processes HPC2 works with NY State Centers for Advanced Technology which serve as focal points for technology transfer to Industry The HPC 2 is a distributed activity - key participants Rensselaer Stony Brook/Brookhaven SUNY Buffalo NYSERNET 8 x/h=-6 z/h=16 x/h=-6 z/h=-16 x/h=25 z/h=16 x/h=-6 z/h=16 x/h=-6 z/h=-16 x/h=25 z/h=16 Time-avg (Cb=1.2) Exp CFD Simulation of active flow control device (Sahn, et al.)

NSF Sponsored Activities on HPC for Industry  Partnership for Interoperable Components for Parallel Engineering Simulations Technologies to make construction of HPC workflows more efficient Component-based methods supporting combinations of open source and commercial software Mechanisms to help industry effectively apply HPC  NSF XSEDE Industrial Challenge Program Install components for parallel adaptive simulations on XSEDE machines Develop HPC workflows for industry on XSEDE machines Investigate use of Phi co-processors on the Stampede system for in parallel adaptive unstructured mesh simulations 9

Recent Industrial Partners Industrial Partners ACUSIM (now Altair) Ames-Goldsmith Blasch Ceramics Boeing Calabazas Creek Research Corning Crystal IS GE HyPerComp IBM ITT Gould Pumps Kitware Northrop Grumman Pliant Procter & Gamble Sikorsky Simmetrix Xerox

Parallel Data & Services Domain Topology Domain Topology Mesh Topology/Shape Dynamic Load Balancing Simulation Fields Partition Control Partition Control Component-Based Unstructured Mesh Infrastructure Parallel data and services are core Abstraction of geometric model topology (GMI or GeomSim) Mesh also based on topology – it must be distributed (PUMI or MeshSim), growing need for distributed geometry (GeomSim) Simulation fields are tensors with distributions over geometric model and mesh entities (APF or FieldSim) Partition control must coordinate communication and partition updates Dynamic load balancing required at multiple steps in the workflow to account for mesh changes and application needs (Zoltan and ParMA) PUMI, GMI, APF, ParMa are SCOREC research codes GeomSim, MeshSim and FieldSim are component-based tool from Simmetrix Zoltan is from Sandia Nat. Labs i M 0 j M 1 1 P 0 P 2 P off-node part boundary on-node part boundary Node j Node i

Distributed Mesh and Partition Control  Distributed Mesh Requirement On part operated without communication Communications through partition model  Services Mesh Migration – moving mesh between parts Ghosting – read only copies to reduce communication Changing numbers of parts Geometric model Partition model i M 0 j M 0 1 P 0 P 2 P inter-process boundary intra-process part boundary process j process i Distributed mesh

Dynamic Load Balancing  Equal “work” with minimum communication  Tools Graph-based (ParMETIS, Zoltan) Geometry-based (Zoltan, Zoltan2) Mesh-based (ParMA) Local/global  Load balancing throughout simulation Need fast methods – can not dominate Need predictive load balancing to account for mesh adaptation Need to account for needs of specific workflow components

Gateway Execution  High barrier to run HPC workflows Requires knowledge of filesystem, scheduler, scripting, runtime env., compilers, … - for each HPC system  XSEDE science gateway for PHASTA lowers the barrier User specifies problem definition, simulation parameters, and required compute resources through experiment creation web page (right) Workflow steps are executed on HPC system, user is ed, and output is prepared for download – option to delete or archive Scales to multiple users and systems

Gateway Creation and Maintenance  System and user software expert maintains builds and execution scripts Optimized builds and runtime parameters Web interface for defining workflow  XSEDE gateway developers quickly accommodate user requests through SciGap and Airavata APIs Output log monitoring – monitor job output from web interface notifications – completion, failure, app specific milestone, … Data persistence – industrial user wants data deleted after run Configuring HPC system access – adding RPI BlueGene /Q support Twin-screw extruder axial velocity: (left) two threads of the screw and (right) cross-section across the extruder. PHASTA gateway experiment summary.

Component-Based Unstructured Mesh Infrastructure

 File transfer a serious bottleneck in parallel simulation workflows All core parallel data and services accessed through APIs File-based workflows require no change of components Often first implementations done via files, but using APIs In-memory integration approaches use APIs Support effective migration from file-based to in-memory for “file-based” codes – replace I/O routines with routines that use APIs for transfer between data structures For more component-based codes the in-memory integration was easier to implement than file based In-memory has far superior parallel performance

Adaptive Loop Applications  Adaptive loops to date have been used for Modeling of nuclear accidents and various flow problems with University of Colorado’s PHASTA code Solid mechanics applications with Sandia’s Albany code Modeling fusion MHD with PPPL’s M3D-C1 code Accelerator modeling problems with SLAC’s ACE3P code Aerodynamics problems with NASA’s Fun3D code Waterway flow problems with ERDC’s Proteus code High-order fluids simulations with Nektar++ Modeling a dam break Plastic deformation of a mechanical part

Complex Flow Simulations

Active flow control on vertical tail improves its effectiveness. Massively parallel simulations provide tremendous physical insights. Integrated experimental and numerical investigation at UC Boulder and RPI. Active Flow Control on Vertical Tail AFC results in reduction of drag/size of the tail. 20 Petascale simulations

Two-phase modeling using level-sets coupled to structural activation Adaptive mesh control – reduces mesh required from 20 million elements to 1 million New ARO project using explicit interface tracking to track reacting particles Adaptive Two-Phases Flow

Modeling Ceramic Extrusion  Objectives Develop end-to-end workflow for modeling ceramic extrusion  Tools SimModeler – mesh generation and problem definition PHASTA – massively parallel CFD Chef – pre-processing, solution transfer, and mesh adaptation driver Kitware Paraview – visualization  Status and Plans Non-linear material model and partial- slip boundary condition into PHASTA. Extended pre-processor to support partial-slip boundary condition. Created XSEDE web-based gateway for automated execution of workflow. Planning gateway support for CCI. Velocity and pressure fields. Twin screw extruder.

Aerodynamics Simulations

NASA Trap Wing Zoom of leading edge of the main wing Adapted: LEV2 Initial: LEV0 C p plots, near the tip

Summary  Technologies and tools needed to create effective HPC workflows for industry are available However, it is not a “field of dream” – just building the tools will not get industry to come use them Need to work with industry to create effective simulation workflows that address their needs Progress is being made on developing the needed tools and mechanisms – more progress is needed Requires too much expertise Takes too much time/effort Even with additional improvement, expect that it will still be a “contact sport” requiring interactions between computational scientists and the engineers that will use the simulations