HPCMP Benchmarking and Performance Analysis

Slides:



Advertisements
Similar presentations
Presentation Outline A word or two about our program Our HPC system acquisition process Program benchmark suite Evolution of benchmark-based performance.
Advertisements

ASME-PVP Conference - July
Computation Fluid Dynamics & Modern Computers: Tacoma Narrows Bridge Case Study Farzin Shakib ACUSIM Software, Inc SGI Technical Users ’ Conference.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
ECEN5341/4341Bioelectromagnetics Spring 2015 Frank S. Barnes Contact Info: (303) ECOT 250
Modelling of an Inductively Coupled Plasma Torch: first step André P. 1, Clain S. 4, Dudeck M. 3, Izrar B. 2, Rochette D 1, Touzani R 3, Vacher D. 1 1.
2003 International Congress of Refrigeration, Washington, D.C., August 17-22, 2003 CFD Modeling of Heat and Moisture Transfer on a 2-D Model of a Beef.
MUTAC Review April 6-7, 2009, FNAL, Batavia, IL Mercury Jet Target Simulations Roman Samulyak, Wurigen Bo Applied Mathematics Department, Stony Brook University.
Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering.
CSE351/ IT351 Modeling and Simulation
Additional Slides VALIS. VALIS – intro.  Any kinetic model of plasma will be closely related to Vlasov’s equation  Describes evolution of particle density,
Laser Treatment Modeling Capabilities at Rensselaer-Hartford Ernesto Gutierrez-Miravete Rensselaer at Hartford
Prediction of Fluid Dynamics in The Inertial Confinement Fusion Chamber by Godunov Solver With Adaptive Grid Refinement Zoran Dragojlovic, Farrokh Najmabadi,
Chamber Dynamic Response Modeling Zoran Dragojlovic.
©2003 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, Romney/Steinbart 18-1 Accounting Information Systems 9 th Edition Marshall.
Image courtesy of National Optical Astronomy Observatory, operated by the Association of Universities for Research in Astronomy, under cooperative agreement.
Brookhaven Science Associates U.S. Department of Energy MUTAC Review March 16-17, 2006, FNAL, Batavia, IL Target Simulations Roman Samulyak Computational.
Application Performance Analysis on Blue Gene/L Jim Pool, P.I. Maciej Brodowicz, Sharon Brunett, Tom Gottschalk, Dan Meiron, Paul Springer, Thomas Sterling,
Model Simulation Studies of Hurricane Isabel in Chesapeake Bay Jian Shen Virginia Institute of Marine Sciences College of William and Mary.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
1 CFD Analysis Process. 2 1.Formulate the Flow Problem 2.Model the Geometry 3.Model the Flow (Computational) Domain 4.Generate the Grid 5.Specify the.
Introduction to virtual engineering László Horváth Budapest Tech John von Neumann Faculty of Informatics Institute of Intelligent Engineering.
‘Tis not folly to dream: Using Molecular Dynamics to Solve Problems in Chemistry Christopher Adam Hixson and Ralph A. Wheeler Dept. of Chemistry and Biochemistry,
MUMPS A Multifrontal Massively Parallel Solver IMPLEMENTATION Distributed multifrontal.
Massively Parallel Magnetohydrodynamics on the Cray XT3 Joshua Breslau and Jin Chen Princeton Plasma Physics Laboratory Cray XT3 Technical Workshop Nashville,
Generate and interpret graphs and charts describing different types of motion, including the use of real-time technology such as motion detectors or photogates.[PHY.4A]
Overview Introduction The Level of Abstraction Organization & Architecture Structure & Function Why study computer organization?
S.S. Yang and J.K. Lee FEMLAB and its applications POSTEC H Plasma Application Modeling Lab. Oct. 25, 2005.
Simulation Technology & Applied Research, Inc N. Port Washington Rd., Suite 201, Mequon, WI P:
Javier Junquera Molecular dynamics in the microcanonical (NVE) ensemble: the Verlet algorithm.
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
HPC Technology Track: Foundations of Computational Science Lecture 1 Dr. Greg Wettstein, Ph.D. Research Support Group Leader Division of Information Technology.
PROGRAM FOR COMPUTATION OF OPERATION PARAMETERS OF LARGE TRANSFORMERS Ivo DOLEŽEL CZECH TECHNICAL UNIVERSITY, PRAHA, CZECH REPUBLIC Pavel KARBAN UNIVERSITY.
C. Chen 1, R. C. Beardsley 2 and G. Cowles 1 1 Department of Fisheries Oceanography University of Massachusetts-Dartmouth (UMASSD), New Bedford, MA
2005 Materials Computation Center External Board Meeting The Materials Computation Center Duane D. Johnson and Richard M. Martin (PIs) Funded by NSF DMR.
CAM-I Scalable Flexible Manufacturing Initiative NGMS Task 6.1.
A Grid fusion code for the Drift Kinetic Equation solver A.J. Rubio-Montero, E. Montes, M.Rodríguez, F.Castejón, R.Mayo CIEMAT. Avda Complutense, 22. Madrid.
Discontinuous Galerkin Methods and Strand Mesh Generation
Lecture Objectives: -Discuss the final project presentations -Energy simulation result evaluation -Review the course topics.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
HPCMP Benchmarking Update Cray Henry April 2008 Department of Defense High Performance Computing Modernization Program.
J.-Ph. Braeunig CEA DAM Ile-de-FrancePage 1 Jean-Philippe Braeunig CEA DAM Île-de-France, Bruyères-le-Châtel, LRC CEA-ENS Cachan
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
 Advanced Accelerator Simulation Panagiotis Spentzouris Fermilab Computing Division (member of the SciDAC AST project)
Ale with Mixed Elements 10 – 14 September 2007 Ale with Mixed Elements Ale with Mixed Elements C. Aymard, J. Flament, J.P. Perlat.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Finite Element Analysis
Discretization Methods Chapter 2. Training Manual May 15, 2001 Inventory # Discretization Methods Topics Equations and The Goal Brief overview.
Types of Models Marti Blad Northern Arizona University College of Engineering & Technology.
PRESENTATION OF CFD ACTIVITIES IN CV GROUP Daniel Gasser.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
Some slides on UCLA LM-MHD capabilities and Preliminary Incompressible LM Jet Simulations in Muon Collider Fields Neil Morley and Manmeet Narula Fusion.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Lecture Objectives: Define 1) Reynolds stresses and
CAD and Finite Element Analysis Most ME CAD applications require a FEA in one or more areas: –Stress Analysis –Thermal Analysis –Structural Dynamics –Computational.
Lecture Objectives: Define the final project Deliverables and Grading policy Analyze factors that influence accuracy of our modeling study Learn about.
Lecture Objectives: - Numerics. Finite Volume Method - Conservation of  for the finite volume w e w e l h n s P E W xx xx xx - Finite volume.
Lecture Objectives: Accuracy of the Modeling Software.
Computational Fluid Dynamics Lecture II Numerical Methods and Criteria for CFD Dr. Ugur GUVEN Professor of Aerospace Engineering.
Unstructured Meshing Tools for Fusion Plasma Simulations
Chamber Dynamic Response Modeling
OVERVIEW Impact of Modelling and simulation in Mechatronics system
SPS1. Obtain, evaluate, and communicate information from the Periodic Table to explain the relative properties of elements based on patterns of atomic.
CAD and Finite Element Analysis
Convergence in Computational Science
Virginia Institute of Marine Sciences College of William and Mary
GENERAL VIEW OF KRATOS MULTIPHYSICS
Presentation transcript:

HPCMP Benchmarking and Performance Analysis Mark Cowan USACE ERDC ITL in support of DoD HPCMP Tuesday, April 17, 2012

What is the HPCMP? Initiated in 1992 Congressional mandate to modernize DoD’s HPC capabilities Assembled from collection of HPC departments across Army, Air Force, and Navy labs and test centers

What is the HPCMP? FOCUS Solve military and security problems using HPC hardware and software Assess technical and management risks Performance Time Available resources Cost Schedule Supports DoD objectives through research, development, test and evaluation

Where we benchmark

Migrate to a 2-year acquisition cycle Why the radical change? Entice more vendors into the competition Vendor feedback  remove or alleviate disincentives Review the entirety of the TI acquisition process Line-by-line justification of benchmarking rules document Address both HPC community and vendor concerns Comprehensive reevaluation of how we benchmark Analyze the codes Justify the test cases

Migrate to a 2-year acquisition cycle Dangers? Time the milestones poorly on the calendar and miss out on release of cutting-edge technology Difficult problem How to schedule activities to maximize likelihood of hitting publicly-available products months in advance, while being blind to intricacies of chip fabrication schedules and unforeseen recalls?

Codes considered for TI-11/12 ABAQUS COBALT ICEPIC ABINIT CP2K LAMMPS ACES CPMD LS-DYNA ADCIRC CTH MATLAB ADH ETA OOCORE ALE3D FDTD OVERFLOW ALEGRA FLAPW SHAMRC AMR FLUENT SIERRA AVUS GAMESS STAR CCM+ CFD++ GASP VASP CFDSHIP-IOWA GAUSSIAN WRF COAMPS HYCOM XPATCH

TI-11/12 benchmarking applications ADCIRC – Coastal Circulation and Storm Surge model 100% Fortran, MPI Uses METIS library (C) 205K LOC ALEGRA – Hydrodynamic and solid dynamics plus magnetic field and thermal transport 96% C, 4% Fortran, MPI 978K LOC AVUS (Cobalt-60) – Turbulent flow CFD code Fortran, MPI, 29K LOC CTH – Shock physics code ~58% Fortran/~42% C, MPI, 900K LOC GAMESS – Quantum chemistry code Fortran, MPI, 330K LOC HYCOM – Ocean circulation modeling code Fortran, MPI, 31K LOC ICEPIC – Particle-in-cell magnetohydrodynamics code C, MPI, 350K LOC LAMMPS – Molecular dynamics code C++, MPI, 45K LOC █ Predicted █ Benchmarked

Components of testing packages Applications tested on representative input sets CODE CASE Distinguished Core Count Time (sec) on DIAMOND Core Counts ADCIRC baroclinic 1024 8959 512, 768, 1024, 1280, 1536, 1792, 2048 hurricane 1280 2082 ALEGRA obliqueImp 1536 1640 1024, 1280, 1536, 1792, 2048 explWire 256 944 256, 384, 512, 768, 1024 AVUS waverider 941 384, 512, 768, 1024, 1536 turret-td 1332 768, 1024, 1280, 1536, 2048 CTH fixed-grid 3399 amr 2535 GAMESS DFT-grad 4701 128, 192, 256, 384, 512 MP2-grad 512 2536 128, 256, 512, 768, 1024 CC-energy 3658 512, 768, 1024, 1536, 2048 HYCOM lrg 1353 3020 1001, 1353, 1516, 1770, 2045 ICEPIC magnetron 384 2559 256, 384, 512, 768, 1024 gyrotron 2048 3639 1536, 1792, 2048, 2304, 2560 LAMMPS Au 3182 128, 256, 384, 512, 1024, 1280, 1536, 2048

Some components of HPC procurement cycle

Some components of HPC procurement cycle Acquire new versions of codes Port codes to various machines Acquire test cases Develop or acquire accuracy checks Test codes, get times to compare Assemble package for vendors

Some components of HPC procurement cycle Run codes with test cases on installed DSRC machines Optimize! How fast can we go?

Some components of HPC procurement cycle We review vendor submittal Anything suspicious? How do vendor times compare to ours? How did vendors optimize? How risky is vendor’s proposal? Present our results

Components of testing packages continued Timers measure the elapsed running times Accuracy checks ensure validity of output files Often requires determination of acceptable error bounds

How the test packages are used Run all test cases on 5 different DSRC machines to acquire times Debug test packages Quantify variation across/within machines Compare times to proposed systems

Machine attributes DSRC Name Make Model Chip Set Processor Speed (GHz)   Architectures Used in Study DSRC Name Make Model Chip Set Processor Speed (GHz) Interconnect Number of Cores Cores per Node Operating System ERDC Diamond SGI Altix ICE Intel Xeon QC 2.8 DDR4 InfiniBand 15360 8 SUSE Linux MHPCC Mana Dell PowerEdge M610 DDR InfiniBand 9216 Linux NAVY DaVinci IBM Power6 IBM P6 DC 4.7 DDR Infiniband 4800 32 AIX Einstein Cray XT5 Cray Opteron QC 2.3 SeaStar2+ 12736 CNL Garnet XE6 AMD Opteron 64-bit 2.4 Cray Gemini 20224 16 CLE

RESULTS! Graphs of runtimes

Risk Assessment: Major Areas Assessed Compliance assessment Ability to follow benchmark rules Number of test case results provided Results within accuracy criteria Assessment of risk in meeting proposed times in acceptance tests Differences between benchmarked and proposed system Processor , interconnect, and I/O system differences Quality of estimation procedure Quality of explanation and soundness of estimation procedure Aggressiveness of final estimate Comparison with measured benchmark system times Comparison with predicted times Assessment of likelihood of users and/or developers using proposed code modifications Acceptability of proposed code modifications

Benchmarking website URL: http://www.benchmarking.hpc.mil/

Benchmarking website continued Narrative of website purpose, codes tested Heatmap of systems best suited for applications

Brief description of application Brief description of test cases Benchmarking website continued Brief description of application Brief description of test cases

Benchmarking website continued An example of how we made the heatmap for allocation choices

Want to suggest an improvement? Benchmarking website continued Got a question? Want to suggest an improvement? Contact us.

Performance Team Members Mark Cowan – ERDC – Chair Larry Davis – HPCMPO Lloyd Slonaker – AFRL Tim Sell – AFRL Laura Brown – ERDC Mahbubur Rashid – ERDC Christine Cuicchi – NAVO Matt Grismer – AFRL Jerry Boatz – AFRL

Performance Team Advisors William Ward – HPCMPO Steve Finn – DTRA Carrie Leach – ERDC Paul Bennett – ERDC Tom Oppe – ERDC Henry Newman – Instrumental Michael Laurenzano – SDSC Bronis de Supinski – LLNL Joseph Swartz – LM Allan Snavely – SDSC Laura Carrington – SDSC Robert Pennington – NSF Nick Wright – NERSC James Ianni – ARL

Questions?

Contact me… Mark Cowan USACE ERDC ITL Computational Analysis Branch 3909 Halls Ferry Road Building 8000, Room 1255 Vicksburg, MS 39180 (601) 634-2665 Mark.A.Cowan@usace.army.mil

ADDENDA

AVUS: Code description CFD code, formerly COBALT_60 Simulates 3-D turbulent viscous flow over irregular geometries Grid-based, reads a large grid file AVUS: 29K lines of Fortran 90 code Uses ParMETIS: 12K lines of C code Parallelism via MPI, no OpenMP Runs on Cray XT, IBM Power, SGI Altix, Linux clusters

CTH: Code description CTA: CSM (Computational Structural Mechanics) Shock Physics Two-step, 2nd order accurate Eulerian algorithm is used to solve the mass, momentum, and energy conservation equations An explicit approach that does not require solving a linear system Has both static and adaptive mesh capabilities Parallelism via MPI 900K LOC, 58% FORTRAN and 42% C Uses NetCDF, supplied with distribution

GAMESS: Code description CTA: CCM (Computational Chemistry, Biology, and Materials Science) Ab Initio Quantum chemistry Computes many energy integrals with molecular data in form of atom positions and electron orbitals Communication depends on platform LAPI, Sockets, SHMEM, MPI Code composition: 99% FORTRAN, 1% C

HYCOM: Code description CTA: Climate/Weather/Ocean Modeling and Simulation (CWO) A primitive equation ocean general circulation model Communication is MPI (MPI-2 is available) 100% FORTRAN Version 2.2.27

HYCOM: MPI-2 details HYCOM may be run with MPI or MPI-2 MPI-2 is MPI with additional features such as parallel I/O, dynamic process management and remote memory operations HYCOM utilizes parallel I/O feature Parallel I/O times required starting with TI-10

ICEPIC: Code description CTA: Computational Electromagnetics and Acoustics (CEA) Particle-in-cell plasma physics code Ions and electrons move under influence of electromagnetic fields Particles are updated in a grid-free manner; grouped in cells which are periodically adjusted to preserve load balance Fields calculated on a structured, static grid and dual grid according to Maxwell's Equations Can simulate plasmas contained in complex geometries Used in electromagnetic device design ~350K lines of code, 100% C++, C

LAMMPS: Code description CTA: CCM (Comp Chemistry, Biology, & Material Science) Classical molecular dynamics code that models particles in a liquid, solid, or gaseous state Calculates atomic velocities, positions, system energy, and temperature After equilibration: surface tension, radial pressure, and phase change Post-processing: pair-correlation function and diffusion coefficients All actions occur within box (usually orthogonal) Distributed-memory message-passing parallelism (MPI) Highly-portable C++ Libraries needed: MPI and single-processor FFT

ADCIRC: Code description ADCIRC Coastal Circulation and Storm Surge Model Solves time dependent, free surface circulation and transport problems in 2 and 3 dimensions. Use the finite element method in space, which permits highly flexible, unstructured grids. Typical ADCIRC applications have included: Modeling tides and wind driven circulation, Analysis of hurricane storm surge and flooding, Dredging feasibility and material disposal studies, Larval transport studies, and Near shore marine operations

Pressure and temperature during formation of jet from shaped charge “BASE” ALEGRA: Code description ALE code -- Arbitrary Lagrangian-Eulerian -- provides flexibility, accuracy and reduced numerical dissipation over pure Eulerian code; modern remeshing technology allows for robust mesh smoothing and control. Hydrodynamic and solid dynamics Models large distortions and strong shock propagation in multiple-materials Finite element code; descendent of PRONTO and uses some CTH Eulerian technology Energy deposition and explosive burn models Geometry -- 2D/3D Cartesian, 2D cylindrical   Material Models in ALEGRA: Equations of State Elastic-Plastic Models Fracture Models Pressure and temperature during formation of jet from shaped charge

“BASE” ALEGRA: Code description

ALEGRA_MHD: Code description All hydrodynamics/solid dynamic modules of "base" ALEGRA PLUS magnetic field and thermal transport effects Lorentz forces, Joule heating, thermal transport and simple models for radiating excess energy 2D and 3D versions 2D modeling with the magnetic flux density vector components in or out of the plane with the corresponding current density out of or in the plane, respectively. 3D uses a magnetic diffusion solution based on edge and face elements which maintains the discrete flux divergence-free property during magnetic solve and constrained transport remap stage Lumped element coupled circuit equations Magnetic and thermal conduction Advanced models for thermal and electrical conductivity Emission model radiates excess energy when medium is optically thin while accounting for reabsorption