August 12, 2004 UCRL-PRES-206265 2 Aug-12-2004 Outline l Motivation l About the Applications l Statistics Gathered l Inferences l Future Work.

Slides:

Advertisements

Similar presentations

Outline Dynamo: theoretical General considerations and plans Progress report Dynamo action associated with astrophysical jets Progress report Dynamo: experiment.

Advertisements

Sparse linear solvers applied to parallel simulations of underground flow in porous and fractured media A. Beaudoin 1, J.R. De Dreuzy 2, J. Erhel 1 and.

1 Numerical Simulation for Flow in 3D Highly Heterogeneous Fractured Media H. Mustapha J. Erhel J.R. De Dreuzy H. Mustapha INRIA, SIAM Juin 2005.

Ionization of the Hydrogen Molecular Ion by Ultrashort Intense Elliptically Polarized Laser Radiation Ryan DuToit Xiaoxu Guan (Mentor) Klaus Bartschat.

MUTAC Review April 6-7, 2009, FNAL, Batavia, IL Mercury Jet Target Simulations Roman Samulyak, Wurigen Bo Applied Mathematics Department, Stony Brook University.

Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering.

Coupling Continuum Model and Smoothed Particle Hydrodynamics Methods for Reactive Transport Yilin Fang, Timothy D Scheibe and Alexandre M Tartakovsky Pacific.

Extending the capability of TOUGHREACT simulator using parallel computing Application to environmental problems.

ASCI/Alliances Center for Astrophysical Thermonuclear Flashes Simulating Self-Gravitating Flows with FLASH P. M. Ricker, K. Olson, and F. X. Timmes Motivation:

Computational Solutions of Helmholtz Equation Yau Shu Wong Department of Mathematical & Statistical Sciences University of Alberta Edmonton, Alberta, Canada.

Exploring Communication Options with Adaptive Mesh Refinement Courtenay T. Vaughan, and Richard F. Barrett Sandia National Laboratories SIAM Computational.

Experimental and Numerical Investigation of Buoyancy-Driven Hydrodynamics Nicholas Mueschke Texas A&M University Malcolm J. Andrews Los Alamos National.

Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.

Steady Aeroelastic Computations to Predict the Flying Shape of Sails Sriram Antony Jameson Dept. of Aeronautics and Astronautics Stanford University First.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.

1 Aug 7, 2004 GPU Req GPU Requirements for Large Scale Scientific Applications “Begin with the end in mind…” Dr. Mark Seager Asst DH for Advanced Technology.

EFFECTS OF CHAMBER GEOMETRY AND GAS PROPERTIES ON HYDRODYNAMIC EVOLUTION OF IFE CHAMBERS Zoran Dragojlovic and Farrokh Najmabadi University of California.

CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.

Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.

Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.

מבוא לעיבוד מקבילי דר' גיא תל-צור סמסטר א' רשימת נושאים שנסקרו עד כה בקורס הרשימה מיועדת לסייע לתלמיד בהכנה לבוחן בכך שהיא מאזכרת מושגים. אולם,

Application Performance Analysis on Blue Gene/L Jim Pool, P.I. Maciej Brodowicz, Sharon Brunett, Tom Gottschalk, Dan Meiron, Paul Springer, Thomas Sterling,

1 Parallel Simulations of Underground Flow in Porous and Fractured Media H. Mustapha 1,2, A. Beaudoin 1, J. Erhel 1 and J.R. De Dreuzy IRISA – INRIA.

LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.

Parallel Programming in C with MPI and OpenMP

Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction Ananth Grama, Purdue University Vivek Sarin, Texas A&M University Hemant.

Collective Communication

MpiP Evaluation Report Hans Sherburne, Adam Leko UPC Group HCS Research Laboratory University of Florida.

LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.

PMaC Performance Modeling and Characterization Efficient HPC Data Motion via Scratchpad Memory Kayla Seager, Ananta Tiwari, Michael Laurenzano, Joshua.

Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY 1 Parallel Solution of the 3-D Laplace Equation Using a Symmetric-Galerkin Boundary Integral.

Scalable Reconfigurable Interconnects Ali Pinar Lawrence Berkeley National Laboratory joint work with Shoaib Kamil, Lenny Oliker, and John Shalf CSCAPES.

ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.

After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.

Applications of the Multi-Material DEM Model Presented by: David Stevens, Jaroslaw Knap, Timothy Dunn September 2007 Lawrence Livermore National Laboratory.

Parallel Simulation of Continuous Systems: A Brief Introduction

Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,

Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.

Strategies for Solving Large-Scale Optimization Problems Judith Hill Sandia National Laboratories October 23, 2007 Modeling and High-Performance Computing.

Blaise Barney, LLNL ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,

Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

Advanced Simulation and Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear.

ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.

Dynamical Instability of Differentially Rotating Polytropes Dept. of Earth Science & Astron., Grad. School of Arts & Sciences, Univ. of Tokyo S. Karino.

Interactive Computational Sciences Laboratory Clarence O. E. Burg Assistant Professor of Mathematics University of Central Arkansas Science Museum of Minnesota.

Resource Utilization in Large Scale InfiniBand Jobs Galen M. Shipman Los Alamos National Labs LAUR

Parallel Programming with MPI and OpenMP

Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

LLNL-PRES DRAFT This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract.

CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.

Cracow Grid Workshop, November 5-6, 2001 Concepts for implementing adaptive finite element codes for grid computing Krzysztof Banaś, Joanna Płażek Cracow.

Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.

Parallel Adaptive Mesh Refinement for Radiation Transport and Diffusion Louis Howell Center for Applied Scientific Computing/ AX Division Lawrence Livermore.

Bronis R. de Supinski and Jeffrey S. Vetter Center for Applied Scientific Computing August 15, 2000 Umpire: Making MPI Programs Safe.

Lawrence Livermore National Laboratory LLNL-PRES- XXXXXX LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by.

Extreme Computing’05 Parallel Graph Algorithms: Architectural Demands of Pathological Applications Bruce Hendrickson Jonathan Berry Keith Underwood Sandia.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,

1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.

A Brachytherapy Treatment Planning Software Based on Monte Carlo Simulations and Artificial Neural Network Algorithm Amir Moghadam.

Resource Utilization in Large Scale InfiniBand Jobs

Simulation of Core Collapse Supernovae

L Ge, L Lee, A. Candel, C Ng, K Ko, SLAC

Parallel Programming in C with MPI and OpenMP

Presentation transcript:

August 12, 2004 UCRL-PRES

2 Aug Outline l Motivation l About the Applications l Statistics Gathered l Inferences l Future Work

3 Aug Motivation l Info for App developers –Information on the expense of basic MPI functions (recode?) –Set expectations l Many tradeoffs available in MPI design –Memory allocation decisions –Protocol cutoff point decisions –Where is additional code complexity worth it? l Information on MPI Usage is scarce l New tools (e.g. mpiP) make profiling reasonable –Easy to incorporate (no source code changes) –Easy to interpret –Unobtrusive observation (little performance impact)

4 Aug About the applications… Amtran Ares Ardra Geodyne IRS: Mdcask: Linpack/HPL: Miranda: Smg: Spheral Sweep3d Umt2k: Amtran: discrete coordinate neutron transport Ares: instability 3-D simulation in massive star supernova envelopes Ardra: neutron transport/radiation diffusion code exploring new numerical algorithms and methods for the solution of the Boltzmann Transport Equation (e.g. nuclear imaging). Geodyne: eulerian adaptive mesh refinement (e.g. comet-earth impacts) IRS: solves the radiation transport equation by the flux-limiting diffusion approximation using an implicit matrix solution Mdcask: molecular dynamics codes for study in radiation damage in metals Linpack/HPL: solves a random dense linear system. Miranda: hydrodynamics code simulating instability growth Smg: a parallel semicoarsening multigrid solver for the linear systems arising from finite difference, volume, or finite element discretizations Spheral: provides a steerable parallel environment for performing coupled hydrodynamical & gravitational numerical simulations Sweep3d: solves a 1-group neuron transport problem Umt2k: photon transport code for unstructured meshes

5 Aug Percent of time to MPI Overall for sampled: 60% MPI 40% remaining app

6 Aug Top MPI Point-to-Point Calls

7 Aug Top MPI Collective Calls

8 Aug Comparing Collective and Point-to-Point

9 Aug Average Number of Calls for Most Common MPI Functions “Large” Runs

10 Aug Communication Patterns most dominant msgsize

11 Aug Communication Patterns (continued)

12 Aug Frequency of callsites by MPI functions

13 Aug Scalability

14 Aug Observations Summary l General –People seem to scale code to ~60% MPI/communication –Isend/Irecv/Wait many times more prevalent than Sendrecv and blocking send/recv –Time spent in collectives predominantly divided among barrier, allreduce, broadcast, gather, and alltoall –Most common msgsize is typically between 1K and 1MB l Surprises –Waitany most prevalent call –Almost all pt2pt messages are the same size within a run –Often, message size decreases with large runs –Some codes driven by alltoall performance

15 Aug Future Work & Concluding Remarks l Further understanding of apps needed –Results for other test configurations –When can apps make better use of collectives –Mpi-io usage info needed –Classified applications l Acknowledgements  mpiP is due to Jeffrey Vetter and Chris Chambreau  This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.