TOPS SciDAC Overview David Keyes, project lead Dept. of Mathematics & Statistics Old Dominion University.

Slides:



Advertisements
Similar presentations
Fusion: The App of our Eyes David Keyes, Reporter and Questioner On behalf of your CS and Applied Math colleagues ISOFS Workshop San Diego, CA 18 September.
Advertisements

05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
A Discrete Adjoint-Based Approach for Optimization Problems on 3D Unstructured Meshes Dimitri J. Mavriplis Department of Mechanical Engineering University.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Simulation Where real stuff starts. ToC 1.What, transience, stationarity 2.How, discrete event, recurrence 3.Accuracy of output 4.Monte Carlo 5.Random.
February 21, 2008 Center for Hybrid and Embedded Software Systems Driving Application: 4D Tele-immersion Future Work Though.
Network and Grid Computing –Modeling, Algorithms, and Software Mo Mu Joint work with Xiao Hong Zhu, Falcon Siu.
Efficient Methodologies for Reliability Based Design Optimization
Advancing Computational Science Research for Accelerator Design and Optimization Accelerator Science and Technology - SLAC, LBNL, LLNL, SNL, UT Austin,
Programming Tools and Environments: Linear Algebra James Demmel Mathematics and EECS UC Berkeley.
PETSc Portable, Extensible Toolkit for Scientific computing.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
This chapter is extracted from Sommerville’s slides. Text book chapter
Computer System Architectures Computer System Software
CQoS Update Li Li, Boyana Norris, Lois Curfman McInnes Argonne National Laboratory Kevin Huck University of Oregon.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Report on Sensitivity Analysis Radu Serban Keith Grant, Alan Hindmarsh, Steven Lee, Carol Woodward Center for Applied Scientific Computing, LLNL Work performed.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
R. Ryne, NUG mtg: Page 1 High Energy Physics Greenbook Presentation Robert D. Ryne Lawrence Berkeley National Laboratory NERSC User Group Meeting.
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
Automatic Differentiation: Introduction Automatic differentiation (AD) is a technology for transforming a subprogram that computes some function into a.
Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
CMRS Review, PPPL, 5 June 2003 &. 4 projects in high energy and nuclear physics 5 projects in fusion energy science 14 projects in biological and environmental.
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Discretization Methods Chapter 2. Training Manual May 15, 2001 Inventory # Discretization Methods Topics Equations and The Goal Brief overview.
Discretization for PDEs Chunfang Chen,Danny Thorne Adam Zornes, Deng Li CS 521 Feb., 9,2006.
On the Use of Finite Difference Matrix-Vector Products in Newton-Krylov Solvers for Implicit Climate Dynamics with Spectral Elements ImpactObjectives 
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
The Performance Evaluation Research Center (PERC) Participating Institutions: Argonne Natl. Lab.Univ. of California, San Diego Lawrence Berkeley Natl.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
Report from LBNL TOPS Meeting TOPS/ – 2Investigators  Staff Members:  Parry Husbands  Sherry Li  Osni Marques  Esmond G. Ng 
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Center for Extended MHD Modeling (PI: S. Jardin, PPPL) –Two extensively developed fully 3-D nonlinear MHD codes, NIMROD and M3D formed the basis for further.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.
Background Computer System Architectures Computer System Software.
Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.
Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation.
On the Path to Trinity - Experiences Bringing Codes to the Next Generation ASC Platform Courtenay T. Vaughan and Simon D. Hammond Sandia National Laboratories.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
CSCAPES Mission Research and development Provide load balancing and parallelization toolkits for petascale computation Develop advanced automatic differentiation.
Why is Design so Difficult? Analysis: Focuses on the application domain Design: Focuses on the solution domain –The solution domain is changing very rapidly.
High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.
Xing Cai University of Oslo
Programming Models for SimMillennium
for more information ... Performance Tuning
GENERAL VIEW OF KRATOS MULTIPHYSICS
Supported by the National Science Foundation.
Salient application properties Expectations TOPS has of users
Verification and Validation Using Code-Based Sensitivity Techniques
Embedded Nonlinear Analysis Tools Capability Area
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

TOPS SciDAC Overview David Keyes, project lead Dept. of Mathematics & Statistics Old Dominion University

TOPS SciDAC Overview Who we are… … the PETSc and TAO people … the hypre and PVODE people … the SuperLU and PARPACK people … as well as the builders of other widely used packages …

TOPS SciDAC Overview Plus some university collaborators Our DOE lab collaborations predate SciDAC by many years.

TOPS SciDAC Overview You may know our “Templates” … but what we are doing now goes “in between” and far beyond!

TOPS SciDAC Overview Scope for TOPS l Design and implementation of “solvers” n Time integrators n Nonlinear solvers n Optimizers n Linear solvers n Eigensolvers l Software integration l Performance optimization Optimizer Linear solver Eigensolver Time integrator Nonlinear solver Indicates dependence Sens. Analyzer (w/ sens. anal.)

TOPS SciDAC Overview Motivation for TOPS l Not just algorithms, but vertically integrated software suites l Portable, scalable, extensible, tunable implementations l Motivated by representative apps, intended for many others Starring hypre and PETSc, among other existing packages l Driven by three applications SciDAC groups n LBNL-led “21st Century Accelerator” designs n ORNL-led core collapse supernovae simulations n PPPL-led magnetic fusion energy simulations l Coordinated with other ISIC SciDAC groups l Many DOE mission-critical systems are modeled by PDEs l Finite-dimensional models for infinite- dimensional PDEs must be large for accuracy n Qualitative insight is not enough (Hamming notwithstanding) n Simulations must resolve policy controversies, in some cases l Algorithms are as important as hardware in supporting simulation n Easily demonstrated for PDEs in the period 1945–2000 n Continuous problems provide exploitable hierarchy of approximation models, creating hope for “optimal” algorithms l Software lags both hardware and algorithms

TOPS SciDAC Overview Salient Application Properties l Multirate n requiring fully or semi-implicit in time solvers l Multiscale n requiring finest mesh spacing much smaller than domain diameter l Multicomponent n requiring physics-informed preconditioners, transfer operators, and smoothers PEP-II cavity model, c/o Advanced Computing for 21st Century Accelerator Science & Technology SciDAC group l Not linear l Not selfadjoint l Not structured FUN3D Slotted Wing model, c/o Anderson et al.

TOPS SciDAC Overview Keyword: “Optimal” l Convergence rate nearly independent of discretization parameters n Multilevel schemes for linear and nonlinear problems n Newton-like schemes for quadratic convergence of nonlinear problems l Convergence rate as independent as possible of physical parameters n Continuation schemes n Asymptotics-induced, operator-split preconditioning Parallel multigrid on steel/rubber composite, c/o M. Adams, Berkeley-Sandia The solver is a key part, but not the only part, of the simulation that needs to be scalable unscalable scalable Problem Size (increasing with number of processors) Time to Solution

TOPS SciDAC Overview It’s 2002; do you know what your solver is up to? Have you updated your solver in the past five years? Is your solver running at 1-10% of machine peak? Do you spend more time in your solver than in your physics? Is your discretization or model fidelity limited by the solver? Is your time stepping limited by stability? Are you running loops around your analysis code? Do you care how sensitive to parameters your results are? If the answer to any of these questions is “yes”, please tell us at the poster session!

TOPS SciDAC Overview What we believe l Many of us came to work on solvers through interests in applications l What we believe about … n applications n users n solvers n legacy codes n software … will impact how comfortable you are collaborating with us l So please give us your comments on the next five slides!

TOPS SciDAC Overview What we believe about apps l Solution of a system of PDEs is rarely a goal in itself n PDEs are solved to derive various outputs from specified inputs n Actual goal is characterization of a response surface or a design or control strategy n Together with analysis, sensitivities and stability are often desired  Software tools for PDE solution should also support related follow-on desires l No general purpose PDE solver can anticipate all needs n Why we have national laboratories, not numerical libraries for PDEs today n A PDE solver improves with user interaction n Pace of algorithmic development is very rapid  Extensibility is important

TOPS SciDAC Overview What we believe about users l Solvers are used by people of varying numerical backgrounds n Some expect MATLAB-like defaults n Others want to control everything, e.g., even varying the type of smoother and number of smoothings on different levels of a multigrid algorithm  Multilayered software design is important l Users’ demand for resolution is virtually insatiable n Relieving resolution requirements with modeling (e.g., turbulence closures, homogenization) only defers the demand for resolution to the next level n Validating such models requires high resolution  Processor scalability and algorithmic scalability (optimality) are critical

TOPS SciDAC Overview What we believe about legacy code l Porting to a scalable framework does not mean starting from scratch n High-value meshing and physics routines in original languages can be substantially preserved n Partitioning, reordering and mapping onto distributed data structures (that we may provide) adds code but little runtime  Distributions should include code samples exemplifying “separation of concerns” l Legacy solvers may be limiting resolution, accuracy, and generality of modeling overall n Replacing the solver may “solve” several other issues n However, pieces of the legacy solver may have value as part of a preconditioner  Solver toolkits should include “shells” for callbacks to high value legacy routines

TOPS SciDAC Overview What we believe about solvers l Solvers are employed as part of a larger code n Solver library is not only library to be linked n Solvers may be called in multiple, nested places n Solvers typically make callbacks n Solvers should be swappable  Solver threads must not interfere with other component threads, including other active instances of themselves l Solvers are employed in many ways over the life cycle of an applications code n During development and upgrading, robustness (of the solver) and verbose diagnostics are important n During production, solvers are streamlined for performance  Tunability is important

TOPS SciDAC Overview What we believe about software l A continuous operator may appear in a discrete code in many different instances n Optimal algorithms tend to be hierarchical and nested iterative n Processor-scalable algorithms tend to be domain-decomposed and concurrent iterative n Majority of progress towards desired highly resolved, high fidelity result occurs through cost-effective low resolution, low fidelity parallel efficient stages  Operator abstractions and recurrence are important l Hardware changes many times over the life cycle of a software package n Processors, memory, and networks evolve annually n Machines are replaced every 3-5 years at major DOE centers n Codes persist for decades  Portability is critical

TOPS SciDAC Overview Why is TOPS needed? l What is wrong? l Many widely used libraries are “behind the times” algorithmically l Logically innermost (solver) kernels are often the most computationally complex — should be designed from the inside out by experts and present the right “handles” to users l Today’s components do not “talk to” each other very well l Mixing and matching procedures too often requires mapping data between different storage structures (taxes memory and memory bandwidth) l What exists already? l Adaptive time integrators for stiff systems: variable-step BDF methods l Nonlinear implicit solvers: Newton-like methods, FAS multilevel methods l Optimizers (with constraints): quasi-Newton RSQP methods l Linear solvers: subspace projection methods (multigrid, Schwarz, classical smoothers), Krylov methods (CG, GMRES), sparse direct methods l Eigensolvers: matrix reduction techniques followed by tridiagonal eigensolvers, Arnoldi solvers

TOPS SciDAC Overview Nonlinear Solvers l What’s ready? KINSOL (LLNL) and PETSc (ANL) l Preconditioned Newton-Krylov (NK) methods with MPI-based objects l Asymptotically nearly quadratically convergent and mesh independent l Matrix-free implementations (FD and AD access to Jacobian elements) Thousands of direct downloads ( PETSc ) and active worldwide “friendly user” base Interfaced with hypre preconditioners ( KINSOL ) Sensitivity analysis extensions ( KINSOL ) l 1999 Bell Prize for unstructured implicit CFD computation at Tflop/s on a legacy F77 NASA code l What’s next? l Semi-automated continuation schemes (e.g., pseudo-transience) l Additive-Schwarz Preconditioned Inexact Newton (ASPIN) l Full Approximation Scheme (FAS) multigrid l Polyalgorithmic combinations of ASPIN, FAS, and NK-MG, together with new linear solvers/preconditioners l Automated Jacobian calculations with parallel colorings l New grid transfer and nonlinear coarse grid operators l Guidance of trade-offs for cheap/expensive residual function calls l Further forward and adjoint sensitivities

TOPS SciDAC Overview Optimizers l What’s ready? TAO (ANL) and VELTISTO (CMU) l Bound-constrained and equality- constrained optimization l Achieve optimum in number of PDE solves independent of number of control variables TAO released 2000, VELTISTO 2001 Both built on top of PETSc l Applied to problems with thousands of controls and millions of constraints on hundreds of processors l Used for design, control, parameter identification l Used in nonlinear elasticity, Navier- Stokes, acoustics l State-of-art Lagrange-Newton-Krylov- Schur algorithmics l What’s next? l Extensions to inequality constraints (beyond simple bound constraints) l Extensions to time-dependent PDEs, especially for inverse problems l Multilevel globalization strategies l Toleration strategies for approximate Jacobians and Hessians l “Hardening” of promising control strategies to deal with negative curvature of Hessian l Pipelining of PDE solutions into sensitivity analysis

TOPS SciDAC Overview Linear Solvers l What’s ready? PETSc (ANL), hypre (LLNL), SuperLU (UCB), Oblio (ODU) l Krylov, multilevel, sparse direct l Numerous preconditioners, incl. BNN, SPAI, PILU/PICC l Mesh-independent convergence for ever expanding set of problems hypre used in several ASCI codes and milestones to date SuperLU in ScaLAPACK State-of-art algebraic multigrid ( hypre ) and supernodal ( SuperLU ) efforts l Algorithmic replacements alone yield up to two orders of magnitude in DOE apps, before parallelization l What’s next? l Hooks for physics-based operator- split preconditionings l AMGe, focusing on incorporation of neighbor information and strong cross-variable coupling l Spectral AMGe for problems with geometrically oscillatory but algebraically smooth components l FOSLS-AMGe for saddle-point problems l Hierarchical basis ILU Incomplete factorization adaptations of SuperLU l Convergence-enhancing orders for ILU l Stability-enhancing orderings for sparse direct methods for indefinite problems

TOPS SciDAC Overview Eigensolvers l What’s ready? l LAPACK and ScaLAPACK symmetric eigensolvers (UCB, UTenn, LBNL) l PARPACK for sparse and nonsymmetric problems l Reductions to symmetric tridiagonal or Hessenberg form, followed by new “Holy Grail” algorithm l Holy Grail optimal (!): O(kn) work for k n-dimensional eigenvectors l What’s next? l Direct and iterative linear solution methods for shift-invert Lanczos for selected eigenpairs in large symmetric eigenproblems l Jacobi-Davidson projection methods for selected eigenpairs l Multilevel methods for eigenproblems arising from PDE applications l Hybrid multilevel/Jacobi- Davidson methods

TOPS SciDAC Overview Goals/Success Metrics l Understand range of algorithmic options and their tradeoffs (e.g., memory versus time) l Can try all reasonable options easily without recoding or extensive recompilation l Know how their solvers are performing l Spend more time in their physics than in their solvers l Are intelligently driving solver research, and publishing joint papers with TOPS researchers l Can simulate truly new physics, as solver limits are steadily pushed back TOPS users —

TOPS SciDAC Overview Expectations TOPS has of Users l Tell us if you think our assumptions above are incorrect or incomplete l Be willing to experiment with novel algorithmic choices – optimality is rarely achieved beyond model problems without interplay between physics and algorithmics! l Adopt flexible, extensible programming styles in which algorithmic and data structures are not hardwired l Be willing to let us play with the real code you care about, but be willing, as well to abstract out relevant compact tests l Be willing to make concrete requests, to understand that requests must be prioritized, and to work with us in addressing the high priority requests l If possible, profile, profile, profile before seeking help

TOPS SciDAC Overview TOPS may be for you! For more information...