High-Performance Numerical Components and Common Interfaces Lois Curfman McInnes Mathematics and Computer Science Division Argonne National Laboratory.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Workshop finale dei Progetti Grid del PON "Ricerca" Avviso Febbraio 2009 Catania Abstract In the contest of the S.Co.P.E. italian.
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering.
Automated Analysis and Code Generation for Domain-Specific Models George Edwards Center for Systems and Software Engineering University of Southern California.
PETSc Portable, Extensible Toolkit for Scientific computing.
© 2011 Autodesk Freely licensed for use by educational institutions. Reuse and changes require a note indicating that content has been modified from the.
Center for Component Technology for Terascale Simulation Software (aka Common Component Architecture) (aka CCA) Rob Armstrong & the CCA Working Group Sandia.
Center for Component Technology for Terascale Simulation Software 122 June 2002Workshop on Performance Optimization via High Level Languages and Libraries.
1 TOPS Solver Components Language-independent software components for the scalable solution of large linear and nonlinear algebraic systems arising from.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
High-Performance Component- Based Scientific Software Engineering Boyana Norris Argonne National Laboratory CSDMS Meeting:
CQoS Update Li Li, Boyana Norris, Lois Curfman McInnes Argonne National Laboratory Kevin Huck University of Oregon.
Massively Parallel Magnetohydrodynamics on the Cray XT3 Joshua Breslau and Jin Chen Princeton Plasma Physics Laboratory Cray XT3 Technical Workshop Nashville,
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
CCA Common Component Architecture CCA Forum Tutorial Working Group Welcome to the Common.
A Component Infrastructure for Performance and Power Modeling of Parallel Scientific Applications Boyana Norris Argonne National Laboratory Van Bui, Lois.
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Presented by High Productivity Language and Systems: Next Generation Petascale Programming Wael R. Elwasif, David E. Bernholdt, and Robert J. Harrison.
Using the PETSc Linear Solvers Lois Curfman McInnes in collaboration with Satish Balay, Bill Gropp, and Barry Smith Mathematics and Computer Science Division.
SIAM Computational Science and Engineering1 10 February Components for Scientific Computing: An Introduction David E. Bernholdt Computer Science.
CCA Common Component Architecture CCA Forum Tutorial Working Group Welcome to the Common.
1 Trade-offs in High-Performance Numerical Library Design Lois Curfman McInnes Mathematics and Computer Science Division Argonne National Laboratory The.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Design engineering Vilnius The goal of design engineering is to produce a model that exhibits: firmness – a program should not have bugs that inhibit.
Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.
Components for Beam Dynamics Douglas R. Dechow, Tech-X Lois Curfman McInnes, ANL Boyana Norris, ANL With thanks to the Common Component Architecture (CCA)
SAP Participants: Douglas Dechow, Tech-X Corporation Lois Curfman McInnes, Boyana Norris, ANL Physics Collaborators: James Amundson, Panagiotis Spentzouris,
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Presented by An Overview of the Common Component Architecture (CCA) The CCA Forum and the Center for Technology for Advanced Scientific Component Software.
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
March 2004 At A Glance NASA’s GSFC GMSEC architecture provides a scalable, extensible ground and flight system approach for future missions. Benefits Simplifies.
1 1  Capabilities: Scalable algebraic solvers for PDEs Freely available and supported research code Usable from C, C++, Fortran 77/90, Python, MATLAB.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
Enabling Self-management of Component-based High-performance Scientific Applications Hua (Maria) Liu and Manish Parashar The Applied Software Systems Laboratory.
CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.
Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
Cracow Grid Workshop, November 5-6, 2001 Concepts for implementing adaptive finite element codes for grid computing Krzysztof Banaś, Joanna Płażek Cracow.
Computational Science & Engineering meeting national needs Steven F. Ashby SIAG-CSE Chair March 24, 2003.
Connections to Other Packages The Cactus Team Albert Einstein Institute
Progress on Component-Based Subsurface Simulation I: Smooth Particle Hydrodynamics Bruce Palmer Pacific Northwest National Laboratory Richland, WA.
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
Center for Extended MHD Modeling (PI: S. Jardin, PPPL) –Two extensively developed fully 3-D nonlinear MHD codes, NIMROD and M3D formed the basis for further.
Center for Component Technology for Terascale Simulation Software (CCTTSS) 110 April 2002CCA Forum, Townsend, TN CCA Status, Code Walkthroughs, and Demonstrations.
Center for Component Technology for Terascale Simulation Software (CCTTSS) 110 April 2002CCA Forum, Townsend, TN This work has been sponsored by the Mathematics,
Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Unstructured Meshing Tools for Fusion Plasma Simulations
Xing Cai University of Oslo
GENERAL VIEW OF KRATOS MULTIPHYSICS
Salient application properties Expectations TOPS has of users
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

High-Performance Numerical Components and Common Interfaces Lois Curfman McInnes Mathematics and Computer Science Division Argonne National Laboratory June 7-8, 2005 Joint ORNL/Indiana University Workshop on Computational Frameworks for Fusion Oak Ridge, TN

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Outline Motivation –Complex, multiphysics, multiscale nonlinear applications –Distributed, multilevel memory hierarchies Parallel Components for PDEs and Optimization –Two-phased approach Some Challenges –Domain-specific common interfaces –Dynamic adaptivity Concluding Remarks

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Motivating Scientific Applications Discretization Algebraic Solvers Parallel I/O Meshes Data Redistribution Physics Optimization Derivative Computation DiagnosticsSteeringVisualization Adaptive Solution Astrophysics Molecular structures Aerodynamics Fusion

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Challenges Community Perspective –Life-cycle costs of applications are increasing Require the combined use of software developed by different groups Difficult to leverage expert knowledge and advances in subfields Difficult to obtain portable performance Individual Scientist Perspective –Too much energy focused on too many details Little time to think about modeling, physics, mathematics Fear of bad performance without custom code Even when code reuse is possible, it is far too difficult Our Perspective –How to manage complexity? Numerical software tools that work together New algorithms (e.g., interactive/dynamic techniques, algorithm composition) Multimodel, multiphysics simulations

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ What are the algorithmic needs of our target applications? Large-scale, nonlinear PDE-based simulations –Multirate, multiscale, multicomponent –Rich variety of time scales and strong nonlinearities –Can run on 10,000+ processors, where systems have increasingly deep memory hierarchies –Require 100,000’s of nonlinear solves (time integration) Need –Fully or semi-implicit solvers –Multi-level algorithms –Support for adaptivity –Support for user-defined customizations (e.g., physics- informed preconditioners, transfer operators, and smoothers)

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Software for Nonlinear PDEs and Related Optimization Problems Goal: For problems arising from PDEs, support the general solution of F(u) = 0 User provides: –Code to evaluate F(u) –Code to evaluate Jacobian of F(u) (optional) or use sparse finite difference (FD) approximation or use automatic differentiation (AD) –AD support via collaboration with P. Hovland and B. Norris (see ) Goal: Solve related optimization problems, generally min f(u), u < u < u, c < c(u) < c Simple example: unconstrained minimization: min f(u) User provides: –Code to evaluate f(u) –Code to evaluate gradient and Hessian of f(u) (optional) or use sparse FD or AD ll uu

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Interface Issues How to hide complexity, yet allow customization and access to a range of algorithmic options? How to achieve portable performance? How to interface among external tools? –Including multiple libraries developed by different groups that provide similar functionality (e.g., linear algebra software) Criteria for evaluation of success –Efficiency (both per node performance and scalability) –Usability –Extensibility

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Two-Phased Approach to Numerical Components Phase 1 –Develop parallel, object-oriented numerical libraries OO techniques are effective for development with a moderate sized team Provide foundation of algorithms, data structures, implementations Phase 2 –Develop CCA-compliant component interfaces Leverage existing code Provide a more effective means for managing interactions among code developed by different groups

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Parallel Numerical Libraries: PETSc and TAO PETSc: Portable, Extensible Toolkit for Scientific Computation –S. Balay, K. Buschelman, B. Gropp, D. Kaushik, M. Knepley, L. C. McInnes, B. Smith, H. Zhang – –Targets the parallel solution of large-scale PDE-based applications –Begun in 1991, now over 13,000 downloads since 1995 TAO: Toolkit for Advanced Optimization –S. Benson, L. C. McInnes, J. Moré, J. Sarich – –Targets the solution of large-scale optimization problems –Begun in 1997 as part of DOE ACTS Toolkit Approach –Freely available and supported research toolkits Hyperlinked docs, many examples, usable from Fortran 77/90, C, and C++ –Portable to any parallel system supporting MPI, including Tightly coupled systems –Cray T3E, SGI Origin, IBM SP, HP 9000, Sun Enterprise Loosely coupled systems, e.g., networks of workstations –Compaq, HP, IBM, SGI, Sun, PCs running Linux or Windows –Distributed memory ‘shared nothing’ approach; encapsulate message- passing details in objects such as matrices, vectors, index sets

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Compressed Sparse Row (AIJ) Blocked Compressed Sparse Row (BAIJ) Block Diagonal (BDIAG) DenseOthers IndicesBlock IndicesStrideOthers Index Sets Vectors Line SearchTrust Region Newton-based Methods Others Nonlinear Solvers Additive Schwartz Block Jacobi ILUICC LU (Sequential only) Others Preconditioners Euler Backward Euler Pseudo Time Stepping Others Time Steppers GMRESCGCGSBi-CG-STABTFQMRRichardsonChebychevOthers Krylov Subspace Methods Matrices PETSc Numerical Libraries Distributed Arrays Matrix-free

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Semi-smooth Methods Others Complementarity Newton Trust Region GPCGInterior PointLMVMKTOthers Bound Constrained Optimization TAO Solvers PETSc (initial interface) Global Arrays (PNNL – thanks to M. Kumar and J. Nieplocha) Etc. Levenberg Marquardt Gauss- Newton LMVM Levenberg Marquardt with Bound Constraints Others Nonlinear Least Squares LMVM with Bound Constraints Line Search Trust Region Newton-based Methods Limited Memory Variable Metric (LMVM) Method Unconstrained Minimization Conjugate Gradient Methods Fletcher- Reeves Polak- Ribiére Polak- Ribiére-Plus Others TAO interfaces to external libraries for parallel vectors, matrices, and linear solvers

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Newton-Krylov Methods Newton: Solve: Update: Krylov: Projection methods for solving linear systems, Ax=b, using the Krylov subspace K = span(r, Ar, A r,…,A r ) –Require A only in the form of matrix-vector products –Popular methods: CG, GMRES, TFQMR, BiCGStab, etc. Preconditioning: In practice, typically needed: –Transform Ax=b into an equivalent form: B Ax = B b or (AB )(Bx) = b where the inverse action of B approximates that of A, but at a smaller cost F’(u ) d u = – F(u ) u = u + l du l-1l l l j j-1

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Post- Processing Application Initialization Function Evaluation Jacobian Evaluation PETSc Nonlinear Solvers PETSc code Application code Finite difference approximation Or automatic differentiation code Matrices Vectors Krylov Solvers Preconditioners GMRES TFQMR BCGS CGS BCG Others… ASM ILU B-Jacobi SSOR Multigrid Others… AIJ B-AIJ Diagonal Dense Matrix-free Others… Sequential Parallel Others… Application Driver An Application Perspective: Solve F(u) = 0

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Aerodynamics Example Developers: D. Kaushik (Argonne), D. Keyes (Columbia Univ), W. Gropp, B. Smith (Argonne), W.K. Anderson (NASA); based on a legacy NASA code, FUN3d, developed by Anderson Background: The Euler equations describe the conservation of mass, momentum, and energy in an inviscid fluid; here we study the flow of air over an ONERA M6 wing. Model: Fully implicit steady-state 3D incompressible Euler model using a tetrahedral mesh Solvers: Newton-Krylov-Schwarz method with pseudo-transient continuation Won Gordon Bell prize at SC99

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Performance ONERA M6 wing test case, tetrahedral grid of 2.8 million vertices (about 11 million unknowns) on up to 3072 ASCI Red nodes (each with dual Pentium Pro 333 MHz processors)

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Scientific Applications PETSc and TAO solvers have been used successfully in many scientific applications –Aerodynamics, acoustics, biomechanics, chemistry, fusion, electromagnetics, micromagnetics, materials science, multiphase flow, nanotechnology, reactive transport, etc. –See and –Scale to low 1000s of processors PETSc usage in fusion applications includes: –The SEL macroscopic modeling code, A. H. Glasser and X. Z. Tang, Computer Physics Communications, 164, , –A finite element Poisson solver for gyrokinetic particle simulations, Y. Nishimura, Z.Lin, J.Lewandowski, and S.Ethier, Submitted to J. Comput. Phys., –Global gyrokinetic Particle-in-cell Simulations with Trapped Electrons, J.L.V Lewandowski, Y.Nishimura, W.W.Lee, Z.Lin, and S. Ethier, Sherwood Fusion Theory Conference, Missoula, MT, –Electromagnetic gyrokinetic simulation with a fluid-kinetic hybrid electron model, Y. Nishimura, Z.Lin, L.Chen, J.Lewandowski, S.Ethier, and W. Wang, Sherwood Fusion Theory Conference, Missoula, MT, –Numerical studies of a steady state axisymmetric co-axial helicity injection plasma, X.Z. Tang and A.H. Boozer, Physics of Plasmas, 11, , –Inclusion of electromagnetic effects into gyrokinetic particle simulations, Y. Nishimura, Z.Lin, L.Chen, and W. Wang, American Physical Society 45th Annual Meeting Division of Plasma Physics, Albuquerque, New Mexico, October 2003, –Resistive Magnetohydrodynamics Simulation of Fusion Plasmas, X. Z. Tang, G. Y. Fu, S. C. Jardin, L. L. Lowe, W. Park, and H. R. Strauss, Princeton Plasma Physics Laboratory, PPPL-3532, Presented at 10th Society for Industrial and Applied Mathematics (SIAM) Conference on Parallel Processing for Scientific Computing, Portsmouth, Virginia, March 12-14, 2001.

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Two-Phased Approach to Numerical Components Phase 1 –Develop parallel, object-oriented numerical libraries OO techniques are effective for development with a moderate sized team Provide foundation of algorithms, data structures, implementations Phase 2 –Develop CCA-compliant component interfaces Leverage existing code Provide a more effective means for managing interactions among code developed by different groups

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ CCA Overview CCA evolved from DOE2000 as a grass roots effort –Recognized benefit of component based software engineering (CBSE) to high-performance scientific computing –Bridle the burgeoning hardware/software complexity! –See: CBSE needed to be specially crafted for HPC –Supporting parallelism and performance requirements –Supporting scientific languages (e.g. Fortran 90), legacy codes With SciDAC support, CCA has: –Demonstrated effectiveness of component-oriented approach –Advanced scientific research across several key domains –Grown a diverse community of users –See:

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ CCA Compliance in TAO Paradigm shift; both TAO and the application become components Each is required to provide a default constructor and to implement the CCA component interface –contains one method: “setServices” to register ports All interactions between components use ports –Application provides a “go” port and uses “taoSolver” port –TAO provides a “taoSolver” port There is no “main” routine Ref: J. Sarich, A Programmer's Guide for Providing CCA Component Interfaces to the Toolkit for Advanced Optimization, Argonne technical report ANL/MCS-TM-279, December, 2004.

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Negligible CCA Overhead in TAO Optimization Components No CCA overhead within components Small overhead between components Small overhead for language interoperability No CCA overhead on parallel computing Be aware of costs & design with them in mind –Small costs, easily amortized Maximum 0.2% overhead for CCA vs native C++ code for parallel molecular dynamics up to 170 CPUs. Aggregate time for linear solver component in unconstrained minimization problem. Ref: B. Norris et al., Parallel Components for PDEs and Optimization: Some Issues and Experiences, Parallel Computing, 28 (12), 2002, pp

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ CCA Application: Optimization in Quantum Chemistry Collaboration of ANL, PNNL, and SNL researchers, working with their own packages, integrated using CCA: –TAO (ANL) Limited Memory Variable Metric (LMVM) algorithm –PETSc (ANL) and Global Arrays (PNNL) for linear algebra –MPQC (SNL) and NWChem (PNNL) chemistry packages Significant improvements over “traditional” BFGS optimizers built into packages Interoperability at linear algebra and chemistry package levels Ref: J. P. Kenny et al. Component-Based Integration of Chemistry and Optimization Software. J. Computational Chemistry, 24(14): , GlycineIsoprenePhosphoserineAcetylsalicylic AcidCholesterol Number of Energy and Gradient Evalutaions NWChem/native MPQC/native NWChem/TAO MPQC/TAO Comparison of native BFGS and TAO LMVM optimization algorithms used with the MPQC and NWChem computational chemistry packages. Function evaluations in this domain are very expensive, so reducing optimization steps is very important.

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Outline Motivation –Complex, multiphysics, multiscale nonlinear applications –Distributed, multilevel memory hierarchies Parallel Components for PDEs and Optimization –Two-phased approach Some Challenges –Domain-specific common interfaces –Dynamic adaptivity Concluding Remarks

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ The CCA Forum participants do not pretend to be experts in all phases of computation, but rather just to be developing a standard way to exchange component capabilities. Medium of exchange: interfaces –Components interact only through explicitly defined interfaces –Quality (generality, completeness) of interfaces varies widely –Higher quality interfaces… Require general agreement among groups or communities Are more easily used in front of multiple implementations Are more easily (re)used by many applications Facilitate experimentation with new algorithms, implementations, etc. The Importance of Interfaces

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ A challenge to the community: Common interfaces are central Need experts in various areas to define sets of domain- specific common interfaces –Scientific application domains, meshes, discretization, (non)linear solvers, optimization, data analysis, visualization, etc. Caveat: Developing common interfaces is difficult! –Technical challenges Tradeoffs in broad functionality vs. maintaining good performance –Social challenges Agreement among diverse individuals with different priorities Few academic rewards for software The CCA is actively developing or promoting the development of common domain-specific interfaces, including –Distributed array descriptor –Molecular geometry optimization –MxN parallel data redistribution –Adaptive mesh refinement (w/ APDEC SciDAC Center) –Mesh and discretization interfaces (lead: TSTT SciDAC Center) –Linear and nonlinear solver interfaces (lead: TOPS SciDAC Center) This means you!

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Interface Definition Efforts Collaborations with math SciDAC centers focus on unified interfaces to numerous existing and new libraries –Users can swap libraries without having to change their code –New libraries are more easily integrated into applications Some info on TOPS and TSTT interfaces: –Parallel PDE-Based Simulations Using the Common Component Architecture, Lois Curfman McInnes et al., Argonne National Laboratory preprint ANL/MCS-P , 2004 (available via to appear in Are Magnus Bruaset, Petter Bjorstad, and Aslak Tveito, editors, Numerical Solution of PDEs on Parallel Computers, Springer-Verlag. SuperLU PETSc Hypre Sparskit Others … Application Linear Solver Libraries TOPS Solver Interfaces SuperLU PETSc Hypre Sparskit SolversSolvers Others … Application

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ TOPS’ Linear Solver Interface Goals –Simplicity - small number of distinct concepts –Generality –Programming language independence (via SIDL) –High performance –Extensibility – infrastructure for defining/implementing ‘conceptual’ solver interfaces Progenitors include –FEI (finite element interface) / C++ developed at SNL –ESI (equation solver interface) / C++ multi-lab effort –Various TOPS software packages Current drafts available via –Bitkeeper repository: bk://tops.bkbits.net:8080/tops-solver-interface –Snapshot: Who: B. Smith (ANL), R. Falgout (LLNL), various TOPS investigators

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Object Model Concepts Solver (is an) Vector – represents field data View (has one or more) (has a) Layout – provides access to the data – how data is laid out across processes Operator

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ View allows users to access values in the “language of the application” Handles any data communication transparently Same idea as conceptual interfaces within hypre (LLNL) Data Layout structuredcompositeblock-strucunstrucCSR Linear Solvers GMG,...FAC,...Hybrid,...AMGe,...ILU,... Conceptual (Linear System) Interfaces c/o Rob Falgout, LLNL

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Views differ primarily in the way they “set” and “get” data Classical Linear Algebra View – Indices are scalars that represent locations in R n Structured Mesh View – Indices are 3D triples that describe “boxes of data” (think 3D Fortran arrays) Views / Layouts –classical linear algebra access –single structured mesh –finite element interface –semi-structured meshes (structured mesh “parts” with additional arbitrary connections) –etc… array getValues(array indices); array getValues( ilower, iupper);

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ What’s Coming in TOPS Solvers Greater interface standardization Greater solver interoperability Better integration upwards w/ meshing and discretization systems Better integration downwards w/ performance monitoring and engineering systems Better algorithms! c/o David Keyes, TOPS PI (see more TOPS info at

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Anticipated Impact of Common TOPS Solver Interfaces on Fusion Easier for fusion scientists to explore different algorithms and solvers developed by different groups, such as these MHD/TOPS collaborations (for which interfaces were done manually for new algorithms callable across Ax=b interface) –M3D replacement of additive Schwarz (ASM) preconditioner with algebraic multigrid (AMG) in hypre (LLNL) achieved mesh-independent convergence rate 4-5  improvement in execution time –NIMROD replacement of diagonally scaled Krylov solver with a supernodal parallel sparse direct solver in SuperLU (LBNL) 2D tests run 100  faster; 3D production runs are 4-5  faster

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Motivating Scientific Applications Discretization Algebraic Solvers Parallel I/O Meshes Data Redistribution Physics Optimization Derivative Computation DiagnosticsSteeringVisualization Adaptive Solution Astrophysics Molecular structures Aerodynamics Fusion

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Dynamic Adaptivity Next generation applications will need to adapt to changing computational conditions –Changes in physics/models/algorithms in long-running simulations, different resource needs and performance characteristics CBSE enables component substitution at runtime, based on changing application characteristics and available resources linear solver A linear solver B linear solver C linear solver proxy: solve f’(u) du = -f(u) component monitoring Newton-Krylov solver application monitoring application driver analysis, optimization, replacement, and substitution decision services Component Substitution Set

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Computational Quality of Service (CQoS ) Approach: Automatic selection and configuration of components to suit a particular computational purpose, involves research in: Ref: P. Hovland, K. Keahey, L. McInnes, B. Norris, L. Diachin, and P. Raghavan, A Quality-of-Service Architecture for High-Performance Numerical Components, Proceedings of the Workshop on QoS in Component-Based Software Engineering, Toulouse, France, June 20, Ref: B. Norris, J. Ray, R. Armstrong, L. McInnes, D. Bernholdt, W. Elwasif, A. Malony, and S. Shende, Computational Quality of Service for Scientific Components, Proceedings of the International Symposium on Component-Based Software Engineering (CBSE7), Edinburgh, Scotland, Ref: B. Norris and I. Veljkovic, Performance Monitoring and Analysis Components in Adaptive PDE-Based Simulations, Argonne preprint ANL/MCS-P , January, Provider Component C Provider Component B Provider Component A Component Proxy Runtime Monitoring Historical Database Runtime Database Access Component Framework Application Component(s) Adaptive Strategy Component Adaptive Strategy Component Adaptive Strategy Component Adaptive Strategy Component Abstract Interface Metadata and metrics Performance evaluation and monitoring Automated application assembly and reconfiguration Adaptive polyalgorithmic solvers

L.C. McInnes, IU/ORNL Workshop on Computational Frameworks for Fusion, 6/8/ Concluding Remarks High-performance numerical components can be effectively built using a 2-phased process –Object-oriented numerical libraries developed by different teams at different institutions –Light-weight component layers Domain-specific common interfaces that are defined by various computational science communities are critical for –Achieving the promise of ‘plug-and-play’ component interoperability –Addressing issues in dynamic component interactions (reconfiguring and recomposing) These capabilities are becoming increasingly important for multi-physics, multi-scale computational science applications (e.g., fusion simulations)