1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.

Slides:



Advertisements
Similar presentations
Workshop finale dei Progetti Grid del PON "Ricerca" Avviso Febbraio 2009 Catania Abstract In the contest of the S.Co.P.E. italian.
Advertisements

05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Parallel Jacobi Algorithm Steven Dong Applied Mathematics.
Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 – June
1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.
" Characterizing the Relationship between ILU-type Preconditioners and the Storage Hierarchy" " Characterizing the Relationship between ILU-type Preconditioners.
Gordon Bell Prize Finalist Presentation SC’99 Achieving High Sustained Performance in an Unstructured Mesh CFD Application
PETSc Portable, Extensible Toolkit for Scientific computing.
CS240A: Conjugate Gradients and the Model Problem.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
VIRTUAL PROTOTYPING of ROBOTS DYNAMICS E. Tarabanov.
A Factored Sparse Approximate Inverse software package (FSAIPACK) for the parallel preconditioning of linear systems Massimiliano Ferronato, Carlo Janna,
Antonio M. Vidal Jesús Peinado
Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction Ananth Grama, Purdue University Vivek Sarin, Texas A&M University Hemant.
Iterative and direct linear solvers in fully implicit magnetic reconnection simulations with inexact Newton methods Xuefei (Rebecca) Yuan 1, Xiaoye S.
CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries.
1 Intel Mathematics Kernel Library (MKL) Quickstart COLA Lab, Department of Mathematics, Nat’l Taiwan University 2010/05/11.
Makoto Kudoh*1, Hisayasu Kuroda*1,
Scalable Multi-Stage Stochastic Programming
Report on Sensitivity Analysis Radu Serban Keith Grant, Alan Hindmarsh, Steven Lee, Carol Woodward Center for Applied Scientific Computing, LLNL Work performed.
High-Performance Numerical Components and Common Interfaces Lois Curfman McInnes Mathematics and Computer Science Division Argonne National Laboratory.
Non-uniformly Communicating Non-contiguous Data: A Case Study with PETSc and MPI P. Balaji, D. Buntinas, S. Balay, B. Smith, R. Thakur and W. Gropp Mathematics.
Fast Low-Frequency Impedance Extraction using a Volumetric 3D Integral Formulation A.MAFFUCCI, A. TAMBURRINO, S. VENTRE, F. VILLONE EURATOM/ENEA/CREATE.
Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.
The ACTS Toolkit (What can it do for you?) Osni Marques and Tony Drummond ( LBNL/NERSC )
A Component Infrastructure for Performance and Power Modeling of Parallel Scientific Applications Boyana Norris Argonne National Laboratory Van Bui, Lois.
Parallelization of Cumulative Reaction Probabilities (CRP) Parallel Subspace Projection Approximate Matrix (SPAM) method Iterpolative Moving Least Squares.
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
Using the PETSc Linear Solvers Lois Curfman McInnes in collaboration with Satish Balay, Bill Gropp, and Barry Smith Mathematics and Computer Science Division.
Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.
HFODD for Leadership Class Computers N. Schunck, J. McDonnell, Hai Ah Nam.
Problem Solving with NetSolve Michelle Miller, Keith Moore,
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
ACES WorkshopJun-031 ACcESS Software System & High Level Modelling Languages by
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
1 1  Capabilities: Scalable algebraic solvers for PDEs Freely available and supported research code Usable from C, C++, Fortran 77/90, Python, MATLAB.
Parallel Solution of the Poisson Problem Using MPI
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
Cracow Grid Workshop, November 5-6, 2001 Concepts for implementing adaptive finite element codes for grid computing Krzysztof Banaś, Joanna Płażek Cracow.
Connections to Other Packages The Cactus Team Albert Einstein Institute
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
2/22/2001Greenbook 2001/OASCR1 Greenbook/OASCR Activities Focus on technology to enable SCIENCE to be conducted, i.e. Software tools Software libraries.
An Object-Oriented Software Framework for Building Parallel Navier-Stokes Solvers Xing Cai Hans Petter Langtangen Otto Munthe University of Oslo.
Report from LBNL TOPS Meeting TOPS/ – 2Investigators  Staff Members:  Parry Husbands  Sherry Li  Osni Marques  Esmond G. Ng 
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
Material adapted from a tutorial by: PETSc Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm Mathematics.
Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
Today's Software For Tomorrow's Hardware: An Introduction to Parallel Computing Rahul.S. Sampath May 9 th 2007.
Parallel Programming & Cluster Computing Linear Algebra Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education Program’s.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.
ANSYS, Inc. Proprietary © 2004 ANSYS, Inc. Chapter 5 Distributed Memory Parallel Computing v9.0.
Large-scale geophysical electromagnetic imaging and modeling on graphical processing units Michael Commer (LBNL) Filipe R. N. C. Maia (LBNL-NERSC) Gregory.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
A survey of Exascale Linear Algebra Libraries for Data Assimilation
Xing Cai University of Oslo
Shrirang Abhyankar IEEE PES HPC Working Group Meeting
A computational loop k k Integration Newton Iteration
Lecture 19 MA471 Fall 2003.
Parallel Matrix Multiplication and other Full Matrix Algorithms
Meros: Software for Block Preconditioning the Navier-Stokes Equations
Gordon Bell Prize Finalist Presentation SC’99 Achieving High Sustained Performance in an Unstructured Mesh CFD Application
Parallel Matrix Multiplication and other Full Matrix Algorithms
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
A computational loop k k Integration Newton Iteration
Presentation transcript:

1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff and A. Wagner) ANL (MCS/CHM) u Introduction u Problem Description u MPP Software Tools u Computations u Future Direction

2 Parallelization of Cumulative Reaction Probabilities (CRP) with PETSc M. Minkoff (ANL/MCS) and A. Wagner(ANL/CHM) Calculation of gas phase rate constants u Develop a highly scalable and efficient parallel algorithm for calculating the Cumulative Reaction Probability, P. u Use parallel subroutine libraries for higher generation parallel machines to develop parallel CRP simulation software. u Implementing Miller and Manthe (1994) method for time-independent solution of P in parallel. –P is determined for an eigenvalue problem with an operator involving two Green’s functions. The eigenvalues are obtained using a Lanczos method. The Green’s functions are evaluated via a GMRES iteration with diagonal preconditioner.

3 Benefits of using PETSc u Sparsity: PETSc allows arbitrarily sparse data structures u GMRES: PETSc has GMRES as an option for linear solves u Present tests involve problems in dimensions 3 to 6. Testing is underway using an SGI Power Challenge (ANL), and SGI/CRAY T3E (NERSC). (Portability is provided via MPI and PETSc, so higher dimensional systems are planned for future work).

4 Chemical Dynamics Theory 3 angles, 3 stretches 6 degrees of freedom

5 Chemical Dynamics Theory u How fast do chemicals react? u Rate constant “k” determines it –d[X] / dt = -k1[X][Y] + k2[Z][Y] –many rates at work in devices –rates express interactions in the chemistry –individual rates are measurable and calculable –rates depend on T, P.

6 Chemical Dynamics Theory N(E) = Tr[P(E)] u Rates are related to –Cumulative Reaction Probability (CRP), N(E) N(E) = 4 Tr[

7 Chemical Dynamics Theory u Probability Operator and It’s Inverse –Using probability method calculates a few large eigenvalues via iterative methods. The iterative evaluation involves the action of two Green’s function. –Using inverse probability method involves a direct calculation each iteration to obtain a few smallest eigenvalues. At each iteration the action of a vector by the Green’s function is required. This leads to solving linear systems involving the Hamiltonian.

8 Chemical Dynamics Theory u The Green’s functions have the form: G(E) = (E + i e - H) -1 and so we need to solve two linear systems (at each iteration) of the form: (E + i e - H) y = x where x is known. This system is solved via GMRES with preconditioning methods (initially diagonal scaling).

9 PETSc: Portable, Extensible Toolkit for Scientific Computing u Focus: data structures and routines for the scalable solution of PDE-based applications u Object-oriented design using mathematical abstractions u Freely available and supported research code  Available via u Usable in C, C++, and Fortran77/90 (with minor limitations in Fortran 77/90 due to their syntax) u Users manual, hyperlinked manual pages for all routines u Many tutorial-style examples  Support via Satish Balay, William Gropp, Lois McInnes, and Barry Smith MCS Division, Argonne National Laboratory

10 Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK Profiling Interface PETSc PDE Application Codes Object-Oriented Matrices, Vectors, Indices Grid Management Linear Solvers Preconditioners + Krylov Methods Nonlinear Solvers, Unconstrained Minimization ODE Integrators Visualization Interface Application Codes Using PETSc Applications can interface to whatever abstraction level is most appropriate.

11 Compressed Sparse Row (AIJ) Blocked Compressed Sparse Row (BAIJ) Block Diagonal (BDIAG) DenseOther IndicesBlock IndicesStrideOther Index Sets Vectors Line SearchTrust Region Newton-based Methods Other Nonlinear Solvers Additive Schwartz Block Jacobi ILUICC LU (Sequential only) Others Preconditioners Euler Backward Euler Pseudo Time Stepping Other Time Steppers GMRESCGCGSBi-CG-STABTFQMRRichardsonChebychevOther Krylov Subspace Methods Matrices PETSc Numerical Components

MHz T3E, 2.8M vertices Sample Scalable Performance 3D incompressible Euler Tetrahedral grid Up to 11 million unknowns Based on a legacy NASA code, FUN3d, developed by W. K. Anderson Fully implicit steady-state Newton-Krylov-Schwarz algorithm with pseudo-transient continuation Results courtesy of Dinesh Kaushik and David Keyes, Old Dominion University

13 Computations via MPI and PETSc

14 5D/T3E Results for Varying Eigenvalue and G-S Method

15 Parallel Speedup 5D/6D ANL/SGI and NERSC/T3E

16 Storage Required for Higher Dimensions

17 Results and Future Work u Achieved parallelization with less effort –Suboptimal but perhaps 2X Optimal Performance u Testing for 6D and 7D underway. –MPP CPU and Memory can provide necessary resources –Many degrees of freedom can be approximated, so maximum dimension needed is ~10. u Develop block structured preconditioning methods