Using the PETSc Linear Solvers Lois Curfman McInnes in collaboration with Satish Balay, Bill Gropp, and Barry Smith Mathematics and Computer Science Division.

Slides:



Advertisements
Similar presentations
Workshop finale dei Progetti Grid del PON "Ricerca" Avviso Febbraio 2009 Catania Abstract In the contest of the S.Co.P.E. italian.
Advertisements

Parallel Jacobi Algorithm Steven Dong Applied Mathematics.
1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.
Rayan Alsemmeri Amseena Mansoor. LINEAR SYSTEMS Jacobi method is used to solve linear systems of the form Ax=b, where A is the square and invertible.
Chapter 2, Linear Systems, Mainly LU Decomposition.
Gordon Bell Prize Finalist Presentation SC’99 Achieving High Sustained Performance in an Unstructured Mesh CFD Application
CSCI 317 Mike Heroux1 Sparse Matrix Computations CSCI 317 Mike Heroux.
Sparse Matrix Algorithms CS 524 – High-Performance Computing.
Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.
Introduction to the ACTS Collection Why/How Do We Use These Tools? Tony Drummond Computational Research Division Lawrence Berkeley National Laboratory.
Sparse Matrix Methods Day 1: Overview Day 2: Direct methods
PETSc Tutorial Numerical Software Libraries for
The Landscape of Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage (if sparse) More Robust.
PETSc Portable, Extensible Toolkit for Scientific computing.
CS240A: Conjugate Gradients and the Model Problem.
Ordinary Differential Equations (ODEs) 1Daniel Baur / Numerical Methods for Chemical Engineers / Implicit ODE Solvers Daniel Baur ETH Zurich, Institut.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
Numerical Software Libraries for the Scalable Solution of PDEs Introduction to PETSc Satish Balay, Bill Gropp, Lois C. McInnes, Barry Smith Mathematics.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries.
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
1 Intel Mathematics Kernel Library (MKL) Quickstart COLA Lab, Department of Mathematics, Nat’l Taiwan University 2010/05/11.
Numerical Software Libraries for the Scalable Solution of PDEs PETSc Tutorial Satish Balay Kris Buschelman Bill Gropp Lois Curfman McInnes Barry Smith.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Report on Sensitivity Analysis Radu Serban Keith Grant, Alan Hindmarsh, Steven Lee, Carol Woodward Center for Applied Scientific Computing, LLNL Work performed.
Non-uniformly Communicating Non-contiguous Data: A Case Study with PETSc and MPI P. Balaji, D. Buntinas, S. Balay, B. Smith, R. Thakur and W. Gropp Mathematics.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
Qualifier Exam in HPC February 10 th, Quasi-Newton methods Alexandru Cioaca.
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.
Numerical Methods for Sparse Systems
CFD Lab - Department of Engineering - University of Liverpool Ken Badcock & Mark Woodgate Department of Engineering University of Liverpool Liverpool L69.
1 Eigenvalue Problems in Nanoscale Materials Modeling Hong Zhang Computer Science, Illinois Institute of Technology Mathematics and Computer Science, Argonne.
Intel Math Kernel Library Wei-Ren Chang, Department of Mathematics, National Taiwan University 2009/03/10.
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
Parallelizing Gauss-Seidel Solver/Pre-conditioner Aim: To parallelize a Gauss-Seidel Solver, which can be used as a pre-conditioner for the finite element.
High Performance Computational Fluid-Thermal Sciences & Engineering Lab GenIDLEST Co-Design Virginia Tech AFOSR-BRI Workshop July 20-21, 2014 Keyur Joshi,
1 1  Capabilities: Scalable algebraic solvers for PDEs Freely available and supported research code Usable from C, C++, Fortran 77/90, Python, MATLAB.
Parallel Solution of the Poisson Problem Using MPI
CS240A: Conjugate Gradients and the Model Problem.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
Cracow Grid Workshop, November 5-6, 2001 Concepts for implementing adaptive finite element codes for grid computing Krzysztof Banaś, Joanna Płażek Cracow.
Connections to Other Packages The Cactus Team Albert Einstein Institute
CS 484. Iterative Methods n Gaussian elimination is considered to be a direct method to solve a system. n An indirect method produces a sequence of values.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning.
An Object-Oriented Software Framework for Building Parallel Navier-Stokes Solvers Xing Cai Hans Petter Langtangen Otto Munthe University of Oslo.
Report from LBNL TOPS Meeting TOPS/ – 2Investigators  Staff Members:  Parry Husbands  Sherry Li  Osni Marques  Esmond G. Ng 
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
Material adapted from a tutorial by: PETSc Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm Mathematics.
On the Performance of PC Clusters in Solving Partial Differential Equations Xing Cai Åsmund Ødegård Department of Informatics University of Oslo Norway.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Programming Massively Parallel Graphics Multiprocessors using CUDA Final Project Amirhassan Asgari Kamiabad
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.
Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.
Parallel Computing Activities at the Group of Scientific Software Xing Cai Department of Informatics University of Oslo.
Large-scale geophysical electromagnetic imaging and modeling on graphical processing units Michael Commer (LBNL) Filipe R. N. C. Maia (LBNL-NERSC) Gregory.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
TEMPLATE DESIGN © H. Che 2, E. D’Azevedo 1, M. Sekachev 3, K. Wong 3 1 Oak Ridge National Laboratory, 2 Chinese University.
Data Objects Object creation Object assembly Setting options Viewing User-defined customizations Vectors ( Vec ) –focus: field data arising in nonlinear.
Xing Cai University of Oslo
Lecture 19 MA471 Fall 2003.
Gordon Bell Prize Finalist Presentation SC’99 Achieving High Sustained Performance in an Unstructured Mesh CFD Application
Solving Linear Systems: Iterative Methods and Sparse Systems
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

Using the PETSc Linear Solvers Lois Curfman McInnes in collaboration with Satish Balay, Bill Gropp, and Barry Smith Mathematics and Computer Science Division Argonne National Laboratory Cactus Tutorial September 30, 1999

PETSc: Portable, Extensible Toolkit for Scientific Computing Focus: data structures and routines for the scalable solution of PDE-based applications Freely available and supported research code Available via Usable in C, C++, and Fortran77/90 (with minor limitations in Fortran 77/90 due to their syntax) Users manual in Postscript and HTML formats Hyperlinked manual pages for all routines Many tutorial-style examples Support via

Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK Profiling Interface PETSc PDE Application Codes Object-Oriented Matrices, Vectors, Indices Grid Management Linear Solvers Preconditioners + Krylov Methods Nonlinear Solvers, Unconstrained Minimization ODE Integrators Visualization Interface PDE Application Codes

Compressed Sparse Row (AIJ) Blocked Compressed Sparse Row (BAIJ) Block Diagonal (BDIAG) DenseOther IndicesBlock IndicesStrideOther Index Sets Vectors Line SearchTrust Region Newton-based Methods Other Nonlinear Solvers Additive Schwartz Block Jacobi ILUICC LU (Sequential only) Others Preconditioners Euler Backward Euler Pseudo Time Stepping Other Time Steppers GMRESCGCGSBi-CG-STABTFQMRRichardsonChebychevOther Krylov Subspace Methods Matrices PETSc Numerical Components

600 MHz T3E, 2.8M vertices Sample Scalable Performance 3D incompressible Euler Tetrahedral grid Up to 11 million unknowns Based on a legacy NASA code, FUN3d, developed by W. K. Anderson Fully implicit steady-state Newton-Krylov-Schwarz algorithm with pseudo-transient continuation Results courtesy of Dinesh Kaushik and David Keyes, Old Dominion University

PETSc Application Initialization Evaluation of A and b Post- Processing Solve Ax = b PCKSP Linear Solvers (SLES) PETSc codeCactus code Linear PDE Solution Cactus driver code

Vectors Fundamental objects for storing field solutions, right-hand sides, etc. VecCreateMPI(...,Vec *) –MPI_Comm - processors that share the vector –number of elements local to this processor –total number of elements Each process locally owns a subvector of contiguously numbered global indices proc 3 proc 2 proc 0 proc 4 proc 1

Sparse Matrices Fundamental objects for storing linear operators (e.g., Jacobians) MatCreateMPIAIJ(…,Mat *) –MPI_Comm - processors that share the matrix –number of local rows and columns –number of global rows and columns –optional storage pre-allocation information Each process locally owns a submatrix of contiguously numbered global rows. proc 3 proc 2 proc 1 proc 4 proc 0

SLES Solver Context Variable Key to solver organization Contains the complete algorithmic state, including –parameters (e.g., convergence tolerance) –functions that run the algorithm (e.g., convergence monitoring routine) –information about the current state (e.g., iteration number) Creating the SLES solver –C/C++: ierr = SLESCreate(MPI_COMM_WORLD,&sles); –Fortran: call SLESCreate(MPI_COMM_WORLD,sles,ierr) Provides an identical user interface for all linear solvers (uniprocessor and parallel, real and complex numbers)

Basic Linear Solver Code (C/C++) SLES sles; /* linear solver context */ Mat A; /* matrix */ Vec x, b; /* solution, RHS vectors */ int n, its; /* problem dimension, number of iterations */ MatCreate(MPI_COMM_WORLD,n,n,&A); /* assemble matrix */ VecCreate(MPI_COMM_WORLD,n,&x); VecDuplicate(x,&b); /* assemble RHS vector */ SLESCreate(MPI_COMM_WORLD,&sles); SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PA TTERN); SLESSetFromOptions(sles); SLESSolve(sles,b,x,&its);

Basic Linear Solver Code (Fortran) SLES sles Mat A Vec x, b integer n, its, ierr call MatCreate(MPI_COMM_WORLD,n,n,A,ierr) call VecCreate(MPI_COMM_WORLD,n,x,ierr) call VecDuplicate(x,b,ierr) call SLESCreate(MPI_COMM_WORLD,sles,ierr) call SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTE RN,ierr) call SLESSetFromOptions(sles,ierr) call SLESSolve(sles,b,x,its,ierr) C then assemble matrix and right-hand-side vector

Customization Options Procedural Interface –Provides a great deal of control on a usage-by-usage basis; gives full flexibility inside an application SLESGetKSP(SLES sles,KSP *ksp) KSPSetType(KSP ksp,KSPType type) KSPSetTolerances(KSP ksp,double rtol,double atol,double dtol, int maxits) Command Line Interface –Applies same rule to all queries via a database; enables complete control at runtime, with no extra coding -ksp_type [cg,gmres,bcgs,tfqmr,…] -ksp_max_it -ksp_gmres_restart

Recursion: Specifying Solvers for Schwarz Preconditioner Blocks -sub_pc_type lu -mat_reordering [nd,1wd,rcm,qmd] -mat_lu_fill -sub_pc_type ilu -sub_pc_ilu_levels Can also use inner iterations, e.g., -sub_ksp_type gmres -sub_ksp_rtol -sub_ksp_max_it

Linear Solvers: Monitoring Convergence -ksp_monitor - Prints preconditioned residual norm -ksp_xmonitor - Plots preconditioned residual norm -ksp_truemonitor - Prints true residual norm || b-Ax || -ksp_xtruemonitor - Plots true residual norm || b-Ax || User-defined monitors, using callbacks

SLES: Selected Preconditioner Options And many more options...

SLES: Selected Krylov Method Options And many more options...

SLES: Runtime Script Example

Viewing SLES Runtime Options

Providing Different Matrices to Define Linear System and Preconditioner Krylov method: Use A for matrix-vector products Build preconditioner using either –A - matrix that defines linear system –or P - a different matrix (cheaper to assemble) SLESSetOperators(SLES sles, –Mat A, –Mat P, –MatStructure flag) Precondition via: M A M (M x) = M b RL R L Solve Ax=b

Matrix-Free Solvers Use “shell” matrix data structure –MatCreateShell(…, Mat *mfctx) Define operations for use by Krylov methods –MatShellSetOperation(Mat mfctx, MatOperation MATOP_MULT, (void *) int (UserMult)(Mat,Vec,Vec)) Names of matrix operations defined in petsc/include/mat.h Some defaults provided for nonlinear solver usage

User-defined Customizations Restricting the available solvers –Customize PCRegisterAll( ), KSPRegisterAll( ) Adding user-defined preconditioners via –PCShell preconditioner type Adding preconditioner and Krylov methods in library style –Method registration via PCRegister( ), KSPRegister( ) Heavily commented example implementations –Jacobi preconditioner: petsc/src/sles/pc/impls/jacobi.c –Conjugate gradient: petsc/src/sles/ksp/impls/cg/cg.c

SLES: Example Programs ex1.c, ex1f.F - basic uniprocessor codes ex2.c, ex2f.F - basic parallel codes ex11.c - using complex numbers ex4.c - using different linear system and preconditioner matrices ex9.c - repeatedly solving different linear systems ex15.c - setting a user-defined preconditioner for more information: And many more examples... Location: petsc/src/sles/examples/tutorials/