Parallel Computing Activities at the Group of Scientific Software Xing Cai Department of Informatics University of Oslo.

Slides:



Advertisements
Similar presentations
A Discrete Adjoint-Based Approach for Optimization Problems on 3D Unstructured Meshes Dimitri J. Mavriplis Department of Mechanical Engineering University.
Advertisements

NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
Geometric (Classical) MultiGrid. Hierarchy of graphs Apply grids in all scales: 2x2, 4x4, …, n 1/2 xn 1/2 Coarsening Interpolate and relax Solve the large.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Parallel Solution of Navier Stokes Equations Xing Cai Dept. of Informatics University of Oslo.
Extending the capability of TOUGHREACT simulator using parallel computing Application to environmental problems.
Support for Adaptive Computations Applied to Simulation of Fluids in Biological Systems Kathy Yelick U.C. Berkeley.
4/26/05Han: ELEC72501 Department of Electrical and Computer Engineering Auburn University, AL K.Han Development of Parallel Distributed Computing System.
Parallel Computing Overview CS 524 – High-Performance Computing.
1 Parallel multi-grid summation for the N-body problem Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3)
Landscape Erosion Kirsten Meeker
Cache-Optimal Parallel Solution of PDEs Ch. Zenger Informatik V, TU München Finite Element Solution of PDEs Christoph Zenger Nadine Dieminger, Frank Günther,
PETSc Portable, Extensible Toolkit for Scientific computing.
Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.
Support for Adaptive Computations Applied to Simulation of Fluids in Biological Systems Kathy Yelick U.C. Berkeley.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
Direct and iterative sparse linear solvers applied to groundwater flow simulations Matrix Analysis and Applications October 2007.
1 Parallel Simulations of Underground Flow in Porous and Fractured Media H. Mustapha 1,2, A. Beaudoin 1, J. Erhel 1 and J.R. De Dreuzy IRISA – INRIA.
July 1, 2010Parallel solution of the Helmholtz equation1 Parallel solution of the Helmholtz equation with large wave numbers Dan Gordon Computer Science.
Tools for Multi-Physics Simulation Hans Petter Langtangen Simula Research Laboratory Oslo, Norway Department of Informatics, University of Oslo.
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
1CPSD NSF/DARPA OPAAL Adaptive Parallelization Strategies using Data-driven Objects Laxmikant Kale First Annual Review October 1999, Iowa City.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
CADD: Component-Averaged Domain Decomposition Dan Gordon Computer Science University of Haifa Rachel Gordon Aerospace Engg. Technion January 13,
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
CFD Lab - Department of Engineering - University of Liverpool Ken Badcock & Mark Woodgate Department of Engineering University of Liverpool Liverpool L69.
Introduction to Parallel Finite Element Method using GeoFEM/HPC-MW Kengo Nakajima Dept. Earth & Planetary Science The University of Tokyo VECPAR’06 Tutorial:
Developing a computational infrastructure for parallel high performance FE/FVM simulations Dr. Stan Tomov Brookhaven National Laboratory August 11, 2003.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
The swiss-carpet preconditioner: a simple parallel preconditioner of Dirichlet-Neumann type A. Quarteroni (Lausanne and Milan) M. Sala (Lausanne) A. Valli.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
A Software Strategy for Simple Parallelization of Sequential PDE Solvers Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo.
Parallel Solution of the Poisson Problem Using MPI
Parallelizing finite element PDE solvers in an object-oriented framework Xing Cai Department of Informatics University of Oslo.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
Connections to Other Packages The Cactus Team Albert Einstein Institute
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Introduction to Scientific Computing II Multigrid Dr. Miriam Mehl.
Data Structures and Algorithms in Parallel Computing Lecture 7.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
An Object-Oriented Software Framework for Building Parallel Navier-Stokes Solvers Xing Cai Hans Petter Langtangen Otto Munthe University of Oslo.
A Software Framework for Easy Parallelization of PDE Solvers Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo.
On the Performance of PC Clusters in Solving Partial Differential Equations Xing Cai Åsmund Ødegård Department of Informatics University of Oslo Norway.
Brain (Tech) NCRR Overview Magnetic Leadfields and Superquadric Glyphs.
A Parallel Hierarchical Solver for the Poisson Equation Seung Lee Deparment of Mechanical Engineering
Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.
Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation.
A Software Framework for Easy Parallelization of PDE Solvers Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo.
A Simulation Framework for Testing Flow Control Strategies Marek Gayer, Milan Milovanovic and Ole Morten Aamo Faculty of Information Technology, Mathematics.
High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Relaxation Methods in the Solution of Partial Differential Equations
Hui Liu University of Calgary
Xing Cai University of Oslo
Programming Models for SimMillennium
Parallel Programming Styles and Hybrids
GENERAL VIEW OF KRATOS MULTIPHYSICS
FEniCS = Finite Element - ni - Computational Software
Introduction to Scientific Computing II
A Software Framework for Easy Parallelization of PDE Solvers
Parallelizing Unstructured FEM Computation
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Improving the Performance of Large-Scale Unstructured PDE Applications
Presentation transcript:

Parallel Computing Activities at the Group of Scientific Software Xing Cai Department of Informatics University of Oslo

Numerical Simulation Phy.phenomMath.model Diffpack hardware Algorithm

Diffpack O-O software environment for scientific computation (C++)O-O software environment for scientific computation (C++) Rich collection of PDE solution components - portable, flexible, extensibleRich collection of PDE solution components - portable, flexible, extensible H.P.Langtangen, Computational Partial Differential Equations, Springer 1999H.P.Langtangen, Computational Partial Differential Equations, Springer 1999

Parallelization Objectives Flexible and user-friendly parallelizationFlexible and user-friendly parallelization High parallel efficiencyHigh parallel efficiency Full code portability (standard MPI)Full code portability (standard MPI) –SGI Cray Origin 2000, SGI Power Challenge –HP V2500 –IBM SP2 –Scali –Cluster of Linux PC nodes

Straightforward Parallelization Develop a sequential simulator, without paying attention to parallelismDevelop a sequential simulator, without paying attention to parallelism Use add-on libraries for parallelization specific functionalitiesUse add-on libraries for parallelization specific functionalities Add a few new statements for transformation to a parallel simulatorAdd a few new statements for transformation to a parallel simulator

Linear-algebra-level (LAL) Approach Parallelize matrix/vector operationsParallelize matrix/vector operations –local sequential operations + communication –inner-product of two vectors –matrix-vector product –preconditioning - block contribution from subgrids Keeps original sequential Diffpack libraries almost intact Keeps original sequential Diffpack libraries almost intact Needs inter-processor communication functionality Needs inter-processor communication functionality Easy to useEasy to use –access to all existing Diffpack iterative methods, preconditioners and convergence monitors –need only to add a few lines of new code –arbitrary choice of number of procs at run-time

Work Load Distribution Through grid partition -> sub-matrix, sub-vector, no global matrix/vector!Through grid partition -> sub-matrix, sub-vector, no global matrix/vector! Need good load balanceNeed good load balance Flexibility & extensibilityFlexibility & extensibility –Global grid -> a set of subgrids Arbitrary number of procs determined at run-timeArbitrary number of procs determined at run-time Non-overlapping partitionNon-overlapping partition Controllable addition of overlap (if desired)Controllable addition of overlap (if desired) –An existing set of subgrids (input from files)

An Add-on Parallelization Library Grid partition administrationGrid partition administration High-level inter-processor communication routines (hidden MPI, OpenMP?)High-level inter-processor communication routines (hidden MPI, OpenMP?) class GridPartAdmclass GridPartAdm –void GridPartAdm::prepareSubgrids() –void GridPartAdm::prepareCommunication() –void GridPartAdm::updateGlobalValues() –void GridPartAdm::matvec –void GridPartAdm::innerProd –void GridPartAdm::norm Fully portable (Origin 2000, IBM SP2, HP V )

A Simple Coding Example //... #ifdef PARALLEL_CODE adm->scan (menu); adm->prepareSubgrids (); adm->prepareCommunication (); lineq->attachCommAdm (*adm); #endif lineq->solve (); //... set subdomain list = DEFAULT set global grid = whole_grid.file set partition-algorithm = METIS set number of overlaps = 0

LAL Approach - Measurements 2D Poisson Equation on unit square

Parallel Vortex-Shedding Simulation

Simulation Snapshots Pressure

Electrical potential depolarization in human heart

Parallelization Approach 2 Inspired by overlapping Schwarz methodsInspired by overlapping Schwarz methods Multilevel methods & parallel computingMultilevel methods & parallel computing Simulator-Parallel: One subdomain is assigned with a sequential simulatorSimulator-Parallel: One subdomain is assigned with a sequential simulator A generic framework: add-on library No.2A generic framework: add-on library No.2 Systematic and flexibleSystematic and flexible –O-O programming enables extensive code reuse –Easy to incorporate multilevel algorithmic modification –Different grid types, local solution methods etc. on different subdomains Parallelization at a higher level!Parallelization at a higher level!

Generic Programming SubdomainSimulator SubdomainFEMSolver OldSimulator NewSimulator Incorporation of a subdomain solver

SP Approach - Measurements 2D Poisson Equation on unit square Fixed number of subdomains Overlap between subdomains Superior numerical efficiency!

SP Approach - Unstructured Grids Highly unstructured grid Highly unstructured grid Discontinuity in the coefficient K Discontinuity in the coefficient K Run multigrid in parallel Run multigrid in parallel

SP Approach - Measurements I HP V2500

SP Approach - Measurements II SGI Cray Origin 2000

SP Application - System of PDEs 2D pressure equation in reservoir simulation

SP Application - System of PDEs 3D Poisson equation in water wave simulation

Parallel Efficiency Fixed number of subdomains M =16.Fixed number of subdomains M =16. Subdomain grids from partition of a global 41x41x41 grid.Subdomain grids from partition of a global 41x41x41 grid. DD as preconditioner of CG for the Laplace eq.DD as preconditioner of CG for the Laplace eq. Multigrid V-cycle as subdomain solver.Multigrid V-cycle as subdomain solver.