Improving the Performance of Large-Scale Unstructured PDE Applications

Slides:

Advertisements

Similar presentations

FEMLAB Conference Stockholm 2005 UNIVERSITY OF CATANIA Department of Industrial and Mechanical Engineering Authors : M. ALECCI, G. CAMMARATA, G. PETRONE.

Advertisements

Parallel Solution of Navier Stokes Equations Xing Cai Dept. of Informatics University of Oslo.

Unstructured Data Partitioning for Large Scale Visualization CSCAPES Workshop June, 2008 Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram.

Data Locality Aware Strategy for Two-Phase Collective I/O. Rosa Filgueira, David E.Singh, Juan C. Pichel, Florin Isaila, and Jesús Carretero. Universidad.

Performance Metrics Parallel Computing - Theory and Practice (2/e) Section 3.6 Michael J. Quinn mcGraw-Hill, Inc., 1994.

Some Geometric integration methods for PDEs Chris Budd (Bath)

Virtues of Good (Parallel) Software

Tools for Multi-Physics Simulation Hans Petter Langtangen Simula Research Laboratory Oslo, Norway Department of Informatics, University of Oslo.

Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.

A Software Strategy for Simple Parallelization of Sequential PDE Solvers Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo.

CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.

Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.

Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.

Electronic visualization laboratory, university of illinois at chicago Visualizing Very Large Scale Earthquake Simulations (SC 2003) K.L.Ma, UC-Davis.

Bivariate Data Parallel Boxplots. Parallel boxplots are made up of several regular boxplots. They are drawn up on a common scale so they can be easily.

An Object-Oriented Software Framework for Building Parallel Navier-Stokes Solvers Xing Cai Hans Petter Langtangen Otto Munthe University of Oslo.

Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.

On the Performance of PC Clusters in Solving Partial Differential Equations Xing Cai Åsmund Ødegård Department of Informatics University of Oslo Norway.

HYDROGRID J. Erhel – October 2004 Components and grids  Deployment of components  CORBA model  Parallel components with GridCCM Homogeneous cluster.

Steps to Success  Keeps lab quiet  Avoids disturbing others  Ensures that every student has a chance to ask for help.

한국수자원공사 연구소 배관내부 전산해석 : CFX-5.7 김 범 석 한국해양대학교 유동정보연구실.

Scientific Computing Goals Past progress Future. Goals Numerical algorithms & computational strategies Solve specific set of problems associated with.

Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.

Parallel Computing Activities at the Group of Scientific Software Xing Cai Department of Informatics University of Oslo.

Jigsaw cards for LARGE classes Use these cards for classes with 24 or more students. Distribute letter cards evenly. If you have 32 students, pass out.

High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.

PERFORMANCE EVALUATIONS

Software Testing.

Strategy Design Pattern

Granular Flow Simulations

Xing Cai University of Oslo

Software Coherence Management on Non-Coherent-Cache Multicores

COMPUTATIONAL MODELS.

UbiCrawler: a scalable fully distributed Web crawler

TroposPL: Tropos for Prolog Implementations

CS 584 Lecture 3 How is the assignment going?.

ParFUM: High-level Adaptivity Algorithms for Unstructured Meshes

Multipath Routing Using Distributed Proxy Servers

Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.

Lecture 5: GPU Compute Architecture

Biology MDS and Clustering Results

الحاسب والتعليم رامي حسين

Lecture 5: GPU Compute Architecture for the last time

Significance of research Challenges & objectives

GPU Implementations for Finite Element Methods

فرایند تسهیلگری در مددکاری جامعه ای

CIGRE D2.24 Information Architecture ** where CIM fits in **

Distributed computing

A View over Distributed databases

What are the characteristics and distribution of natural hazards?

Parallel Algorithm Models

Topical Paper Presentation #07

A Software Framework for Easy Parallelization of PDE Solvers

Your name here Your institution here

Parallelizing Unstructured FEM Computation

Individual influences on lifestyle change to reduce vascular risk: a qualitative literature review by Jenni Murray, Stephanie Honey, Kate Hill, Cheryl.

Re- engineeniering.

ContinuStreaming: Achieving High Playback Continuity of Gossip-based Peer-to-Peer Streaming IPDPS 2008 LI Zhenhua Dept. Computer, Nanjing University.

Low Order Methods for Simulation of Turbulence in Complex Geometries

Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.

3 Week A: May 1 – 19 3 Week B: May 22 – June 9

Gary M. Zoppetti Gagan Agrawal Rishi Kumar

Open Discussion Questions in re: the Roadmap

MPM Particle Sim + Galaxy

CS 584 Lecture 5 Assignment. Due NOW!!.

Call and return architectures

Presentation transcript:

Improving the Performance of Large-Scale Unstructured PDE Applications Xing Cai Simula Research Lab Norway June 18, 2004

How can parallel PDE solvers avoid overhead due to duplicated local computations? Overlapping DD methods may involve duplicated local computations Needed is a catagorization of the overlapping mesh points Improved parallel performance is achievable

Overlapping DD methods are important parallel PDE solvers Overlapping zones exist between neighbornig subdomains, sharing some mesh points => overhead

Avoiding overhead due to duplicated local computations needs a “categorization” A disjoint distribution of all the overlapping points Re-order the mesh points on each subdomain Replace “work” on certain points with communication

Improved parallel performance arises from removing duplicated local computations Fewer points participate in local computations A slight increase of communication volume Sequential software is easily re-usable