Improving the Performance of Large-Scale Unstructured PDE Applications

Slides:



Advertisements
Similar presentations
FEMLAB Conference Stockholm 2005 UNIVERSITY OF CATANIA Department of Industrial and Mechanical Engineering Authors : M. ALECCI, G. CAMMARATA, G. PETRONE.
Advertisements

Parallel Solution of Navier Stokes Equations Xing Cai Dept. of Informatics University of Oslo.
Unstructured Data Partitioning for Large Scale Visualization CSCAPES Workshop June, 2008 Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram.
Data Locality Aware Strategy for Two-Phase Collective I/O. Rosa Filgueira, David E.Singh, Juan C. Pichel, Florin Isaila, and Jesús Carretero. Universidad.
Performance Metrics Parallel Computing - Theory and Practice (2/e) Section 3.6 Michael J. Quinn mcGraw-Hill, Inc., 1994.
Some Geometric integration methods for PDEs Chris Budd (Bath)
Virtues of Good (Parallel) Software
Tools for Multi-Physics Simulation Hans Petter Langtangen Simula Research Laboratory Oslo, Norway Department of Informatics, University of Oslo.
computer
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
A Software Strategy for Simple Parallelization of Sequential PDE Solvers Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
Electronic visualization laboratory, university of illinois at chicago Visualizing Very Large Scale Earthquake Simulations (SC 2003) K.L.Ma, UC-Davis.
Bivariate Data Parallel Boxplots. Parallel boxplots are made up of several regular boxplots. They are drawn up on a common scale so they can be easily.
An Object-Oriented Software Framework for Building Parallel Navier-Stokes Solvers Xing Cai Hans Petter Langtangen Otto Munthe University of Oslo.
Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
On the Performance of PC Clusters in Solving Partial Differential Equations Xing Cai Åsmund Ødegård Department of Informatics University of Oslo Norway.
HYDROGRID J. Erhel – October 2004 Components and grids  Deployment of components  CORBA model  Parallel components with GridCCM Homogeneous cluster.
Steps to Success  Keeps lab quiet  Avoids disturbing others  Ensures that every student has a chance to ask for help.
한국수자원공사 연구소 배관내부 전산해석 : CFX-5.7 김 범 석 한국해양대학교 유동정보연구실.
Scientific Computing Goals Past progress Future. Goals Numerical algorithms & computational strategies Solve specific set of problems associated with.
Adaptive grid refinement. Adaptivity in Diffpack Error estimatorError estimator Adaptive refinementAdaptive refinement A hierarchy of unstructured gridsA.
Parallel Computing Activities at the Group of Scientific Software Xing Cai Department of Informatics University of Oslo.
Jigsaw cards for LARGE classes Use these cards for classes with 24 or more students. Distribute letter cards evenly. If you have 32 students, pass out.
High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.
PERFORMANCE EVALUATIONS
Software Testing.
Strategy Design Pattern
Granular Flow Simulations
Xing Cai University of Oslo
Software Coherence Management on Non-Coherent-Cache Multicores
COMPUTATIONAL MODELS.
UbiCrawler: a scalable fully distributed Web crawler
TroposPL: Tropos for Prolog Implementations
CS 584 Lecture 3 How is the assignment going?.
ParFUM: High-level Adaptivity Algorithms for Unstructured Meshes
Multipath Routing Using Distributed Proxy Servers
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Lecture 5: GPU Compute Architecture
COMMUNICATION.
Biology MDS and Clustering Results
الحاسب والتعليم رامي حسين
Lecture 5: GPU Compute Architecture for the last time
Significance of research Challenges & objectives
GPU Implementations for Finite Element Methods
فرایند تسهیلگری در مددکاری جامعه ای
CIGRE D2.24 Information Architecture ** where CIM fits in **
Module 5.
Distributed computing
CS 584.
A View over Distributed databases
What are the characteristics and distribution of natural hazards?
Parallel Algorithm Models
Topical Paper Presentation #07
A Software Framework for Easy Parallelization of PDE Solvers
Your name here Your institution here
Parallelizing Unstructured FEM Computation
Individual influences on lifestyle change to reduce vascular risk: a qualitative literature review by Jenni Murray, Stephanie Honey, Kate Hill, Cheryl.
Re- engineeniering.
ContinuStreaming: Achieving High Playback Continuity of Gossip-based Peer-to-Peer Streaming IPDPS 2008 LI Zhenhua Dept. Computer, Nanjing University.
Low Order Methods for Simulation of Turbulence in Complex Geometries
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
3 Week A: May 1 – 19 3 Week B: May 22 – June 9
Gary M. Zoppetti Gagan Agrawal Rishi Kumar
Open Discussion Questions in re: the Roadmap
MPM Particle Sim + Galaxy
CS 584 Lecture 5 Assignment. Due NOW!!.
Call and return architectures
Presentation transcript:

Improving the Performance of Large-Scale Unstructured PDE Applications Xing Cai Simula Research Lab Norway June 18, 2004

How can parallel PDE solvers avoid overhead due to duplicated local computations? Overlapping DD methods may involve duplicated local computations Needed is a catagorization of the overlapping mesh points Improved parallel performance is achievable

Overlapping DD methods are important parallel PDE solvers Overlapping zones exist between neighbornig subdomains, sharing some mesh points => overhead

Avoiding overhead due to duplicated local computations needs a “categorization” A disjoint distribution of all the overlapping points Re-order the mesh points on each subdomain Replace “work” on certain points with communication

Improved parallel performance arises from removing duplicated local computations Fewer points participate in local computations A slight increase of communication volume Sequential software is easily re-usable