Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson Oct 14, 2014 slides6b.ppt 1.

Slides:

Advertisements

Similar presentations

Mutigrid Methods for Solving Differential Equations Ferien Akademie 05 – Veselin Dikov.

Advertisements

Parallel Jacobi Algorithm Steven Dong Applied Mathematics.

CSCI-455/552 Introduction to High Performance Computing Lecture 25.

1 Iterative Solvers for Linear Systems of Equations Presented by: Kaveh Rahnema Supervisor: Dr. Stefan Zimmer

1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Feb 26, 2013, DyanmicParallelism.ppt CUDA Dynamic Parallelism These notes will outline CUDA.

CSCI-455/552 Introduction to High Performance Computing Lecture 26.

Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.

Numerical Algorithms • Matrix multiplication

CSCI 317 Mike Heroux1 Sparse Matrix Computations CSCI 317 Mike Heroux.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M

ECE669 L5: Grid Computations February 12, 2004 ECE 669 Parallel Computer Architecture Lecture 5 Grid Computations.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M

Chapter 13 Finite Difference Methods: Outline Solving ordinary and partial differential equations Finite difference methods (FDM) vs Finite Element Methods.

Module on Computational Astrophysics Jim Stone Department of Astrophysical Sciences 125 Peyton Hall : ph :

Monte Carlo Methods in Partial Differential Equations.

Lecture 8 – Stencil Pattern Stencil Pattern Parallel Computing CIS 410/510 Department of Computer and Information Science.

Introduction to Scientific Computing II Multigrid Dr. Miriam Mehl Institut für Informatik Scientific Computing In Computer Science.

Introduction to Scientific Computing II

ECE 1747H: Parallel Programming Lecture 2-3: More on parallelism and dependences -- synchronization.

Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00.

All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,

Relaxation Methods in the Solution of Partial Differential Equations

Numerical Algorithms Chapter 11.

Part 8 - Chapter 29.

Part 3 Chapter 12 Iterative Methods

Xing Cai University of Oslo

High Altitude Low Opening?

Programming assignment #1. Solutions and Discussion

Solving Systems of Linear Equations: Iterative Methods

بسم الله الرحمن الرحيم.

Synchronous Computations

Pattern Parallel Programming

Iterative Methods Good for sparse matrices Jacobi Iteration

Lecture 19 MA471 Fall 2003.

Introduction to Multigrid Method

Introduction to Scientific Computing II

Stencil Pattern A stencil describes a 2- or 3- dimensional layout of processes, with each process able to communicate with its neighbors. Appears in simulating.

Hidden Markov Models Part 2: Algorithms

Synchronous Computations

Numerical Algorithms • Parallelizing matrix multiplication

Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.

Sathish Vadhiyar Courtesy: Dr. David Walker, Cardiff University

Parallelization of An Example Program

Stencil Quiz questions

Introduction to Scientific Computing II

Stencil Quiz questions

All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,

All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,

CS6068 Applications: Numerical Methods

Parallel Computing Demand for High Performance

COMP60621 Designing for Parallelism

Notes on Assignment 3 OpenMP Stencil Pattern

Parallel Computing Demand for High Performance

Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.

Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.

topic16_cylinder_flow_relaxation

Markov Decision Problems

All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,

Jacobi Project Salvatore Orlando.

Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson Jan 28,

Quiz Questions Iterative Synchronous Pattern

Introduction to High Performance Computing Lecture 16

Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson StencilPattern.ppt Oct 14,

Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.

Data Parallel Pattern 6c.1

Programming assignment #1 Solving an elliptic PDE using finite differences Numerical Methods for PDEs Spring 2007 Jim E. Jones.

Quiz Questions Iterative Synchronous Pattern

Home assignment #3 (1) (Total 3 problems) Due: 12 November 2018

Multidisciplinary Optimization

Presentation transcript:

Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson Oct 14, 2014 slides6b.ppt 1

Stencil Pattern Examples A stencil describes a 2- or 3- dimensional layout of processes, with each process able to communicate only with its neighbors. Appears in simulating many real-life situations. Examples Solving partial differential equations using discretized methods, which may be for: Modeling engineering structures Weather forecasting, see intro to course slides1a-1 Particle dynamics simulations Modeling chemical and biological structures 2

Stencil pattern On each iteration, each node communicates with neighbors to get stored computed values Two-way connection Compute node Source/sink 3

(Iterative synchronous) stencil pattern Often globally synchronous and iterative: Processes compute and communicate only with their neighbors, exchanging results Check termination condition Repeat Stop 4

Application example of stencil pattern Solving Laplace’s Equation Already seen this one in an assignment Solve for f over the two-dimensional x-y space. For computer solution, finite difference methods appropriate Two-dimensional solution space “discretized” into large number of solution points.

Question: Do you recognize this?

Heat Distribution Problem (Steady State Heat Equation) Finding the static distribution of heat in a space, here a 2-dimensional space but could be 3-dimensional. An area has known temperatures along each of its borders (boundary conditions). Find the temperature distribution within. Each point taken to be the average of the four neighboring points 7

Natural ordering For convenience, edges represented by points, but having fixed values, and used in computing internal values. 6.8

Relationship with a General System of Linear Equations Using natural ordering, ith point computed from ith equation: which is a linear equation with five unknowns (except those with boundary points). In general form, the ith equation becomes:

Question will a Jacobi iterative method converge?

Sequential Code Using a fixed number of iterations for (iteration = 0; iteration < limit; iteration++) { for (i = 1; i < n; i++) for (j = 1; j < n; j++) g[i][j] = 0.25*(h[i-1][j]+h[i+1][j]+h[i][j-1]+h[i][j+1]); for (i = 1; i < n; i++) /* update points */ h[i][j] = g[i][j]; } using original numbering system (n x n array). Earlier we saw this can be improved by using a single 3-dimensional array. 11

Algorithmic ways to improving performance of computational stencil applications

Partially Synchronous Computations -- Computations in which individual processes operate without needing to synchronize with other processes on every iteration. Important idea because synchronizing processes very significantly slows the computation and a major cause for reduced performance of parallel programs. 13

Heat Distribution Problem Re-visited Making Partially Synchronous Uses previous iteration results h[][] for next iteration, g[][] forall (i = 1; i < n; i++) forall (j = 1; j < n; j++) { g[i][j]=0.25*(h[i-1][j]+h[i+1][j]+h[i][j-1]+h[i][j+1]); } Synchronization point at end of each iteration The waiting can be reduced by not forcing synchronization at each iteration by allowing processes to move to next iteration before all data points computed – then uses data from not only last iteration but possibly from earlier iterations. Method then becomes an “asynchronous iterative method.” 14

Asynchronous Iterative Method Convergence Conditions Mathematical conditions for convergence may be more strict. Each process may not be allowed to use any previous iteration values if the method is to converge. Chaotic Relaxation A form of asynchronous iterative method introduced by Chazan and Miranker (1969) in which conditions stated as: “there must be a fixed positive integer s such that, in carrying out the evaluation of the ith iterate, a process cannot make use of any value of the components of the jth iterate if j < i - s” (Baudet, 1978). 15

Gauss-Seidel Relaxation Uses some newly computed values to compute other values in that iteration.

Gauss-Seidel Iteration Formula where superscript indicates iteration. With natural ordering of unknowns, formula reduces to At kth iteration, two of the four values (before ith element) taken from kth iteration and two values (after ith element) taken from (k-1)th iteration. Have: In this form does not readily parallelize.

Red-Black Ordering First, black points computed. Next, red points computed. Black points computed simultaneously, and red points computed simultaneously.

Red-Black Parallel Code // compute red points forall (i = 1; i < n; i++) forall (j = 1; j < n; j++) if ((i + j) % 2 == 0) f[i][j] = 0.25*(f[i-1][j] + f[i][j-1] + f[i+1][j] + f[i][j+1]); // now compute black points if ((i + j) % 2 != 0) // repeat

Multigrid Method First, a coarse grid of points used. With these points, iteration process will start to converge quickly. At some stage, number of points increased to include points of coarse grid and extra points between points of coarse grid. Initial values of extra points found by interpolation. Computation continues with this finer grid. Grid can be made finer and finer as computation proceeds, or computation can alternate between fine and coarse grids. Coarser grids take into account distant effects more quickly and provide a good starting point for the next finer grid.

Multigrid processor allocation

Questions