Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.

Slides:



Advertisements
Similar presentations
Iteration, the Julia Set, and the Mandelbrot Set.
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Practical techniques & Examples
Sharks and Fishes – The problem
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
CSCI-455/552 Introduction to High Performance Computing Lecture 11.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide.
Partitioning and Divide-and-Conquer Strategies Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
Graduate School of Information Sciences, Tohoku University
COMPE472 Parallel Computing Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations.
Two-Dimensional Geometric Transformations
Embarrassingly Parallel (or pleasantly parallel) Domain divisible into a large number of independent parts. Minimal or no communication Each processor.
Computational Physics Lecture 4 Dr. Guy Tel-Zur.
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.
EECC756 - Shaaban #1 lec # 8 Spring Message-Passing Computing Examples Problems with a very large degree of parallelism: –Image Transformations:
Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.
Pipelined Computations Divide a problem into a series of tasks A processor completes a task sequentially and pipes the results to the next processor Pipelining.
Characteristics of Embarrassingly Parallel Computations Easily parallelizable Little or no interaction between processes Can give maximum speedup.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Cluster Analysis (1).
1 Random Walks in WSN 1.Efficient and Robust Query Processing in Dynamic Environments using Random Walk Techniques, Chen Avin, Carlos Brito, IPSN 2004.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Mapping Techniques for Load Balancing
Separate multivariate observations
1 CS4402 – Parallel Computing Lecture 7 Parallel Graphics – More Fractals Scheduling.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
01/24/05© 2005 University of Wisconsin Last Time Raytracing and PBRT Structure Radiometric quantities.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Scientific Computing Lecture 5 Dr. Guy Tel-Zur Autumn Colors, by Bobby Mikul, Mikul Autumn Colors, by Bobby Mikul,
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Embarrassingly Parallel Computations processes …….. Input data results Each process requires different data and produces results from its input without.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
CORDIC Algorithm COordinate Rotation DIgital Computer Method for Elementary Function Evaluation (e.g., sin(z), cos(z), tan -1 (y)) Originally Used for.
Geometric Transformations
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
8.2 Trigonometric (Polar) Form of Complex Numbers.
2D Geometric Transformation Translation A translation is applied to an object by repositioning it along a straight-line path from one coordinate location.
Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?
MSc in High Performance Computing Computational Chemistry Module Parallel Molecular Dynamics (i) Bill Smith CCLRC Daresbury Laboratory
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Embarrassingly Parallel (or pleasantly parallel) Characteristics Domain divisible into a large number of independent parts. Little or no communication.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
Geometric Transformations Ceng 477 Introduction to Computer Graphics Computer Engineering METU.
Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,
Parallel Computing Chapter 3 - Patterns R. HALVERSON MIDWESTERN STATE UNIVERSITY 1.
Embarrassingly Parallel Computations
Graphing Complex Numbers AND Finding the Absolute Value
Embarrassingly Parallel (or pleasantly parallel)
Sample Problem 3.5 A cube is acted on by a force P as shown. Determine the moment of P about A about the edge AB and about the diagonal AG of the cube.
Embarrassingly Parallel
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Introduction to High Performance Computing Lecture 7
Parallel Techniques • Embarrassingly Parallel Computations
Embarrassingly Parallel Computations
Ratio and Proportion Vocabulary.
Math review - scalars, vectors, and matrices
Vector Components.
Introduction to High Performance Computing Lecture 8
Presentation transcript:

Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Strategies that achieve load balancing Parallel Techniques 3.1 ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2012.

Embarrassingly Parallel Computations A computation that can obviously be divided into a number of completely independent parts, each of which can be executed by a separate process(or). No communication or very little communication between processes Each process can do its tasks without any interaction with other processes 3.3

Practical embarrassingly parallel computation with static process creation and master-slave approach 3.4

1. Low level image processing Many low level image processing operations only involve local data with very limited if any communication between areas of interest. 3.7 Embarrassingly Parallel Computation Examples

Image coordinate system 3.9. (x, y) x y Origin (0,0)

Some geometrical operations (a) Shifting Object shifted by  x in x-dimension and  y in y-dimension: x = x +  x y = y +  y where x and y are original coordinates and x and y are the new coordinates. 3.8 x y xx yy

(b) Scaling Object scaled by factor S x in x-direction and S y in y-direction: x = xS x y = yS y 3.8 x y

(c) Rotation Object rotated through an angle q about the origin of the coordinate system: x = x cos  + y sin  y = -x sin  + y cos  3.8 x y

Parallelizing Image Operations Partitioning into regions for individual processes Example Square region for each process (can also use strips) 3.9

2. Mandelbrot Set Set of points in a complex plane that are quasi-stable (will increase and decrease, but not exceed some limit) when computed by iterating the function: where z k +1 is the (k + 1)th iteration of complex number: z = a + bi and c is a complex number giving position of point in the complex plane. The initial value for z is zero. 3.10

Mandelbrot Set continued Iterations continued until magnitude of z is: Greater than 2 or Number of iterations reaches arbitrary limit. Magnitude of z is the length of the vector given by: 3.10

Sequential routine computing value of one point, returning number of iterations structure complex { float real; float imag; }; int cal_pixel(complex c) { int count, max; complex z; float temp, lengthsq; max = 256; z.real = 0; z.imag = 0; count = 0; /* # of iterations */ do { temp = z.real * z.real - z.imag * z.imag + c.real; z.imag = 2 * z.real * z.imag + c.imag; z.real = temp; lengthsq = z.real * z.real + z.imag * z.imag; count++; } while ((lengthsq < 4.0) && (count < max)); return count; } 3.11

Mandelbrot set 3.12

Parallelizing Mandelbrot Set Computation Static Task Assignment Simply divide the region in to fixed number of parts, each computed by a separate processor. Not very successful because different regions require different numbers of iterations and time. 3.13

3.14 Dynamic Task Assignment Have processor request regions after computing previous regions

3. Monte Carlo Methods Another embarrassingly parallel computation. Monte Carlo methods use of random selections. 3.15

Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within circle will be, given sufficient number of randomly selected samples. 3.16

Computing an Integral One quadrant can be described by integral Random pairs of numbers, (x r,y r ) generated, each between 0 and 1. Counted as in circle if 3.18

Alternative (better) Method Generate random values of x to compute f(x) Sum values of f(x): where x r are randomly generated values of x between x 1 and x 2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19

Integral Example Computing the integral Sequential Code sum = 0; for (i = 0; i < N; i++) { /* N random samples */ xr = rand_v(x1, x2); /* generate next random value */ sum = sum + xr * xr - 3 * xr; /* compute f(xr) */ } area = (sum / N) * (x2 - x1); Routine randv(x1, x2) returns a pseudorandom number between x1 and x

For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel - see textbook 3.21