Parallel Techniques • Embarrassingly Parallel Computations

Slides:



Advertisements
Similar presentations
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Advertisements

Practical techniques & Examples
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide.
Graduate School of Information Sciences, Tohoku University
COMPE472 Parallel Computing Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations.
Embarrassingly Parallel (or pleasantly parallel) Domain divisible into a large number of independent parts. Minimal or no communication Each processor.
CSCE Review—Fortran. CSCE Review—I/O Patterns: Read until a sentinel value is found Read n, then read n things Read until EOF encountered.
Computational Physics Lecture 4 Dr. Guy Tel-Zur.
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.
EECC756 - Shaaban #1 lec # 8 Spring Message-Passing Computing Examples Problems with a very large degree of parallelism: –Image Transformations:
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Characteristics of Embarrassingly Parallel Computations Easily parallelizable Little or no interaction between processes Can give maximum speedup.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Steady-State Sinusoidal Analysis
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Unit 7 We are learning to solve two variable linear equations for one variable. CC3:
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Parallelism and Robotics: The Perfect Marriage By R.Theron,F.J.Blanco,B.Curto,V.Moreno and F.J.Garcia University of Salamanca,Spain Rejitha Anand CMPS.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Scientific Computing Lecture 5 Dr. Guy Tel-Zur Autumn Colors, by Bobby Mikul, Mikul Autumn Colors, by Bobby Mikul,
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Embarrassingly Parallel Computations processes …….. Input data results Each process requires different data and produces results from its input without.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Operational Research & ManagementOperations Scheduling Economic Lot Scheduling 1.Summary Machine Scheduling 2.ELSP (one item, multiple items) 3.Arbitrary.
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?
MSc in High Performance Computing Computational Chemistry Module Parallel Molecular Dynamics (i) Bill Smith CCLRC Daresbury Laboratory
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Embarrassingly Parallel (or pleasantly parallel) Characteristics Domain divisible into a large number of independent parts. Little or no communication.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,
Parallel Computing Chapter 3 - Patterns R. HALVERSON MIDWESTERN STATE UNIVERSITY 1.
Embarrassingly Parallel Computations
Auburn University
Courtesy: Dr. David Walker, Cardiff University
Xing Cai University of Oslo
Basic simulation methodology
Content * Overview * Project overall * PF meter * Calculation of firing angle * Generation of firing angle * Results * Comparison * Problems.
Embarrassingly Parallel (or pleasantly parallel)
Parallel Programming By J. H. Wang May 2, 2017.
Sample Problem 3.5 A cube is acted on by a force P as shown. Determine the moment of P about A about the edge AB and about the diagonal AG of the cube.
Parallel Shared Memory
Parallel and Distributed Simulation Techniques
Parallel Techniques之 易并行计算
Chapter 5 - Functions Outline 5.1 Introduction
Embarrassingly Parallel
CS 584.
CSC4005 – Distributed and Parallel Computing
Monte Carlo Integration Using MPI
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Introduction to High Performance Computing Lecture 7
Embarrassingly Parallel Computations
Lecture 4 - Monte Carlo improvements via variance reduction techniques: antithetic sampling Antithetic variates: for any one path obtained by a gaussian.
Jacobi Project Salvatore Orlando.
Ratio and Proportion Vocabulary.
Introduction to High Performance Computing Lecture 16
Math review - scalars, vectors, and matrices
Introduction to High Performance Computing Lecture 17
Mattan Erez The University of Texas at Austin
Introduction to High Performance Computing Lecture 8
Presentation transcript:

Parallel Techniques • Embarrassingly Parallel Computations • Partitioning and Divide-and-Conquer Strategies • Pipelined Computations • Synchronous Computations • Asynchronous Computations • Load Balancing and Termination Detection 3.1

Embarrassingly Parallel Computations Chapter 3 Embarrassingly Parallel Computations 3.2

Embarrassingly Parallel Computations A computation that can obviously be divided into a number of completely independent parts, each of which can be executed by a separate process(or). No communication or very little communication between processes Each process can do its tasks without any interaction with other processes 3.3

Practical embarrassingly parallel computation with static process creation and master-slave approach 3.4

Practical embarrassingly parallel computation with dynamic process creation and master-slave approach 3.5

Embarrassingly Parallel Computation Examples • Low level image processing • Mandelbrot set • Monte Carlo Calculations 3.6

Low level image processing Many low level image processing operations only involve local data with very limited if any communication between areas of interest. 3.7

Some geometrical operations Shifting Object shifted by Dx in the x-dimension and Dy in the y-dimension: x¢ = x + Dx y¢ = y + Dy where x and y are the original and x¢ and y¢ are the new coordinates. Scaling Object scaled by a factor Sx in x-direction and Sy in y-direction: x¢ = xSx y¢ = ySy 3.8

Rotation Object rotated through an angle q about the origin of the coordinate system: x¢ = x cosq + y sinq y¢ = -x sinq + y cosq 3.8

Partitioning into regions for individual processes Square region for each process (can also use strips) 3.9

Mandelbrot Set Set of points in a complex plane that are quasi-stable (will increase and decrease, but not exceed some limit) when computed by iterating the function where zk +1 is the (k + 1)th iteration of the complex number z = a + bi and c is a complex number giving position of point in the complex plane. The initial value for z is zero. Iterations continued until magnitude of z is greater than 2 or number of iterations reaches arbitrary limit. Magnitude of z is the length of the vector given by 3.10

Sequential routine computing value of one point returning number of iterations structure complex { float real; float imag; }; int cal_pixel(complex c) { int count, max; complex z; float temp, lengthsq; max = 256; z.real = 0; z.imag = 0; count = 0; /* number of iterations */ do { temp = z.real * z.real - z.imag * z.imag + c.real; z.imag = 2 * z.real * z.imag + c.imag; z.real = temp; lengthsq = z.real * z.real + z.imag * z.imag; count++; } while ((lengthsq < 4.0) && (count < max)); return count; } 3.11

Mandelbrot set 3.12

Static Task Assignment Dynamic Task Assignment Parallelizing Mandelbrot Set Computation Static Task Assignment Simply divide the region in to fixed number of parts, each computed by a separate processor. Not very successful because different regions require different numbers of iterations and time. Dynamic Task Assignment Have processor request regions after computing previous regions 3.13

Work Pool/Processor Farms Dynamic Task Assignment Work Pool/Processor Farms 3.14

Monte Carlo Methods Another embarrassingly parallel computation. Monte Carlo methods use of random selections. 3.15

Circle formed within a 2 x 2 square Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within the circle will be , given a sufficient number of randomly selected samples. 3.16

3.17

Computing an Integral One quadrant can be described by integral Random pairs of numbers, (xr,yr) generated, each between 0 and 1. Counted as in circle if 3.18

Alternative (better) Method Use random values of x to compute f(x) and sum values of f(x): where xr are randomly generated values of x between x1 and x2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19

Example Computing the integral Sequential Code sum = 0; for (i = 0; i < N; i++) { /* N random samples */ xr = rand_v(x1, x2); /* generate next random value */ sum = sum + xr * xr - 3 * xr; /* compute f(xr) */ } area = (sum / N) * (x2 - x1); Routine randv(x1, x2) returns a pseudorandom number between x1 and x2. 3.20

For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel - see textbook 3.21