Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. 2004.

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Practical techniques & Examples
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
CSCI-455/552 Introduction to High Performance Computing Lecture 25.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 11.
4.1 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M.
Image Processing A brief introduction (by Edgar Alejandro Guerrero Arroyo)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
COMPE472 Parallel Computing Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations.
Embarrassingly Parallel (or pleasantly parallel) Domain divisible into a large number of independent parts. Minimal or no communication Each processor.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Computational Physics Lecture 4 Dr. Guy Tel-Zur.
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.
EECC756 - Shaaban #1 lec # 8 Spring Message-Passing Computing Examples Problems with a very large degree of parallelism: –Image Transformations:
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Connecting with Computer Science, 2e
Characteristics of Embarrassingly Parallel Computations Easily parallelizable Little or no interaction between processes Can give maximum speedup.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
First Bytes - LabVIEW. Today’s Session Introduction to LabVIEW Colors and computers Lab to create a color picker Lab to manipulate an image Visual ProgrammingImage.
CSCI-455/552 Introduction to High Performance Computing Lecture 22.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
MATRICES AND DETERMINANTS
COMP Bitmapped and Vector Graphics Pages Using Qwizdom.
1 CPE 333 : Computer Graphics มหาวิทยาลัยเทคโนโลยีพระจอม เกล้าธนบุรี Dr. Natasha Dejdumrong.
CSCI-455/552 Introduction to High Performance Computing Lecture 18.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Scientific Computing Lecture 5 Dr. Guy Tel-Zur Autumn Colors, by Bobby Mikul, Mikul Autumn Colors, by Bobby Mikul,
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Geometric Transformations
Addison Wesley is an imprint of © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 7 The Game Loop and Animation Starting Out with Games & Graphics.
Embarrassingly Parallel Computations processes …….. Input data results Each process requires different data and produces results from its input without.
CSCI-455/552 Introduction to High Performance Computing Lecture 13.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Lecture 7: Intro to Computer Graphics. Remember…… DIGITAL - Digital means discrete. DIGITAL - Digital means discrete. Digital representation is comprised.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
CSCI-455/552 Introduction to High Performance Computing Lecture 6.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
 By Bob “The Bird” Fiske & Anita “The Snail” Cost.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
CSCI-455/552 Introduction to High Performance Computing Lecture 23.
Digital Media Dr. Jim Rowan ITEC 2110 Chapter 3. Roll call.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Embarrassingly Parallel (or pleasantly parallel) Characteristics Domain divisible into a large number of independent parts. Little or no communication.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
CSCI-455/552 Introduction to High Performance Computing Lecture 15.
Embarrassingly Parallel Computations
Binary Notation and Intro to Computer Graphics
Embarrassingly Parallel (or pleasantly parallel)
Embarrassingly Parallel
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Introduction to High Performance Computing Lecture 7
Parallel Techniques • Embarrassingly Parallel Computations
Embarrassingly Parallel Computations
Introduction to High Performance Computing Lecture 16
Introduction to High Performance Computing Lecture 17
Introduction to High Performance Computing Lecture 8
Presentation transcript:

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Embarrassingly Parallel Computations (chapter 3) Partitioning and Divide-and-Conquer Strategies (chapter 4) Pipelined Computations (chapter 5) Synchronous Computations (chapter 6) Load Balancing and Termination Detection (chapter 7) Parallel Techniques 3.1

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Chapter 3 Embarrassingly Parallel Computations 3.2

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Embarrassingly Parallel Computations A computation that can obviously be divided into a number of completely independent parts, each of which can be executed by a separate process(or). (embarrassingly parallel computation, naturally parallel) Ideally, there would be no communication or very little communication between the separate processes; that is, a completely disconnected computational graph.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. This situation will give the maximum possible speedup if all the available processors can be assigned processes for the total duration of the computation. The key characteristic is that there is no interaction between the processes. In a practical embarrassingly parallel computation, data has to be distributed to the processes and results collected and combined in some way. A common approach is the master-slave organization.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. If dynamic process creation is used, first, a master process will be started that will spawn (start) identical slave processes. the master-slave approach can be used with static process creation. We will introduce load balancing in this chapter, but only for cases in which there is no interaction between slave processes. (chapter 7 for load balancing)

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Practical embarrassingly parallel computation with static process creation and master-slave approach 3.4

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Practical embarrassingly parallel computation with dynamic process creation and master-slave approach 3.5

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Embarrassingly Parallel Computation Examples Geometrical Transformations of Images Mandelbrot set Monte Carlo Calculations 3.6

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Geometrical Transformations of Images computer graphics –In any event, a number of graphical operations can be performed upon the stored image. –Such graphical transformations must be done at high speed to be acceptable to the viewer. (chapter 11) –The most basic way to store a two-dimensional image is a pixmap, in which each pixel (picture element) is stored as a binary number in a two- dimensional array. (bitmap) black-and-white images, a single binary bit is sufficient for each pixel, a 1 if the pixel is white and a 0 if the pixel is black. Grayscale images require more bits, typically using 8 bits to represent 256 different monochrome intensities. Color. Three bytes could be used for each pixel, one byte for red, one for green. and one for blue. (RGB)

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. –Geometrical transformations require mathematical operations to be performed on the coordinates of each pixel to move the position of the pixel without affecting its value. –each pixel is independent. (truly embarrassingly parallel computation)

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Some geometrical operations Shifting Object shifted by Dx in the x-dimension and Dy in the y- dimension: x = x +  x y = y +  y where x and y are the original and x¢ and y¢ are the new coordinates. Scaling Object scaled by a factor S x in x-direction and S y in y- direction: x = xS x y = yS y 3.8

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Rotation Object rotated through an angle q about the origin of the coordinate system: x = x cos  + y sin  y = -x sin  + y cos  3.8 Clipping This transformation applies defined rectangular boundaries to a figure and deletes from the displayed picture those points outside the defined area:

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. –The input data is the bitmap that is typically held in a file and copied into an array. –The main parallel programming concern is the division of the bitmap into groups of pixels for each processor because there are usually many more pixels than processes/processors. –Suppose we use a master process and 48 slave processes and partition in groups of 10 rows. Each slave process processes one 640 x 10 area, returning the old coordinates and the new coordinates to the master for displaying. (The master then updates the bitmap) –Wherever the new image does not appear in the display area, the bitmap is set to 0 (black). –The results are returned one at a time rather than as one group, which would have reduced the message overhead time.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Partitioning into regions for individual processes Square region for each process (can also use strips) 3.9

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved.

Analysis The parallel time complexity is composed of two parts, communication and computation.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved.

O(1) for fixed processors

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Mandelbrot Set Set of points in a complex plane that are quasi-stable (will increase and decrease, but not exceed some limit) when computed by iterating the function where z k +1 is the (k + 1)th iteration of the complex number z = a + bi and c is a complex number giving position of point in the complex plane. The initial value for z is zero. Iterations continued until magnitude of z is greater than 2 or number of iterations reaches arbitrary limit. Magnitude of z is the length of the vector given by 3.10

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Sequential routine computing value of one point returning number of iterations structure complex { float real; float imag; }; int cal_pixel(complex c) { int count, max; complex z; float temp, lengthsq; max = 256; z.real = 0; z.imag = 0; count = 0; /* number of iterations */ do { temp = z.real * z.real - z.imag * z.imag + c.real; z.imag = 2 * z.real * z.imag + c.imag; z.real = temp; lengthsq = z.real * z.real + z.imag * z.imag; count++; } while ((lengthsq < 4.0) && (count < max)); return count; } 3.11

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Mandelbrot set 3.12

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Parallelizing Mandelbrot Set Computation Static Task Assignment Simply divide the region in to fixed number of parts, each computed by a separate processor. Not very successful because different regions require different numbers of iterations and time. Dynamic Task Assignment Have processor request regions after computing previous regions 3.13

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Dynamic Task Assignment Work Pool/Processor Farms 3.14

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Analysis

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved.

Monte Carlo Methods Another embarrassingly parallel computation. Monte Carlo methods use of random selections. 3.15

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within the circle will be, given a sufficient number of randomly selected samples. 3.16

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. 3.17

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Computing an Integral One quadrant can be described by integral Random pairs of numbers, (x r,y r ) generated, each between 0 and 1. Counted as in circle if 3.18

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Alternative (better) Method Use random values of x to compute f(x) and sum values of f(x): where x r are randomly generated values of x between x 1 and x 2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Example Computing the integral Sequential Code sum = 0; for (i = 0; i < N; i++) { /* N random samples */ xr = rand_v(x1, x2); /* generate next random value */ sum = sum + xr * xr - 3 * xr; /* compute f(xr) */ } area = (sum / N) * (x2 - x1); Routine randv(x1, x2) returns a pseudorandom number between x1 and x

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel - see textbook 3.21