Download presentation
Presentation is loading. Please wait.
Published byStuart Ford Modified over 9 years ago
1
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide and Conquer –Dividing problem recursively into sub-problems of the same type –Assign sub-problems to individual processors (e.g. Save and hold) Domain (Data) Decomposition –Assign parts of the data to separate processors Functional Decomposition –Assign application functions to separate processors
2
Partitioning and Divide and Conquer Example Applications Summing Numbers Sorting Algorithms Numerical Integration The N-body problem Bucket Sort Adaptive Quadrature +++++++ + + ++++++ + + + ++ + + + Partitioning Divide and Conquer Divide and Hold Half Partition Divide, compute and merge
3
Bucket Sort Partitioning Communication reduction is possible if each processor sends a small bucket to each other processor Bucket Sort works well if numbers are uniformly distributed across a known interval (e.g. 0->1) Unsorted Numbers Sorted Sequential Bucket Sort Unsorted Numbers Sorted Parallel Bucket Sort P0
4
All-To-All Broadcast Bucket Sort is a possible application for the all- to-all broadcast All-to-all is also useful for transposing matrices Processor i Buffer P-1 PiP1P2P0
5
Divide and Conquer Sum of N Numbers Two Conditions Required for Recursive Solutions –How does the recursion terminate? –How does a problem of size n relate to a problem of size < n? Pseudo code If (less than two numbers) return sum Divide the problem into two parts Recursive call to sum the first part Recursive call to sum the second part Merge the two partial sums and return the total Parallel implementation with eight processors P0 keep half and send half to P4 P4,P0 keep half and send half to P2,P6 respectively P0,P2,P4,P6 keep half and send half to P1,P3,P5,P7 respectively Perform the computation in parallel The merge phase Non leafs receive and reduce results Non root sends results to the parent processor
6
Numerical Integration p δ q ab ab Rectangles Trapezoids Difficulties How do we choose the value for δ? Parts of the integral requires a smaller δ than others
7
Adaptive Quadrature Pseudo code p=a, δ = b-a WHILE p b)?q=b:q=a+δ x = (p+q)/2 Compute A, B, and C IF C>tolerance δ /= 2 WHILE C > tolerance p += δ; δ*=2 Notes and Questions –When do we terminate? –Termination rates differ –Can we balance processor load? ab AB C ab AB C = 0 xpq p xq
8
Parallel Numerical Integration Sequential Algorithm Choose a δ For each region, x i, in the integral Calculate sum += f(x i ) * δ Parallel algorithm –Static Assignment (Question: How to choose δ?) Send region to each processor Processors perform parallel computation Reduce add operation computes final result –Dynamic Assignment Adaptive Quadrature varies the convergence rates Use work pool approach for assigning regions
9
Gravitational N-Body Problem Predict positions and movements of bodies in space For astrophysics and molecular dynamics Based on the Newtonian laws of physics Formulae F = G m x m y / r xy 2 F = m a Notation G = Gravitational constant m x,m y = mass of bodies x, y r xy = distance between x, y a = accelleration F = force between bodies 3 Dimension Force –F = G m x m y r x / r xy 3 –r x = distance in the x direction
10
The N-body problem astronomical systems, electrical charges, etc. Sequential Solution Pseudo Code For each time step, t. Compute pair-wise forces (F x =Gm a m b (x a -x b )/r 3 ) Compute acceleration on each body (F=ma) Compute velocity for each body (v t+1 =v t + a t) Compute new position of each body (x t+1 =x t + v t+1 t) Parallel Solution Notes –Partition the bodies among the processors –Communication costs are relatively high –This n 2 algorithm doesn’t scale well to lots of bodies
11
Barnes and Hut Solution Pseudo code FOR each time step, t Perform recursive division All-to-all the essential tree Perform Parallel calculations Output visualization data Questions –How is the best way to partition the n-bodies? –Should the partitioning be dynamic or static? Center of mass r 2-Dimensional Recursive Division N lg N Complexity instead of N 2 Treat distant clusters as a single body
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.