Download presentation
Presentation is loading. Please wait.
Published byBrittney Wilkins Modified over 9 years ago
1
Computer Science 320 Reduction
2
Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then, C / N should be about the same ratio as circle area / square area Circle’s area = π * R 2, and circle quadrant’s area is π / 4, where R = 1 Then C / N = π / 4, and π = 4 * C / N
3
Sequential Program PiSeq // Generate n random points in the unit square, count how many are in // the unit circle. count = 0; for (long i = 0; i < N; ++ i){ double x = prng.nextDouble(); double y = prng.nextDouble(); if (x * x + y * y <= 1.0) ++ count; } // Stop timing. time += System.currentTimeMillis(); // Print results. System.out.println("pi = 4 * " + count + " / " + N + " = " + (4.0 * count / N));
4
Parallel Program PiSmp3 new ParallelTeam().execute (new ParallelRegion(){ public void run() throws Exception{ execute (0, N-1, new LongForLoop(){ // Set up per-thread PRNG and counter. Random prng_thread = Random.getInstance(seed); long count_thread = 0; // Extra padding to avert cache interference. long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7; long pad8, pad9, pada, padb, padc, padd, pade, padf; // Parallel loop body. public void run (long first, long last){ // Skip PRNG ahead to index prng_thread.setSeed(seed); prng_thread.skip(2 * first); // Generate random points. for (long i = first; i <= last; ++ i){ double x = prng_thread.nextDouble(); double y = prng_thread.nextDouble(); if (x * x + y * y <= 1.0) ++ count_thread; }
5
Reduction Step, SMP-Style static SharedLong count;... public void finish(){ // Reduce per-thread counts into shared count. count.addAndGet(count_thread); }
6
Monte Carlo Design for a Cluster Could keep global counter in process 0, but that would involve too many messages Use reduction instead, so message passing is minimal Each process has its own PRNG, with its own split sequence
7
Reduction vs Gather Could allocate an array of K cells for results, where the ith processor’s result is in the ith cell; then gather these into process 0 and let process 0 reduce the end result from these Instead, the reduce method employs all processes in computing the reduction
8
Reduction in Cluster Concentrate data into fewer and fewer processes When K = 8, –processes 4-7 send their data to processes 0-3 –processes 2-3 send their results to processes 0-1 –process 1 sends its results to process 0 At most log 2 (K) messages!
9
Reduction Tree for K = 8 Messages are sent in parallel at each level, starting at the bottom When results have been computed, messages are sent from the next level
10
Example: Add the Results Initial stateAfter first set of messages
11
Example: Add the Results After second set of messagesAfter third set of messages
12
It’s Automatic: reduce world.reduce(0, buf, InegerOp.SUM); // Compute the count in each processor... // Perform the reduction step LongItemBuf buf = new LongItemBuf(); buf.item = count; world.reduce(0, buf, InegerOp.SUM); count = buf.item;... if (rank == 0) // Output the count and the estimate of PI
13
Reduction in Mandelbrot Histogram int[] histogram = new int[maxiter = 1];... world.reduce(0, IntegerBuf.buffer(histogram), InegerOp.SUM);
14
Reduction in Mandelbrot Histogram int[] histogram = new int[maxiter = 1];... world.reduce(0, IntegerBuf.buffer(histogram), InegerOp.SUM);
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.