Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science 320 Load Balancing for Hybrid SMP/Clusters.

Similar presentations


Presentation on theme: "Computer Science 320 Load Balancing for Hybrid SMP/Clusters."— Presentation transcript:

1 Computer Science 320 Load Balancing for Hybrid SMP/Clusters

2 Load Balancing Strategies For SMP, use a dynamic schedule to break the work into smaller chunks to keep the threads continually busy For cluster, use the master/worker pattern with a dynamic schedule to keep the nodes continually busy For hybrid, put several worker threads in each node, and schedule them as in the cluster program

3 One-Level Scheduling Strategy Cluster Hybrid

4 Hybrid Mandelbrot Set Program Each of Kp nodes has Kt worker threads Node 0 has one extra thread (the master) Each worker thread is numbered, from 0 to Kt * Kp - 1 The master thread communicates with all worker threads; message tags identify them

5 Set Up and Run the Threads ParallelTeam team = new ParallelTeam (rank == 0 ? Kt+1 : Kt); // Every parallel team thread runs the worker section, except thread Kt // (which exists only in process 0) runs the master section. team.execute(new ParallelRegion(){ public void run() throws Exception{ if (getThreadIndex() == Kt) masterSection(); else workerSection(rank * Kt + getThreadIndex()); } }); The workerSection method takes a parameter to identify the thread for messages to and from the master thread

6 Scheduling the Threads in the Master private static void masterSection()throws IOException{ int process, thread, worker; Range range; // Set up a schedule object to divide the row range into chunks. IntegerSchedule schedule = IntegerSchedule.runtime(); schedule.start(K, new Range(0, height-1)); // Send initial chunk range to each worker. If range is null, no more // work for that worker. Keep count of active workers. int activeWorkers = K; // (Kp * Kt) for (process = 0; process < Kp; ++ process) for (thread = 0; thread < Kt; ++ thread) worker = process * Kt + thread; range = schedule.next(worker); world.send(process, worker, ObjectBuf.buffer(range)); if (range == null) --activeWorkers; }

7 Scheduling the Threads in the Master private static void masterSection()throws IOException{ int process, thread, worker; Range range; // Repeat until all workers have finished. while (activeWorkers > 0){ // Receive an empty message from any worker. CommStatus status = world.receive(null, null, IntegerBuf.emptyBuffer()); process = status.fromRank; worker = status.tag; // Send next chunk range to that specific worker. // If null, no more work. range = schedule.next(worker); world.send(process, worker, ObjectBuf.buffer (range)); if (range == null) --activeWorkers; }

8 Worker Thread Activity: Receive private static void workerSection(int worker) throws IOException{ // Image, writer, matrix, and row slice variables are now local here.... for (;;){ // Receive chunk range from master. If null, no more work. ObjectItemBuf rangeBuf = ObjectBuf.buffer(); world.receive(0, worker, rangeBuf); Range range = rangeBuf.item; if (range == null) break; int lb = range.lb(); int ub = range.ub(); int len = range.length(); // Allocate storage for matrix row slice if necessary. if (slice == null || slice.length < len) slice = new int [len] [width]; // Code to compute rows and columns of slice goes here.

9 Worker Thread Activity: Send private static void workerSection(int worker) throws IOException{ // Image, writer, matrix, and row slice variables are now local here.... for (;;){ // Receive chunk range from master. If null, no more work. ObjectItemBuf rangeBuf = ObjectBuf.buffer(); world.receive (0, worker, rangeBuf); Range range = rangeBuf.item; if (range == null) break;... // Report completion of slice to master. world.send(0, worker, IntegerBuf.emptyBuffer()); // Set full pixel matrix rows to refer to slice rows. System.arraycopy(slice, 0, matrix, lb, len); // Write row slice of full pixel matrix to image file. writer.writeRowSlice(range); }

10 One-Level Scheduling Performance With one master and Kt * Kp workers, lots of messages just to schedule them all Two-level scheduling: –One worker per node, but each worker uses multiple threads –Two schedules, one from the master for each worker and one from each worker for its threads

11 Two-Level Scheduling

12 Changes to Program Master uses a schedule with chunk size of 100, worker uses schedule with chunk size of 1 Master node has two parallel sections as well as a worker team No worker tags needed Master section has no changes otherwise

13 Set Up and Run the Threads // In master process, run master section and worker section in parallel. if (rank == 0) new ParallelTeam(2).execute (new ParallelRegion(){ public void run() throws Exception{ execute(new ParallelSection(){ public void run() throws Exception{ masterSection(); } }, new ParallelSection(){ public void run() throws Exception{ workerSection(); } }); } }); // In worker process, run only worker section. else workerSection();

14 Worker Thread Activity private static void workerSection() throws IOException{ // Image, writer, matrix, and row slice variables are now local here.... // Parallel team to calculate each slice in multiple threads. ParallelTeam team = new ParallelTeam(); for (;;){ // Receive chunk range from master. If null, no more work. ObjectItemBuf rangeBuf = ObjectBuf.buffer(); world.receive(0, rangeBuf); Range range = rangeBuf.item; if (range == null) break; final int lb = range.lb(); final int ub = range.ub(); final int len = range.length(); // Allocate storage for matrix row slice if necessary. if (slice == null || slice.length < len) slice = new int [len] [width];

15 Worker Thread Activity private static void workerSection() throws IOException{ // Image, writer, matrix, and row slice variables are now local here.... // Parallel team to calculate each slice in multiple threads. ParallelTeam team = new ParallelTeam(); for (;;){... // Compute rows of slice in parallel threads. team.execute (new ParallelRegion(){ public void run() throws Exception{ execute (lb, ub, new IntegerForLoop(){ // Use the thread-level loop schedule. public IntegerSchedule schedule(){ return thrschedule; } // Compute all rows and columns in slice. public void run (int first, int last){ for (int r = first; r <= last; ++ r){ // Yadah, yadah, yadah


Download ppt "Computer Science 320 Load Balancing for Hybrid SMP/Clusters."

Similar presentations


Ads by Google