Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol.

Similar presentations


Presentation on theme: "Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol."— Presentation transcript:

1 Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 2, February 2009 Menno Dobber, Student Member, IEEE, Rob van der Mei, and Ger Koole Present by Chen, Ting-Wei

2 2009.05.07Chen, Ting-Wei2 Index Introduction Introduction Preliminaries Preliminaries Experimental Setup Experimental Setup Experimental Results Experimental Results Conclusions Conclusions

3 2009.05.07Chen, Ting-Wei3 Introduction (cont.) Dynamics of grid environments Dynamics of grid environments Dynamic Load Balancing Dynamic Load Balancing Job Replication Job Replication Easy-to-measure statistic Y Easy-to-measure statistic Y Corresponding threshold value Y* If Y>Y* ……DLB outperforms JR If Y>Y* ……DLB outperforms JR If Y<Y* ……JR outperforms DLB If Y<Y* ……JR outperforms DLB

4 2009.05.07Chen, Ting-Wei4 Introduction (cont.) Easy-to-implement approach Easy-to-implement approach Make dynamic decisions about whether to use DLB or JR Make dynamic decisions about whether to use DLB or JR Two types of investigations accurately verify Two types of investigations accurately verify Trace-driven simulation Trace-driven simulation Real implementation Real implementation

5 2009.05.07Chen, Ting-Wei5 Introduction (cont.) Real implementation Real implementation To acquire more knowledge about DLB To acquire more knowledge about DLB Means of trace-driven simulations Means of trace-driven simulations Require detailed knowledge about the processes Require detailed knowledge about the processes Take less time Take less time More extensive analyses can be performed More extensive analyses can be performed

6 2009.05.07Chen, Ting-Wei6 Introduction (cont.) Analyze and compare the effectiveness of ELB, DLB, and JR Analyze and compare the effectiveness of ELB, DLB, and JR Using trace-driven simulations Using trace-driven simulations Gathering from a global-scale grid testbed Gathering from a global-scale grid testbed

7 2009.05.07Chen, Ting-Wei7 Preliminaries (cont.) Bulk Synchronous Processing (BSP) Bulk Synchronous Processing (BSP) Problem can be divided into subproblems or jobs Problem can be divided into subproblems or jobs I iterations, P jobs, P processes I iterations, P jobs, P processes Each processor receives one job per iteration Each processor receives one job per iteration After computing the jobs, all the processors send their data and wait for each others data before the next iteration starts After computing the jobs, all the processors send their data and wait for each others data before the next iteration starts The standard BSP program is implemented according to the ELB principle The standard BSP program is implemented according to the ELB principle

8 2009.05.07Chen, Ting-Wei8 Preliminaries (cont.) Implementations on ELB Implementations on ELB

9 2009.05.07Chen, Ting-Wei9 Preliminaries (cont.) Dynamic Load Balancing (DLB) Dynamic Load Balancing (DLB) DLB starts with the execution of an iteration is the same with BSP DLB starts with the execution of an iteration is the same with BSP At the end of each iteration, the processors predict their processing speed for the next iteration At the end of each iteration, the processors predict their processing speed for the next iteration Select one processor to be the DLB scheduler Select one processor to be the DLB scheduler After every N iterations, the processors send their prediction to this scheduler After every N iterations, the processors send their prediction to this scheduler

10 2009.05.07Chen, Ting-Wei10 Preliminaries (cont.) The processor calculate the “optimal” distribution The processor calculate the “optimal” distribution Send relevant information to each processor Send relevant information to each processor All processors redistribute the load All processors redistribute the load

11 2009.05.07Chen, Ting-Wei11 Preliminaries (cont.) Implementations on DLB Implementations on DLB

12 2009.05.07Chen, Ting-Wei12 Preliminaries (cont.) Job Replication (JR) Job Replication (JR) Two copies of a job Two copies of a job R copies of all P jobs have been distributed to P processors. R copies of all P jobs have been distributed to P processors. A processor has finished one of the copies, it sends a message to the other processors A processor has finished one of the copies, it sends a message to the other processors The other processors can kill the job and start the next job The other processors can kill the job and start the next job

13 2009.05.07Chen, Ting-Wei13 Preliminaries (cont.) Implementations on JR Implementations on JR

14 2009.05.07Chen, Ting-Wei14 Experimental Setup (cont.) Data-Collection Procedure Data-Collection Procedure

15 2009.05.07Chen, Ting-Wei15 Experimental Setup (cont.) Completely available Pentium 4, 3.0-GHz processor, the computations in the jobs would take 10000 ms Completely available Pentium 4, 3.0-GHz processor, the computations in the jobs would take 10000 ms Set one’s job times are 72500 ms (average) Set one’s job times are 72500 ms (average) Distributed within the USA Distributed within the USA More coherence between the generated datasets More coherence between the generated datasets Set two’s job times are 65000 ms (average) Set two’s job times are 65000 ms (average) Show more burstiness and have higher differences between the average job times on the processors Show more burstiness and have higher differences between the average job times on the processors Globally distributed Globally distributed

16 2009.05.07Chen, Ting-Wei16 Experimental Setup (cont.) Trace-driven simulation analyses Trace-driven simulation analyses with with, and, and with with

17 2009.05.07Chen, Ting-Wei17 Experimental Setup (cont.) Simulation Details Simulation Details Trace-driven DLB simulations Trace-driven DLB simulations Assume a linear relation between the job size and their job times in BSP Assume a linear relation between the job size and their job times in BSP

18 2009.05.07Chen, Ting-Wei18 Experimental Setup (cont.) DLB simulation DLB simulation 1. Randomly select a resource set 2. The DES-based prediction 3. Derive the IT 4. Derive the runtime of the R-JR 5. Derive the expected runtime of a DLB run

19 2009.05.07Chen, Ting-Wei19 Experimental Setup (cont.) JT simulation JT simulation 1. The same with step one of the DLB simulation 2. Divide the set of processors in execution groups 3. Drive the effective job times for all P processors 4. Derive the IT by repeating step two R times 5. Derive the runtime of the R-JR run by repeating step three 6. Derive the expected runtime of an R-JR run on P processors

20 2009.05.07Chen, Ting-Wei20 Experimental Setup (cont.) Dynamic Selection Method Dynamic Selection Method Analysis Analysis

21 2009.05.07Chen, Ting-Wei21 Experimental Results (cont.) Simulate the runtimes of DLB for different numbers of processors with set one and two Simulate the runtimes of DLB for different numbers of processors with set one and two Simulate runs of BSP parallel applications that use JR and analyze the expected speedups for different numbers of processors, replication, data sets and CCR values Simulate runs of BSP parallel applications that use JR and analyze the expected speedups for different numbers of processors, replication, data sets and CCR values

22 2009.05.07Chen, Ting-Wei22 Experimental Results (cont.) Compare the results of the runtimes and the speedups of the ELB, DLB, and JR Compare the results of the runtimes and the speedups of the ELB, DLB, and JR Simulate the speedups of the proposed selection method Simulate the speedups of the proposed selection method

23 2009.05.07Chen, Ting-Wei23 Experimental Results (cont.) DLB DLB

24 2009.05.07Chen, Ting-Wei24 Experimental Results (cont.) Job Replication Job Replication

25 2009.05.07Chen, Ting-Wei25 Experimental Results (cont.) Comparison of ELB, DLB, and JR Comparison of ELB, DLB, and JR Runtimes of DLB and JR with CCR 0.01 Runtimes of DLB and JR with CCR 0.01

26 2009.05.07Chen, Ting-Wei26 Experimental Results (cont.) Speedups of DLB and JR with sets of 40 and 90 data sets with CCR 0.01 Speedups of DLB and JR with sets of 40 and 90 data sets with CCR 0.01

27 2009.05.07Chen, Ting-Wei27 Experimental Results (cont.) Statistic Y against ITs of DLB and JR Statistic Y against ITs of DLB and JR

28 2009.05.07Chen, Ting-Wei28 Experimental Results (cont.) Speedup of selection method, DLB and JR Speedup of selection method, DLB and JR

29 2009.05.07Chen, Ting-Wei29 Conclusions Made an extensive assessment and comparison between DLB and JR Made an extensive assessment and comparison between DLB and JR Y>Y* ……DLB outperforms JR Y>Y* ……DLB outperforms JR Y<Y* ……JR outperforms DLB Y<Y* ……JR outperforms DLB Propose the so-called DLB/JR method Propose the so-called DLB/JR method

30 2009.05.07Chen, Ting-Wei30 Outlook Bring the result to a higher level of reality Bring the result to a higher level of reality Make use of mathematical techniques to provide a more solid foundation Make use of mathematical techniques to provide a more solid foundation Determine the optimal number of job replicas needed to obtain the best speedup performance Determine the optimal number of job replicas needed to obtain the best speedup performance

31 Thanks for your attention


Download ppt "Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol."

Similar presentations


Ads by Google