Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy
Motivation Effective scheduling is critical for the performance of an application launched onto the Grid(Distributed) environment Many scheduling algorithms have been proposed, studied and compared but there are few studies comparing their performance in Grid environments The Grid environment has the unique property of drastic cost differences between inter-cluster and the intra-cluster data transfers.
Two schemes In this paper, we compare several scheduling algorithms that represent two major schemes: Level based and List based algorithms. The list-based approach first orders the nodes in the DAG by a pre-calculated priority(i.e. b-level, critical path). Then the scheduler considers the nodes in order, assigning each to a resource that minimizes a suitable cost function(makespan). The level-based methods first organize the DAG into levels in which all the nodes are independent and then schedule the jobs within each level. We will also examine some hybrid versions.
Algorithms Heterogeneous Earliest Finish Time (HEFT) Scheduling Algorithm Use b-level to sort Heterogeneously adept version of MCP Levelized Heuristic Based Scheduling (LHBS) Algorithm Greedy Heuristic (simple LHBS) Min-min Heuristic (sophisticated LHBS) Min-max Heuristic (sophisticated LHBS) Sufferage Heuristic (sophisticated LHBS) Hybrid Heuristic(HHS) Scheduling Algorithm Different combinations, we use LHHS
Experimental Environment We defined and generated the experimental environment including the universe of compute and network resources and DAGs representing different applications. Three resource universes The universal environment: over 18,000 processors in 500 clusters The many-cluster environment: around 300 processors in 20 clusters The big-cluster environment: around 300 processors in 4 clusters
DAG Design We use DAGs taken from two real Grid applications(EMAN and Montage) and three classes of artificially-generated DAGs which abstract certain characteristics of these applications. The DAG generator can generate different formats of DAGs. Currently, we support fully random, level, and choke formats. The DAG generator also takes other parameters. For each DAG we have cost model instance to estimate the performance of tasks in such DAG.
Experiment Design We produced our DAGs with the following parameters: Type = {random, level, choke} Total number of nodes = {300, 1000, 3000} Shape = {0.5, 1.0, 5.0} Average out degree = {1.0, 2.0, 5.0} We generated 5 random DAGs for each possible parameter combination.In addition, we used 30 EMAN and 30 Montage DAGs. For each of those DAGs, we applied our cost model with the following parameters: DataSize = {20,1000},{100,1000},{500,1000} CCR = {0.1, 1.0, 10} Complexity = {0.85,1.15}, {0,6, 1.4}, {0.15, 1.85} In total,our experiments schedule and evaluate over 10,000 DAG/environment combinations.
Performance
Different DAG types
Different CCR
Different Resources
Cluster Settings
Cluster fitting
Constrained Resource
Current work Look ahead Idea: Instead of dispatching the task to the resource that results in the earliest finish time, try to find the resource that will give the earliest finish time for the child level tasks. Drawback: The complexity soar up to O(VP^2*d), d is the average out degree. Dynamic sorting Idea: Update the t-level of each task after their parents are scheduled. Then schedule tasks in one level on the order of t-level+b-level. Drawback: The preliminary results didn’t show any advantage over HEFT or LHHS. Cluster Constraint Idea: Select the clusters to schedule on. We introduced the aggregated computing power property of the cluster. Drawback: It doesn’t perform much better when the DAG is computational intensive. Hope: It works very well in other situations, outperform the one-level same algorithm by as many as 50%. To complement this, we can simply choose the fastest clusters in computational intensive cases.