Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.

Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

 Introduction  Related work  System design  Experiments and Evaluations  Conclusions 2

 In recent years, scientific workflows have been widely applied in astronomy, seismology, genomics, etc.  This paper aims to address the problem of scheduling large workflows onto multiple execution sites with storage constraints.  We model workflows as Directed Acyclic Graphs (DAGs), where nodes represent computation and directed edges represent data flow dependencies between nodes. 4

 control or data dependencies between jobs  the mapping of jobs in the workflow onto resources that are often distributed in the wide area  data-intensive workflows that require significant amount of storage ◦ the entire CyberShake earthquake science workflow has 16,000 sub-workflows and each sub-workflow has more than 24,000 individual jobs and requires 58 GB data. 5

 Heuristic scheduling ◦ HEFT 、 Min-Min 、 Max-Min 、 MCT  These algorithms didn’t take storage constraints into consideration and they need to check every job and schedule them.  Workflow partitioning can be classified as a network cut problem where a sub-workflow is viewed as a sub-graph. 7

 The site catalog provides information about the available resources. 9

 The major challenge in partitioning workflows is to avoid cross dependency, which is a chain of dependencies that forms a cycle in graph (in this case cycles between sub-workflows). 11 ↑ deadlock loop

 Usually jobs that have parent-child relationships share a lot of data since they have data dependencies.  Three heuristics are proposed to first partition the workflow into sub-workflows. 12

 Our heuristic only checks three particular types of nodes: ◦ fan-out: where the output of a job is input to many children ◦ fan-in: where the output of several jobs is aggregated by a child ◦ pipeline nodes: 1 parent, 1 child  Our algorithm reduces the time complexity of check operations by n folds, while n is the average depth of the fan-in-fan-out structure. 13

 Aggressive search ◦ checks if it’s possible to add the whole fan structure into the sub- workflow  Less-aggressive search ◦ performed on its parent jobs, which includes all of its predecessors until the search reaches a fan-out job. 14

 Conservative search ◦ all of its predecessors until the search reaches a fan-in job or a fan-out job. 15

 We assume that the size of each input file and output file is known. 16

 adds a job to a sub-workflow if all of its unscheduled children can be added to that sub- workflow without causing cross dependencies or exceed the storage constraint. 17

1. For a job with multiple children, each child has already been scheduled 2. After adding this job to the sub-workflow, the data size doesn’t exceed the storage constraint. 18

 Critical Path: the longest depth of the sub- workflow weighted by the runtime of each job.  Average CPU Time is the quotient of cumulative CPU time of all jobs divided by the number of available resources.  HEFT estimator uses the calculated earliest finish time of the last sink job as makespan of sub-workflows. 19

 Re-ordering: partitioning step has already guaranteed that there is a valid mapping  Scheduling algorithm: HEFT 、 Min-min  There are two differences compared to their original versions: ◦ First, the data transfer cost within a sub-workflow is ignored since we use a shared file system in our experiments. ◦ Second, the data constraints must be satisfied for each sub-workflow. 20

 Eucalyptus[14]: an infrastructure software that provides on-demand access to Virtual Machine (VM) resources.  The submit host that performs workflow planning and which sends jobs to the execution sites is a Linux 2.6 machine equipped with 8GB RAM and an Intel 2.66GHz Quad CPUs. 14. Eucalyptus Systems. http://www.eucalyptus.com/ 22

 We use Condor [6] pools as execution sites.  HTCondor is a specialized workload management system for compute-intensive jobs. Like other full- featured batch systems, HTCondor provides a job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. 6. M. Litzkow, M. Livny, et al., Condor—A Hunter of Idle Workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems, New York, June 1988. 23

 Performance Metrics. ◦ Satisfying the Storage Constraints ◦ Improving the Runtime Performance 24

 Workflows Used ◦ Montage: an astronomy application, an astronomy application that is used to construct large image mosaics of the sky. ◦ CyberShake: a seismology application, calculate Probabilistic Seismic Hazard curves for several geographic sites in the Southern California area. ◦ Epigenomics: a bioinformatics application, maps short DNA segments collected with high- throughput gene sequencing machines to a reference genome. 25

 They were chosen because they represent a wide range of application domains and a variety of resource requirements. ◦ Montage: I/O intensive ◦ CyberShake: memory intensive ◦ Epigenomics: CPU intensive 26

 storage constraint: 30GB  The default workflow has no storage constraint. 27

 Performance with Different Storage Constraints 29

 CyberShake 30

 Montage 31

 Epigenomics 32

 The performance with three workflows shows that this approach is able to satisfy the storage constraints and reduce the makespan significantly especially for Epigenomics which has fewer fan-in (synchronization) jobs.  For the workflows we used, scheduling them onto two or three execution sites is best due to a tradeoff between increased data transfer and increased parallelism. 34

◦ The Average CPU Time doesn’t take the dependencies into consideration. ◦ The Critical Path doesn’t consider the resource availability. 35

 Three heuristics are proposed and compared to show the close relationship between cross dependency and runtime improvement.  The performance with three real-world workflows shows that this approach is able to satisfy storage constraints and improve the overall runtime by up to 48% over a default whole-workflow scheduling. 37

Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.

Similar presentations

Presentation on theme: "Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.

Similar presentations

Presentation on theme: "Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1."— Presentation transcript:

Similar presentations

About project

Feedback