Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer.

Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer School Budapest, 05 September 2007

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 2 Implementations CoreGRID RMS Institute Objective Objectives: èDevelopment of a common and generic solution for Grid resource management/scheduling in Next Generation Grids. èDevelopment of new algorithms for coordinated scheduling for all resource types, including data, network etc. èSupport of Grid business models in the scheduling process Architecture Algorithms Goal: linking theoretical foundation and practical implementation on the different level of Resource Management

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 3 Inst. RMS Current Institute Roadmap Common Scheduling/Brokerage Architecture Model Algorithms for coordinated scheduling/negotiation Support for SLA Management and Negotiation Domain-specific solutions for Computational Grids Solutions for Evaluation, Testing, Prediction

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 4 Participants CETIC, Belgium IPP-BAS, Bulgaria CNR-ISTI, Italy CNRS, France Delft University, Netherlands EPFL, Switzerland Fraunhofer Gesellschaft, Germany Research Center Jülich, Germany PSNC, Poland MTA SZTAKI, Hungary University of Münster, Germany University of Calabria, Italy University of Cyprus University of Dortmund, Germany University of Manchester, UK EAI-FR, Switzerland University of Westminster, UK Technical University of Catalonia, Spain Zuse Institute Berlin, Germany University of Innsbruck, Austria 20 participating institutes; 89 researchers

Grid Scheduling

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 6 Key Question Which services/resources to use for an activity, when, where, how? Typically: A particular user, or business application, or component application needs for an activity one or several services/resources under given constraints Trust & Security Timing & Economics Functionality & Service level Application-specifics & Inter-dependencies Scheduling and Access Policies èThis question has to be answered in an automatic, efficient, and reliable way. èPart of the invisible and smart infrastructure!

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 7 Motivation Resource Management for Future/Next Generation Grids! But what are Future Generation Grids? HPC Computing –Parallel Computing –Cluster Computing –Desktop Computing HPC Computing –Parallel Computing –Cluster Computing –Desktop Computing Enterprise Grids –Business Services –Application Server –SOA/Webservices Enterprise Grids –Business Services –Application Server –SOA/Webservices Ambient Intelligence Ubiquitous Computing –PDA, Mobile Devices Ambient Intelligence Ubiquitous Computing –PDA, Mobile Devices depends on who you ask!

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 8 Resource Definition Concluding from the different interpretations of Grid: for broad acceptance Grid RMS should probably cover the whole scope; Resources: Compute Network Storage Data Software –components, licenses Services –functionality, ability Management of some resources is less complex, while other resources require coordination and orchestration to be effective (e.g. HW and SW). Management of some resources is less complex, while other resources require coordination and orchestration to be effective (e.g. HW and SW).

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 9 Resource Management Layer Grid Resource Management System consists of : Local resource management system (Resource Layer) –Basic resource management unit –Provide a standard interface for using remote resources –e.g. GRAM, etc. Global resource management system (Collective Layer) –Coordinate all Local resource management system within multiple or distributed Virtual Organizations (VOs) –Provide high-level functionalities to efficiently use all of resources Job Submission Resource Discovery and Selection Scheduling Co-allocation Job Monitoring, etc. –e.g. Meta-scheduler, Resource Broker, etc.

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 10 Resource Broker Grid Resource Manager Information Services Monitoring Services Security Services Core Grid Infrastructure Services Grid Middleware PBSLSF… Resource Local Resource Management Higher-Level Services User/ Application Grid RMS

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 11 Grid Scheduling Scheduler Schedule time Job-Queue Machine 1 Scheduler Schedule time Job-Queue Machine 2 Scheduler Schedule time Job-Queue Machine 3 Grid-Scheduler Grid User

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 12 Select a Resource for Execution Most systems do not provide advance information about future job execution –user information not accurate as mentioned before –new jobs arrive that may surpass current queue entries due to higher priority Grid scheduler might consider current queue situation, however this does not give reliable information for future executions: –A job may wait long in a short queue while it would have been executed earlier on another system. Available information: –Grid information service gives the state of the resources and possibly authorization information –Prediction heuristics: estimate jobs wait time for a given resource, based on the current state and the jobs requirements.

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 13 Co-allocation It is often requested that several resources are used for a single job. –that is, a scheduler has to assure that all resources are available when needed. in parallel (e.g. visualization and processing) with time dependencies (e.g. a workflow) The task is especially difficult if the resources belong to different administrative domains. –The actual allocation time must be known for co-allocation –or the different local resource management systems must synchronize each other (wait for availability of all resources)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 14 Example Multi-Site Job Execution Scheduler Schedule time Job-Queue Machine 2 Scheduler Schedule time Job-Queue Machine 3 A job uses several resources at different sites in parallel. Network communication is an issue. Scheduler Schedule time Job-Queue Machine 1 Grid-Scheduler Multi-Side Job

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 15 Advanced Reservation Co-allocation and other applications require a priori information about the precise resource availability With the concept of advanced reservation, the resource provider guarantees a specified resource allocation. –includes a two- or three-phase commit for agreeing on the reservation Implementations: –GARA/DUROC/SNAP provide interfaces for Globus to create advanced reservation –implementations for network QoS available. setup of a dedicated bandwidth between endpoints –WS-Agreement defines a protocol for agreement management

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 16 Using Service Level Agreements The mapping of jobs to resources can be abstracted using the concept of Service Level Agreement (SLAs) SLA: Contract negotiated between –resource provider, e.g. local scheduler –resource consumer, e.g., grid scheduler, application SLAs provide a uniform approach for the client to –specify resource and QoS requirements, while –hiding from the client details about the resources, –such as queue names and current workload

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 17 Execution Alternatives Time sharing: The local scheduler starts multiple processes per physical CPU with the goal of increasing resource utilization. –multi-tasking The scheduler may also suspend jobs to keep the system load under control –preemption Space sharing: The job uses the requested resources exclusively; no other job is allocated to the same set of CPUs. –The job has to be queued until sufficient resources are free.

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 18 Job Classifications Batch Jobs vs interactive jobs –batch jobs are queued until execution –interactive jobs need immediate resource allocation Parallel vs. sequential jobs –a job requires several processing nodes in parallel the majority of HPC installations are used to run batch jobs in space-sharing mode! –a job is not influenced by other co-allocated jobs –the assigned processors, node memory, caches etc. are exclusively available for a single job. –overhead for context switches is minimized –important aspects for parallel applications

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 19 Parallel Application Types Rigid –Requires a fixed number of processors Moldable –The number of processors can be adapted only at the start of the execution Malleable –Number of assigned processors can be changed during runtime (i.e., grow/shrink) D. G. Feitelson and L. Rudolph, Toward convergence in job schedulers for parallel supercomputers, in JSPP96 RigidMoldableMalleable # of Processors time

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 20 Preemption A job is preempted by interrupting its current execution –the job might be on hold on a CPU set and later resumed; job still resident on that nodes (consumption of memory) –alternatively a checkpoint is written and the job is migrated to another resource where it is restarted later Preemption can be useful to reallocate resources due to new job submissions (e.g. with higher priority) or if a job is running longer then expected.

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 21 Job Scheduling A job is assigned to resources through a scheduling process –responsible for identifying available resources –matching job requirements to resources –making decision about job ordering and priorities HPC resources are typically subject to high utilization therefore, resources are not immediately available and jobs are queued for future execution –time until execution is often quite long (many production systems have an average delay until execution of >1h) –jobs may run for a long time (several hours, days or weeks)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 22 Typical Scheduling Objectives Minimizing the Average Weighted Response Time Maximize machine utilization/minimize idle time –conflicting objective –criteria is usually static for an installation and implicit given by the scheduling algorithm r : submission time of a job t : completion time of a job w : weight/priority of a job

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 23 Job Steps Scheduler Schedule time lokale Job-Queue HPC Machine Grid - Use r Job Execution Management Node Job Mgmt Job Description A user job enters a job queue, the scheduler (its strategy) decides on start time and resource allocation of the job.

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 24 Example of Grid Scheduling Decision Making Scheduler Schedule time Job-Queue Machine 1 Scheduler Schedule time Job-Queue Machine 2 Scheduler Schedule time Job-Queue Machine 3 Grid-Scheduler Grid User 15 jobs running 20 jobs queued 5 jobs running 2 jobs queued 40 jobs running 80 jobs queued Where to put the Grid job?

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 25 Available Information from the Local Schedulers Decision making is difficult for the Grid scheduler –limited information about local schedulers is available –available information may not be reliable Possible information: –queue length, running jobs –detailed information about the queued jobs execution length, process requirements,… –tentative schedule about future job executions These information are often technically not provided by the local scheduler In addition, these information may be subject to privacy concerns!

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 26 Grid-Level Scheduler Discovers & selects the appropriate resource(s) for a job If selected resources are under the control of several local schedulers, a meta-scheduling action is performed Architecture: –Centralized: all lower level schedulers are under the control of a single Grid scheduler not realistic in global Grids –Distributed: lower level schedulers are under the control of several grid scheduler components; a local scheduler may receive jobs from several components of the grid scheduler

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 27 Towards Grid Scheduling Grid Scheduling Methods: –Support for individual scheduling objectives and policies –Multi-criteria scheduling models –Economic scheduling methods to Grids Architectural requirements: –Generic job description –Negotiation interface between higher- and lower-level scheduler –Economic management services –Workflow management –Integration of data and network management

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 28 Scheduling Objectives in the Grid In contrast to local computing, there is no general scheduling objective anymore –minimizing response time, minimizing cost –tradeoff between quality, cost, response-time etc. Cost and different service quality come into play –the user will introduce individual objectives –the Grid can be seen as a market where resource are concurring alternatives Similarly, the resource provider has individual scheduling policies

Workflow Scheduling

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 30 Workflows What is a workflow? Task1Task 2Task 3Task 4 Example: A simple Job Chain Dependencies between tasks/job steps: Control and/or data dependencies

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 31 Example of a Workflow A simple workflow from climate research with data dependencies Task1 Task 2Task 3 Climate Archive Select Interesting Data Visualize Simulate Data Subset New Results

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 32 Communication/Data Dependencies Workflows can cover different communication models –synchronous (e.g. streaming of multiple active jobs) or –asynchronous (e.g. via files) Synchronous communication requires co-allocation of jobs and data streaming management Asynchronous communication requires file/data management in distributed Grid environments

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 33 Impact of Coordinated Scheduling (1) Consider an application example with a simple workflow consisting of 4 consecutive tasks/steps each running 4 minutes Task1Task 2Task 3Task 4 Consider also a Grid resource with a batch queuing system (e.g. Torque) that has on average a queue waiting time of 60 minutes. We apply a just-in-time scheduling. How long will it take to execute the whole workflow? Task 1 waits for 1h and runs for 5 minutes Task 2 waits for Task 1 to complete, all other tasks analogous = 4*1h + 4*5min = 4h 20 min

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 34 Impact of Coordinated Scheduling (2) How to improve? or using advance reservations (= Planning) How long will it ideally take to execute the whole workflow? Task 1 waits for 1h and runs for 5 minutes Task 2 starts immediately after Task 1 all other tasks analogous = 1h + 4*5min = 1h 20 min put several step in the queue and keep them on hold if preceeding step is not finished (might produce idle times on resources)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 35 More complex workflow (1) Concurrent activities Task1 T 2.3 T 2.2 T 2.1 Task3 T 4.3 T 4.2 T 4.1 Task5

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 36 More complex workflow (1) Using loops Task1Task 2Task 3Task 4

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 37 Example: DAGMan Directed Acyclic Graph Manager DAGMan allows you to specify the dependencies between your Condor-G jobs, so it can manage them automatically for you. (e.g., Dont run job B until job A has completed successfully.) A DAG is defined by a.dag file, listing each of its nodes and their dependencies: # diamond.dag Job A a.sub Job B b.sub Job C c.sub Job D d.sub Parent A Child B C Parent B C Child D Job A Job BJob C Job D Source: Miron Livny

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 38 Dynamic Workflows vs Static Workflows Some workflows are not known in advance and its structure might be determined during run time = Dynamic Workflows Static workflows are known in advance Major impact for planning and scheduling workflows

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 39 Promoter Identification Workflow Source: Matt Coleman (LLNL)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 40 Source: NIH BIRN (Jeffrey Grethe, UCSD)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 41 Ecology: GARP Analysis Pipeline for Invasive Species Prediction Training sample (d) GARP rule set (e) Test sample (d) Integrated layers (native range) (c) Species presence & absence points (native range) (a) EcoGrid Query EcoGrid Query Layer Integration Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Validation Map Generation Integrated layers (invasion area) (c) Species presence &absence points (invasion area) (a) Native range prediction map (f) Model quality parameter (g) Environmental layers (native range) (b) Generate Metadata Archive To Ecogrid Registered Ecogrid Database Registered Ecogrid Database Registered Ecogrid Database Registered Ecogrid Database Environmental layers (invasion area) (b) Invasion area prediction map (f) Model quality parameter (g) Selected prediction maps (h) Source: NSF SEEK (Deana Pennington et. al, UNM)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 42 http://www.gridlab.org/

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 43 Triana Prototype GEO 600 Coalescing Binary Search

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 44 Workflow Taxonomy Workflow design And specification Component/Service Discovery Scheduling and Enactment Data Management Operational Attributes Workflow System structure Model/spec composition Source: Omer Rana

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 45 Workflow Composition User Directed Automated Language-based Graph-based Markup Functional Logic DAG UML Petri Net Process Calculi Process Calculi Composition User defined scripting Planner Templates Design Patterns Sub-workflows Factory Source: Omer Rana

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 46 Taxonomy of Workflow Scheduling Scheduling Criteria –Single vs. multiple Number of workflows considered during scheduling step –Single (optimizing a single workflow) vs. –multiple (optimizing several or all workflows at the same time) Dynamicity –Full-ahead vs. –Just-time vs. –Hybrid Source: CoreGRID Report by U. Innsbruck, FhG FIRST Berlin

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 47 Taxonomy of Workflow Scheduling (2) Optimization Model –Workflow-oriented (considering the benefit of a single workflow/user) vs. –Grid-wide (overall optimization goal) Advance Reservation –With AR (using reservations/SLAs) –or without Source: CoreGRID Report by U. Innsbruck, FhG FIRST Berlin

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 48 Taxonomy of Workflow Scheduling Systems Source: Jia Yu, Rajkumar Buyya

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 49 Workflow Languages Plenty of them, see Grid Workflow Forum: Workflow languages (scientific and industrial) * AGWL * BPEL4WS * BPML * DGL * DPML * GJobDL * GSFL * GFDL * GWorkflowDL * MoML * SWFL * WSCL * WSCI * WSFL * XLANG * YAWL * SCUFL/XScufl * WPDL * PIF * PSL * OWL-S * xWFL Source: Grid Workflow Forum (www.gridworkflow.org)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 50 Excerpt of Workflow Scheduling Systems DAGMan Pegasus Triana ICENI Taverna GridAnt GrADS GridFlow Unicore Gridbus workflow Askalon Karajan Kepler Source: Grid Workflow Forum (www.gridworkflow.org)

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 51 Scheduling of a Workflow (1) Schedules without advance reservation -All times are depending on the local queues -The probability of an accidental schedule that reflects the logical flow of the workflow tasks is rather low -In many cases the workflow will be broken

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 52 Scheduling of a Workflow (2) Schedules without advance reservation - more intelligent -All times are depending on the state of the local queues -A subsequent task is submitted when the previous one terminates -The logical flow of the workflows tasks is maintained -Overall time depends on the local queues and the probability for longer makespan is quite high

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 53 Scheduling of a Workflow (3) Optimal schedules with advance reservation -t 3 = t 2 and t 5 = t 4 -In case of data transfer of lengths t d12 and t d23 between tasks: t 3 = t 2 + t d12, t 5 = t 4 + t d23

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 54 Thanks!

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies 24.07.06 55 Background Information Surveys on Grid Workflow Scheduling: A Taxonomy of Workflow Management Systems for Grid Computing, Yu, J. Buyya, R., JOURNAL OF GRID COMPUTING, 2005, VOL 3; NUMBER 3-4, pages 171-200 Taxonomy of the Multi-criteria Grid Workflow Scheduling Problem, Marek Wieczorek, Andreas Hoheisel, Radu Prodan CoreGrid Technical Report

Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer.

Similar presentations

Presentation on theme: "Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer.

Similar presentations

Presentation on theme: "Workflows and Scheduling in Grids Ramin Yahyapour University Dortmund Leader CoreGRID Institute on Resource Management and Scheduling CoreGRID – Summer."— Presentation transcript:

Similar presentations

About project

Feedback