Combining the strengths of UMIST and The Victoria University of Manchester Utility-based Adaptive Workflow Execution on the Grid Kevin Lee School of Computer Science, University of Manchester 11 th March 2009
Combining the strengths of UMIST and The Victoria University of Manchester Talk Overview 1)Overview 2)Technical Background 3)Generic Adaptivity Framework 4)Instantiation for Workflow Execution 5)Experimental Evaluation 6)Future work 7)Questions
Combining the strengths of UMIST and The Victoria University of Manchester 1. Overview Scientists doing research requiring large scale computation. Computation is expensive. Unlikely to have local or regional computation resources. Globally, Institutions share resources to maximize usage and value. Computation in the form of workflows can execute across multiple resources. We want to make sure that the resources are used as efficiently as possible and workflows can execute as fast as possible. As we’ll see there is potential for bad decisions leading to inefficient execution The solution presented here is to use runtime adaptation to improve execution
Combining the strengths of UMIST and The Victoria University of Manchester 2. Technical Background Abstract workflows: Workflows that can be specified by hand or by a higher level program Replicas (input and outputs) specified as logical files Transformations (programs) specified as logical transformations Dependencies between tasks in a Digraph form (workflows) Compiler and Management software: Looks up the replicas and transformations in various databases and services Creates a concrete (executable) workflow that contains locations of resources Local diagraph manager that executes the workflow Resources: Grid based resources that can accept individual jobs that form the workflow
Combining the strengths of UMIST and The Victoria University of Manchester 2. Technical Background: a workflow Mosaic created by Montage from a run of the M101 galaxy images <- A Simple Montage workflow. These can be of varying sizes depending on the size of the area of sky of the mosaic. The numbers represent the level of each task in the overall workflow. This corresponds to the size used in our experiments (25 tasks, equivalent to a 0.2 degree area). Montage Deliver science-grade mosaics on demand Produce mosaics from a wide range of data sources User-specified parameters of projection, coordinates, size, rotation and spatial sampling
Combining the strengths of UMIST and The Victoria University of Manchester 2. Technical Background: Pegasus Workflow Management System
Combining the strengths of UMIST and The Victoria University of Manchester 2. Technical Background More detailed view: Compilation Submission Execution Reporting
Combining the strengths of UMIST and The Victoria University of Manchester 2. Technical Background Execution Characteristics of Pegasus workflow execution Very long running Small delays can have large effects due to dependencies involve highly distributed resources Limited control over resources Uncertain execution times Uncertain queue waiting times Pegasus schedules a workflow before it starts executing Using current information about the execution environment What happens if the environment changes? Resources appear/disappear Loads change due to resources being used Obvious solution, Adapt at runtime!!!
Combining the strengths of UMIST and The Victoria University of Manchester 3. Generic Adaptivity Framework Developing infrastructure to support the Systematic Development of Adaptive Systems Ease the development of adaptive systems. Support the development of better adaptive systems Investigate the use of the infrastructure in a number of different domains Leverage the best tools for in each part of the adaptation process Use the infrastructure to improve the general understanding of adaptive systems
Combining the strengths of UMIST and The Victoria University of Manchester Monitor: Events from a source: log files in-memory process sensors Analyze: When an event occurs, what to do about it... Plan: After the event is detected and analysed, the system needs to determine what to do about it. Execute: Perform the necessary changes 3. Generic Adaptivity Framework (IBM autonomic vision) Next, each in more detail.
Combining the strengths of UMIST and The Victoria University of Manchester Transforms sensor events from the system into something more useful Events are expected to be in XML XSLT transformation transforms event to a standard style Output events stored in the Knowledge base Control events indicate to components in the pipeline that new data is available in the knowledge base 3. Generic Adaptivity Framework: Monitoring
Combining the strengths of UMIST and The Victoria University of Manchester Performs analysis on data in the knowledge base Performs analysis in response to an indication that new data is available Analysis based on a series of CQL Stream queries Output events stored in the knowledge base Control events indicate to components in the pipeline that new data is available in the knowledge base 3. Generic Adaptivity Framework: Analysis
Combining the strengths of UMIST and The Victoria University of Manchester Performs optimisation of utility functions Based on data in the knowledge base Performs optimisation in response to an indication that new data is available Optimisation algorithm used to maximise utility Output events stored in the knowledge base Control events indicate to components in the pipeline that new data is available in the knowledge base 3. Generic Adaptivity Framework: Planning
Combining the strengths of UMIST and The Victoria University of Manchester Executes a workflow on data in the knowledge base Executes workflow in response to an indication that new data is available Uses BPEL workflow containing services which effect adaptations on the system Output events stored in the knowledge base 3. Generic Adaptivity Framework: Execution So, how does this help improve workflow execution?
Combining the strengths of UMIST and The Victoria University of Manchester Combining the framework and Pegasus No changes to Pegasus Touch points via Sensors and Effectors 4. Instantiation for Workflow Execution
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution At the time the workflow is compiled and scheduled the resources selected may be correct As time goes on these decisions are likely to diverge from the ideal For computation resources the largest cause of delay is contention for the resource This manifests itself as increased batch queue times for submitted jobs Therefore, when applying the framework to Pegasus we focus on adapting to queue times. Aim
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution To monitor the progress of an executing workflow, we parse the Live Log. Example: Sensors->Monitoring 2/17 11:53:14 Event: ULOG_GRID_SUBMIT for Condor Node mBackground_ID (4713.0) 2/17 11:53:14 Event: ULOG_EXECUTE for Condor Node mBackground_ID (4709.0) 2/17 11:53:14 Number of idle job procs: 4 2/17 11:53:20 Event: ULOG_EXECUTE for Condor Node mBackground_ID (4708.0) 2/17 11:53:20 Number of idle job procs: 3 2/17 11:53:28 Event: ULOG_JOB_TERMINATED for Condor Node mBackground_ID (4710.0) 2/17 11:53:28 Node mBackground_ID job proc (4710.0) completed successfully. Result: XML Events for job queued, executed, termination. Made available to analysis as a stream RegEx: ([\d]+)/([\d]+).([\d]+):([\d]+):([\d]+).Event:.([\S]+_[\S]+).for.Condor.Node.([a-zA-Z0-9_]+)
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Uses the CQL continuous query language to group and analyse the events SQL-like but with extensions for queries over time. 1.Calculates current average job queue times over a period of time 2.Causes re-planning when queue times are more or less than expected Analysis select h*3600+m*60+s,job,site,est from workflowlog where event="ULOG_SUBMIT"; register stream submittedjobs (time int, job char(22), site char(22), est int); select h*3600+m*60+s,job from workflowlog where event="ULOG_EXECUTE"; register stream executedjobs (time int, job char(22)); Rstream (select executed.time-submitted.time, executed.job, submitted.site, submitted.est from executedjobs[Range 180 Seconds] as executed,submittedjobs as submitted where executed.job=submitted.job); register stream jobdelay (delay int, job char(22), site char(22), est int); select site, delay, est, (delay-est) from jobdelay where (delay-est)>20; Output from this causes planning
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Planning has the task of recalculating a better assignment for the workflow Data we have: Workflow DAG Current Assignment Collected data about resources, number CPUS, Execution times, AVG queue time What we’ve submitted since the execution started Approach: Use a Matlab based utility function optimiser We write a function that depending on the values (above) and a potential new assignment gives us a value of the assignment (Higher the better) Optimiser (MADS) searches potential values calling the function many times. Planning
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Firstly, for each proposed new assignment we calculate estimated queue times: Planning Estimated Queue time: Based on external demand, the new demand and the change in actual queue times A Estimate of External Demand For a period p Assigned demand For the period p The Candidate Demand The demand we’ll put on the resources Full explanation in papers
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Next, calculate the Predicted Response Time for the workflow: Planning Completion time of the last task plus any adaptation cost: Recursive formula to estimate the completion time of the last task So, now we have a estimate of how long a workflow will take for each new assignment We need a way of judging how good a assignment is in relation to its PRT and the resources used
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Option 1: Utility for Response time: Purely tries to use the fastest resources available to complete the workflow EQT ensures a resource isn’t overloaded The utility is therefore just: The higher the Utility value the better The optimiser will try multiple values of assignment until a ‘good’ one is found Planning
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution Option 2: Utility for Profit: As resources are not free, we attach a value to using resources We have a reward for completing a workflow within a target time A cost for using a resource to execute a task Planning Cost for a workflow assignment: Profit is a measure of utility minus cost The utility is a calculation of how likely the assignment completes before the target response time The larger the ‘profit’ the better for the optimiser
Combining the strengths of UMIST and The Victoria University of Manchester The first has a high target response time which it easily meets. Then improves further. 4. Instantiation for Workflow Execution Planning Some example runs of the optimizer with the profit utility The second has a lower target response time which is slowly gets closer to. Graphs show the utility being minimized rather than maximized
Combining the strengths of UMIST and The Victoria University of Manchester 4. Instantiation for Workflow Execution For a new assignment: 1.Tell the local DAG manager to halt the workflow(s) 2.Collect the locations of all the partial results 3.Modify local databases with this new data 4.Replan the workflow(s) with the new assignment 5.Deploy the workflow 6.Continue monitoring the new execution Repeats every time a new assignment is available Execution/Deploying a new assignment
Combining the strengths of UMIST and The Victoria University of Manchester 5. Experimental Evaluation Workflow: 27 Node Montage workflow of M17: Takes between 20 mins and a few hours depending on resources Profit gain is 100 for completing within the target Resources: Previous work on teragrid clusters (see papers) These experiments are new so 2 workstations. (1) is less powerful with longer queue times (2) is more powerful with shorter queue times (2) costs more than (1). (1) costs 1, (2) costs 2.
Combining the strengths of UMIST and The Victoria University of Manchester 5. Experimental Evaluation Experiment 1 Single workflow. Periodic Load Applied to Cluster 1. The adaptive version performs an adaption and results in a faster workflow
Combining the strengths of UMIST and The Victoria University of Manchester 5. Experimental Evaluation Experiment 1 For different target response times U(RT) Always performs the best. U(Profit) meets the High and mid target response times at less cost than U(RT) U(Profit) fails to meet the low target response time so uses the cheapest resources
Combining the strengths of UMIST and The Victoria University of Manchester 5. Experimental Evaluation Experiment 2 Two Montage workflows. Periodic Load Applied to Cluster 1. Achieved by submitting and monitoring two workflows at the same time. Utility is the Sum of all U(RT) and U(Profit) for all workflows. U(RT) Always performs the best. U(Profit) meets the High and mid target response times at less cost than U(RT) U(Profit) fails to meet the low target response time so uses the cheapest resources
Combining the strengths of UMIST and The Victoria University of Manchester 6. Current/Future Work Current work: Scaling up the number of workflows, 10, and more. Scaling up to more sites. Managing workflows arriving over time, rather than at the start. More workflow types. We’ve used linear types and montage in our papers. Future work: Further refinement of the conditions when to adapt. Large scale experiments, more tight integration into Pegasus. Lots of interesting problems
Combining the strengths of UMIST and The Victoria University of Manchester Publications I’ve tried to give the general picture and some details. See for more: WORKS 2007 K. Lee, R. Sakellariou, N. W. Paton and A. A. A. Fernandes, Workflow Adaptation as an Autonomic Computing Problem, 2nd Workshop on Workflows in Support of Large-Scale Science (Works 07), In Proceedings of HPDC 2007, Monterey Bay California, June WAGE 2008 K.Lee, N. W. Paton, R. Sakellariou, E. Deelman, A. A. A. Fernandes, G. Mehta, Adaptive Work- flow Processing and Execution in Pegasus, 3rd International Workshop on Workflow Management and Applications in Grid Environments (WaGe08) May , Kunming, China CCGRID 2009 K. Lee, N. W. Paton, R. Sakellariou, A. A. A. Fernandes, Utility based scheduling for Adaptive workflow execution, 9th IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2009), to appear soon.
Combining the strengths of UMIST and The Victoria University of Manchester Acknowledgements Rizos Sakellariou, Norman W. Paton and Alvaro A. A. Fernandes {klee, rizos, norm, University of Manchester UK Ewa Deelman, Gaurang Mehta Information Systems Institute University of Southern California, US
Combining the strengths of UMIST and The Victoria University of Manchester Questions/Comments?
Combining the strengths of UMIST and The Victoria University of Manchester Notes: Less on the slides Rizos paper notes: > reasons Added value, maybe another experiment =