Combining the strengths of UMIST and The Victoria University of Manchester Adaptive Workflow Processing and Execution in Pegasus Kevin Lee School of Computer.

Slides:



Advertisements
Similar presentations
Computer Systems & Architecture Lesson 2 4. Achieving Qualities.
Advertisements

Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
Bag-of-Tasks Scheduling under Budget Constraints Ana-Maria Oprescu, Thilo Kielman Presented by Bryan Rosander.
Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan.
Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Silberschatz, Galvin and Gagne ©2007 Processes and Their Scheduling.
Resource Management of Grid Computing
Application architectures
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
1 Optimizing Utility in Cloud Computing through Autonomic Workload Execution Reporter : Lin Kelly Date : 2010/11/24.
An Astronomical Image Mosaic Service for the National Virtual Observatory
A Grid-Enabled Engine for Delivering Custom Science- Grade Images on Demand
An Astronomical Image Mosaic Service for the National Virtual Observatory / ESTO.
CREATING A MULTI-WAVELENGTH GALACTIC PLANE ATLAS WITH AMAZON WEB SERVICES G. Bruce Berriman, John Good IPAC, California Institute of Technolog y Ewa Deelman,
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
Pegasus A Framework for Workflow Planning on the Grid Ewa Deelman USC Information Sciences Institute Pegasus Acknowledgments: Carl Kesselman, Gaurang Mehta,
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Understand Application Lifecycle Management
CONTENTS Arrival Characters Definition Merits Chararterstics Workflows Wfms Workflow engine Workflows levels & categories.
DOE BER Climate Modeling PI Meeting, Potomac, Maryland, May 12-14, 2014 Funding for this study was provided by the US Department of Energy, BER Program.
Combining the strengths of UMIST and The Victoria University of Manchester Utility-based Adaptive Workflow Execution on the Grid Kevin Lee School of Computer.
Large-Scale Science Through Workflow Management Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Combining the strengths of UMIST and The Victoria University of Manchester Utility Driven Adaptive Workflow Execution Kevin Lee School of Computer Science,
GRID’2012 Dubna July 19, 2012 Dependable Job-flow Dispatching and Scheduling in Virtual Organizations of Distributed Computing Environments Victor Toporkov.
 The workflow description modified to output a VDS DAX.  The workflow description toolkit developed allows any concrete workflow description to be migrated.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Pegasus: Planning for Execution in Grids Ewa Deelman Information Sciences Institute University of Southern California.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Dr. Ahmed Abdeen Hamed, Ph.D. University of Vermont, EPSCoR Research on Adaptation to Climate Change (RACC) Burlington Vermont USA MODELING THE IMPACTS.
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Condor Week 2005Optimizing Workflows on the Grid1 Optimizing workflow execution on the Grid Gaurang Mehta - Based on “Optimizing.
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
GriPhyN Virtual Data System Grid Execution of Virtual Data Workflows Mike Wilde Argonne National Laboratory Mathematics and Computer Science Division.
Faucets Queuing System Presented by, Sameer Kumar.
Experiment Management from a Pegasus Perspective Jens-S. Vöckler Ewa Deelman
Pegasus: Planning for Execution in Grids Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi Information Sciences Institute University.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Funded by the NSF OCI program grants OCI and OCI Mats Rynge, Gideon Juve, Karan Vahi, Gaurang Mehta, Ewa Deelman Information Sciences Institute,
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
LIGO-G Z1 Using Condor for Large Scale Data Analysis within the LIGO Scientific Collaboration Duncan Brown California Institute of Technology.
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
Managing LIGO Workflows on OSG with Pegasus Karan Vahi USC Information Sciences Institute
Resource Allocation and Scheduling for Workflows Gurmeet Singh, Carl Kesselman, Ewa Deelman.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
CPU SCHEDULING.
HTCondor and LSST Stephen Pietrowicz Senior Research Programmer National Center for Supercomputing Applications HTCondor Week May 2-5, 2017.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
Liang Chen Advisor: Gagan Agrawal Computer Science & Engineering
Montage: An On-Demand Image Mosaic Service for the NVO
COMP60611 Fundamentals of Parallel and Distributed Systems
COMP60621 Designing for Parallelism
ANALYSIS OF USER SUBMISSION BEHAVIOR ON HPC AND HTC
rvGAHP – Push-Based Job Submission Using Reverse SSH Connections
A General Approach to Real-time Workflow Monitoring
Workflow Adaptation as an Autonomic Computing Problem
Presentation transcript:

Combining the strengths of UMIST and The Victoria University of Manchester Adaptive Workflow Processing and Execution in Pegasus Kevin Lee School of Computer Science, University of Manchester 25 th May 2008

Combining the strengths of UMIST and The Victoria University of Manchester Contributors Rizos Sakellariou, Norman W. Paton and Alvaro A. A. Fernandes {klee, rizos, norm, University of Manchester UK Ewa Deelman, Gaurang Mehta Information Systems Institute University of Southern California, US

Combining the strengths of UMIST and The Victoria University of Manchester Talk Overview 1)Background: Adaptivity at Manchester 2)Background: Pegasus Workflow Execution 3)Adaptive Pegasus 4)Experiments and Results 5)Conclusions and Future work 6)Questions

Combining the strengths of UMIST and The Victoria University of Manchester 1. Background: Adaptivity at Manchester Creating an infrastructure to support the Systematic Development of Adaptive Systems based on the ideas presented today Ease the development of adaptive systems. Support the development of better adaptive systems Investigate the use of the infrastructure in a number of different domains Use the infrastructure to improve the general understanding of adaptive systems Applying the infrastructure to related domains Workflow processing with the Pegasus team Concurrent web-service workflows Distributed Query Processing

Combining the strengths of UMIST and The Victoria University of Manchester 2. Background: Pegasus Workflow Execution

Combining the strengths of UMIST and The Victoria University of Manchester 3. Adaptive Pegasus Execution Characteristics of Pegasus workflow execution Very long running Small delays can have large effects due to dependencies involve highly distributed resources Limited control over resources Uncertain execution times Uncertain queue waiting times Pegasus schedules a workflow before it starts executing Using current information about the execution environment What happens if the environment changes? Resources appear/disappear Loads change due to resources being used

Combining the strengths of UMIST and The Victoria University of Manchester 3. Adaptive Pegasus We combined the adaptivity work at Manchester with Pegasus Retrofitted Pegasus with an adaptivity framework Focused on adapting to site queue length, one of the biggest delays in execution Result is a Pegasus instantiation that can react dynamically to the environment Main components:

Combining the strengths of UMIST and The Victoria University of Manchester 3. Adaptive Pegasus Monitoring: To monitor the progress of an executing workflow. Events: Job queue, Execute, Termination. Sensed from the Pegasus Log. Analysis: Establish whether the workflow is performing according to expectations when it was compiled Uses the CQL continuous query language to group and analyse the events produced by monitoring *SQL-like but with extensions for queries over time. *Detailed in paper. Planning: When analysis detects a sustained change in batch queue times for a site. Re-scheduling using scheduler that takes into account historic data. *algorithm in paper. Execution: Halt the current workflow and deploy the newly planned one.

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Overview Experiment investigates the effect of the adaptive approach on the workflow response time. Pegasus operates on abstract workflows in the form of Directed Acyclic Graphs (DAGs) We used two styles of DAGs in our experiments, linear workflow and a Montage workflow. The experiments took place using 2 clusters. Each cluster was running the Condor Scheduler We apply loads to the clusters by submitting additional workflows and submit the workflow with adaptive support.

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Workflow type 1 This is simply a DAG were each subsequent task is dependent on the file created by the previous task, and may contain any number of tasks. With these dependencies present, the tasks in the workflow will execute in series. In our experiments we considered an instance with 50 tasks.

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Workflow type 2 –Montage (NASA and NVO) Deliver science-grade custom mosaics on demand Produce mosaics from a wide range of data sources (possibly in different spectra)‏ User-specified parameters of projection, coordinates, size, rotation and spatial sampling. Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid. <- A Simple Montage workflow. These can be of varying sizes depending on the size of the area of sky of the mosaic. The numbers represent the level of each task in the overall workflow. This corresponds to the size used in our experiments (25 tasks, equivalent to a 0.2 degree area).

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Experiment 4 The linear workflow is scheduled in a round robin fashion to cluster 1 and 2 Cluster 1 is constant loaded with an additional 50 linear workflows. Jobs sent to Cluster 1 are queued longer, visible on the graph. The adaptive workflow adapts early to the constant load Result is adaptive workflow has a better response time

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Experiment 5 The Montage workflow is scheduled in a round robin fashion to cluster 1 and 2 Cluster 1 is constant loaded with an additional 50 linear workflows. Jobs sent to Cluster 1 are queued longer, visible on the graph. The adaptive workflow adapts early to the constant load Result is adaptive workflow has a better response time

Combining the strengths of UMIST and The Victoria University of Manchester 4. Experiments and Results: Experiment 6 The linear workflow is scheduled in a round robin fashion to cluster 1 and 2 Cluster 1 is temperately loaded during execution with an additional 50 linear workflows at 60 minutes into the execution for 60 minutes. Jobs sent to Cluster 1 are queued longer, visible on the graph. The adaptive workflow adapts twice, after the load is applied and after it is removed Result is adaptive workflow has a marginally better response time The temporary load is roughly equivalent to 2 adaptations.

Combining the strengths of UMIST and The Victoria University of Manchester 5. Conclusions and Future work Adaptive Pegasus succeeds in improving the response time of workflows. Retrofitting Pegasus with dynamic behaviour required minimum interference with Pegasus. Ongoing work: Current work involves the use of Utility Functions to acomplish generic Planning and make better decisions. Continuing with Framework Development Continuing with multiple case studies

Combining the strengths of UMIST and The Victoria University of Manchester Questions/Comments?