Scheduling Strategies for Mapping Application Workflows Onto the Grid A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Mellor- Crummey, B. Liu, L. Johnsson.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.
Communication Pattern Based Node Selection for Shared Networks
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
Parallel Simulation etc Roger Curry Presentation on Load Balancing.
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.
Dynamic Load Balancing Experiments in a Grid Vrije Universiteit Amsterdam, The Netherlands CWI Amsterdam, The
Parallel Processing and Minimum Spanning Trees Prof. Sin-Min Lee Dept. of Computer Science, San Jose State University.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids 16-th Euromicro PDP Toulose, February 2008 QoS-CONSTRAINED LIST SCHEDULING.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
7th Biennial Ptolemy Miniconference Berkeley, CA February 13, 2007 Scheduling Data-Intensive Workflows Tim H. Wong, Daniel Zinn, Bertram Ludäscher (UC.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Bridge the gap between HPC and HTC Applications structured as DAGs Data dependencies will be files that are written to and read from a file system Loosely.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.
1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
TMC BioGrid A GCC Consortium Ken Kennedy Center for High Performance Software Research (HiPerSoft) Rice University
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Condor Week 2005Optimizing Workflows on the Grid1 Optimizing workflow execution on the Grid Gaurang Mehta - Based on “Optimizing.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Predicting Queue Waiting Time in Batch Controlled Systems Rich Wolski, Dan Nurmi, John Brevik, Graziano Obertelli Computer Science Department University.
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC.
Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
1 Grid Scheduling Cécile Germain-Renaud. 2 Scheduling Job –A computation to run on a machine –Possibly with network access e.g. input/output file (coarse.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
GSAF: A Grid-based Services Transfer Framework Chunyan Miao, Wang Wei, Zhiqi Shen, Tan Tin Wee.
A Hyper-heuristic for scheduling independent jobs in Computational Grids Author: Juan Antonio Gonzalez Sanchez Coauthors: Maria Serna and Fatos Xhafa.
Static Process Scheduling
Efficient Load Balancing Algorithm for Cloud Computing Network Che-Lun Hung 1, Hsiao-hsi Wang 2 and Yu-Chen Hu 2 1 Dept. of Computer Science & Communication.
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
A Fast Genetic Algorithm Based Static Heuristic For Scheduling Independent Tasks on Heterogeneous Systems Gaurav Menghani Department of Computer Engineering,
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Resource Allocation and Scheduling for Workflows Gurmeet Singh, Carl Kesselman, Ewa Deelman.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
VGrADS Programming Tools Research: Vision and Overview Ken Kennedy Center for High Performance Software Rice University
- DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007.
Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy.
VGES Demonstrations Andrew A. Chien, Henri Casanova, Yang-suk Kee, Richard Huang, Dionysis Logothetis, and Jerry Chou CSE, SDSC, and CNS University of.
The EMAN Application: An Update. EMAN Oversimplified Preliminary 3D Model Preliminary 3D model Particles Electron Micrographs Refine Final 3D model.
Introduction to Load Balancing:
EMAN, Scheduling, Performance Prediction, and Virtual Grids
LEAD-VGrADS Day 1 Notes.
New Workflow Scheduling Techniques Presentation: Anirban Mandal
VGrADS Tools Activities
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee
Abstract Machine Layer Research in VGrADS
Load Balancing: List Scheduling
آشنایی با الگوریتم های زمانبندی
MapReduce: Data Distribution for Reduce
Multiple Processor Systems
Load Balancing: List Scheduling
R, Zhang, A. Chien, A. Mandal, C. Koelbel,
Presentation transcript:

Scheduling Strategies for Mapping Application Workflows Onto the Grid A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Mellor- Crummey, B. Liu, L. Johnsson

The Forest Performance Prediction + Scheduling Heuristics Static Schedule for Workflow Components G. Marin, 2004T. Braun, 1999

Environment GrADSoft –Runs on top of Globus –Facilitates scheduling, launching, and monitoring of grid apps Extend GrADSoft to deal with workflows (not only tightly coupled apps)

What’s a workflow? A set of applications (workflow components) that must be run in a specific order DAG – Directed Acyclic Graph

Workflow Scheduling Condor DAGMan – dynamic, effectively random scheduling This approach is to do static scheduling –Classic problem: given a set of machines, a set of jobs, and the performance of each job on each machine, schedule all jobs as to minimize total makespan

Determining Machine Fitness Marin and Mellor-Crummey’s performance models –For each workflow component and target machine, produce a performance model –Advantage of performance models over cycle accurate simulations! Add data transfer penalty (using Network Weather Service) We now have the expected time to completion (ETC) of every machine for every task.

Minimum Multiprocessor Scheduling Problem Classic problem is NP-Complete Use traditional heuristics: –Min-Min – Schedule minimum-length job –Max-Min – Schedule maximum-length job –Sufferage – Schedule job with most to lose by waiting

Is This a Workflow Problem? Only one component is easy (Marin already showed this works) Scheduling many may not be tractable

Evaluation EMAN – Electron Micrograph Analysis Almost entire time spent here

Evaluation RN: Random Scheduling (DAGMan) RA: Weighted Random HC: Heuristic Scheduling with crude performance models (CPU speed) HA: Heuristic Scheduling with accurate performance models (this scheme)

Evaluation Testbed 147 machines 4 types 64 dual processor Itanium 900MHz IA-64 nodes (RTC – Houston) 16 Opteron 2009MHz nodes (Medusa - Houston) 60 dual processor 1300MHz Itanium IA-64 nodes (acrl – Houston) 7 Pentium IA-32 nodes (Knoxville) – used?

Results 2.2x improvement over random

Discussion Static vs Dynamic Scheduling –Problems? –Why not use performance models dynamically? Application to workflows or more to parameter sweeps? How did they achieve load balance? Barriers to adoption?