New Workflow Scheduling Techniques Presentation: Anirban Mandal

Slides:

Advertisements

Similar presentations

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.

Advertisements

ADAPTIVE FASTEST PATH COMPUTATION ON A ROAD NETWORK: A TRAFFIC MINING APPROACH Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag.

Big Data + SDN SDN Abstractions. The Story Thus Far Different types of traffic in clusters Background Traffic – Bulk transfers – Control messages Active.

Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.

Effective Straggler Mitigation: Attack of the Clones [1]

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Label Placement and graph drawing Imo Lieberwerth.

Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.

Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.

1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.

GridFlow: Workflow Management for Grid Computing Kavita Shinde.

Math443/543 Mathematical Modeling and Optimization

1 A Second Stage Network Recourse Problem in Stochastic Airline Crew Scheduling Joyce W. Yen University of Michigan John R. Birge Northwestern University.

Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling Shanshan Song, Ricky Kwok, and Kai Hwang University of Southern.

8-1 Problem-Solving Examples (Preemptive Case). 8-2 Outline Preemptive job-shop scheduling problem (P-JSSP) –Problem definition –Basic search procedure.

Neural Networks Lecture 17: Self-Organizing Maps

Chapter 9 Neural Network.

A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.

Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,

Predicting Queue Waiting Time in Batch Controlled Systems Rich Wolski, Dan Nurmi, John Brevik, Graziano Obertelli Computer Science Department University.

1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.

Optimization with Neural Networks Presented by: Mahmood Khademi Babak Bashiri Instructor: Dr. Bagheri Sharif University of Technology April 2007.

1 Grid Scheduling Cécile Germain-Renaud. 2 Scheduling Job –A computation to run on a machine –Possibly with network access e.g. input/output file (coarse.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can

Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg

Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!

Scheduling Strategies for Mapping Application Workflows Onto the Grid A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Mellor- Crummey, B. Liu, L. Johnsson.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Embedded Real-Time Systems Processing interrupts Lecturer Department University.

1 VLDB, Background What is important for the user.

Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy.

Resource Characterization Rich Wolski, Dan Nurmi, and John Brevik Computer Science Department University of California, Santa Barbara VGrADS Site Visit.

Resource Specification Prediction Model Richard Huang joint work with Henri Casanova and Andrew Chien.

Event Based Simulation of The Backfilling Algorithm OOP tirgul No

The EMAN Application: An Update. EMAN Oversimplified Preliminary 3D Model Preliminary 3D model Particles Electron Micrographs Refine Final 3D model.

William Stallings Data and Computer Communications

OPERATING SYSTEMS CS 3502 Fall 2017

Data Transformation: Normalization

SC’07 Demo Draft VGrADS Team June 2007.

LEAD-VGrADS Day 1 Notes.

Resource Characterization

CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016

VGrADS Tools Activities

Lecture 8: Dispatch Rules

A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee

CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017

Computer Science cpsc322, Lecture 14

Chapter 6: CPU Scheduling

Greedy Algorithms / Interval Scheduling Yin Tat Lee

Optimizing Interactive Analytics Engines for Heterogeneous Clusters

湖南大学-信息科学与工程学院-计算机与科学系

CPU Scheduling G.Anuradha

TransCAD Vehicle Routing 2018/11/29.

SAT-Based Optimization with Don’t-Cares Revisited

CIS 488/588 Bruce R. Maxim UM-Dearborn

Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)

Lecture 2 Part 3 CPU Scheduling

3.3 Network-Centric Community Detection

Boltzmann Machine (BM) (§6.4)

Nearest Neighbors CSC 576: Data Mining.

R, Zhang, A. Chien, A. Mandal, C. Koelbel,

MapReduce: Simplified Data Processing on Large Clusters

CSCI 465 Data Communications and Networks Lecture 16

Under a Concurrent and Hierarchical Scheme

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

New Workflow Scheduling Techniques Presentation: Anirban Mandal VGrADS Workshop @UCSD, Sep 2005

Outline Drawbacks of Workflow Scheduler v.0 Middle-Out Scheduling Scheduling onto systems with batch queues Scheduling onto Abstract Resource Classes Premise: Automating good application level scheduling using performance models by taking advantage of vgES features

Top-Down Scheduling Top-Down For each heuristic Until all components mapped Map available components to resources Select mapping with minimum makespan While all available components not mapped For each (component, resource) pair ECT(c,r) = rank(c,r) + EAT(r) End For each Run min-min, max-min and sufferage Store mapping End while Top-Down

Drawbacks of Workflow Scheduler v.0 Top-Down Workflow Scheduler suffers from Myopia Top-down traversal implies no look ahead Potential of poor mapping of critical steps for decisions taken higher up in the workflow Assumption of instant resource availability Many systems have batch queue front ends Have to wait before job starts Scaling problems Scheduling onto individual nodes pose scaling problems in large resource environments - Issue raised at the site-visit

Addressing the Drawbacks We address the drawbacks as follows Myopia Middle-Out Scheduling Schedule critical step first and propagate mapping up and down Assumption of instant resource availability Incorporating batch queue wait times to take scheduling decisions (Joint work: Rice+UCSB) Scaling problems Using a two-level scheduling strategy - explicit resource pruning using vgDL/other means and then scheduling (Joint Work: Rice+UCSD+Hawaii) Scheduling onto abstract resource classes / clusters Ryan’s talk

Middle-Out Scheduling Key step Top-Down Middle-Out

Middle-Out Scheduling: Results Compared makespans for middle-out vs. top-down scheduling Resource set: 5 clusters [2 Opteron clusters and 3 Itanium clusters] 6 resource-topology scenarios : combination of Opteron clusters close, normal and far with Fast and Slow Itaniums - {(Opteron close, Fast Itanium), ..} Application: Actual EMAN DAG with 3 different communication-to-computation ratios (CCR): 0.1, 1 and 10 Used known performance model values for computational components Varied file sizes to obtain desired CCR for each pair of synchronization points

Middle-Out Scheduling: Results CCR: 0.1 Computation 10 times the communication Fast Itanium makes top-down scheduler to “get stuck” at the Itanium clusters Since key computation step is scheduled on both the Opteron clusters, makespan depends on the Opteron connectivity In the slow Itanium case, the top-down scheduler “got lucky” Gain from middle-out scheduling not much

Middle-Out Scheduling: Results CCR: 1 Equal communication and computation Fast Itanium makes top-down scheduler to “get stuck” at the Itanium clusters Since key computation step is scheduled on both the Opteron clusters, makespan depends on the Opteron connectivity In the slow Itanium case, the top-down scheduler “got lucky”

Middle-Out Scheduling: Results CCR: 10 Communication 10 times the computation Fast Itanium makes top-down scheduler to “get stuck” at the Itanium clusters Since key computation step is scheduled on both the Opteron clusters, makespan depends on the Opteron connectivity In the slow Itanium case, the top-down scheduler “got lucky”

Middle-Out Scheduling: Results With increasing communication, the middle-out scheduler performs better when the top-down scheduler gets stuck

Outline Drawbacks of Workflow Scheduler v.0 Middle-Out Scheduling Scheduling onto systems with batch queues Scheduling onto Abstract Resource Classes

Scheduling onto Batch-Queue Systems Incorporated Point-value predictions for batch queue wait times Slight modification to the top-down scheduler At every scheduling step, take into account the estimated time the job has to wait in the queue in the estimated completion time for the job [ECT(c,r) in the algorithm] Keep track of the queue wait times for each cluster and the number of nodes that correspond to the queue wait time With each mapping, update the estimated availability time [EAT in the algorithm] with the queue wait time, as required Joint work with Dan Nurmi and Rich Wolski

Scheduling onto Batch-Queue Systems: Example Cluster 0 Cluster 1 Input DAG R0 R1 R2 R3 Queue Wait Time [Cluster 0] = 20 # nodes for this wt. time = 1 Queue Wait Time [Cluster 1] = 10 # nodes for this wt. time = 2 T

Scheduling onto Batch-Queue Systems: Example Cluster 0 Cluster 1 Input DAG R0 R1 R2 R3 Queue Wait Time [Cluster 0] = 20 # nodes for this wt. time = 1 Queue Wait Time [Cluster 1] = 10 # nodes for this wt. time = 2 T

Outline Drawbacks of Workflow Scheduler v.0 Middle-Out Scheduling Scheduling onto systems with batch queues Scheduling onto Abstract Resource Classes Addressing the scaling problem Modify scheduler to schedule onto clusters instead of individual nodes

Scheduling onto Clusters Input: Workflow DAG with restricted structure - nodes at the same level do the same computation Set of available Clusters (numNodes, arch, CPU speed etc.) and inter-cluster network connectivity Per-node performance models for each cluster Output: Mapping: for each level the number of instances mapped to each cluster Objective: Minimize makespan at each step

Scheduling onto Clusters: Modeling Abstract modeling of mapping problem for a DAG level Given: N instances M clusters r1..rM nodes/cluster t1..tM - rank value per node per cluster (incorporates both computation and communication) Aim: To find a partition (n1, n2,… nM) of N such that overall time is minimized with n1+n2+..nM = N Analytical solution: No ‘obvious’ solution because of the discrete nature

Scheduling onto Clusters Iterative approach to solve the problem Addresses the scaling issue For each instance, i from 1 to N For each cluster, j from 1 to M Tentatively map i onto j Record makespan for each j by taking care of round(j) End For each Find cluster, p with minimum makespan increase Map i to p Update round(p), numMapped(p) O(#instances * #clusters)

Discussions…

Middle-Out Scheduling Key step Top-Down Middle-Out